Questions tagged [tfx]

TensorFlow Extended (TFX) is an end-to-end platform for deploying production ML pipelines

TFX is a Google-production-scale machine learning platform based on TensorFlow. It provides a configuration framework and shared libraries to integrate common components needed to define, launch, and monitor your machine learning system.

Resources

185 questions
0
votes
0 answers

How to customize tfx Evaluator component

We are developing a ML pipeline with TFX, the model is a Keras sequential DNN with mostly dense layers. We would like to customize the Evaluator component to compare the blessed and the candidate model by comparing their predictions on a test set of…
Sara C.
  • 1
  • 1
0
votes
0 answers

How to use jupyter notebook in a local tfx docker image?

I have a windows laptop where I have WSL2 installed. In that I have installed docker and successfully run the docker installation of TFX. I am not sure on how to customize the code and run it like we see in the example Google Colab Notebook (Colab…
Yash Kanojia
  • 154
  • 10
0
votes
1 answer

How do you feed Ragged Tensors to a DNN trained by TensorFlow Extended?

We are developing a ML pipeline with TFX, with the most common components such as ExampleGen, Transform, Trainer, and so on. The examples that have to be fed to the DNN have varying length, so we decided to use the Ragged Tensors to enable an input…
0
votes
1 answer

What does DataAccessor do in tfx?

I'm reading the tfx tutorials, which all uses the DataAccessor to load data. The code looks something like this: return data_accessor.tf_dataset_factory( file_pattern, tfxio.TensorFlowDatasetOptions( batch_size=batch_size,…
bli00
  • 2,215
  • 2
  • 19
  • 46
0
votes
1 answer

join datasets with tfx tensorflow transform

I am trying to replicate some data preprocessing that I have done in pandas into tensorflow transform. I have a few CSV files, which I joined and aggregated with pandas to produce a training dataset. Now as part of productionising the model I would…
DarioB
  • 1,349
  • 2
  • 21
  • 44
0
votes
1 answer

Is there an implemented way to use a kubeflow pipeline's output outside the pipeline?

I'm using local kubeflow pipelines for building a continuous machine learning test project. I have one pipeline that preprocess the data using TFX, and it saves the outputs automatically to minio. Outside of this pipeline, I want to train the model…
0
votes
1 answer

How to get vocabulary size in tensorflow_transform before apply_vocabulary?

Also posted the question at https://github.com/tensorflow/transform/issues/261 I am using tft in TFX and needs to transform string list class labels into multi-hot indicators inside preprocesing_fn. Essentially: vocab =…
ynait
  • 1
0
votes
1 answer

local TFX pipeline run create ERROR Failed to make stateful working dir ; Protocol error

I am following Building a TFX Pipeline Locally (https://www.tensorflow.org/tfx/guide/build_local_pipeline) on ubuntu 21.04. I am only running the CsvExampleGen component and I am getting the following error: ERROR:absl:Failed to make stateful…
m c
  • 1
  • 1
0
votes
1 answer

tfx extension create failed while running npm script

I'm using this package to compile and build my extension { "id": "extensionAzureDevOps", "name": "extension", "publisher": "ME", "version": "0.0.63", "description": "Azure DevOps Extension", "keywords": [ "extensions", "Azure…
Nadav
  • 65
  • 9
0
votes
2 answers

Equivalent of TFX Standard Components in KubeFlow

I have an existing TFX pipeline here that I want to rewrite using the KubeFlow Pipelines SDK. The existing pipeline is using many TFX Standard Components such as ExampleValidator. When checking the KubeFlow SDK, I see a kfp.components.package but no…
pirateofebay
  • 930
  • 1
  • 10
  • 25
0
votes
0 answers

How to write an aggregation function inside the Transform component of TFX?

I have datetime feature, cluster number and coordinates in my tfrecord file. How to write a function inside the transform component of TFX to count the records for each cluster number based on a 30minutes time interval? OR Is there any built-in…
0
votes
1 answer

Delete a column from TFRecord Dataset (for feature selection)

I am trying to implement a Feature Selection component with the following plan in mind: The implementation Component takes and InputArtifact[Example] as input Since the data is stored in the form of TFRecords in the URI of the input artifact, I…
Eagle
  • 318
  • 4
  • 16
0
votes
1 answer

Why pod on GKE cluster is OOMkilled when trying to run a very simple Kubeflow pipeline using TFX?

I'm following the TFX on Cloud AI Platform Pipelines tutorial to implement a Kubeflow orchestrated pipeline on Google Cloud. The main difference is that I'm trying to implement an Object Detection solution instead of the Taxi application proposed by…
0
votes
0 answers

pip install TFX taking way too long

I am trying to deploy a Kubeflow pipeline using TFX. I have upgraded my pip to the latest version but however, the TFX is taking way too longer to get installed. I've installed TFX on my other projects in GCP, but it has never taken this long. Also,…
0
votes
2 answers

TFX/Apache Beam -> Flink jobs hang when running on more than one task manager

When I am trying to run a TFX pipeline/Apache Beam job on a Flink runner, it works fine when using 1 task manager (on one node) with parallelism 2 (2 task slots per task manager). But hangs when I try it with higher parallelism on more than one task…