0

System information

  • Environment: Linux 5.13.0-41-generic #46~20.04.1-Ubuntu
  • TensorFlow version: 2.8.0
  • TFX Version: 1.8.0
  • Python version: 3.8.13
  • Python dependencies: absl-py 1.0.0 alembic 1.7.7 anyio 3.5.0 apache-airflow 2.2.5 apache-airflow-providers-ftp 2.1.2 apache-airflow-providers-http 2.1.2 apache-airflow-providers-imap 2.2.3 apache-airflow-providers-sqlite 2.1.3 apache-beam 2.39.0 apispec 3.3.2 argcomplete 2.0.0 argon2-cffi 21.3.0 argon2-cffi-bindings 21.2.0 asgiref 3.5.1 astunparse 1.6.3 attrs 20.3.0 Babel 2.9.1 backcall 0.2.0 beautifulsoup4 4.10.0 bleach 4.1.0 blinker 1.4 cachelib 0.6.0 cachetools 4.2.4 cattrs 1.10.0 certifi 2021.10.8 cffi 1.15.0 charset-normalizer 2.0.12 click 7.1.2 clickclick 20.10.2 cloudpickle 2.0.0 colorama 0.4.4 colorlog 6.6.0 commonmark 0.9.1 connexion 2.13.0 crcmod 1.7 croniter 1.3.4 cryptography 36.0.2 cycler 0.11.0 Cython 0.29.28 debugpy 1.6.0 decorator 5.1.1 defusedxml 0.7.1 Deprecated 1.2.13 dill 0.3.1.1 dm-tree 0.1.6 dnspython 2.2.1 docker 4.4.4 docopt 0.6.2 docutils 0.16 email-validator 1.1.3 entrypoints 0.4 fastapi 0.78.0 fastapi-utils 0.2.1 fastavro 1.4.10 fasteners 0.17.3 fastjsonschema 2.15.3 Flask 1.1.4 Flask-AppBuilder 3.4.5 Flask-Babel 2.0.0 Flask-Caching 1.10.1 Flask-JWT-Extended 3.25.1 Flask-Login 0.4.1 Flask-OpenID 1.3.0 Flask-Session 0.4.0 Flask-SQLAlchemy 2.5.1 Flask-WTF 0.14.3 flatbuffers 2.0 fonttools 4.31.2 gast 0.5.3 gin-config 0.5.0 google-api-core 1.31.5 google-api-python-client 1.12.11 google-apitools 0.5.31 google-auth 1.35.0 google-auth-httplib2 0.1.0 google-auth-oauthlib 0.4.6 google-cloud-aiplatform 1.11.0 google-cloud-bigquery 2.34.3 google-cloud-bigquery-storage 2.13.0 google-cloud-bigtable 1.7.1 google-cloud-core 1.7.2 google-cloud-datastore 1.15.4 google-cloud-dlp 3.6.2 google-cloud-language 1.3.0 google-cloud-pubsub 2.11.0 google-cloud-pubsublite 1.4.1 google-cloud-recommendations-ai 0.2.0 google-cloud-spanner 1.19.1 google-cloud-storage 2.2.1 google-cloud-videointelligence 1.16.1 google-cloud-vision 1.0.1 google-crc32c 1.3.0 google-pasta 0.2.0 google-resumable-media 2.3.2 googleapis-common-protos 1.56.0 graphviz 0.20 grpc-google-iam-v1 0.12.3 grpcio 1.45.0 grpcio-gcp 0.2.2 grpcio-status 1.45.0 gunicorn 20.1.0 h11 0.12.0 h5py 3.6.0 hdfs 2.7.0 httpcore 0.14.7 httplib2 0.19.1 httpx 0.22.0 idna 3.3 importlib-metadata 4.11.3 importlib-resources 5.6.0 inflection 0.5.1 ipykernel 6.12.1 ipython 7.32.0 ipython-genutils 0.2.0 ipywidgets 7.7.0 iso8601 1.0.2 itsdangerous 1.1.0 jedi 0.18.1 Jinja2 2.11.3 joblib 0.14.1 jsonschema 3.2.0 jupyter-client 7.2.1 jupyter-core 4.9.2 jupyterlab-pygments 0.1.2 jupyterlab-widgets 1.1.0 kaggle 1.5.12 keras 2.8.0 Keras-Preprocessing 1.1.2 keras-tuner 1.1.2 kiwisolver 1.4.2 kt-legacy 1.0.4 kubernetes 12.0.1 lazy-object-proxy 1.7.1 libclang 13.0.0 lockfile 0.12.2 Mako 1.2.0 Markdown 3.3.6 MarkupSafe 2.0.1 marshmallow 3.15.0 marshmallow-enum 1.5.1 marshmallow-oneofschema 3.0.1 marshmallow-sqlalchemy 0.26.1 matplotlib 3.5.1 matplotlib-inline 0.1.3 mistune 0.8.4 ml-metadata 1.8.0 ml-pipelines-sdk 1.8.0 nbclient 0.5.13 nbconvert 6.4.5 nbformat 5.3.0 nest-asyncio 1.5.5 notebook 6.4.10 numpy 1.21.5 oauth2client 4.1.3 oauthlib 3.2.0 opencv-python-headless 4.5.5.64 opt-einsum 3.3.0 orjson 3.6.7 overrides 6.1.0 packaging 20.9 pandas 1.4.2 pandocfilters 1.5.0 parso 0.8.3 pendulum 2.1.2 pexpect 4.8.0 pickleshare 0.7.5 Pillow 9.1.0 pip 21.2.4 portalocker 2.4.0 portpicker 1.5.0 prison 0.2.1 prometheus-client 0.13.1 promise 2.3 prompt-toolkit 3.0.29 proto-plus 1.20.3 protobuf 3.20.0 psutil 5.9.0 ptyprocess 0.7.0 py-cpuinfo 8.0.0 pyarrow 5.0.0 pyasn1 0.4.8 pyasn1-modules 0.2.8 pycocotools 2.0.4 pycparser 2.21 pydantic 1.9.0 pydot 1.4.2 pyfarmhash 0.3.2 Pygments 2.11.2 PyJWT 1.7.1 pymongo 3.12.3 pyparsing 2.4.7 pyrsistent 0.18.1 python-daemon 2.3.0 python-dateutil 2.8.2 python-nvd3 0.15.0 python-slugify 4.0.1 python3-openid 3.2.0 pytz 2022.1 pytzdata 2020.1 PyYAML 5.4.1 pyzmq 22.3.0 regex 2022.3.15 requests 2.27.1 requests-oauthlib 1.3.1 rfc3986 1.5.0 rich 12.2.0 rsa 4.8 sacrebleu 2.0.0 scikit-learn 1.0.2 scipy 1.8.0 Send2Trash 1.8.0 sentencepiece 0.1.96 seqeval 1.2.2 setproctitle 1.2.3 setuptools 58.0.4 six 1.16.0 sniffio 1.2.0 soupsieve 2.3.1 SQLAlchemy 1.3.24 SQLAlchemy-JSONField 1.0.0 SQLAlchemy-Utils 0.38.2 starlette 0.19.1 swagger-ui-bundle 0.0.9 tabulate 0.8.9 tenacity 8.0.1 tensorboard 2.8.0 tensorboard-data-server 0.6.1 tensorboard-plugin-wit 1.8.1 tensorflow 2.8.0 tensorflow-addons 0.16.1 tensorflow-data-validation 1.8.0 tensorflow-datasets 4.5.2 tensorflow-hub 0.12.0 tensorflow-io-gcs-filesystem 0.24.0 tensorflow-metadata 1.8.0 tensorflow-model-analysis 0.39.0 tensorflow-model-optimization 0.7.2 tensorflow-serving-api 2.8.0 tensorflow-text 2.8.1 tensorflow-transform 1.8.0 termcolor 1.1.0 terminado 0.13.3 testpath 0.6.0 text-unidecode 1.3 tf-estimator-nightly 2.8.0.dev2021122109 tf-models-official 2.8.0 tf-slim 1.1.0 tfx 1.8.0 tfx-bsl 1.8.0 threadpoolctl 3.1.0 tornado 6.1 tqdm 4.64.0 traitlets 5.1.1 typeguard 2.13.3 typing_extensions 4.1.1 typing-utils 0.1.0 unicodecsv 0.14.1 uritemplate 3.0.1 urllib3 1.26.9 uvicorn 0.17.6 wcwidth 0.2.5 webencodings 0.5.1 websocket-client 1.3.2 Werkzeug 1.0.1 wheel 0.37.1 widgetsnbextension 3.6.0 wrapt 1.14.0 WTForms 2.3.3 zipp 3.8.0

Issue description: I am using Tranform component with arg custom_config, like this:

  transform = tfx.components.Transform(
            module_file=os.path.abspath(self.cfg.transformer_fn),
            examples=example_gen.outputs['examples'],
            schema=schema_gen.outputs['schema'],
            custom_config=self.hyper_params
        )

If i implement preprocessing_fn without custom_config:

def preprocessing_fn(inputs):
    config = Configer(
        os.path.join(__ROOT, "configs", "bert.yaml")
    )
    preprocessor = BertPreprocessor(config)
    outputs = preprocessor.run(inputs)

    return outputs

It all works fine with downstream components.

But when I then implement preprocessing_fn with custom_config:

def preprocessing_fn(inputs, custom_config):
    preprocessor = BertPreprocessor(custom_config)
    outputs = preprocessor.run(inputs)

    return outputs

Transform layer is not working when i want to save model for tf serving, with code below:

    def _get_serve_tf_strings_fn(self, model, tf_transform_output):

        model.tft_layer = tf_transform_output.transform_features_layer()

        @tf.function(input_signature=[
            tf.TensorSpec(shape=[None], dtype=tf.string, name='examples')
        ])
        def serve_tf_examples_fn(text):
            reshaped_text = tf.reshape(text, [-1, 1])
            transformed_features = model.tft_layer({"text": reshaped_text})
            outputs = model(transformed_features)

            return {'outputs': outputs}

        return serve_tf_examples_fn

transformed_features is {}, nothing in it. Is that i can not use a custom_config in this way, or there is some other way to do so?

1 Answers1

0

From my understanding, part of the price of using the power of using tfx is that you have to stick to the function signatures. In this case, looks like you are trying to do some hyperparameter tuning in your preprocessing layers/functions. I would recommend putting those in the keras preprocessing layers, integrating those into the model, putting keras tuner hyperparameter configurations into the tuner, and driving that through the Tuner component.

Pritam Dodeja
  • 177
  • 1
  • 8