0

i couldnt find tutorial for cloud ML engine + airflow, someone please help deploy a cloud ml engine model and orchestrate with airflow to run training with new data every hour

AVR
  • 83
  • 9

3 Answers3

4

You can schedule ML Engine jobs using the Composer quick start available here and the Airflow ML Engine operators' documentation here. The model package gets created on GCS when you train a job on MLEngine or it can even be created manually. If you intend to do hyper parameter optimization then your package will need to contain a setup.py as mentioned here.

Below is an example of the DAG for iris scikit model (reference here)-

with models.DAG(
        'composer_sample_ml',
        # Continue to run DAG once per day
        schedule_interval=datetime.timedelta(days=1),
        default_args=default_dag_args) as dag:

    train_model = mlengine_operator.MLEngineTrainingOperator(
        task_id='train_model',
        project_id='PROJECT_ID',
        job_id='{}_{}'.format('iris_train_job', str(uuid.uuid4())),
        package_uris='gs://BUCKET_ID/scikit_learn_job_dir/packages/PACKAGE_ID/iris_sklearn_trainer-0.1.tar.gz',
        training_python_module='iris_sklearn_trainer.iris',
        training_args=["--jobDir='gs://BUCKET_ID/scikit_learn_job_dir'"],
        region='us-central1',
        scale_tier='BASIC',
        runtimeVersion = '1.8',
        pythonVersion = '2.7'
    )

    train_model
KVK
  • 129
  • 2
  • 11
  • I have a doubt about the hyperparameter tuning. When i use the python api, I put the hp_spec as a part of the json. In this case is not clear to me how to pass all that specs – Pablo V. Jun 18 '20 at 22:09
  • @PabloV. Have you tried packaging the training task similar to iris sample [here](https://github.com/GoogleCloudPlatform/cloudml-samples/tree/master/tensorflow/standard/iris)? Essentially the HP spec goes in the hptuning_config.yaml and the package should have setup.py for it to be picked up. – KVK Jun 29 '20 at 16:41
2

Here's a tutorial that includes ML Engine and Composer: https://cloud.google.com/solutions/machine-learning/recommendation-system-tensorflow-deploy

Trevor Edwards
  • 431
  • 2
  • 6
0

As an update for anyone who may be looking post-June 2020, in addition to the resources above, these two PRs include updates to the example DAG and a guide for these operators. I don't think they're updated in the docs yet, but once it is I'll try to update

  1. https://github.com/apache/airflow/pull/9727
  2. https://github.com/apache/airflow/pull/9798/files?short_path=b8207cb#diff-b8207cb1601efda85d3cbcf0087a6347
LEC
  • 161
  • 9