0

Im trying to launch an ml-engine jobs submit training using a cloud composer, i'm using this guide for instructions recommendation-system-tensorflow-deploy.

Im using a plugin which google created (see the implementation here)

Im trying to make it work on python version 3.5, this by changing line 206 from:

training_request = {
        'jobId': job_id,
        'trainingInput': {
            'scaleTier': self._scale_tier,
            'packageUris': self._package_uris,
            'pythonModule': self._training_python_module,
            'region': self._region,
            'args': self._training_args,
            'masterType': self._master_type
        }

To:

training_request = {
    'jobId': job_id,
    'trainingInput': {
        'scaleTier': self._scale_tier,
        'packageUris': self._package_uris,
        'pythonModule': self._training_python_module,
        'region': self._region,
        'args': self._training_args,
        'masterType': self._master_type,
        'python-version': '3.5'     #self._python_version

    }

I also tried to add to it the run time version (runtime-version='1.12') but i keep on getting the following error:

[2019-01-20 11:58:36,331] {models.py:1594} ERROR - <HttpError 400 when requesting https://ml.googleapis.com/v1/projects/hallowed-forge-577/jobs?alt=json returned "Invalid JSON payload received. Unknown name "python-version" at 'job.training_input': Cannot find field.">
Traceback (most recent call last)
  File "/usr/local/lib/airflow/airflow/models.py", line 1492, in _run_raw_tas
    result = task_copy.execute(context=context
  File "/home/airflow/gcs/plugins/ml_engine_plugin.py", line 241, in execut
    self._project_id, training_request, check_existing_job
  File "/home/airflow/gcs/plugins/ml_engine_plugin.py", line 79, in create_jo
    request.execute(
  File "/usr/local/lib/python3.6/site-packages/oauth2client/util.py", line 135, in positional_wrappe
    return wrapped(*args, **kwargs
  File "/usr/local/lib/python3.6/site-packages/googleapiclient/http.py", line 838, in execut
    raise HttpError(resp, content, uri=self.uri
googleapiclient.errors.HttpError: <HttpError 400 when requesting https://ml.googleapis.com/v1/projects/hallowed-forge-577/jobs?alt=json returned "Invalid JSON payload received. Unknown name "python-version" at 'job.training_input': Cannot find field."
[2019-01-20 11:58:36,334] {models.py:1623} INFO - Marking task as FAILED.
[2019-01-20 11:58:36,513] {models.py:1627} ERROR - Failed to send email to: ['airflow@example.com']
[2019-01-20 11:58:36,516] {models.py:1628} ERROR - HTTP Error 401: Unauthorized
Traceback (most recent call last)
  File "/usr/local/lib/airflow/airflow/models.py", line 1625, in handle_failur
    self.email_alert(error, is_retry=False
  File "/usr/local/lib/airflow/airflow/models.py", line 1778, in email_aler
    send_email(task.email, title, body
  File "/usr/local/lib/airflow/airflow/utils/email.py", line 44, in send_emai
    return backend(to, subject, html_content, files=files, dryrun=dryrun, cc=cc, bcc=bcc, mime_subtype=mime_subtype
  File "/usr/local/lib/airflow/airflow/contrib/utils/sendgrid.py", line 116, in send_emai
    _post_sendgrid_mail(mail.get()
  File "/usr/local/lib/airflow/airflow/contrib/utils/sendgrid.py", line 122, in _post_sendgrid_mai
    response = sg.client.mail.send.post(request_body=mail_data
  File "/usr/local/lib/python3.6/site-packages/python_http_client/client.py", line 252, in http_reques
    return Response(self._make_request(opener, request, timeout=timeout)
  File "/usr/local/lib/python3.6/site-packages/python_http_client/client.py", line 176, in _make_reques
    raise ex
python_http_client.exceptions.UnauthorizedError: HTTP Error 401: Unauthorize

Notice that the python version actually changes (to 3.6 from the original 2.7) so changing the python version, does something, but then gets stuck

Any help on what i'm missing here will be awesome!

Community
  • 1
  • 1
Yehoshaphat Schellekens
  • 2,305
  • 2
  • 22
  • 49

1 Answers1

1

It seems like the example uses an old version of airflow MLEngineTrainingOperator. The last version implements the runtime-version/python-version training params. Use the current version: mlengine_operator.py

YPrei
  • 61
  • 1
  • 3