2

I'm working on a DoFn that writes to Elastic Search App Search (elastic_enterprise_search.AppSearch). It works fine when I run my pipeline using the DirectRunner.

But when I deploy to DataFlow the elasticsearch client fails because, I suppose, it can't access a certificate store:

 File "/usr/local/lib/python3.8/site-packages/urllib3/util/ssl_.py", line 402, in ssl_wrap_socket
    context.load_verify_locations(ca_certs, ca_cert_dir, ca_cert_data)
FileNotFoundError: [Errno 2] No such file or directory

Any advice on how to overcome this sort of problem? I'm finding it difficult to get any traction on how to solve this on google.

Obviously urllib3 is set up properly on my local machine for DirectRunner. I have "elastic-enterprise-search" in the REQUIRED_PACKAGES key of setup.py for my package along with all my other dependencies:

REQUIRED_PACKAGES = ['PyMySQL', 'sqlalchemy', 
'cloud-sql-python-connector', 'google-cloud-pubsub', 'elastic-enterprise-search']

Can I package certificates up with my pipeline? How? Should I look into creating a custom docker image? Any hints on what it should look like?

Robert Moskal
  • 21,737
  • 8
  • 62
  • 86

1 Answers1

1

Yes, creating a custom container that has the necessary credentials in it would work well here.

robertwb
  • 4,891
  • 18
  • 21