3

I am working on building an Apache Beam pipeline but I am running into an AttributeError when attempting to import pipeline options.

I am testing on an Ubuntu server in a clean virtual environment using python3.6

Steps:

virtualenv -p python3.6 beam-env
. beam-env/bin/activate
pip install apache_beam==2.12.0
python3.6 test.py

Inside test.py:

from apache_beam.options.pipeline_options import PipelineOptions

I would expect the import to work successfully but I am getting the following error:

AttributeError: module 'apache_beam.coders.coders' has no attribute 'VarIntCoder'

Justin Miller
  • 55
  • 1
  • 5

4 Answers4

28

I discovered one very weird behavior while writing my own data pipeline with apache-beam: You get this behavior as soon as you have a pipeline in a file called test.py.

Let's say you have your current implementation in main.py, you only need to copy the whole code into test.py file and after that it will produce the error on both files.

The only way for me to fix this right now is to either delete that test.py or rename it to some other names and the problem's gone.

KalanyuZ
  • 672
  • 6
  • 15
3

For those using Google Colab I had the same error and solved it by restarting the runtime and running the whole notebook again.

enter image description here

My imports are

!{'pip install apache-beam[gcp]'}
!{'pip install apache-beam[interactive]'}
!{('pip install google-apitools')}
lloiacono
  • 4,714
  • 2
  • 30
  • 46
0

Update:

In fact the error is due to python3. Switch to python2.7, the error is gone. Beam is transiting completely to python3, which is expected to 100% finish very soon. [I am not sure about a ETA for 100%, maybe double check in user group for an exact time line]

Original: Do following as well in your virtualenv:

pip install -e .[gcp,test]

and maybe also do this under apache_beam folder:

python setup.py sdist

And then try again.

Setup environment could be tricky even if a virtualenv is used. I sometimes find the tips in this page useful: https://cwiki.apache.org/confluence/display/BEAM/Python+Tips

Hope it helps.

Ruoyun Huang
  • 173
  • 10
  • I pulled the beam github project and installed from scratch using your steps with no luck. The AttributeError happens when running https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/wordcount_minimal.py even though the nosetests are passing – Justin Miller May 20 '19 at 18:25
  • strange, it works for me. During your pull-install of github project, did you quit your python3.6 env and re-enter? – Ruoyun Huang May 20 '19 at 18:31
  • yep. I pulled the project, checked out release-2.12.0, created a fresh virtual env, followed your steps, ran the example I linked above and got the same result – Justin Miller May 20 '19 at 18:44
  • I switched back and forth between release-2.12 , release-2.13, and master head. each of them works (after repeating the setup step above). My command to run is "python -m apache_beam.examples.wordcount --output=/tmp/out*". Are you using linux? – Ruoyun Huang May 20 '19 at 19:07
  • okay so `python -m apache_beam.examples.wordcount --output=/tmp/out*` works when I run it from the base apache_beam directory but when I cd to my custom project directory the same command breaks with the error. I am running Ubuntu – Justin Miller May 20 '19 at 19:21
  • I see. Now I do see the same error as yours. I suspect it is due to transition from python2.7 to pythone3 not complete yet (even if not, it should be done very soon). Maybe use python2.7 instead to start? I switched to python2.7 and everything works fine. – Ruoyun Huang May 20 '19 at 19:46
  • Yes, thank you for all the help! Not the solution I was hoping for, but I will continue with python2.7 until python3 is more stable – Justin Miller May 21 '19 at 14:02
0

In order to fix this error, you can open a new project. Then, reinstall apace-beam.

Catalina Chircu
  • 1,506
  • 2
  • 8
  • 19