1

I want to read a csv file into a list in an apache beam application, where each element in the list is a tuple or list (don't really matter), so that I would have the csv

1,2,3
4,5,6

become

[(1,2,3) , (4,5,6)] 

or

[ [1,2,3], [4,5,6] ]

I tried following the instructions in How to convert csv into a dictionary in apache beam dataflow but when I try to use

from beam_utils.sources import CsvFileSource

I get

from beam_utils.sources import CsvFileSource
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/site-packages/beam_utils/sources.py", line 9, in <module>
    from apache_beam.io import fileio
ImportError: cannot import name fileio

If I try to directly import

from apache_beam.io import fileio

I get the same issue, however I can use both of

import apache_beam.io
import beam_utils

without any issues. Anyone got a good idea of what the issue might be or got a good idea of how I could do this in a different way?

I currently have

with beam.Pipeline(options = pipeline_options) as p:
        csvfile = p | ReadFromText(known_args.input)

so if I can turn csvfile to the desired format in another way that works well too

digestivee
  • 690
  • 1
  • 8
  • 16

1 Answers1

1

Just ran into this same problem a few minutes ago. The issue is that fileio is apparently no longer in apache_beam (at least it wasn't for me). It appears to have been replaced by filesystem.

Not a great solution, but in sources.py from beam_utils I replaced all instances of "fileio" with "filesystem"

So

from apache_beam.io import fileio

becomes

from apache_beam.io import filesystem
Ben Reid
  • 11
  • 1
  • I decided to not use beam_utils in the end and just found a way to input the data into a list/dict myself, but I suspected that something like what you described was the issue. Sometimes ugly fixes are the best :) – digestivee Oct 20 '17 at 09:47
  • Hi @TrotteBoman, care to share your solution with us? Been looking for this as well. :) – Ventus Oct 27 '17 at 11:01
  • I suggest these examples https://github.com/apache/beam/tree/master/sdks/python/apache_beam/examples/complete/game @Ventus , if they dont help tell me and I will see if I can be of assistance, but basically all I wrote is from here – digestivee Oct 31 '17 at 13:06