0

While using Apache beamIO for preprocessing data, snappy library was a good to have module for compression but looks like the file transformation doesnt seems to work as it cannot find the crc32 compress function in the library! Im using snappy-0.5.2 version

the error looks like this -

INFO:tensorflow:Saver not created because there are no variables in the graph to restore
ERROR:root:Exception at bundle <apache_beam.runners.direct.bundle_factory._Bundle object at 0x7f1dd1d60e50>, due to an exception.
 Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/direct/executor.py", line 312, in call
    side_input_values)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/direct/executor.py", line 347, in attempt_call
    evaluator.process_element(value)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/direct/transform_evaluator.py", line 551, in process_element
    self.runner.process(element)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/common.py", line 390, in process
    self._reraise_augmented(exn)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/common.py", line 388, in process
    self.do_fn_invoker.invoke_process(windowed_value)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/common.py", line 281, in invoke_process
    self._invoke_per_window(windowed_value)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/runners/common.py", line 307, in _invoke_per_window
    windowed_value, self.process_method(*args_for_process))
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/typehints/typecheck.py", line 63, in process
    return self.wrapper(self.dofn.process, args, kwargs)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/typehints/typecheck.py", line 81, in wrapper
    result = method(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/iobase.py", line 965, in process
    self.writer.write(element)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/filebasedsink.py", line 299, in write
    self.sink.write_record(self.temp_handle, value)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/filebasedsink.py", line 129, in write_record
    self.write_encoded_record(file_handle, self.coder.encode(value))
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/tfrecordio.py", line 235, in write_encoded_record
    _TFRecordUtil.write_record(file_handle, value)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/tfrecordio.py", line 97, in write_record
    struct.pack('<I', cls._masked_crc32c(encoded_length)),  #
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/tfrecordio.py", line 77, in _masked_crc32c
    crc = crc32c_fn(value)
  File "/usr/local/lib/python2.7/dist-packages/apache_beam/io/tfrecordio.py", line 43, in _default_crc32c_fn
    _default_crc32c_fn.fn = snappy._crc32c  # pylint: disable=protected-access
AttributeError: 'module' object has no attribute '_crc32c' [while running 'WriteTrainData/Write/WriteImpl/WriteBundles']

If any one could help me to use snappy with tensorflow correctly! Thank you

1 Answers1

2

I just hit this issue; I think it is due to Beam being a little careless about versions of optional test-dependencies (in this case, tensorflow and python-snappy).

The problematic code:

import snappy
snappy._crc32c

works in python-snappy version 0.5.1 but not in 0.5.2 (the latest version).

I got these Beam tests passing by installing python-snappy 0.5.1 via:

pip install \
  --upgrade --ignore-installed \
  python-snappy==0.5.1 \
  --global-option=build_ext \
  --global-option="-I/usr/local/include" \
  --global-option="-L/usr/local/lib"

On OSX I need the three --global-option flags otherwise it doesn't find my snappy headers (symptom: errors about #include <snappy-c.h>) and library files, which brew install snappy placed in /usr/local/include and /usr/local/lib, respectively.

The bits before that seem necessary to override pip's default of wanting to give me the latest version.

Ryan Williams
  • 173
  • 1
  • 2
  • 13