0

When deploying this step

WriteToBigQuery(
                    method='STORAGE_WRITE_API',
                    table=[TABLE],
                    schema=[PATH TO SCHEMA ON GCS BUCKET],
                    create_disposition=beam.io.BigQueryDisposition.CREATE_IF_NEEDED,
                    write_disposition=beam.io.BigQueryDisposition.WRITE_APPEND,
                    insert_retry_strategy='RETRY_NEVER'
                )

getting

  File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/collections/__init__.py", line 402, in namedtuple
    raise ValueError('Field names cannot start with an underscore: '
ValueError: Field names cannot start with an underscore: '_embedded'

Bigquery does support fields starting with underscores. Is this a bug and are there any workarounds?

user912830823
  • 1,179
  • 3
  • 14
  • 25

1 Answers1

-1

The error occurs in the Python standard library's collections module, specifically in the code that handles the creation of namedtuple.

Field names cannot start with an underscore: '_embedded' means that you are attempting to create a namedtuple where one of the field names starts with an underscore ('_embedded'), which is not allowed.

Based on this documentation, names starting with an underscore(_) are invalid fieldnames.

One workaround regarding this is by changing rename=True argument. You can also refer to this use case.

If you didn't intend to work with namedtuple and you're encountering this error within Apache Beam's WriteToBigQuery method, it could indicate a potential problem linked to field names that begin with underscores in your BigQuery schema or data. Renaming the fields in your data or schema so that they do not start with underscores is a straightforward workaround.

The Dataflow WriteToBigQuery transform doesn't inherently prevent fields starting with underscores when using the STORAGE_WRITE_API write mode. However, there could be other issues related to BigQuery schema or data validation that might cause problems if your BigQuery table's schema doesn't match the incoming data.

Poala Astrid
  • 1,028
  • 2
  • 10