I'm using the BigQuery API for Python to insert data in a table with table.insert_data()
.
Sometimes a row will contain a None
value for a field that is set to REPEATED
mode in the table schema. When this happens, the following error is returned by the API:
[{'index': 48,
'errors': [
{u'debugInfo': u'generic::invalid_argument: Field value cannot be empty.',
u'reason': u'invalid',
u'message': u'Field value cannot be empty.',
u'location': u'name_of_my_field'}]}]
I would like these rows to be ignored. This is the prototype of the function i am using, that can also be found here:
insert_data(rows, row_ids=None, skip_invalid_rows=None, ignore_unknown_values=None, template_suffix=None, client=None)
And this is the parameter I am interested in because it seems to perfectly fit my needs:
skip_invalid_rows (bool) – (Optional) skip rows w/ invalid data?
It mirrors this parameter in the main API:
skipInvalidRows boolean [Optional] Insert all valid rows of a request, even if invalid rows exist. The default value is false, which causes the entire request to fail if any invalid rows exist.
However by using the function with skip_invalid_rows=True
, the same errors are thrown.
Does this parameter mean what I think it means?
Worth pointing out:
- why is there a question mark in the doc at the end of the description of
skip_invalid_rows
in the doc? ignore_unknown_values=True
works fine for its own purpose.
Any help much appreciated. :)