0

I'm trying to insert a non-repeated record into BigQuery but keep receiving the error Array specified for non-repeated field: record..

My question is: How can I insert non-repeated records into BigQuery using the bigrquery library?

If I have the following schema:

bqSchema <- bq_fields(list(
  bq_field(name = "record", type = "RECORD", fields = list(
    bq_field(name = "a", type = "INTEGER"),
    bq_field(name = "b", type = "STRING")
  ))
))

And this data frame:

df <- tibble(
  record = list(
    a = 1,
    b = "B"
  )
)

Inserting the data as below causes the error in BigQuery:

bq_perform_upload(bqTableObj, df, fields = bqSchema)
# Array specified for non-repeated field: record

I think this is in part because bigrquery converts the dataframe to JSON with jsonlite::stream_out(), but doesn't use the argument auto_unbox = TRUE, resulting in arrays, rather than objects. This results in the following newline delimited JSON being sent to BigQuery:

{"record": [1]}
{"record": ["B"]}

The correct NDJSON that should be sent to BigQuery I believe should be:

{"record": {"a": 1, "b", "B"}}

Has anyone had this problem before, or have ideas how I can resolve this?

Rhys Jackson
  • 129
  • 4

1 Answers1

0

You should try the following where you set mode = "REPEATED" :

bqSchema <- bq_fields(list(
  bq_field(name = "record", type = "RECORD", mode = "REPEATED",
           fields = list(bq_field(name = "a", type = "INTEGER"),
                         bq_field(name = "b", type = "STRING")
                         )
           )
 ))
  • Thanks for your suggestion. However, the format of the data I would like to load is not a repeated field of records. It feels like this answer hides the error rather than solves the problem. – Rhys Jackson Jun 05 '20 at 12:17