0

I'm using BigQuery java library to send data to BigQuery table. I have list of data to send and I'm using bigQuery.insertAll() method to send all that data at once but I wanted to get rid of duplicates with the same timestamp so I added insertId parameter on timestamps so now when 2 rows with the same timestamp are added timestamp is merged but not the values. There is only 1 value, and other columns are null. (https://i.stack.imgur.com/MGmGn.png) I want it to be merged with all of the values within 1 timestamp. I searched for the solution and found "WRITE_APPEND" option that could possibly help me, but I don't know how to use it with InsertAllRequest and bigQuery.insertAll() method.

Here is code to get insertId of timestamp (createRowsToInsert is just a method to create Map with field-value data).

List<RowToInsert> rowsToInsert = infoTableHelper.createRowsToInsert(inputInfoTable, fieldsNames);
         List<RowToInsert> rowsToInsertWithId = new ArrayList<>();
         for(RowToInsert row : rowsToInsert){
            String insertId = null;

            if(row.getContent().containsKey("timestamp")){
               insertId = row.getContent().get("timestamp").toString();
            }

            if(insertId != null){
               InsertAllRequest.RowToInsert insertAllRequest = InsertAllRequest.RowToInsert.of(insertId, row.getContent());
               rowsToInsertWithId.add(insertAllRequest);
            }
         }

Here I'm using bigQuery.insertAll() to send all data with insertId's at once to BigQuery.

InsertAllResponse response = bigQuery.insertAll(InsertAllRequest.newBuilder(tableId)
                 .setRows(rowsToInsertWithId)
                 .build());

I tried to add JobConfigurationLoad which has .setWriteDisposition("WRITE_APPEND") option but I couldn't connect it with insertAll() method.

0 Answers0