2

the firebase extension "Stream Collections to BigQuery" allows for configuring a Transform Function for converting the Firestore Data Json to explicit BigQuery table fields. https://firebase.google.com/products/extensions/firebase-firestore-bigquery-export

Can anyone point me to an example function or detailed docs for such functions?

Thanks, Ben

Renaud Tarnec
  • 79,263
  • 10
  • 95
  • 121
Benjamin
  • 813
  • 8
  • 11

1 Answers1

4

The transform Function should be an HTTP Cloud Function with the following logic (get the input object from the request, transform it, send it back in the response) as shown in the below CF skeleton:

exports.bqTransform = functions.https.onRequest(async (req, res) => {
    
   const inputPayload = req.body // JS Object
   // ...
   // Transform the object 
   // ...
   const outputPayload = {...}   // JS Object
    
   res.send(outputPayload);
    });

As explained in the doc, the inputPayload object (i.e. req.body) contains a data property (which is an array) which contains a representation of the Firestore document, has shown below:

{ 
  data: [{
    insertId: int;
    json: {
      timestamp: int;
      event_id: int;
      document_name: string;
      document_id: int;
      operation: ChangeType;
      data: string;  // <= String containing the stringified object representing the Firestore document data
    },
  }]
}

The transformation implemented in your code shall create an object with the same structure (outputPayload in our skeleton example above) where the data[0].json property is adapted according to your transformation requirements.


Here is a very simple example in which we totally change the content of the Firestore record with some static data.

exports.bqTransform = functions.https.onRequest(async (req, res) => {

    const inputPayload = req.body; 
    const inputData = inputPayload.data[0];

    const outputPayload = [{
        insertId: inputData.insertId,
        json: {
            timestamp: inputData.json.timestamp,
            event_id: inputData.json.event_id,
            document_name: inputData.json.document_name,
            document_id: inputData.json.document_id,
            operation: inputData.json.operation,
            data: JSON.stringify({ createdOn: { _seconds: 1664983515, _nanoseconds: 745000000 }, array: ["a1", "a2"], name: "Transformed Name" })
        },
    }]   

    res.send({ data: outputPayload });
});
Renaud Tarnec
  • 79,263
  • 10
  • 95
  • 121
  • 1
    Thanks a lot @Renaud Tarnec. That did the job! – Benjamin Oct 09 '22 at 04:28
  • Hi, as a follow-up querstion: Is it recommended to use a Transform Function for extracting firestore data from json and create new fields in the data set? When I try to, I see log messages such as "Retried to insert 1 row(s) of data into BigQuery (ignoring unknown columns)" Better to extract data from JSON at query time or how to solve the unkown column issue? – Benjamin Oct 09 '22 at 11:17