2

I am using this code to read data from Google Cloud Pubsub:

pubsubmessage := pubsubio.Read(s, project, *input, &pubsubio.ReadOptions{Subscription: sub.ID()})

and this code to write to my bigquery data set :

bigqueryio.Write(s, project, *output, pubsubmessage)

I get the following error:

panic: schema type must be struct: []uint8
unable to convert []uint8/byte to schema type must be struct`

Please help me.

I am following these examples:

https://github.com/apache/beam/blob/cea122724c5cd87a403684004452305ca64b3a68/sdks/go/examples/cookbook/max/max.go

https://github.com/apache/beam/blob/master/sdks/go/examples/streaming_wordcap/wordcap.go

Kenn Knowles
  • 5,838
  • 18
  • 22
  • I think you might have a good question however the way it is structured right now makes it hard to respond: Can you add more of your code so that the type definitions are visible. Also check the editor a bit to see how to style your question a bit more readable. – Norbert Sep 14 '22 at 15:32

1 Answers1

1

The return value of pubsubio.Read is a PCollection of Pubsub messages. To convert these to a BigQuery row, you will need to apply a DoFn that takes a Pubsub message and converts it to a BigQuery row. This will return a PCollection of BigQuery rows that you can pass to bigqueryio.Write. Something like this:

p := beam.NewPipeline()
s := p.Root()

pubsubmessages := pubsubio.Read(s, project, *input, &pubsubio.ReadOptions{Subscription: sub.ID()})

bigqueryrows := beam.ParDo(s, func(message []byte) string {
        return ...
}, pubsubmessages)

bigqueryio.Write(s, project, *output, bigqueryrows)

You replace the ... with your code that converts the raw bytes of the Pubsub message to a BigQuery row.

Kenn Knowles
  • 5,838
  • 18
  • 22