1

Trying to find an example of how to convert protobuf messages to parquet using Gobblin. Unable to find any.

Scenario:
- Kafka messages are in Protobuf
- Gobblin Consumer: consumes protobuf from kafka and writes them as parquet into HDFS

Gobblin runtime does have a writer builder called:

public class ParquetDataWriterBuilder extends FsDataWriterBuilder<MessageType, Group> 

https://github.com/apache/incubator-gobblin/blob/master/gobblin-modules/gobblin-parquet/src/main/java/org/apache/gobblin/writer/ParquetDataWriterBuilder.java

but that does not seem to take in Protobuf messages as is. They have to be first converted to a Group.

Unable to figure out how to convert protobuf message to a Group.

Any pointer to a working Gobblin consumer with protobuf to parquet conversion should help.

alex
  • 12,464
  • 3
  • 46
  • 67
Pritam
  • 929
  • 1
  • 7
  • 16

0 Answers0