Trying to find an example of how to convert protobuf messages to parquet using Gobblin. Unable to find any.
Scenario:
- Kafka messages are in Protobuf
- Gobblin Consumer: consumes protobuf from kafka and writes them as parquet into HDFS
Gobblin runtime does have a writer builder called:
public class ParquetDataWriterBuilder extends FsDataWriterBuilder<MessageType, Group>
but that does not seem to take in Protobuf messages as is. They have to be first converted to a Group.
Unable to figure out how to convert protobuf message to a Group.
Any pointer to a working Gobblin consumer with protobuf to parquet conversion should help.