2

I'm using spooldir as Flume source and sink to kafka, is there anyway that i can transfer both the content and filename to kafka. For example, filename is test.txt and content is hello world, need to display hello world test.txt

bytecode77
  • 14,163
  • 30
  • 110
  • 141
thomas
  • 21
  • 2

1 Answers1

0

Some sources allow adding the name of the file as header of the Flume event created with the input data; that's the case of the spooldir source.

And some sinks allow configuring the serializer to be used for writting the data, such as the HDFS one; in that case, I've read there exists a header_and_text serializer (never tested it). Nevertheless, the Kafka source does not expose parameters for doing that.

So, IMHO your options are two:

  1. Configure the spooldir for adding the above mentioned header about the file name, and develop a custom interceptor in charge of modifying the data with such a header value. Interceptors are pieces of code running at the output of the sources that "intercept" the events and modify them before they are effectively put into the Flume channel.
  2. Modify the data you send to the spooldir source by adding a first data line about the file name.
Community
  • 1
  • 1
frb
  • 3,738
  • 2
  • 21
  • 51