I'm using spooldir as Flume source and sink to kafka, is there anyway that i can transfer both the content and filename to kafka. For example, filename is test.txt and content is hello world, need to display hello world test.txt
Asked
Active
Viewed 505 times
1 Answers
0
Some sources allow adding the name of the file as header of the Flume event created with the input data; that's the case of the spooldir source.
And some sinks allow configuring the serializer to be used for writting the data, such as the HDFS one; in that case, I've read there exists a header_and_text
serializer (never tested it). Nevertheless, the Kafka source does not expose parameters for doing that.
So, IMHO your options are two:
- Configure the spooldir for adding the above mentioned header about the file name, and develop a custom interceptor in charge of modifying the data with such a header value. Interceptors are pieces of code running at the output of the sources that "intercept" the events and modify them before they are effectively put into the Flume channel.
- Modify the data you send to the spooldir source by adding a first data line about the file name.
-
Thanks frb, i'll take option 2. – thomas Nov 16 '15 at 03:41