I am using hortonworks sandbox.
creating topic:
./kafka-topics.sh --create --zookeeper 10.25.3.207:2181 --replication-factor 1 --partitions 1 --topic lognew
tailing the apache access log directory:
tail -f /var/log/httpd/access_log |./kafka-console-producer.sh --broker-list 10.25.3.207:6667 --topic lognew
At another terminal (of kafka bin) start consumer:
./kafka-console-consumer.sh --zookeeper 10.25.3.207:2181 --topic lognew --from-beginning
The apache access logs are sent to the kafka topic "lognew".
I need to store them to HDFS.
Any ideas or suggestions regarding how to do this.
Thanks in advance.
Deepthy