I have a stream that watch output of multi file in a directory, process data and put it to HDFS. Here is my stream creat command:
stream create --name fileHdfs --definition "file --dir=/var/log/supervisor/ --pattern=tracker.out-*.log --outputType=text/plain | logHdfsTransformer | hdfs --fsUri=hdfs://192.168.1.115:8020 --directory=/data/log/appsync --fileName=log --partitionPath=path(dateFormat('yyyy/MM/dd'))" --deploy
Problem is source:file module send all data read from file to log processing module instead of one line each turn, becase of that, payload string have millions of char, i can't process it. Ex:
--- PAYLOAD LENGTH---- 9511284
Please tell me how to read line by line when use source:file module, thanks !!!