0

For compliance purposes, I have to save all kafka raw documents and keep them one year. To do this, I use the following :

kafka-console-consumer.sh --bootstrap-server kafka1:9092,kafka2:9092,kafka3:9092 --topic test1 --consumer.config /usr/local/kafka_2.12-2.2.1/config/consumer.properties >> /data/backup.txt

Saved documents have escape characters like this (\ before each double quotes) :

{"type":"Fortigate","@timestamp":"2021-09-06T09:20:38.909Z","message":"<189>date=2021-09-06 time=11:20:38 devname=\"FW_N\" devid=\"FG1123\" logid=\"0000000013\" type=\"traffic\" subtype=\"forward\" level=\"notice\" vd=\"FW-D\" eventtime=1630920038 srcip=1.2.3.4 srcport=59349 srcintf=\"LAG11.2418\" srcintfrole=\"undefined\" dstip=1.2.3.4 dstport=8531 dstintf=\"LAG11.1470\" dstintfrole=\"undefined\" poluuid=\"8f80acf4-db3f-51eb-369f-97e43f065438\" sessionid=3122225826 proto=6 action=\"deny\" policyid=1234 policytype=\"policy\" service=\"gTCP/8531\" dstcountry=\"Reserved\" srccountry=\"Reserved\" trandisp=\"noop\" duration=0 sentbyte=0 rcvdbyte=0 sentpkt=0 appcat=\"unscanned\" crscore=30 craction=131072 crlevel=\"high\"","@version":"1","host":"5.6.7.8"}

Expected result :

{"type":"Fortigate","@timestamp":"2021-09-06T09:20:38.909Z","message":"<189>date=2021-09-06 time=11:20:38 devname="FW_N" devid="FG1123" logid="0000000013" type="traffic" subtype="forward" level="notice" vd="FW-D" eventtime=1630920038 srcip=1.2.3.4 srcport=59349 srcintf="LAG11.2418" srcintfrole="undefined" dstip=1.2.3.4 dstport=8531 dstintf="LAG11.1470" dstintfrole="undefined" poluuid="8f80acf4-db3f-51eb-369f-97e43f065438" sessionid=3122225826 proto=6 action="deny" policyid=1234 policytype="policy" service="gTCP/8531" dstcountry="Reserved" srccountry="Reserved" trandisp="noop" duration=0 sentbyte=0 rcvdbyte=0 sentpkt=0 appcat="unscanned" crscore=30 craction=131072 crlevel="high"","@version":"1","host":"5.6.7.8"}

Is there any way to tell to kafka console consumer to not add these escape characters ?

  • 1
    That looks like a JSON rendering of your document. The backslashes are supposed to be there. They tell which quotation marks are part of your data and which are not. Your JSON parser will deal with these. You do not need to do anything. – Michael Hampton Sep 06 '21 at 11:33
  • Yes indeed....Thanks Michael – Atreiide Sep 06 '21 at 14:35
  • This seems off-topic for ServerFault. In the future, StackOverflow would be better – OneCricketeer Nov 01 '21 at 21:03

0 Answers0