1

I have the following 10000 rows of log file every 5 seconds.

log_datetime1 host_name1 log_message1
log_datetime2 host_name2 log_message2
log_datetime3 host_name3 log_message3

I want to send them to kudu or parquet table as the following JSON

{"cureent_datetime":"datetime", "log_data":"log_datetime1 host_name1 log_message1"}
{"cureent_datetime":"datetime", "log_data":"log_datetime2 host_name2 log_message2"}
{"cureent_datetime":"datetime", "log_data":"log_datetime3 host_name3 log_message3"}

Currently I'm using Two ReplaceText Processors. One to add the {"cureent_datetime":"datetime", "log_data":" at the beginning of each line of 10000 rows log file and the second one to add "} at the end of each line.

Was wondering if I could do the both step in one ReplaceText proecssor.

mongotop
  • 7,114
  • 14
  • 51
  • 76

1 Answers1

2

Using the search pattern (.+)(?=\n) and the replacement pattern {"current_datetime":"datetime", "log_data":"$1"} will result in the desired output. The search pattern looks for text which is followed by a newline, and the replacement includes the capture group inside the templated JSON structure.

Andy
  • 13,916
  • 1
  • 36
  • 78