Transform String into JSON so that it's searchable in Kibana/Elasticsearch

Question

I have Elasticsearch, Filebeat and Kibana running on a Windows machine. Filebeat log has a proper log file and is listening to the path. When I look on the data in Kibana it looks fine.

My issue is that the message field is a String.

Example of one log line:

12:58:09.9608 Trace {"message":"No more Excel rows found","level":"Trace","logType":"User","timeStamp":"2020-08-14T12:58:09.9608349+02:00","fingerprint":"226fdd2-e56a-4af4-a7ff-724a1a0fea24","windowsIdentity":"mine","machineName":"NAME-PC","processName":"name","processVersion":"1.0.0.1","jobId":"957ef018-0a14-49d2-8c95-2754479bb8dd","robotName":"NAME-PC","machineId":6,"organizationUnitId":1,"fileName":"GetTransactionData"}

So what I would like to have now is that String converted to a JSON so that it is possible to search in Kibana for example for the level field.

I already had a look on Filebeat. There I tried to enable LogStash . But then the data does not come anymore to Elasticsearch. And also the log file is not genereated into the LogStash folder.

Then I downloaded LogStash via install guide, but unfortunately I got this message:

C:\Users\name\Desktop\logstash-7.8.1\bin>logstash.bat 
Sending
Logstash logs to C:/Users/mine/Desktop/logstash-7.8.1/logs which
is now configured via log4j2.properties ERROR: Pipelines YAML file is
empty. Location:
C:/Users/mine/Desktop/logstash-7.8.1/config/pipelines.yml usage:  
bin/logstash -f CONFIG_PATH [-t] [-r] [] [-w COUNT] [-l LOG]  
bin/logstash --modules MODULE_NAME [-M
"MODULE_NAME.var.PLUGIN_TYPE.PLUGIN_NAME.VARIABLE_NAME=VALUE"] [-t]
[-w COUNT] [-l LOG]   bin/logstash -e CONFIG_STR [-t] [--log.level
fatal|error|warn|info|debug|trace] [-w COUNT] [-l LOG]   bin/logstash
-i SHELL [--log.level fatal|error|warn|info|debug|trace]   bin/logstash -V [--log.level fatal|error|warn|info|debug|trace]  
bin/logstash --help
[2020-08-14T15:07:51,696][ERROR][org.logstash.Logstash    ]
java.lang.IllegalStateException: Logstash stopped processing because
of an error: (SystemExit) exit

Edit:

I tried to use Filebeat only. Here I set:

processors:
  - add_host_metadata: ~
  - add_cloud_metadata: ~
  - add_docker_metadata: ~
  - add_kubernetes_metadata: ~
  - dissect: 
      tokenizer: '"%{event_time} %{loglevel} %{json_message}"' 
      field: "message" 
      target_prefix: "dissect"
  - decode_json_fields: 
      fields: ["json_message"]

but that gave me:

dissect_parsing_error

The tip with removing the "" at tokenizer helped. Then I got:

I simply refreshed the index and the message was gone. Nice.

But The question is now, how to filter for something in the new field?

you should config logstash.yml and make pipeline. you can add pipeline path to pipeline.conf. I suggest you search for sample logstash pipeline config. — hamid bayat, Aug 15 '20 at 05:43

score 2 · Accepted Answer · edited Aug 18 '20 at 11:12

2

The message says, your pipeline config is empty. It seems you did not configured any pipeline yet. Logstash can do the trick (JSON filter plugin), but Filebeat is sufficient here. If you don't want to introduce another Service, this is the better option.

It has the decode_json_fields option to transform specific fields containing JSON in your event to a . Here is the documentation.

For the future case, where your whole event is a JSON, there is the possibility of parsing in filebeat configuring the json.message_key and related json.* option.

EDIT - Added filebeat snippet as an processors example of dissecting the log line into three fields (event_time, loglevel, json_message). Afterwards the recently extracted field json_message, whose value is a JSON object encoded as a string, will be decoded into an JSON structure:

 ... 

filebeat.inputs: 
  - type: log 
    paths: 
      - path to your logfile
  
processors: 
  - dissect: 
      tokenizer: '%{event_time} %{loglevel} %{json_message}' 
      field: "message" 
      target_prefix: "dissect"

  - decode_json_fields: 
      fields: ["dissect.json_message"]
      target: ""

  - drop_fields:
      fields: ["dissect.json_message"]


 ...

If you want to practice the filebeat processors, try to set the correct event timestamp, taken from the encoded json and written into @timestamp using the timestamp processor.

edited Aug 18 '20 at 11:12

kwoxer

3,734
4
40
70

answered Aug 15 '20 at 08:38

ibexit

3,465
1
11
25

JSON filter sounds perfect. But I'm not sure which file I need to access and write those filter lines to. Also what happens now with Filebeat? Do I need that tool anymore? Also what about the config, which connection needs to be in there? – kwoxer Aug 15 '20 at 19:34
1

What your scenario looks like? How many machines and how the planned dataflow looks like? If you have serveral machines deploy filebeat on them and try to solve the parsing/mapping already there (filebeat.yml). It sounds to me, this will be the shorter way for you. But if you can't, setup a single logstash and forward the beats events to logstash instead to the cluster directly. Logstash has a logstash.yml and a pipelines.yml. If you need multiple pipelines use the pipelines.yml, for a single edit the logstash.yml. Set there the location(s) of the logstash pipeline(s) as described here: https – ibexit Aug 15 '20 at 21:55
https://www.elastic.co/guide/en/logstash/current/dir-layout.html https://www.elastic.co/guide/en/logstash/current/multiple-pipelines.html https://www.elastic.co/guide/en/logstash/current/configuration.html. And herr are some examples: https://www.elastic.co/guide/en/logstash/current/configuration.html – ibexit Aug 15 '20 at 21:57
My task is simple. I have a folder with some logs files. And those have this kind of message from the above example. On that machine I now want to be able to search for the message contents. – kwoxer Aug 16 '20 at 03:03
OK, try to ommit logstash in favor of beats. Samller and easy to set up. Follow this guide here https://www.elastic.co/guide/en/beats/filebeat/7.8/filebeat-getting-started.html – ibexit Aug 16 '20 at 06:10
Alright, but where the part in Filebeat where I transform the JSON? Maybe you can give me a last hint on that. Will try tomorrow. Thanks – kwoxer Aug 16 '20 at 17:10
I added the relevant snippet of filebeat.yml in my answer. I'm glad to help you. Cheers! – ibexit Aug 16 '20 at 18:26
Does not work. `"flags": [ "dissect_parsing_error" ]`. I rechecked the `tokenizer`, but it seems correct. I added the image about how my processors look like. I can't see the issue. – kwoxer Aug 17 '20 at 06:21
Please try to remove the double quotes from tokenizer: '%{event_time} %{loglevel} %{json_message}' – ibexit Aug 17 '20 at 06:31
Last question is how can I search now in the new field. It's more a basic usage question I think. Or maybe its not possible to search that JSON? Do I need something else? – kwoxer Aug 17 '20 at 07:14
Hey Curtis, is it working? If so, feel free to mark this questuon as solved for future users. If you want to search, do you mean usig kibana or the query dsl? Grüße! – ibexit Aug 18 '20 at 08:15
I'm also from germany. We can as well continue in Chat if you want. And yes I still have trouble with the query. How to make the query working in Kibana? – kwoxer Aug 18 '20 at 08:24
i've opened a room here: https://chat.stackoverflow.com/rooms/220018/room-for-ibexit-and-kwoxer – ibexit Aug 18 '20 at 09:23

Transform String into JSON so that it's searchable in Kibana/Elasticsearch

1 Answers1

Linked