4

I'm fairly new to the Elastic Stack. I'm using Logstash 6.4.0 to load JSON log data from Filebeat 6.4.0 into Elasticsearch 6.4.0.. I'm finding that I'm getting way too many JSON properties converted into fields once I start using Kibana 6.4.0.

I know this because when I navigate to Kibana Discover and put in my index of logstash-*, I'm getting an error message that states:

Discover: Trying to retrieve too many docvalue_fields. Must be less than or equal to: [100] but was [106]. This limit can be set by changing the [index.max_docvalue_fields_search] index level setting.

If I navigate to Management > Kibana > Index Patterns I see that I have 940 fields. It appears that each child property of my root JSON object (and many of those child properties have JSON objects as values, and so on) is automatically being parsed and used to create fields in my Elasticsearch logstash-* index.

So here's my question – how can I limit this automatic creation? Is it possible to do this by property depth? Is it possible to do this some other way?

Here is my Filebeat configuration (minus the comments):

filebeat.inputs:
- type: log
  enabled: true
  paths:
  - d:/clients/company-here/rpms/logs/rpmsdev/*.json
  json.keys_under_root: true
  json.add_error_key: true

filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false

setup.template.settings:
  index.number_of_shards: 3

setup.kibana:

output.logstash:
  hosts: ["localhost:5044"]

Here is my current Logstash pipeline configuration:

input {
    beats {
        port => "5044"
    }
}
filter {
    date {
        match => [ "@timestamp" , "ISO8601"]
    }
}
output {
    stdout { 
        #codec => rubydebug 
    }
    elasticsearch {
        hosts => [ "localhost:9200" ]
    }
}

Here is an example of a single log message that I am shipping (one row of my log file) – note that the JSON is completely dynamic and can change depending on what's being logged:

{
    "@timestamp": "2018-09-06T14:29:32.128",
    "level": "ERROR",
    "logger": "RPMS.WebAPI.Filters.LogExceptionAttribute",
    "message": "Log Exception: RPMS.WebAPI.Entities.LogAction",
    "eventProperties": {
        "logAction": {
            "logActionId": 26268916,
            "performedByUserId": "b36778be-6181-4b69-a0fe-e3a975ddcdd7",
            "performedByUserName": "test.sga.danny@domain.net",
            "performedByFullName": "Mike Manley",
            "controller": "RpmsToMainframeOperations",
            "action": "UpdateStoreItemPricing",
            "actionDescription": "Exception while updating store item pricing for store item with storeItemId: 146926. An error occurred while sending the request. InnerException: Unable to connect to the remote server InnerException: A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond 10.1.1.133:8800",
            "url": "http://localhost:49399/api/RpmsToMainframeOperations/UpdateStoreItemPricing/146926",
            "verb": "PUT",
            "statusCode": 500,
            "status": "Internal Server Error - Exception",
            "request": {
                "itemId": 648,
                "storeId": 13,
                "storeItemId": 146926,
                "changeType": "price",
                "book": "C",
                "srpCode": "",
                "multi": 0,
                "price": "1.27",
                "percent": 40,
                "keepPercent": false,
                "keepSrp": false
            },
            "response": {
                "exception": {
                    "ClassName": "System.Net.Http.HttpRequestException",
                    "Message": "An error occurred while sending the request.",
                    "Data": null,
                    "InnerException": {
                        "ClassName": "System.Net.WebException",
                        "Message": "Unable to connect to the remote server",
                        "Data": null,
                        "InnerException": {
                            "NativeErrorCode": 10060,
                            "ClassName": "System.Net.Sockets.SocketException",
                            "Message": "A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond",
                            "Data": null,
                            "InnerException": null,
                            "HelpURL": null,
                            "StackTraceString": "   at System.Net.Sockets.Socket.InternalEndConnect(IAsyncResult asyncResult)\r\n   at System.Net.Sockets.Socket.EndConnect(IAsyncResult asyncResult)\r\n   at System.Net.ServicePoint.ConnectSocketInternal(Boolean connectFailure, Socket s4, Socket s6, Socket& socket, IPAddress& address, ConnectSocketState state, IAsyncResult asyncResult, Exception& exception)",
                            "RemoteStackTraceString": null,
                            "RemoteStackIndex": 0,
                            "ExceptionMethod": "8\nInternalEndConnect\nSystem, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089\nSystem.Net.Sockets.Socket\nVoid InternalEndConnect(System.IAsyncResult)",
                            "HResult": -2147467259,
                            "Source": "System",
                            "WatsonBuckets": null
                        },
                        "HelpURL": null,
                        "StackTraceString": "   at System.Net.HttpWebRequest.EndGetRequestStream(IAsyncResult asyncResult, TransportContext& context)\r\n   at System.Net.Http.HttpClientHandler.GetRequestStreamCallback(IAsyncResult ar)",
                        "RemoteStackTraceString": null,
                        "RemoteStackIndex": 0,
                        "ExceptionMethod": "8\nEndGetRequestStream\nSystem, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089\nSystem.Net.HttpWebRequest\nSystem.IO.Stream EndGetRequestStream(System.IAsyncResult, System.Net.TransportContext ByRef)",
                        "HResult": -2146233079,
                        "Source": "System",
                        "WatsonBuckets": null
                    },
                    "HelpURL": null,
                    "StackTraceString": "   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()\r\n   at RPMS.WebAPI.Infrastructure.RpmsToMainframe.RpmsToMainframeOperationsManager.<PerformOperationInternalAsync>d__14.MoveNext() in D:\\Century\\Clients\\PigglyWiggly\\RPMS\\PWADC.RPMS\\RPMSDEV\\RPMS.WebAPI\\Infrastructure\\RpmsToMainframe\\RpmsToMainframeOperationsManager.cs:line 114\r\n--- End of stack trace from previous location where exception was thrown ---\r\n   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()\r\n   at RPMS.WebAPI.Infrastructure.RpmsToMainframe.RpmsToMainframeOperationsManager.<PerformOperationAsync>d__13.MoveNext() in D:\\Century\\Clients\\PigglyWiggly\\RPMS\\PWADC.RPMS\\RPMSDEV\\RPMS.WebAPI\\Infrastructure\\RpmsToMainframe\\RpmsToMainframeOperationsManager.cs:line 96\r\n--- End of stack trace from previous location where exception was thrown ---\r\n   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n   at System.Runtime.CompilerServices.TaskAwaiter`1.GetResult()\r\n   at RPMS.WebAPI.Controllers.RpmsToMainframe.RpmsToMainframeOperationsController.<UpdateStoreItemPricing>d__43.MoveNext() in D:\\Century\\Clients\\PigglyWiggly\\RPMS\\PWADC.RPMS\\RPMSDEV\\RPMS.WebAPI\\Controllers\\RpmsToMainframe\\RpmsToMainframeOperationsController.cs:line 537\r\n--- End of stack trace from previous location where exception was thrown ---\r\n   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n   at System.Threading.Tasks.TaskHelpersExtensions.<CastToObject>d__1`1.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n   at System.Web.Http.Controllers.ApiControllerActionInvoker.<InvokeActionAsyncCore>d__1.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n   at System.Web.Http.Filters.ActionFilterAttribute.<CallOnActionExecutedAsync>d__6.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n   at System.Web.Http.Filters.ActionFilterAttribute.<CallOnActionExecutedAsync>d__6.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n   at System.Web.Http.Filters.ActionFilterAttribute.<ExecuteActionFilterAsyncCore>d__5.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n   at System.Web.Http.Filters.ActionFilterAttribute.<CallOnActionExecutedAsync>d__6.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n   at System.Web.Http.Filters.ActionFilterAttribute.<CallOnActionExecutedAsync>d__6.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n   at System.Web.Http.Filters.ActionFilterAttribute.<ExecuteActionFilterAsyncCore>d__5.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n   at System.Web.Http.Controllers.ActionFilterResult.<ExecuteAsync>d__5.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n   at System.Web.Http.Filters.AuthorizationFilterAttribute.<ExecuteAuthorizationFilterAsyncCore>d__3.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n   at System.Web.Http.Controllers.AuthenticationFilterResult.<ExecuteAsync>d__5.MoveNext()\r\n--- End of stack trace from previous location where exception was thrown ---\r\n   at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)\r\n   at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)\r\n   at System.Web.Http.Controllers.ExceptionFilterResult.<ExecuteAsync>d__6.MoveNext()",
                    "RemoteStackTraceString": null,
                    "RemoteStackIndex": 0,
                    "ExceptionMethod": "8\nThrowForNonSuccess\nmscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089\nSystem.Runtime.CompilerServices.TaskAwaiter\nVoid ThrowForNonSuccess(System.Threading.Tasks.Task)",
                    "HResult": -2146233088,
                    "Source": "mscorlib",
                    "WatsonBuckets": null,
                    "SafeSerializationManager": {
                        "m_serializedStates": [{

                        }]
                    },
                    "CLR_SafeSerializationManager_RealType": "System.Net.Http.HttpRequestException, System.Net.Http, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a"
                }
            },
            "performedAt": "2018-09-06T14:29:32.1195316-05:00"
        }
    },
    "logAction": "RPMS.WebAPI.Entities.LogAction"
}
jlavallet
  • 1,267
  • 1
  • 12
  • 33

2 Answers2

4

I never ultimately found a way to limit the depth of the automatic field creation. I also posted my question in the Elastic forums and never got an answer. Between the time of my post and now, I have learned a lot more about Logstash.

My ultimate solution was to extract the JSON properties that I needed as fields and then I used the GREEDYDATA pattern In a grok filter to place the rest of the properties into an unextractedJson field so that I could still query for values within that field in Elasticsearch.

Here is my new Filebeat configuration (minus the comments):

filebeat.inputs:
- type: log
  enabled: true
  paths:
  - d:/clients/company-here/rpms/logs/rpmsdev/*.json
  #json.keys_under_root: true
  json.add_error_key: true

filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false

setup.template.settings:
  index.number_of_shards: 3

setup.kibana:

output.logstash:
  hosts: ["localhost:5044"]

Note that I commented out the json.keys_under_root setting which tells Filebeat to place the JSON formatted log entry into a json field that is sent on to Logstash.

Here is a snippet of my new Logstash pipeline configuration:

#...

filter {

    ###########################################################################
    # common date time extraction
    date {
        match => ["[json][time]", "ISO8601"]
        remove_field => ["[json][time]"]
    }

    ###########################################################################
    # configuration for the actions log
    if [source] =~ /actionsCurrent.json/ {

        if ("" in [json][eventProperties][logAction][performedByUserName]) {
            mutate {
                add_field => {
                    "performedByUserName" => "%{[json][eventProperties][logAction][performedByUserName]}"
                    "performedByFullName" => "%{[json][eventProperties][logAction][performedByFullName]}"
                }
                remove_field => [
                    "[json][eventProperties][logAction][performedByUserName]", 
                    "[json][eventProperties][logAction][performedByFullName]"]
            }
        }

        mutate {
            add_field => {
                "logFile" => "actions"
                "logger" => "%{[json][logger]}"
                "level" => "%{[json][level]}"
                "performedAt" => "%{[json][eventProperties][logAction][performedAt]}"
                "verb" => "%{[json][eventProperties][logAction][verb]}"
                "url" => "%{[json][eventProperties][logAction][url]}"
                "controller" => "%{[json][eventProperties][logAction][controller]}"
                "action" => "%{[json][eventProperties][logAction][action]}"
                "actionDescription" => "%{[json][eventProperties][logAction][actionDescription]}"
                "statusCode" => "%{[json][eventProperties][logAction][statusCode]}"
                "status" => "%{[json][eventProperties][logAction][status]}"
            }
            remove_field => [
                "[json][logger]",
                "[json][level]",
                "[json][eventProperties][logAction][performedAt]",
                "[json][eventProperties][logAction][verb]",
                "[json][eventProperties][logAction][url]",
                "[json][eventProperties][logAction][controller]",
                "[json][eventProperties][logAction][action]",
                "[json][eventProperties][logAction][actionDescription]",
                "[json][eventProperties][logAction][statusCode]",
                "[json][eventProperties][logAction][status]",
                "[json][logAction]",
                "[json][message]"
            ]
        }

        mutate {
            convert => {
                "statusCode" => "integer"
            }
        }

        grok {
            match => { "json" => "%{GREEDYDATA:unextractedJson}" }
            remove_field => ["json"]
        }

    }

# ...

Note the add_field configuration options in the mutate commands that extract the properties into named fields followed by the remove_field configuration options that removes those properties from the JSON. At the end of the filter snippet, notice the grok command that gobbles up the rest of the JSON and places it in the unextractedJson field. Finally, and all importantly, I remove the json field that was provided by Filebeat. That last bit saves me from exposing all that JSON data to Elasticsearch/Kibana.

This solution takes log entries that look like this:

{ "time": "2018-09-13T13:36:45.376", "level": "DEBUG", "logger": "RPMS.WebAPI.Filters.LogActionAttribute", "message": "Log Action: RPMS.WebAPI.Entities.LogAction", "eventProperties": {"logAction": {"logActionId":26270372,"performedByUserId":"83fa1d72-fac2-4184-867e-8c2935a262e6","performedByUserName":"rpmsadmin@domain.net","performedByFullName":"Super Admin","clientIpAddress":"::1","controller":"Account","action":"Logout","actionDescription":"Logout.","url":"http://localhost:49399/api/Account/Logout","verb":"POST","statusCode":200,"status":"OK","request":null,"response":null,"performedAt":"2018-09-13T13:36:45.3707739-05:00"}}, "logAction": "RPMS.WebAPI.Entities.LogAction" }

And turns them into Elasticsearch indexes that look like this:

{
  "_index": "actions-2018.09.13",
  "_type": "doc",
  "_id": "xvA41GUBIzzhuC5epTZG",
  "_version": 1,
  "_score": null,
  "_source": {
    "level": "DEBUG",
    "tags": [
      "beats_input_raw_event"
    ],
    "@timestamp": "2018-09-13T18:36:45.376Z",
    "status": "OK",
    "unextractedJson": "{\"eventProperties\"=>{\"logAction\"=>{\"performedByUserId\"=>\"83fa1d72-fac2-4184-867e-8c2935a262e6\", \"logActionId\"=>26270372, \"clientIpAddress\"=>\"::1\"}}}",
    "action": "Logout",
    "source": "d:\\path\\actionsCurrent.json",
    "actionDescription": "Logout.",
    "offset": 136120,
    "@version": "1",
    "verb": "POST",
    "statusCode": 200,
    "controller": "Account",
    "performedByFullName": "Super Admin",
    "logger": "RPMS.WebAPI.Filters.LogActionAttribute",
    "input": {
      "type": "log"
    },
    "url": "http://localhost:49399/api/Account/Logout",
    "logFile": "actions",
    "host": {
      "name": "Development5"
    },
    "prospector": {
      "type": "log"
    },
    "performedAt": "2018-09-13T13:36:45.3707739-05:00",
    "beat": {
      "name": "Development5",
      "hostname": "Development5",
      "version": "6.4.0"
    },
    "performedByUserName": "rpmsadmin@domain.net"
  },
  "fields": {
    "@timestamp": [
      "2018-09-13T18:36:45.376Z"
    ],
    "performedAt": [
      "2018-09-13T18:36:45.370Z"
    ]
  },
  "sort": [
    1536863805376
  ]
}
jlavallet
  • 1,267
  • 1
  • 12
  • 33
  • Thank you for answering your own question, it helped me find the correct way to set the field depth limit of an index. See my answer below – The Django Ninja Aug 02 '20 at 08:58
0

The depth limit can be set per index directly in elastic search.

ElascticSearch Field Mapping documentation : https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html#mapping-limit-settings

From the docs :

index.mapping.depth.limit The maximum depth for a field, which is measured as the number of inner objects. For instance, if all fields are defined at the root object level, then the depth is 1. If there is one object mapping, then the depth is 2, etc. Default is 20.

Related answer : Limiting the nested fields in Elasticsearch