19

I am using logstash to feed logs into ElasticSearch. I am configuring logstash output as:

input {
file {
            path => "/tmp/foo.log"
            codec =>
                    plain {
                    format => "%{message}"
            }
    }
}
output {
        elasticsearch {
                        #host => localhost 
                        codec => json {}
                        manage_template => false
                        index => "4glogs"
                }
}

I notice that as soon as I start logstash it creates a mapping ( logs ) in ES as below.

{
    "4glogs": {
        "mappings": {
            "logs": {
                "properties": {
                    "@timestamp": {
                        "type": "date",
                        "format": "dateOptionalTime"
                    },
                    "@version": {
                        "type": "string"
                    },
                    "message": {
                        "type": "string"
                    }
                }
            }
        }
    }
}

How can I prevent logstash from creating this mapping ?

UPDATE:

I have now resolved this error too. "object mapping for [logs] tried to parse as object, but got EOF, has a concrete value been provided to it?"

As John Petrone has stated below, once you define a mapping, you have to ensure that your documents conform to the mapping. In my case, I had defined a mapping of "type: nested" but the output from logstash was a string. So I removed all codecs ( whether json or plain ) from my logstash config and that allowed the json document to pass through without changes.

Here is my new logstash config ( with some additional filters for multiline logs ).

input {
    kafka {
        zk_connect => "localhost:2181"
        group_id => "logstash_group"
        topic_id => "platform-logger"
        reset_beginning => false
        consumer_threads => 1
        queue_size => 2000
        consumer_id => "logstash-1"
        fetch_message_max_bytes => 1048576
        }
        file {
                path => "/tmp/foo.log"
        }
}
filter {
  multiline {
    pattern => "^\s"
    what => "previous"
  }
  multiline {
    pattern => "[0-9]+$"
    what => "previous"
  }
  multiline {
    pattern => "^$"
    what => "previous"
  }
        mutate{
                remove_field => ["kafka"]
                remove_field => ["@version"]
                remove_field => ["@timestamp"]
                remove_tag => ["multiline"]
        }
 }
output {
        elasticsearch {
                        manage_template => false
                        index => "4glogs"
                }
}
Prakash Shankor
  • 437
  • 1
  • 7
  • 16
  • Is it created when the first message arrives or when log stash is started ? – Jettro Coenradie Jul 24 '14 at 09:23
  • On closer inspection it seems to be when the first message arrives. – Prakash Shankor Jul 24 '14 at 17:28
  • Ok, than you cannot change it, this is what elasticsearch does. It needs a mapping, if it is not available it auto creates one. What is what you want to accomplish? – Jettro Coenradie Jul 24 '14 at 17:49
  • Jettro, Thanks for your response. I do not want the dynamic mapping that occurs on the first message. So it appears that defining my own mapping ( strictly ) is the way. But I am still having issues with my own mapping. See "object mapping error" below. – Prakash Shankor Jul 24 '14 at 23:01

3 Answers3

14

You will need a mapping to store data in Elasticsearch and to search on it - that's how ES knows how to index and search those content types. You can either let logstash create it dynamically or you can prevent it from doing so and instead create it manually.

Keep in mind you cannot change existing mappings (although you can add to them). So first off you will need to delete the existing index. You would then modify your settings to prevent dynamic mapping creation. At the same time you will want to create your own mapping.

For example, this will create the mappings for the logstash data but also restrict any dynamic mapping creation via "strict":

$ curl -XPUT 'http://localhost:9200/4glogs/logs/_mapping' -d '
{
    "logs" : {
        "dynamic": "strict",
        "properties" : {
            "@timestamp": {
                "type": "date",
                "format": "dateOptionalTime"
                    },
            "@version": {
                "type": "string"
                    },
             "message": {
                "type": "string"
                    }
        }
    }
}
'

Keep in mind that the index name "4glogs" and the type "logs" need to match what is coming from logstash.

For my production systems I generally prefer to turn off dynamic mapping as it avoids accidental mapping creation.

The following links should be useful if you want to make adjustments to your dynamic mappings:

https://www.elastic.co/guide/en/elasticsearch/guide/current/dynamic-mapping.html

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/custom-dynamic-mapping.html

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/dynamic-mapping.html

bioffe
  • 6,283
  • 3
  • 50
  • 65
John Petrone
  • 26,943
  • 6
  • 63
  • 68
  • I am able to use the "dynamic":"strict" and have my own mapping. But now I run into a different error: **"object mapping for [logs] tried to parse as object, but got EOF, has a concrete value been provided to it?"** – Prakash Shankor Jul 24 '14 at 21:04
  • To be more clear, I have defined my own mapping for the type "logs" and index "4glogs". I cant paste my mapping in here as it's too huge. My mapping uses **"type": "nested"** under the "message" field. With this mapping, any attempt to index a document results in error: **"object mapping for [logs] tried to parse as object, but got EOF, has a concrete value been provided to it?"** . – Prakash Shankor Jul 24 '14 at 21:10
  • When you changed your mapping to include "type": "nested" you are telling ES to expect some object, fields unknown and for it to dynamically map them - but you have dynamic mapping turned off. Also keep in mind that even with dynamic mapping turned on, the first nested doc will define the mapping for it going forward, so if any data that follows does not match (e.g. field name same but data type different) you will get an error. – John Petrone Jul 24 '14 at 21:52
  • John, Thanks for your help. Are you saying I can not use "nested" and define my own mapping? My nested mapping has the child fields and their types clearly defined. – Prakash Shankor Jul 24 '14 at 23:07
  • (pardon the formatting) A sample of my mapping is: `{ "mappings": { "logs": { "properties": { "message": { "type": "nested", "properties": { "level": { "type": "string", "store": "no" }, "logger": { "type": "string", "store": "yes" }, "timestamp": { "type": "string", "store": "no" } } } } } } }` – Prakash Shankor Jul 24 '14 at 23:14
  • Ok, misunderstood your previous comment just had the "type":"nested" without the field mappings. This error message can occur when you load data with the same field name but with a different field type - for instance after declaring something as "nested" if you then load a string it will fail. I'd advice not using "nested" until you have a really good idea of all possible cases of the data coming in - otherwise you run the risk of a failure. – John Petrone Jul 24 '14 at 23:20
  • I think that might indeed be my problem - declaring "nested" but loading "string" instead. Will post back here in a bit. – Prakash Shankor Jul 24 '14 at 23:54
3

logs in this case is the index_type. If you don't want to create it as logs, specify some other index_type on your elasticsearch element. Every record in elasticsearch is required to have an index and a type. Logstash defaults to logs if you haven't specified it.

There's always an implicit mapping created when you insert records into Elasticsearch, so you can't prevent it from being created. You can create the mapping yourself before you insert anything (via say a template mapping).

The setting manage_template of false just prevents it from creating the template mapping for the index you've specified. You can delete the existing template if it's already been created by using something like curl -XDELETE http://localhost:9200/_template/logstash?pretty

Alcanzar
  • 16,985
  • 6
  • 42
  • 59
1

Index templates can help you. Please see this jira for more details. You can create index templates with wildcard support to match an index name and put your default mappings.

Pankaj Yadav
  • 139
  • 1
  • 10
  • or see this link right away: https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-templates.html – radiospiel Jan 11 '16 at 19:35