14

I am searching only by couple of fields but I want to be able to store the whole document in ES in order not to additional DB (MySQL) queries.

I tried adding index: no, store: no to whole objects/properties in the mapping but I'm still not sure if the fields are being indexed and add unnecessary overhead.

Let's say I've got books and each has an author. I want to search only by book title, but I want to be able to retrieve the whole document.

Is this okay:

mappings:
properties:
    title:
        type: string
        index: analyzed
    author:
        type: object
        index: no
        store: no
        properties:
            first_name:
                type: string
            last_name:
                type: string

Or should I rather do:

mappings:
properties:
    title:
        type: string
        index: analyzed
    author:
        type: object
        properties:
            first_name:
                index: no
                store: no
                type: string
            last_name:
                index: no
                store: no
                type: string

Or maybe I am doing it completely wrong? And what about nested properties that should not be indexed?

Яois
  • 3,838
  • 4
  • 28
  • 50
pinkeen
  • 690
  • 3
  • 10
  • 21

2 Answers2

11

By default the _source of the document is stored regardless of the fields that you choose to index. The _source is used to return the document in the search results, whereas the fields that are indexed are used for searching.

You can't set index: no on an object to prevent all fields in an object being indexed, but you can do what you want with Dynamic Templates using path_match property to apply the index: no setting to every field within an object. Here is a simple example.

Create an index with your mapping that includes the dynamic templates for the author object and the nested categories object:

POST /shop
{
    "mappings": {
        "book": {
            "dynamic_templates": [
                {
                    "author_object_template": {
                        "path_match": "author.*",
                        "mapping": {
                            "index": "no"
                        }
                    }
                },
                {
                    "categories_object_template": {
                        "path_match": "categories.*",
                        "mapping": {
                            "index": "no"
                        }
                    }
                }
            ],
            "properties": {
                "categories": {
                    "type": "nested"
                }
            }
        }
    }
}

Index a document:

POST /shop/book/1
{
    "title": "book one",
    "author": {
        "first_name": "jon",
        "last_name": "doe"
    },
    "categories": [
        {
            "cat_id": 1,
            "cat_name": "category one"
        },
        {
            "cat_id": 2,
            "cat_name": "category two"
        }
    ]
}

If you searched on the title field with the search term book the document would be returned. If you search on the author.first_name or author.last_name, there won't be a match because this fields were not indexed:

POST /shop/book/_search
{
    "query": {
        "match": {
            "author.first_name": "jon"
        }
    }
}

The same would be the case for a nested query on the category fields:

POST /shop/book/_search
{
    "query": {
        "nested": {
            "path": "categories",
            "query": {
                "match": {
                    "categories.cat_name": "category"
                }
            }
        }
    }
}

Also you can use the Luke tool to expect the Lucene index and see what fields have been indexed.

Dan Tuffery
  • 5,874
  • 29
  • 28
  • Does `"index": "no"` imply `"store": "no"` ? I've read `store` means storing the original property's `_source` in lucene but I'm not sure how it is related to `index`. And just to make sure - I don't have to provide mappings for the non-indexed fields? ES won't throw errors if I put a document with property X that is an int and then a document with the same property but with string? – pinkeen Apr 10 '15 at 19:54
  • No, the setting for index does not determine the setting for store. The default for store is no, which is fine in your use case because the _source is enabled. If you disabled the _source field and select the fields you want to store, the stored fields will only be returned in the search results when there is a match. You have to provide mapping for non indexed fields in order to tell Elasticsearch not to index them, otherwise Elasticsearch will use the default analyzer (Standard Analyzer) to index the field. – Dan Tuffery Apr 11 '15 at 08:11
  • 1
    However, in the above example the dynamic templates are used for the mappings of non indexed fields. If you don't have a mapping for a field an error won't be returned if you change the type of a property. – Dan Tuffery Apr 11 '15 at 08:11
1

You can simply set "enabled": false in mapping definition.

The enabled setting, which can be applied only to the top-level mapping definition and to object fields, causes Elasticsearch to skip parsing of the contents of the field entirely. The JSON can still be retrieved from the _source field, but it is not searchable or stored in any other way.

"mappings": {
  "properties": {
    "title": { "type": "text" },
    "author": { "type": "object", "enabled": false }
  }
}

But beware that enabled is not applicable to core types, however index option can be applied to core types by setting "index": false instead.

The index option controls whether field values are indexed. It accepts true or false and defaults to true. Fields that are not indexed are not queryable.

Ahmad Nabil
  • 305
  • 2
  • 12