7

I just start using elasticsearch 5.2 .

I am trying to get all keys in the index if I have the following mapping:

"properties": {
         "name": { "type": "text" },
         "article": {
          "properties": {
           "id": { "type": "text" },
           "title":  { "type": "text"},
           "abstract": { "type": "text"},
            "author": {
             "properties": {
              "id": { "type": "text" },
              "name": { "type": "text" }
}}}} } }

is it possible to get all fields full name ? like this:

 name,
 article.id ,
 article.title ,
 article.abstract ,
 article.author.id,
 article.author.name

how can I get that ?

Rahul
  • 15,979
  • 4
  • 42
  • 63
igx
  • 4,101
  • 11
  • 43
  • 88
  • are you trying to get aggregations or documents by those fields ? – eyildiz Mar 17 '17 at 20:57
  • I am trying to get the fields names list . maybe the aggregation trial is confusing . I will remove it. thanks – igx Mar 17 '17 at 21:19
  • then you can use source filtering -- https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-source-filtering.html – eyildiz Mar 17 '17 at 21:23
  • I don't understand how that will yield only field names of the index – igx Mar 17 '17 at 21:26
  • ES returns all fields by default, if you want to exclude fields from source you can use source filtering. Maybe i couldnt understand your question ? – eyildiz Mar 17 '17 at 21:32
  • thanks @eyildiz but filtering will not do it . I simply want to list all fields . so for ```article": { "properties": { "id": { "type": "text" }, "title": { "type": "text"}}}``` I want article.id, article.title – igx Mar 17 '17 at 21:42
  • i see now, you want to get field names, not fields by name like getting column names on sql. this would be helpful maybe -- https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-get-field-mapping.html – eyildiz Mar 17 '17 at 21:45

3 Answers3

3

You may use _field_names field.

The _field_names field indexes the names of every field in a document that contains any value other than null.

GET _search
{
  "size"0,
  "aggs": {
    "Field names": {
      "terms": {
        "field": "_field_names", 
        "size": 100
      }
    }
  }
}

Update : from ES 5 onwards

the _field_names field has been locked down and is only indexed, it doesn't support fielddata (memory intensive) or doc values,

Ref : https://github.com/elastic/elasticsearch/issues/22576

As an alternative, you may getMapping API

The get mapping API can be used to get more than one index or type mapping with a single call. General usage of the API follows the following syntax: host:port/{index}/_mapping/{type}

$ curl -XGET 'http://localhost:9200/index/_mapping?pretty'

You may then process the response to extract all the field names in the index

Rahul
  • 15,979
  • 4
  • 42
  • 63
  • tried that but I am getting : `"root_cause" : [ { "type" : "illegal_argument_exception", "reason" : "Fielddata is not supported on field [_field_names] of type [_field_names]" } ],` – igx Mar 18 '17 at 06:56
  • IIUC I cannot use the `_field_names` in aggregations as suggested – igx Mar 18 '17 at 07:26
  • Yes, you may only query for the existence of a field but can't aggregate on the same. – Rahul Mar 18 '17 at 07:29
  • you can also use get _mapping api but that would require some coding on your part – Rahul Mar 18 '17 at 07:37
  • The get mappings approach will not work if you are using the join functionality. If you want to get the fields from a strictly child document this is not possible, since you cannot match your join field value. – Red-Tune-84 Feb 10 '18 at 18:24
  • I found a solution, which I think may be suggested but not obvious in redlus's answer. See my answer in another thread: https://stackoverflow.com/a/48724411/3453043 – Red-Tune-84 Feb 10 '18 at 18:55
3

The mapping API also allows querying the field names directly. Here is a python 3 code snippet which should do the work:

import json
import requests

# get mapping fields for a specific index:
index = "INDEX_NAME"
elastic_url = "http://ES_HOSTNAME:9200"
doc_type = "DOC_TYPE"
mapping_fields_request = "_mapping/field/*?ignore_unavailable=false&allow_no_indices=false&include_defaults=true"
mapping_fields_url = "/".join([elastic_url, index, doc_type, mapping_fields_request])
response = requests.get(mapping_fields_url)

# parse the data:
data = response.content.decode()
parsed_data = json.loads(data)
keys = sorted(parsed_data[index]["mappings"][doc_type].keys())
print("index= {} has a total of {} keys".format(index, len(keys)))

# print the keys of the fields:
for i, key in enumerate(keys):
    if i % 43 == 0:
        input()
    print("{:4d}:     {}".format(i, key))

Very convenient indeed. Do note that keys which contain "." in their name may confuse you a bit in how cascaded they are in the document...

redlus
  • 2,301
  • 2
  • 12
  • 16
0

You can try this, Get Field Mapping API

def unique_preserving_order(sequence):
    """
    Preserving Order
    :param sequence: object list
    :return:  new list from the set’s contents
    """

    seen = set()
    return [x for x in sequence if not (x in seen or seen.add(x))]

get es index fields recursively

def get_fields_recursively(dct, field_types=None):

    if dct and 'properties' in dct:
        fields = []
        for key, ndct in dct.get('properties').items():
            if 'properties' in ndct:
                for nkey, nd in ndct.items():
                    if nkey == 'properties':
                        field = get_fields_recursively(ndct)

                        if field_types:
                            for f in field:
                                prop = ndct.get('properties').get(f)
                                if prop and prop.get('type') in field_types:
                                    ff = '{0}.{1}'.format(key, f)
                                    # fields.append(f)
                                    fields.append(ff)
                        else:
                            # if not key.startswith('@'):
                            # _fields = field + ['{0}.{1}'.format(key, f) for f in field]
                            _fields = ['{0}.{1}'.format(key, f) for f in field]
                            fields.extend(_fields)
                        continue

                continue

            if field_types:
                if ndct.get('type') in field_types and not key.startswith('@'):
                    fields.append(key)
            else:
                if not key.startswith('@'):
                    fields.append(key)
        return fields
    else:
        return dct

get fields from index mappings, also you can filter fields by types, ex. text fields or numerical fields

def get_mapping_fields(self, field_type=None, index=None, params={}):
    """

    :param field_type: es field types, filter fields by type
    :param index: elastic index name
    :param params: mapping additional params
    :return: fields

    <https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-get-field-mapping.html>
    - http://eshost:9200/_mapping
    - http://eshost:9200/_all/_mapping
    - http://eshost:9200/index_name/_mapping

    """

    _fields = []
    _mapping = self.esclient.indices.get_mapping(index=index, params=params)
    for idx_mapping in _mapping.values():
        mapping = idx_mapping.get('mappings')
        if 'system' in mapping:
            mapping = mapping.get('system')
        else:
            mapping = mapping.get('doc')
        fields = get_fields_recursively(mapping, field_type)
        if fields:
            _fields.extend(fields)

    return list(unique_preserving_order(_fields))
Droid
  • 1,410
  • 8
  • 23
  • 37