1

Question: Can the Elasticsearch _reindex API be used to set/reset the "field datatypes" of fields that are copied through it?

This question comes from looking at Elastics docs for reindex: https://www.elastic.co/guide/en/elasticsearch/reference/6.2/docs-reindex.html

Those docs show the _reindex API can modify things while they are being copied. They give the example of changing a field name:

POST _reindex
{
  "source": {
    "index": "from-index"
  },
  "dest": {
    "index":"new-index"
  },
  "script": {
    "source": "ctx._source.New-field-name = ctx._source.remove(\"field-to-change-name-of\")"
  }
}

The script clause will cause the "new-index" to have a field called New-field-name, instead of the field with the name field-to-change-name-of from the "from-index"

The documentation implies there is a great deal of flexibility available in the "script" functionality, but its not clear to me if that includes projecting datatypes (for instance quoting data to turn it into a strings/text/keywords, and/or treating things as literals to attempt to turn string data into non-strings (obviously fought with danger)

If setting the datatypes in a _reindex is possible, I'm not assuming it will be efficient and/or be without (perhaps harsh) limits - I just want to better understand the limit of the _reindex functionality (and figure out if I can force a datatype in just one interaction, instead of setting the mapping no the new index before I do the reindex command)

(P.S. I happen to be working on Elasticsearch 6.2, but I think my question holds for all versions that have had the _reindex api (sounds like everything 2.3.0 and greater))

Mike Lutz
  • 1,812
  • 1
  • 10
  • 17

1 Answers1

1

Maybe you are confusing some terms. The part of the documentation you are pointing out refers to the metadata associated with a document, in this case the _type meta field just tells Elasticsearch that a particular document belongs to a specific type (e.g. user type), it is not related to the datatype of a field (e.g. integer or boolean).

If you want to set/reset the mapping of particular fields, you don't even need to use scripting depending on your case. You just have to create the destination index with the new mapping and execute the _reindex API.

But if you want to change the mapping between incompatible values (e.g. a non numerical string into a field with an "integer" datatype), you would need to do some transformation through scripting or through the ingest node.

Antonio Val
  • 3,200
  • 1
  • 14
  • 27
  • Thanks for you reply! - My goal is to know if there is a way to set the datatype of a field/parameters all within a single _reindex call. Thanks for noting one could pre-create the "new-index" with a mapping - but I'm trying to avoid that work - I've updated my question to make this all a bit more clear – Mike Lutz Mar 13 '18 at 16:29
  • I'm not sure if I understood what you mean, but do you want to set the mappings of the new index in the same call using the `_reindex` API? I think that only would be possible through dynamic mapping. Maybe knowing your use case would help, what kind of task are you trying to accomplish. – Antonio Val Mar 14 '18 at 05:28
  • What is exactly the difference between _type and the field's datatype? – Mustafa Qamaruddin Jun 30 '21 at 05:20