For example:
When is each API best for what?
You have the answer inside: "Reindex does not copy the settings from the source index. Mappings, shard counts, replicas, and so on must be configured ahead of time."
Clone and ReIndex performs similar operations in Elasticsearch, but it has fundamentally some differences.
What is a Clone operation ?
Clone operation will clone an existing index into a new index, where each original primary shard is cloned into a new primary shard in the new index. Basically this functionality is to copy an existing index to a new index with the same properties and settings as that of the original index.
The following are the internal activities happening as part of the clone operation.
Clone functionality is useful in cases where we need the a copy of the index as is to another index. Clone will maintain the same number of shards, same mapping and settings as that of the source index in the target index.
What is a ReIndex operation ?
ReIndex operation copies the contents of a source index and writes it to a target index. This operation copies only the data and does not copies the index settings. We need to create the target index upfront with the required settings and mapping before doing the reindex operation. The source and destination can be any pre-existing index, index alias, or data stream. However, the source and destination must be different. ReIndex is suitable for cases that requires updating the number of shards, updating the mapping, updating the settings etc. I usually perform reindex to update the mapping.
ReIndex operation can be performed in the background by setting the following property
wait_for_completion=false.
Sample API request for clone operation:
POST /my_source_index/_clone/my_target_index
Sample API request for reindex operation:
POST _reindex
{
"source": {
"index": "source_index"
},
"dest": {
"index": "target_index"
}
}