4

How can I delete data from my elasticsearch database without deleting my index mapping?

I am Tire gem and using the delete command deletes all my mappings and run the create command once again. I want to avoid the create command from being run again and again.

Please help me out with this.

javanna
  • 59,145
  • 14
  • 144
  • 125
Scorpy
  • 95
  • 4
  • 11
  • 1
    I don't think it's possible because you use ElasticSearch to save millions or even billions of records and index them. So, when you have saved your documents why would you need to delete them and again index them? The thing you are saying is only applicable if you keep deleting entire index and again creating it having small small indexed data. And as @HüseyinBABAL said it's Data which is important not mapping. – Airy Jan 22 '14 at 11:34
  • 1
    Why do you want to keep mapping of data? Important part of this case keeping indexed data. You can create mapping again easily – Hüseyin BABAL Jan 22 '14 at 11:35
  • I have no interest in keeping the indexed data but I want the data to be indexed under the same mapping format. – Scorpy Jan 22 '14 at 12:47
  • Suppose I am downloading tweets from twitter for a particular search and stored them in elasticsearch analyzed this data and showed some results. Now for a new search I don't want the earlier data to remain.That is why I wanted to know if there is any way I can delete that earlier data without effecting the mapping. – Scorpy Jan 22 '14 at 20:36
  • this is a good question - none of the current answers address it. – Yehosef Feb 02 '15 at 16:09
  • The deprecated Delete By Query API mentioned in other answers came back in 5.0, as documented here: https://www.elastic.co/guide/en/elasticsearch/reference/5.0/docs-delete-by-query.html –  Aug 29 '17 at 02:10

5 Answers5

7

found it at http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-delete-by-query.html

DELETE <index>/_query
{
  "query" : {
    "match_all": {}
  }
}

You can also just delete a specific type by changing it to DELETE <index>/<type>/_query

This will delete the data and maintain the mappings, setting, etc.

Yehosef
  • 17,987
  • 7
  • 35
  • 56
  • Just to note, this is deprecated as of Elasticsearch 1.5.3, and will be removed in 2.0. – mnd Sep 30 '15 at 18:31
  • It's being moved to a plugin - https://www.elastic.co/blog/core-delete-by-query-is-a-plugin - it uses the scroll/scan api internally so it should be safer. – Yehosef Sep 30 '15 at 23:26
  • 1
    This [feature is implemented in 7.8](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html). – Ivo Pereira Jul 02 '20 at 10:14
4

You can use index templates, which will be applied to indices whose name matches a pattern.

That way you can simply delete an index, using the delete index api (way better than deleting all documents in it), and when you recreate the same index the matching index templates will get applied to it, so that you don't need to recreate its mappings, settings, warmers as well...

What happens is that the mappings will get deleted as they refer to the index that you deleted, but since they are stored in the index templates as well you won't need to resubmit them again when recreating the same index later on.

javanna
  • 59,145
  • 14
  • 144
  • 125
2

Due to the way ElasticSearch delete it's documents (by flagging the document with a bitset for deletion) it wouldn't be worthwhile to iterate through X amount of documents and flagging them for delete. I believe when you flush an indices it will free memory by removing all documents with the delete bitset flagged, being an expensive operation and slowing down the shards on which the index resides in.

Hope this helps.

Nathan Smith
  • 8,271
  • 3
  • 27
  • 44
2

Updating Yehosef's answer based on the latest docs (6.2 as of this post):

POST <index>/_delete_by_query
{
  "query" : {
    "match_all": {}
  }
}
tmcdevitt
  • 1,200
  • 1
  • 7
  • 9
0

Deleting by query is deprecated in 1.5.3

You should use the scroll/scan API to find all matching ids and then issue a bulk request to delete them.

As documented here

curl -XGET 'localhost:9200/realestate/houses/_search?scroll=1m' -d '
{
    "query": {
        "match_all" : { }
    },
    "fields": []
}
'

and then the bulk delete (don't forget to put a new line after the last row)

curl -XPOST 'localhost:9200/_bulk' -d '
{ "delete" : { "_index" : "realestate", "_type" : "houses", "_id" : "1" } }
{ "delete" : { "_index" : "realestate", "_type" : "houses", "_id" : "2" } }
{ "delete" : { "_index" : "realestate", "_type" : "houses", "_id" : "3" } }
{ "delete" : { "_index" : "realestate", "_type" : "houses", "_id" : "4" } }
{ "delete" : { "_index" : "realestate", "_type" : "houses", "_id" : "5" } }
{ "delete" : { "_index" : "realestate", "_type" : "houses", "_id" : "6" } }
{ "delete" : { "_index" : "realestate", "_type" : "houses", "_id" : "7" } }
{ "delete" : { "_index" : "realestate", "_type" : "houses", "_id" : "8" } }
'
Ludo - Off the record
  • 5,153
  • 4
  • 31
  • 23
  • I am afraid this answer might be outdated. This [feature is implemented in 7.8](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html). – Ivo Pereira Jul 02 '20 at 10:13