I have some trouble with elasticsearch.
Currently:
I have a machine specs as follows
- CPU vendor: Intel
- CPU model: Xeon (2001 MHz)
- CPU total logical cores: 24
CPU cache: 15kb
VM name: Java HotSpot(TM) 64-Bit Server VM
- VM vendor: Oracle Corporation
- VM version: 25.31-b07
- Java version: 1.8.0_31
elasticsearch v/1.3.2
I have index(metadatav3) with:
- settings: http://pastebin.com/6Q9f3tPv
mapping
{
"metadatav3": {
"mappings": {
"track": {
"dynamic": "true",
"numeric_detection": true,
"properties": {
"album.album": {
"type": "string",
"norms": {
"enabled": false
},
"analyzer": "music_field"
},
"album.exact": {
"type": "string",
"analyzer": "exact_music_field"
},
"artist.artist": {
"type": "string",
"norms": {
"enabled": false
},
"analyzer": "music_field"
},
"artist.exact": {
"type": "string",
"analyzer": "exact_music_field"
},
"fullString": {
"type": "string",
"norms": {
"enabled": false
},
"analyzer": "nGram_token_field"
},
"fullString.token": {
"type": "string",
"norms": {
"enabled": false
},
"analyzer": "music_field"
},
"id": {
"type": "string"
},
"isHidden": {
"type": "boolean"
},
"lastRankedDate": {
"type": "long"
},
"popularity": {
"type": "float"
},
"tagCount": {
"type": "long"
},
"title.edgeNGNoSplit": {
"type": "string",
"norms": {
"enabled": false
},
"analyzer": "edge_nGram_no_split_small_field"
},
"title.exact": {
"type": "string",
"analyzer": "exact_music_field"
},
"title.title": {
"type": "string",
"norms": {
"enabled": false
},
"analyzer": "music_field"
}
}
}
}
}
}
When I run that query:
{
"from": 0,
"size": 20,
"timeout": 5000,
"query": {
"function_score": {
"query": {
"bool": {
"must": {
"match": {
"fullString": {
"query": "test",
"type": "boolean",
"operator": "OR",
"minimum_should_match": "1",
"cutoff_frequency": 0.01
}
}
},
"must_not": {
"term": {
"isHidden": "true"
}
},
"should": []
}
},
"field_value_factor": {
"field": "popularity"
}
}
},
"explain": false
}
near 50-60 req/sec search response times become 60ms to 4-5 secs.
However, when I run that query:
{
"from": 0,
"size": 20,
"timeout": 5000,
"query": {
"function_score": {
"query": {
"bool": {
"must": {
"match": {
"fullString.token": {
"query": "test",
"type": "boolean",
"operator": "OR",
"minimum_should_match": "1",
"cutoff_frequency": 0.01
}
}
},
"must_not": {
"term": {
"isHidden": "true"
}
},
"should": []
}
},
"field_value_factor": {
"field": "popularity"
}
}
},
"explain": false
}
I can see 600 req/sec during load test.
I mean the part that I don't understand that does using ngram filter can create that much cpu usage?
Also hot thread dumps are as follows: http://pastebin.com/5sFEZJa5 and I can also send/upload bigdesk screenshots.
Thanks.