I'm currently using the ElasticSearch 5.5 to search through source files as fast as possible.
The thing is, I need to search for exact phrases, but it can be just part of it. I've searched the whole google but couldn't find other similar cases.
For example, if the source file is "public static class ElasticMethods". I need to be able to search for "ic static class Elast".
I'm not sure about what analyzer should be used. If I use the standard analyzer, it will break words a part. That is a problem because I need to search for the exact phrase, so even if the source file has words like public or static and class, it is not a match if it is not in the exactly same order.
I've tried to use the keyword analyzer then, but the problem with the keyword is that a term cannot have more then 32kb, which most of my sources have.
If someone could help I'd appreciate it.
Here is one of the mappings that I've tried.
PUT /my_index?pretty
{
"mappings": {
"programa": {
"properties": {
"tFSPath": {
"type": "text",
"analyzer": "keyword"
},
"fileName": {
"type": "text",
"analyzer": "keyword"
},
"data": {
"type": "text",
"analyzer": "keyword"
}
}
}
}
}
A document example:
PUT /my_index/programa/2
{
"tFSPath": "$/Projects/AuthAPI/AuthAPI/Controllers/HomeController.cs",
"fileName:" : "HomeController.cs",
"data": "using System.Web.Mvc; namespace AuthAPI.Controllers { public class HomeController : Controller { public ActionResult Index() { ViewBag.Title = \"Home Page\"; return View(); } } }"
}
And the query that got me the closest of what I need:
POST my_index/_search
{
"query": {
"regexp":{
"data": ".*\"public ActionResult Index\".*"
}
}
}
In the example above, "ic ActionResult Ind" would be a match, but "Index ActionResult public" woulnd't. That's what I need.