0

I'm currently using the ElasticSearch 5.5 to search through source files as fast as possible.

The thing is, I need to search for exact phrases, but it can be just part of it. I've searched the whole google but couldn't find other similar cases.

For example, if the source file is "public static class ElasticMethods". I need to be able to search for "ic static class Elast".

I'm not sure about what analyzer should be used. If I use the standard analyzer, it will break words a part. That is a problem because I need to search for the exact phrase, so even if the source file has words like public or static and class, it is not a match if it is not in the exactly same order.

I've tried to use the keyword analyzer then, but the problem with the keyword is that a term cannot have more then 32kb, which most of my sources have.

If someone could help I'd appreciate it.

Here is one of the mappings that I've tried.

PUT /my_index?pretty
{
  "mappings": {
    "programa": {
      "properties": {
        "tFSPath": {
          "type": "text",
          "analyzer": "keyword"
        },
        "fileName": {
          "type": "text",
          "analyzer": "keyword"
        },
        "data": {
          "type": "text",
          "analyzer": "keyword"
        }
      }
    }
  }
}

A document example:

PUT /my_index/programa/2
{
    "tFSPath": "$/Projects/AuthAPI/AuthAPI/Controllers/HomeController.cs",
    "fileName:" : "HomeController.cs",
    "data": "using System.Web.Mvc; namespace AuthAPI.Controllers { public class HomeController : Controller { public ActionResult Index() { ViewBag.Title = \"Home Page\"; return View(); } } }"
}

And the query that got me the closest of what I need:

POST my_index/_search
{
    "query": {
        "regexp":{
            "data": ".*\"public ActionResult Index\".*"
      }
    }
}

In the example above, "ic ActionResult Ind" would be a match, but "Index ActionResult public" woulnd't. That's what I need.

Jeff Klein
  • 11
  • 1
  • 3
  • Please provide more information to the document. What type is the filed you 're searching on? – aholbreich Dec 22 '17 at 13:56
  • They're source files from multiple languages. Java, C#, Javascript, etc.I`m using the text type for that. – Jeff Klein Dec 22 '17 at 15:15
  • How does our query look like? And please give an example of a document. Otherwize it's hard to understnad what you are trying to do... Also check this: https://stackoverflow.com/questions/30517904/elasticsearch-exact-matches-on-analyzed-fields and this https://stackoverflow.com/questions/22093334/how-to-make-query-string-search-exact-phrase-in-elasticsearch – aholbreich Dec 22 '17 at 15:39
  • Hey aholbreich, thanks for the help. I've read that post, the problem is that I can't use the keyword analyzer because my files have more than 32kb. I've also edited the post with an example of file and a query that worked, but only using the keyword analyzer. – Jeff Klein Dec 22 '17 at 16:16

0 Answers0