6

I was wondering which characters are used to delimit a string for elastic search's standard tokenizer?

David Carek
  • 1,103
  • 1
  • 12
  • 26

1 Answers1

6

As per the documentation I believe this is the list of symbols/characters used for defining tokens: http://unicode.org/reports/tr29/#Default_Word_Boundaries

Andrei Stefan
  • 51,654
  • 6
  • 98
  • 89