13

I am trying to figure out gow to index the following in ES.

I have a lot of documents which are crawler from website with various language. Each document has a category such as Airport, restaurant, river, beach etc ., and it's language such as Arabic, English.. like

doc { language:"eng" , content :"something here" , category:"beach" }

doc { language:"vn" , content :"Xin chao" , category:"beach" }

I want to index and search documents with each languages;

I choose English options, and search with query " here " => RESUTLS

Should I :

  1. Setup each Elastic Core ( per machine per language) for per language. JUST COPY ES to run :)

    Eg : create elasticsearch_ENGLISH, elastichsearch_VIETNAMESE

  2. created each language with each index of ElasticSearch Eg: create indexs

/english/type/

/vietnames/type/ . When I search some query, I just search only index of language

OR do it some other way I am not aware of :) ?

phuongdo
  • 271
  • 1
  • 4
  • 9

1 Answers1

10

Not sure I fully understood your concern.

If you need to search on the full cluster (I mean search in every language), you can't create one setup per language.

That said, you have many options:

It's not a full answer but some clues to help you...

Vinh VO
  • 705
  • 1
  • 7
  • 28
dadoonet
  • 14,109
  • 3
  • 42
  • 49
  • Thanks dadoonet,:) there one index per language is the best for this situation – phuongdo Nov 01 '12 at 06:09
  • Great suggestions, thanks. I have just updated the links in your answer as it seems the pages have moved. – Tom Oct 16 '13 at 21:37
  • 1
    there's a good post about this here http://gibrown.wordpress.com/2013/05/01/three-principles-for-multilingal-indexing-in-elasticsearch/ – hellvinz Oct 24 '13 at 07:30