1

Introduction

I'm currently working on a project for a company, and so far so good. We are in production. We've seen some odd behavior with ElasticSearch that our QA recently found. We are using ElasticSearch along with MongoDB. ElasticSearch is populated via River, specifically the MongoDB River Plugin For Elastic Search.

Background

We aggregate, filter, and sort through upwards of 2 million job posts through our service. For searching this data quickly and effectively we use Elastic Search, with MongoDB being our main datastorage. One of the main search functions is searching by Region, State, and City. We do this with State abbreviations, ex. Madison, WI. With this functionality we can search entire regions(ex. midwest) and come up with results for all regions in the midwest, we can do the same for states and come up with all the results for cities in that state.

The Problem

We have an odd problem occurring where searches in the state of Oregon are turning up with no hits, or the hits do not include cities within Oregon, but instead just statewide jobs(not specific to any city).

The Cause

The most prominent cause of this seems to be that Apache Lucene reserves the word OR as an or operation, this is also the abbreviation of Oregon. This is what I believe to be the problem, because this odd behavior is only shown for searches in the state of Oregon.

The Solution?

My purposed solution is to change the "states" field to be not_analyzed to prevent this from happening, and also changing my search query.

Why I cannot get this to work

MongoDB River is relatively turn key, I can point it at a database and even refine that to a collection. It will form its own mapping to my collection/s, problem being that there is no documentation or mention of how I would define my own mapping for data that's stored in MongoDB and indexed to ES using River.

Conclusion

Does anyone know of a way to change a field in a predefined mapping? Otherwise, does anyone know of how I could define my own mapping for MongoDB River? Documentation or examples would be great. It's a some what confusing issue, so if you need more details feel free to ask.

Community
  • 1
  • 1
tsturzl
  • 3,089
  • 2
  • 22
  • 35

1 Answers1

1

I believe you need to create the index and define the mappings first, and then create the river. You may find this previous discussion helpful:

mapping in create index in elasticsearch through mongodb river is not taking effect

along with this one:

http://elasticsearch-users.115913.n3.nabble.com/Add-settings-and-mapping-when-create-new-river-mongodb-td4039081.html

Community
  • 1
  • 1
John Petrone
  • 26,943
  • 6
  • 63
  • 68