I built an auction system that uses ElasticSearch. It has 3 models, users, auctions and bids. A user can post an auction and can also bid on other auctions.
One of my first use cases for searching is for searching bids. Aside from searching by id, user_id, price, etc, I've encountered an interesting use case. I want to be able to search for a user's name and it should return all my bids from all the auctions posted by that user.
e.g. When I search for "John", it would get all the bids that I have sent for all the auctions posted by the user "John".
Here's what the index looks like:
Bids
- id (not analyzed)
- user_id (not analyzed)
- price (not analyzed)
- auction_user_name (uses ngrams)
I have a couple of problems with this index:
Bids has a lot of rows (10M+) and having n-grams on
auction_user_name
takes up a lot of space. I'm thinking if this data really should be de-normalized in a single index with a single type, or if there are any alternatives that is more appropriate (parent-child types)?Some users are very active and can have thousands of bids. If one of them changes their name, it will cause thousands of updates to the bids index. This is not ideal and due to the duplicates, it can result to a write-heavy index which can be vulnerable to denial of service.
Are there known solutions for these two problems? I'm sure there's some trade-off I can do to solve this.
I have seen some suggestions on: https://www.elastic.co/guide/en/elasticsearch/guide/current/relations.html
The methods are not as elegant as I imagined so I'm interested if there are more ways to tackle the problem.