0

I have 2 indices. These 2 indices are kind of related to one another.

For example, lets say, the 1st index contains all information pertaining to an e-book. Information like author, published date, title etc. will be indexed here.

And the 2nd index contains all paragraphs within a book. Information like book id, para content, page number, complex object information etc. will be indexed here.

When I want to query for paragraphs from the 2nd index based on the 1st index information like book title or published data, how do I do that?

  1. Is it advisable to store all the meta information of the 1st index inside the 2nd index to apply filters and query its documents. In this way I'll be needlessly bloating up the 2nd index with duplicate information which I already have in the 1st index.
  2. Is there a way I can form a relationship between these indices?
  3. Is it possible to maintain a single index for my case? Like storing all the paragraph related information in the 1st index itself as a list of objects. In this case, every document in the 1st index will be huge (lets say a list 10000 paragraphs indexed or more) and will it be efficient while performing the querying operation?

Or is there any other way I can solve this?

Any help, much appreciated.

Vignesh T
  • 237
  • 1
  • 2
  • 11
  • Relational based approach is not advisable. You can either go with 1 or 3 approach – Addicted Apr 04 '20 at 10:34
  • @Addicted, what are the complexities involved in approach 3? won't it cause any performance issues when we're storing each document with huge size? – Vignesh T Apr 04 '20 at 11:57
  • You could also use parent/child setup with join https://www.elastic.co/guide/en/elasticsearch/reference/current/parent-join.html – Alkis Kalogeris Apr 04 '20 at 16:51
  • In your case It would be better to have all information in only one index, the information you have in your 1st index seems to be very simple, all metadata of keyword type for filtering and sorting, you can store this information in your 2nd index and get rid of the 1st index. – leandrojmp Apr 04 '20 at 17:31
  • @leandrojmp, but don't you think there will be data duplication if I might need to store the same parent information for all the 10000 child documents? – Vignesh T Apr 05 '20 at 08:05

0 Answers0