0

I have downloaded my Inbox and I'm processing my emails with Pig and Hadoop. I have used Pig and Wonderdog to index these emails in ElasticSearch.

Now I am creating a web page for each email address in my inbox, to show messages sent by these addresses.

I can do this in two ways:

1) Group by email address in Pig, store to MongoDB (or ElasticSearch).

2) Query ElasticSearch to return this list for me from the email index using facets.

Which one is the go-to answer, and upon what does it depend?

rjurney
  • 4,824
  • 5
  • 41
  • 62

1 Answers1

1

option 1 - Group by email address in Pig, store to MongoDB (or ElasticSearch):

You are pre-computing the results and storing to MongoDB or ElasticSearch. If data is big and not being updated frequently, this is a good thing to do.

option 2 - Query ElasticSearch to return this list for me from the email index using facets.

If the data is updated frequently and even for small dataset, better to go by this option as querying over data (indexed on correct field) will give quick results and you wont have to rely on pre-processing.

Tejas Patil
  • 6,149
  • 1
  • 23
  • 38