0

Given the ElasticSearch NoSQL db, I'm trying to figure out how to best model social relationship data (yes, a graph db would be the best tool for the job, but in my current situation this choice might be forced upon me).

I'm new to ElasticSearch, and am reviewing ways to model relationships, but they don't seem to fit a use case for social connections, or at least it's not apparent to me how these would be modeled.

A greatly simplified version of my requirements is as follows:

  • People have IDs, names, and work place (they might not have a work place)
  • People can have friendship relationships with other people (and a date of friendship creation)
  • People can block other people from talking to them (directionality matters, as only the one who blocked can unblock)
  • People can work at the same work place

Things we're likely to query:

  • Give me all the people I'm friends with (given my ID)
  • Give me all the people I work with (given my ID)
  • Give me the union of the above 2, and the names and ids of their work places, but not those I've blocked or who have blocked me.
  • Give me all the friends who have a work place in the city where I work.

While the queries seem like they could be a challenge, I'm more interested in simply modeling people, work places, and the relationships between them in ElasticSearch in such a way that it makes sense, is maintainable, and that could support queries like these.

Documentation tells me ElasticSearch doesn't have joins. It has nested objects, and parent-child relationships, but neither of these seems like a fit for friendship relationships between people; both nested objects and parent-child have an implicit concept of single-ownership...unless I start duplicating people data everywhere, both in other people objects (for friends and for blocked) and in work places. That of course introduces the problem of keeping data consistent, as changing person data needs to change their duplicated data everywhere, and removing a friendship relationship must remove the other side of that relationship with the other person. This also brings up the issue of transactions, as I've heard that transactional support across different documents isn't supported.

Aside from denormalization and duplication, or application-side joining outside of the db, are there any better ways (aside from using a different DB) to model this in a sane way that's easier to query?

Community
  • 1
  • 1
InverseFalcon
  • 29,576
  • 4
  • 38
  • 51

1 Answers1

1

Sample simplified json with some explanation afterward:

{ "type":"person", "id":1, "name":"InverseFalcon", "workplace":"StackOverflow", "friend_ids":[3,4,19], "blocked_ids":[45,24], "blocked_by_ids":[5] }

This should be lightning fast as you can retrieve the document, work your sets (union, intersection, etc.), and then perform a multi-get (mget) to retrieve the names and workflow places. Not using a graph database means recursive calls to get friends of friends, etc.

NaturalData
  • 459
  • 3
  • 8