1

The setup for this question is a bit elaborate, please bear with me :)

My scenario is that of migrating data from relational table form to document collection form ("ETL"). My source data is represented in a relational table where each row describes a "Person" and one of his/her "Addresses". (Naturally this table isn't normalized).

A simplified example would be:

PersonId PersonName AddressId AddressText
1        John Doe   1         Somewhere
1        John Doe   2         Elsewhere
2        Jane Doe   3         Some other place

To complicate things, some of the rows can update previous existing values. Ex:

PersonId PersonName AddressId AddressText
1        John Doe   1         A new address

I would like to migrate this data into a document collection in a document store (DB) for easy retrieval of specific Person entities with all their related Addresses. To begin with I am considering a single document per Person, like so:

{
   "id":"1",
   "addresses":[
      {
         "id":"1",
         "text":"A new address"
      },
      {
         "id":"2",
         "text":"Elsewhere"
      }
   ]
}

MY QUESTION IS: What is the best way to merge the table rows into a document?

Is the best approach to take a document store that supports partial updates (MongoDB, RavenDB) and use that feature?

Or maybe I should fetch the document as a whole from the document DB, update it in my business logic layer and save it whole back to the DB?

Or should I aim for a different document representation of the data altogether?

Community
  • 1
  • 1
urig
  • 16,016
  • 26
  • 115
  • 184
  • 1
    With "merge the rows into a document", do you mean that you want to migrate data from a relational database into a document db or are you just wondering how to model this? – Jens Pettersson Dec 16 '14 at 20:57
  • @Jaynard good question. The scenario is indeed data migration - A daily scheduled ETL process. I'll also update my question with this detail. – urig Dec 17 '14 at 10:08
  • Did you resolve this? – datasci Dec 31 '14 at 21:17
  • @datasci Yes. In my particular scenario the document store did not support partial updates - This was Azure DocumentDB. Because of this constraint my team opted to fetch the document from the store, update it at the application layer and save it back. In retrospect, I would have gone with a document store that does support partial updates and leverage that feature. – urig Feb 17 '15 at 22:17

2 Answers2

1

Your document could look something like this in MongoDB:

{
    person_id:PersonId,
    name: PersonName,
    addresses:[{
              id:AddressId1,
              street:AddressText1
              },
              {
              id:AddressId2,
              street:AddressText2
              },
              ...
              {
              id:AddressIdn,
              street:AddressTextn
              }]
}

Then, we could update the second address this way:

db.collection.update({person_id:PersonId,"addresses.id":AddressId2},{$set:{"addresses.$.street":NewAddress}})

See here for additional detail: http://docs.mongodb.org/manual/reference/operator/update/positional/#update-embedded-documents-using-multiple-field-matches

Hope this helps.

datasci
  • 1,019
  • 2
  • 12
  • 29
1

You might want to ask yourself WHY you want to do a "partial update" on a document. If the requirement is that you should be able to change/update a person's address you might want to model that behaviour in the Person entity (kind of what you say with "update it in my business logic layer").

Changing address information seems like a domain responsibility and should (imo) be modeled accordingly. A quick example using RavenDb in C#:

using (var session = _documentStore.OpenSession())
{
    var person = session.Load<Person>("persons/1");
    person.UpdateAddressInformation(someAddressInformation);

    session.SaveChanges();
}

But if you really need to do partial document updates, you can do that with RavenDb as well:

http://ravendb.net/docs/article-page/2.5/csharp/client-api/partial-document-updates

//J

Jens Pettersson
  • 1,167
  • 1
  • 9
  • 14