0

Trying to model some highly connected, but also hierarchical data in app engine.

Here is an example:

Person:
    Phone Numbers:
        Number: 555-555-5555, Ext: 123, Notes: Work
        Number: 444-444-4444, Ext: 456, Notes: Mobile

One entity, contained data structures stored as JSON blobs:

One way to do this would be to store the phone_numbers collection as an unindexed blob of JSON text, and then add a search property so that a person could be queried by phone number:

p_entity = Person()

p_entity.phone_numbers = dbText(simplejson.dumps([{'Number':'555-555-5555', 'Ext':'123', 'Notes':'Work'},{'Number':'444-444-4444', Ext:'456', Notes:'Mobile'}]))
p_entity.phone_numbers_search_property = ['5555555555', '4444444444']

p_entity.put()

Multiple entities with parent-child relationships:

Another way would be to use child and parent entities:

person_entity = Person()
person_entity.put()

phone_entity1  = PhoneNumber(parent=person_entity)
phone_entity.Number = '5555555555'
phone_entity.Ext    = '123'
phone_entity.Notes  = 'Work'

phone_entity2  = PhoneNumber(parent=person_entity)
phone_entity.Number = '4444444444'
phone_entity.Ext    = '456'
phone_entity.Notes  = 'Mobile'

A use case:

This is highly connected data. A person object contains multiple phone numbers. But phone calls can also be made to and from those phone numbers. Records of phone calls will also need to refer to these phone numbers.

The purpose of parent-child entity relationships:

After reading over the documentation, I was under the impression that the purpose of parent-child entity relationships was for performing transactions.

However, could they also be appropriate in this case? Is it almost as efficient to pull a parent and all of it's children out of the datastore as to pull one entity out with its "children" instead stored as JSON text blobs?

Basic question

Is there a normal and accepted way to handle this kind of data in google app engine?

Chris Dutrow
  • 48,402
  • 65
  • 188
  • 258

2 Answers2

2

Take a look at the new NDB API (in particular the StructuredProperty: http://code.google.com/appengine/docs/python/ndb/properties.html#structured)

Also, from my experience and what I've read, when you update an existing entity, you do not pay for writes on properties that did not change, that is, in contrast to what Riley said, you would only pay for writing the object + 2 writes for any indexed properties that were modified + 1 write for each composite index you have that contains the model and properties you modified.

From all the articles I've read and my experience (I too had to come up with a solution for this and ended up with the JSON method) you want to pack as much as you can into a single entity to minimize trips to the datastore which cost the most in terms of $$ and time.

someone1
  • 3,570
  • 2
  • 22
  • 35
  • Re. updating existing properties: Oh, really? I'd love for you to be right. Can you point me to the docs that indicate that you don't pay for properties that didn't change? – Riley Lark Mar 12 '12 at 17:34
  • 1
    I can't find the exact documentation, but Look at Nick Johnson's reply to this question: http://stackoverflow.com/questions/8113363/what-does-google-classify-as-a-datastore-write-operation-in-google-app-engine. Nick is a App Engine Team member and I believe a credible source on this matter. I also believe Guido mentioned this in the NDB google group. To be fair though, my math is incorrect, the number of writes are 4 per modified property, not 2. – someone1 Mar 12 '12 at 17:44
  • Ah, beautiful. That will save me a lot of money ;) – Riley Lark Mar 12 '12 at 18:01
  • Its not really well documented I guess. Luckily, it all happens behind the scenes so you benefit from it whether you are aware of it or not, but none-the-less you can make better modeling decisions knowing the information. – someone1 Mar 12 '12 at 18:14
  • Is NDB a lot better to use than the PYTHON APIs directly? I used the ORM "entity framework" to build an application a few years and had a bad experience, so I'm a little standoffish when it comes to ORMs. – Chris Dutrow Mar 14 '12 at 16:39
  • NDB is just a new Python datastore API. Its meant as an improvement/replacement of google.appengine.ext.db. It offers additional features and optimization over the old API. Are you using the datastore API directly, bypassing even the google.appengine.ext.db library? – someone1 Mar 14 '12 at 18:25
  • In support of Riley Lark's comment (App Engine doesn't charge for properties that don't change), see also the [Billing page](https://developers.google.com/appengine/docs/billing) in the official documentation: it says that the cost for an Existing Entity Put (per entity) - i.e. an "update" - is "1 Write + 4 Writes _**per modified indexed property value**_ + 2 Writes per modified composite index value" – AngularChef Jul 10 '12 at 04:29
1

There's no special bonus for pulling children entities out of the datastore. If you get two entities, the cost is the same whether they're in the same entity group or not. The only purpose of entity groups in app engine is transactions.

Should your records of phone calls change when the phone number is changed? My initial thought is that the records should have a separate copy of the phone number data, not a reference to a phone number object. You can still query the call log by phone number. It makes more sense to store a reference to the contacts involved, so that if their name changes or something the call log can be updated.

Riley Lark
  • 20,660
  • 15
  • 80
  • 128
  • Hey Riley, thanks so much for your response! Yeah, I keep the phone numbers in the call records as well, but I also refer back to the phone number record that originated the call. That way if someone changes their 'WORK' number, then I know a call was made to a 'WORK' number, but I also know the actual number that was called. – Chris Dutrow Mar 12 '12 at 17:19