-2

I have four main kinds: Account, Company, Service and Address. I would like Address entities to be shared between Company and Service entities.

  • Account: this is the user account (email, password)
  • Company: A business which provide Services (ancestor: Account)
  • Service: A service rendered by a Company (ancestor: Company)
  • Address: An address (group of fields: street, city, country) of a Company OR a Service (ancestor: Account)

The Challenge: Company and Service entities may have different addresses; after all, a company's address is not necessarily where its services are acquired. Services may have many addresses, since a company may set up different franchises/outlets where its services may be acquired.

I would like to model data in such a way that Addresses can be referenced by either Company or Service entities, or both. I have tried these two approaches:

Let's assume this is the Address model:

class Address(ndb.Model):
    street = ndb.StringProperty(required=True)
    city = ndb.StringProperty(required=True)
    country = ndb.StringProperty(required=True)

Approach 1: Store list of address keys inside Service or Company

class Service(ndb.Model):
    title = ndb.StringProperty(required=True)
    addresses = ndb.KeyProperty(repeated=True)

class Company(ndb.Model):
    name = ndb.StringProperty(required=True)
    addresses = ndb.KeyProperty(repeated=True)

Problem: For each page view of Service or Company, I would need to perform additional queries to fetch the their respective addresses. This blows up to be a big expensive problem as our entities grow in number.

Approach 2: Create an AddressMapping entity which forms a relationship between two entities:

class Service(ndb.Model):
    title = ndb.StringProperty(required=True)
    addresses = ndb.KeyProperty(repeated=True)

class AddressMapping(ndb.Model):
    entity = ndb.StringProperty(required=True)  # service or company
    address = ndb.KeyProperty(repeated=True)

Problem: If a service is disabled/deleted/modified, we need to delete/modify all accompanying AddressMapping entities, or else they will be orphaned. Additional queries still required when viewing pages. This also seems expensive.

These are the two approaches I've come up with; they both seem bad. Any ideas on how I may improve this?

Dan McGrath
  • 41,220
  • 11
  • 99
  • 130
hyang123
  • 1,208
  • 1
  • 13
  • 32
  • Why not combine Company, Service and Address in a single entity. – voscausa Feb 05 '16 at 17:15
  • 1
    Do it the correct way (normalized) then test it. If you only care about performance write everything in assembly – Neil McGuigan Feb 05 '16 at 20:57
  • Companies and Services sound like subclasses of some superclass to me. But I can't think of a name for that superclass. If you think of one, please edit your question accordingly. – Walter Mitty Feb 06 '16 at 20:31

2 Answers2

1

If you store keys of addresses in your Company and Service models, you do not need "additional queries to fetch them" - you can simply get all address entities that you need. This is fast and cheap.

Andrei Volgin
  • 40,755
  • 6
  • 49
  • 58
  • Not so sure about the 'fast and cheap' of this solution. If you get an entity by key it will still be a datastore read / query. If you require multiple (or all) addresses you would require multiple key queries, instead of a single filter query. – konqi Feb 05 '16 at 18:37
  • 3
    (a) You don't need any queries when you have keys. (b) 1 read per entity is the lowest possible cost for retrieving data. Every other solution will be more expensive. – Andrei Volgin Feb 05 '16 at 18:40
0

This is a pretty standard problem with the datastore. The solution is denormalisation. By allowing duplicates you can break the problem down into a one-to-many relationship. So in your example: Allow Address duplicates and let each Address have a parent, either Company or Service. Or split your Address entity into two (ServiceAddress, CompanyAddress).

When you now modify a Service or Company, you can do a simple ancestor query for the addresses and you will only get the corresponding addresses.

This approach assumes that you will not update Address (or any other entity you have) more than once per second, since you will run into the 1 write per second and entity group otherwise.

konqi
  • 5,137
  • 3
  • 34
  • 52
  • There is absolutely no reason to create duplicate address entities. – Andrei Volgin Feb 05 '16 at 18:40
  • depending on the consistency requirements there might be. I would prefer ancestors before lists of keys or ids, but that's just my opinion.. – konqi Feb 05 '16 at 19:10
  • Child entities is a good option when a child always belongs to one parent, which is not the case here. You have to be very careful duplicating data, because you create a complex logic (remember to edit/delete in both places, etc.) In this particular case, it's totally unnecessary. – Andrei Volgin Feb 05 '16 at 19:50
  • No argument there. Anyway, this argument is pointless because it is a question of requirements and personal taste. Both approaches work; there will always be scenarios where one approach works better than the other. I really don't want to construct hypothetical scenarios to prove my point, so i will let the op decide which approach he likes best. – konqi Feb 05 '16 at 22:25