0

I have several tens of thousands of related small entities (NDB atop of Master-Slave, will have to move to HRD one day..), which I'd like to put in the same entity group to enable transactions.

Small subsets of those entities will be updated by transactions.

What are the performance implications of this setup?

Does it mean the whole group gets locked during the update? I.e. one transaction at a time.

Thanks!

Srg
  • 510
  • 4
  • 13

1 Answers1

1

There's an approximate performance limit of 1 write transaction per second to an entity group. The whole group does get locked for the update. A subsequent transaction will fail and retry.

10k entities in an entity group sounds like a lot, but it really depends on your write patterns. For example, if only a few entities in the group are ever updated, it may not be a big issue. However, if random users are constantly updating random entities in the group, you'll want to split it up into more entity groups.

dragonx
  • 14,963
  • 27
  • 44
  • "performance limit of 1 write transaction per second" - does this mean that it will take ~5 seconds to complete 5 simultaneously issued transactions (on entities of the same group)? – Srg Dec 20 '12 at 15:56
  • That can happen, since there's some auto-retry in the transaction infrastructure. It's also possible that the first few transactions will succeed and the later ones might just fail (this is more likely with more traffic) – dragonx Dec 20 '12 at 16:02
  • There are two types of entities per group. Entities of first type (a few thousands) change rarely (few times a week), the second type entities (a few thousands also) will change several times a day. – Srg Dec 20 '12 at 16:02
  • Does it matter if there is no actual conflict between the updates? I expect the occasions when two transactions attempt to update the same entity to be very rare. – Srg Dec 20 '12 at 16:26
  • When you put them in the same entity group, you limit the whole entity group. Essentially, if you put entities in the same entity group, they're treated as one in terms of transactions. If you want better performance, you want your entities to be in separate groups. Because of that you usually want to keep the number of entities in a group to a minimum if possible. – dragonx Dec 20 '12 at 17:00
  • So, when I use get_or_insert for the root entries (parent-less), is it a free lunch performance-wise? Or do I lock the whole Kind for a second? – Srg Dec 20 '12 at 23:58
  • In theory you can have as many of those going on in parallel as you want, until Google's hardware gets bogged down. – dragonx Dec 21 '12 at 01:44
  • Wow, thanks! That helps :) I think I'll put one entity per group (for protection) and try to use tasklets to regain some performance. – Srg Dec 21 '12 at 10:45