1

I have a datastore entity called lineItems, which consists of individual line items to be invoiced. The users find the line items and attach a purchase order number to the line items. These are they displayed on the web page where they can create the invoice.

I would display my code for fetching the entities, but I don't think it matters at all as this also happened a couple times when I was using managed VM's a few months ago and the code is completely different. (I was using objectify before, now I am using the datastore API). In a nutshell, I am currently just using a StructuredQuery.setFilter(new PropertyFilter.eq("POnum",ponum)).setFilter(new PropertyFilter.eq("Invoiced", false)); (this is pseudo code you can't do two .setFilters like this. The real code accepts a list of PropertyFilters and creates a composite filter properly.)

What happened this morning was the admin person created the invoice, and all but two of the lines were on the invoice. There were two lines which the code never fetched, and those lines were stuck in the "invoices to create" section.

The admin person simply created the invoice again for the given purchase order number, but the second time it DID pick up the two remaining lines and created a second invoice.

Note that the entities were created/edited almost 24 hours before (when she assigned the purchase order number to them), so they were sitting in the database for quite a while. (I checked my logs). This is not a case where they were just created, and then tried to be accessed within a short period of time. It is also NOT a case of failing to update the entities - the code creates the invoice in a 3'rd party accounting package, and they simply were not there. Upon success of the invoice creation, all of the entities are then updated with "invoiced = true" and written in the datastore. So the lines which were not on the invoice in the accounting program are the ones that weren't updated in the datastore. (This is not a "smart" check either, it does not check line-by line. It simply checks if the invoice creation was successful or not, and then updates all of the entities that it has in memory).

As far as I can tell, the datastore simply did not return all of the entities which matched the query the first time but it did the second time.

There are approximately 40'000 lineItem entities.

What are the conditions which can cause a datastore fetch to randomly fail to grab all of the entities which meet the search parameters of a StructuredQuery? (Note that this also happened twice while using Objectify on the now deprecated Managed VM architecture.) How can I stop this from happening, or check to see if it has happened?

KevinG
  • 450
  • 3
  • 8
  • With Objectify, you were probably grabbing everything from Memcache and it didn't bother w/ Datastore. If you don't use a parent/child relationship to find stuff, then it's difficult. You should also look at denormalization of data where possible. – Les Vogel - Google DevRel Jul 28 '17 at 21:18
  • I wasn't using memcache... and I'm getting the same problem with the new code using the flexible app engine datastore API. It was just a simple search, return a List where purchaseOrder="123456". The code ran once, and picked up 8/10 entities. Then the user ran it again and it picked up the last two... The entities were updated 24 hours prior to this, as well as accessed right before this properly. It seems to be a case of one random bad-read out of 3. (One to display the data on the page, one while processing it, and the last when the user re-processed the data). – KevinG Jul 29 '17 at 04:38
  • I'm not sure what the advantage of a parent/child relationship would be... If I have a field in my entity called"purchaseOrderNumber", "invoiceNumber", etc... I'd like to just keep things simple and just search all of the entities for "purchaseOrderNumber". (It's actually much more complex than this, PO numbers, work order numbers, authorized by, location, date of work... It is the combination of all of these fields matching which determines what goes on each invoice. After the invoice is created, I could see using a parent/child relationship as each entity would only have one parent...) – KevinG Jul 29 '17 at 04:49

1 Answers1

1

You may be seeing eventual consistency because you are not using an ancestor query.

See: https://cloud.google.com/datastore/docs/articles/balancing-strong-and-eventual-consistency-with-google-cloud-datastore/

Joshua Melcon
  • 249
  • 1
  • 4
  • I do not believe this to be the case. The entity was last updated on July 26'th. At 8:08 am on July 27'th, the query was executed, and it missed 2 entities. Approximately 30 seconds later, the user ran the query again, only this time it picked up the two entities it missed the first time... Also, to display the data in the first place in the "invoices to create" section, the same code is used to fetch the entities. So it worked to display, failed to fetch them again from the server, and then the third fetch worked. – KevinG Jul 27 '17 at 23:07