0

I have a RESTful web service that runs on the Google App Engine, and uses JPA to store entities in the GAE Data Store.

New entities are created using a POST request (as the server will generate the entity ID).

However, I am uncertain as to the best status code to return, as the GAE DS is eventual consistent. I have considered the following:

  • 200 OK: RFC states that the response body should contain “an entity describing or containing the result of the action”. This is achievable as the entity is updated with it's generated ID when it is persisted to the DS, therefore it is possible to serialize and return the updated entity straight away. However, subsequent GET requests for that entity by ID may fail as all nodes may not yet have reached consistency (this has been observed as a real world problem for my client application).
  • 201 Created: As above, returning a URI for the new entity may cause the client problems if consistency has not yet been reached.
  • 202 Accepted: Would eliminate the problems discussed above, but would not be able to inform the client of the ID of the new entity.

What would be considered best practice in this scenario?

Ben Owen
  • 102
  • 9

2 Answers2

2

A get by key will always be consistent, so a 200 response would be Ok based on your criteria unless there is a problem in google land. Are you certain you observed problems are from gets rather than queries. There is a difference between a query selecting a KEY vs a GET by key.

For a query to be consistent it must be an ancestor query, alternately a GET is consistent, anything else may see inconsistent data as indexes have yet to be updated.

This is all assuming there isn't an actual problem in google land. We have seen problems in the past, where datacenters where late replicating, and eventual consistancy was very late, sometimes even hours.

But you have no way of knowing that, so you either have to assume all is OK, or take an extremely pessimistic approach.

Tim Hoffman
  • 12,976
  • 1
  • 17
  • 29
  • He says in the question that the GET by key sometimes fails in the real world. That's the reason he's asking the question. – Eric Stein Apr 08 '14 at 10:38
  • But is it really from a get or a query. The only time I am aware of failed gets is when google has an outage. Is this something you can deal with, on that level with response codes. You don't know either way. Which means you can't ever rely on anything, and so even sending whole resources back are meaningless. – Tim Hoffman Apr 08 '14 at 10:43
  • Do you know if this is also true for the Eclipse GAE dev server? There seems to be a noticeable delay between persisting the entity (which is instantly updated with it's generated ID) and being able to successfully retrieve it by said ID. EDIT - the query is "SELECT m FROM Message m WHERE Id = " – Ben Owen Apr 08 '14 at 10:43
  • You can't measure consistenacy/performance based on the dev server. You can only make statements about how the production environment behaves. The dev server tries (as much as possible) to simulate the production environment, but it fundamentally is not the production environment. – Tim Hoffman Apr 08 '14 at 10:45
  • Yes, there is a delay with a query, and your example is a query, not a GET by key. Google only state GET's and ancestor queries are consistent. – Tim Hoffman Apr 08 '14 at 10:46
  • Not sure the -1 was warranted. – Tim Hoffman Apr 08 '14 at 10:47
  • @TimHoffman I now agree, but I can't remove it unless you edit the answer. Just add a space at the end or something. – Eric Stein Apr 08 '14 at 10:47
  • I would certainly agree that the dev server is by no means the real thing. However, my question was what is the best practice for this situation? Is it simply such an edge case in production that it is safe to ignore the possibility? – Ben Owen Apr 08 '14 at 10:50
  • 1
    I believe you that if you can't trust the primary behavior then sending a copy of the object back is probably a worse. Situation because it could hide what is in fact going on, in the event of an outage. – Tim Hoffman Apr 08 '14 at 10:52
  • If you are realy paranoid, you could add a GUID/transaction id to the entity. Then return the Key, and transaction ID to the client, along with whatever reponse code (I would chose 200 or 201) They can then retrieve the entity with a GET, if the transaction id, doesn't match then you know the data is out of date. Though it could have been overwritten in the intervening time. – Tim Hoffman Apr 08 '14 at 10:55
  • Unfortunately, I don't have control over the entity itself. I do agree with the comment about returning the full entity masking a potential problem - so I am leaning towards a 201 Created accompanied by the URI of the new entity. Thanks. – Ben Owen Apr 08 '14 at 11:01
  • Thats what I would do, given your situation. – Tim Hoffman Apr 08 '14 at 11:02
0

It depends on which JSON REST Protocoll you are using. Just always returning a json Object is not very RESTful.

You should look at some of these:

To answer you Question: I would prefer using a format, where the Resource itself is aware of it's URL, so I would use 201 but return also the whole ressource.

The easiest way would be be to use jsonapi with a convenious url schema, so you are able to find a ressource by url because you know the id.

CansasCity
  • 47
  • 2
  • 3
  • Thanks for your answer. Could you please expand on how this helps with the eventual-consistency problem? If the client requests the URI provided in the response, they will still potentially hit an inconsistent node. Also, my question never mentioned JSON :) – Ben Owen Apr 08 '14 at 10:25
  • If you provide the whole Object in the Response, the Client does not have to fetch the Resource again, so there should be enough time to sync your Nodes. Also if syncing for some reasons take very long you could provide the url to your current Node. – CansasCity Apr 08 '14 at 10:29