I'm tuning an app we run on App Engine and one of the largest costs is data store reads and writes. I have noticed one of the biggest offenders of the writes is when we persist an order.
Basic data is Order has many items - we store both separately and relate them like this:
@PersistenceCapable
public class Order implements Serializable {
@Persistent(mappedBy="order")
@Element(dependent = "true")
private List<Item> orderItems;
// other fields too obviously
}
@PersistenceCapable
public class Item implements Serializable {
@Persistent(dependent = "true")
@JsonIgnore
private Order order;
// more fields...
}
The appstats is showing two data store puts for an order with a single item - but both are using massive numbers of writes. I want to know the best way to optimize this from anyone who's got experience.
AppStats data:
real=34ms api=1695ms cost=6400 billed_ops=[DATASTORE_WRITE:64]
real=42ms api=995ms cost=3600 billed_ops=[DATASTORE_WRITE:36]
Some of the areas I know of that would probably help:
- less indexes - there's implict indexes on a number of order and item properties that I could tell appengine not to index, for example item.quantity is not something I need to query by. But is that what all these writes are for?
- de-relate item and order, so that I just have a single entity OrderItem, removing the need for a relationship at all (but paying for it with extra storage).
- In terms of explicity indexes, I only have 1 on the order table, by order date, and one on the order items, by SKU/date and the implict one for the relationship.
- If the items were a collection, not a list, would that remove the need for an index on the children _IDX entirely?
So, my question would be, are any of the above items going to herald big wins, or are there other options I've missed that would be better to focus on initially?
Bonus points: Is there a good 'guide to less datastore writes' article somewhere?