I have a typical scenario where a consumer is calling a Azure Function (EP1) (synchronously) which then queries Azure Table storage (having 5 million records), based upon the input parameters of the Azure Function API. Azure Table Storage has following columns:
- Order Number (incremental number)
- IsConfirmed (can have value Y or N)
- Type of Order (can be of 6 types maximum)
- Order Date
- Order Details
- UUID
Now when consumer queries, it generally searches with the Order Number and expects the Order Date and Order Details in response, along with Order Number.
For this, we had chosen:
- Partition Key: IsConfirmed + Type of Order
- Row Key: UUID
Now for 5 million records search, because of the partition key type, the search partition often runs into more than 3 million records (maximum orders have IsConfirmed as Y and Type of Order a specific one among the six types) and the Table query takes more than 5 minutes. As a result, the consumer generally times out as the wait configured on consumer side is 60 secs.
So looking for recommendation on how to do this efficiently.
- Can we choose partition key as Order Number (but that will create 5 million partitions) or a combination of Order NUmber+IsConfirmed+TypeofOrder?
- Ours is a write heavy Java application and READ happens much less.
+++++++++++ UPDATE +++++++++++++++
As suggested by Gaurav in the answer, after making orderid as partition key, the query is working as expected.
Now that brings to the next problem - we do have other API queries where the order data and type are only used as input search criteria.
Since this doesn't match with the partition key, so in this 2nd type of query, its basically making a whole scan and the consumer is again timed out again.
So what should be the design to handle these types of queries.. Azure doc says creating a separate table where order type + order date becomes partition key. However that will mean that whenever we are writing to the table, we will have to write on both tables (one with orderid as part key and other as order date + type as part key).