I have to compare two tables in Cassandra to get the differences. Here is the requirement. We have to perform inventory count where we’ll enter/scan each and every items in stock and after finish we’ll compare all with the master inventory table to get the variance. I created a temp table in Cassandra where I’ll insert record against each scan.
**TempInventory**
userId
storeId
skuId
PK(storeId, skuId)
I have master table with other details –
**Inventory**
storeId
skuId
skuDesc
..
..
PK(storeId)
Once scan completed then on submit I have to compare tempInventory with Inventory table to get the differences. So what is the best way of doing this in Cassandra as we cannot use joins –
- Get everything in Java class in collection of objects and then compare (Use Java 8 features for better performance) [in this case Inventory table size may be more than 3000. So will this be fine to get everything in JVM)
- Use spark SQL with Cassandra which allow to use Joins (Spark is new for me so does not have better idea. Some links of examples would be helpful)
- Is there any other utility available (e.g. from Apache)
- I am using Gemfire also. But I think we can not create region in gemfire with composite key. Please correct me.
Please suggest what approach is most suitable.