So, as you are aware, GemFire (and by extension, Apache Geode) stores JSON in PDX format (as a PdxInstance). This is so GemFire can interoperate with many different language-based clients (native C++/C#, web-oriented (JavaScript, Pyhton, Ruby, etc) using the Developer REST API, in addition to Java) and also to be able to use OQL to query the JSON data.
After a bit of experimentation, I am surprised GemFire is not behaving as I would expect. I created an example, self-contained test class (i.e. no Spring XD, of course) that simulates your use case... essentially storing JSON data in GemFire as PDX and then attempting to read the data back out as the Order application domain object type using the Repository abstraction, logical enough.
Given the use of the Repository abstraction and implementation from Spring Data GemFire, the infrastructure will attempt to access the application domain object based on the Repository generic type parameter (in this case "Order" from the "OrderRepository" definition).
However, the data is stored in PDX, so now what?
No matter, Spring Data GemFire provides the MappingPdxSerializer class to convert PDX instances back to application domain objects using the same "mapping meta-data" that the Repository infrastructure uses. Cool, so I plug that in...
@Bean
public CacheFactoryBean gemfireCache(Properties gemfireProperties) {
CacheFactoryBean cacheFactoryBean = new CacheFactoryBean();
cacheFactoryBean.setProperties(gemfireProperties);
cacheFactoryBean.setPdxSerializer(new MappingPdxSerializer());
cacheFactoryBean.setPdxReadSerialized(false);
return cacheFactoryBean;
}
You will also notice, I set the PDX 'read-serialized' property (cacheFactoryBean.setPdxReadSerialized(false);
) to false in order to ensure data access operations return the domain object and not the PDX instance.
However, this had no affect on the query method. In fact, it had no affect on the following operations either...
orderRepository.findOne(amazonOrder.getTransactionId());
ordersRegion.get(amazonOrder.getTransactionId());
Both calls returned a PdxInstance. Note, the implementation of OrderRepository.findOne(..)
is based on SimpleGemfireRepository.findOne(key), which uses GemfireTemplate.get(key), which just performs Region.get(key)
, and so is effectively the same as (ordersRegion.get(amazonOrder.getTransactionId();
). The outcome should not be, especially with Region.get()
and read-serialized set to false.
With the OQL query (SELECT * FROM /Orders WHERE transactionId = $1
) generated from the findByTransactionId(String id)
, the Repository infrastructure has a bit less control over what the GemFire query engine will return based on what the caller (OrderRepository) expects (based on the generic type parameter), so running OQL statements could potentially behave differently than direct Region access using get
.
Next, I went onto try modifying the Order
type to implement PdxSerializable, to handle the conversion during data access operations (direct Region access with get, OQL, or otherwise). This had no affect.
So, I tried to implement a custom PdxSerializer for Order
objects. This had no affect either.
The only thing I can conclude at this point is something is getting lost in translation between Order -> JSON -> PDX
and then from PDX -> Order
. Seemingly, GemFire needs additional type meta-data required by PDX (something like @JsonTypeInfo(use = JsonTypeInfo.Id.CLASS, include = JsonTypeInfo.As.PROPERTY, property = "@type")
in the JSON data that PDXFormatter recognizes, though I am not certain it does.
Note, in my test class, I used Jackson's ObjectMapper
to serialize the Order
to JSON and then GemFire's JSONFormatter to serialize the JSON to PDX, which I suspect Spring XD is doing similarly under-the-hood. In fact, Spring XD uses Spring Data GemFire and is most likely using the JSON Region Auto Proxy support. That is exactly what SDG's JSONRegionAdvice object does (see here).
Anyway, I have an inquiry out to the rest of the GemFire engineering team. There are also things that could be done in Spring Data GemFire to ensure the PDX data is converted, such as making use of the MappingPdxSerializer
directly to convert the data automatically on behalf of the caller if the data is indeed of type PdxInstance
. Similar to how JSON Region Auto Proxying works, you could write AOP interceptor for the Orders Region to automagicaly convert PDX to an Order
.
Though, I don't think any of this should be necessary as GemFire should be doing the right thing in this case. Sorry I don't have a better answer right now. Let's see what I find out.
Cheers and stay tuned!
See subsequent post for test code.