0

I am trying to find a way to serialize data with a directory structure in Java. The data I'd be saving are stock trading records, each of which consists of the stock name, price, time and volume. I know how to serialize such records (instances of "Trade" class) without any hierarchical structure, but I'd have to go through all, say billions of, records to collect data for a single specific stock when reading. So, I'd like the data to be partitioned by the stock name, so I'd have a much faster reading performance when needing data for just a few stocks.

I know you can create such hierarchical structures (directories) in HDF5, but I'm looking for a Java serialization library that does not use JNI and is more HDFS-friendly. After some online search, I found Kryo was one of the newest and easiest-use Java serialization libraries. So I am hoping there is some way to make a directory structure in Kryo files, but other modern serialization libraries such as Avro, Thrift would work too.

Thank you for your help.

tksmrch
  • 131
  • 1
  • 5

1 Answers1

0

Probably is too late, but in case you still need it you can take a look at dfs-datastores library developed by Nathan Marz. You can define your own data storage in terms of folders structure, here is the link https://github.com/nathanmarz/dfs-datastores/tree/develop/dfs-datastores/src

supernovae
  • 65
  • 1
  • 8