0

Riak supports rich query language including term searching, and fields. Additionally, Riak indexes json documents as key/values which support fields.

I'm storing some objects into riak through RiakCS which exposes an implementation of the S3 API, and makes it available within cloudfoundry marketplace. The doc mentions:

On write, Riak CS breaks large objects into blocks. Riak CS distributes data across physical machines using consistent hashing and replicates objects a default of 3 times in the underlying Riak storage system. A manifest is maintained for each object that points to which blocks comprise the object. The manifest is used to retrieve all blocks and present them to a client on read.

I'm wondering if there is a way to query riak against objects stored through riakcs S3 API, and therefore to use the powerful riak query language for those.

Is there a size threshold after which CS breaks objects into multiple blocks as described above, making such querying including JSON parsing unavailable to large CS objects, while available to short objects ?

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Guillaume Berche
  • 3,049
  • 2
  • 17
  • 18

1 Answers1

1

I believe the bucket and key chosen by Riak CS to store a block of data in Riak is based on the hash of the s3 bucket and a UUID. Enabling search on the right bucket would be the first trick, you would probably have to spelunk the source to find the bucket name. The result from search is the bucket/key that contained the match, so you would probably need to store the document's name within the document itself in order to be able to get back from the UUID to a document name.

The size threshold seems to be 1 Mb, but there may be a configuration setting for that.

Joe
  • 25,000
  • 3
  • 22
  • 44