3

I have a BinData field in my mongo and I need to make a find over it with partial information.

Let's say that the bindata that I have looks like this:

{ "_id" : ObjectId("5480356518e91efd34e9b5f9"), "test" : BinData(0,"dGVzdA==") }

If I do this query I get the result:

> db.test.find({"test" : BinData(0,"dGVzdA==")})
{ "_id" : ObjectId("5480356518e91efd34e9b5f9"), "test" : BinData(0,"dGVzdA==") }

However I would like to find it with only a part of the binary object.
Is it possible?

Thanks!

Community
  • 1
  • 1
Kuu
  • 117
  • 1
  • 12
  • What object are you storing as BinData ? If they are files, you could use GridFS and query on its metadata. – BatScream Dec 04 '14 at 18:54
  • It's a byte[] and I want to query by it's content, not only the metadata – Kuu Dec 05 '14 at 06:51
  • 1
    As of 2.6, MongoDB doesn't have operators to support searching part of a piece of BinData, unfortunately. I couldn't find a feature request for such a thing - if you like, you can make one in the [MongoDB JIRA](https://jira.mongodb.org/browse/SERVER). As mnemosyn says, I think doing such a thing is more complicated than you might naively believe. – wdberkeley Dec 05 '14 at 16:44
  • Thank you @wdberkeley . Yeah, I know is hard, that's why I asked :) Thanks for the info ;) – Kuu Dec 05 '14 at 20:35
  • 1
    Well, there's a request for bitwise query operators (https://jira.mongodb.org/browse/SERVER-3518) and support for `$bit` queries (https://jira.mongodb.org/browse/SERVER-3281) with the following comment by a Pieter Willem Jordaan: 'This should also be applied to do queries over a section of binary data with an offset and length query'. However, the ticket is from June 2011... – mnemosyn Dec 09 '14 at 13:33

1 Answers1

1

"partial" is a vague term - if you're searching for a contiguous block of binary data (needle) at any point in the haystack, you're going to need a very different solution I think, maybe something based on a suffix tree / suffix array for binary data.

If you want to find binary data that starts with specific bytes, you might want to consider storing the data as hex or base64 encoded strings and use a rooted regex for index use. But that is fraught with its own perils (padding, endianness, etc.) and incredibly ugly...

Isn't there a way to store the binary data in a way that MongoDB understands it? That might be easier...

mnemosyn
  • 45,391
  • 6
  • 76
  • 82
  • 1
    Thanks for your answer. Yes, I'm looking for someway to make MongoDB to understand the binary so I can do searchs as you can do with texts for instance. Imagine that I have the "ab cd ef" bytes in a binary and I want to find those binaries which contain "cd" byte. I thought also of the posibility of representing it as hex but yeah, it's ugly and not too efficient in terms of space consumed. – Kuu Dec 04 '14 at 14:14