0

I'm already quite familiar with the concept of bitmasks, and also with the bitwise operators used to modify them. However, I have one specific problem that I can't solve. Here's the problem:

I have a large number of relatively large bitmasks (something around 10,000,000 bitmasks, each 256 bits long). Generating an SQL index that will allow me to search for a specific one in log(n) time is simple enough. However, what I need to do is to match a given 256-bit query against the entire dataset and find N (variable) data items that are "least different" from the given query, least different meaning the number of bits that don't match should be minimal. For example, if the database contains {0110, 1101, 0000, 1110} then the closest match to 0100 is either one of 0110 and 0000.

Given the number of entries, a linear search would be very inefficient, which is, I believe, what would happen if I were to use aggregate operators. I'm looking for a way to improve the search, but have found no way to do it as of now. Any ideas would be highly appreciated.

Arshia001
  • 1,854
  • 14
  • 19
  • This is _highly_ DBMS dependent. So which DBMS are you using? Postgres? Oracle? –  Oct 05 '14 at 11:56
  • I'm in the planning phase, and I have yet to decide on one. I'm hoping to use MSSQL, since other components of the system already use it; however, I could use another DBMS for this part if need be, seeing as it's (almost) completely independent of the rest of the system. – Arshia001 Oct 06 '14 at 12:20

0 Answers0