searching a boost multi_index container the fastest possible way

Question

The Problem

My task is to analyze two 1-dimensional matrices. The first one is the data matrix, holding around 500 million floats. The second one is the query matrix, holding around 4000 floats.

The goal at the end is to find (sub-)sequences in data and query which are "similar". Matching (sub-)sequences are of equal length but their positions are independently of each other. Needless to say the result must be calculated with lightning speed.

First Try

To achieve this goal I've tried dynamic time warping first but failed as it searches for the whole query in total. Nevertheless, it can be very fast in finding matches like this link shows.

Actual Try

At this try I want to organize the data matrix in two ways. First as a std::multiset<float> and secondly as an std::vector<float>. I've choosen the boost::multi_index container to do that:

typedef multi_index_container <
    float
  , indexed_by <
        ordered_non_unique<identity<float>> // similar to std::multiset 
      , sequenced<>                         // similar to std::vector
    >
> Floats;

The query matrix is a simple one:

std::vector<float>

Actual Idea

In human words I want to do following:

take the n_th element (qn) of the query
search for a set of elements (S_qn) in data
- within a range (e.g. +- 0.001) around qn
- via ordered_non_unique index (because it's fast)
go to elements in data (N_S_qn)
- which are next to the elements of S_qn
- without searching
- but via sequence index (because it's fast)
store the results (see example output)
qn = qn+1
go to step 1.
find (sub-)sequences in results (see explanation)

Example Output

q_pos q_val d_pos d_val
0 0.00221354 416 0.000512153
1 0.000512153 413 0.00480252
1 0.000512153 419 0
1 0.000512153 418 0.00046875
1 0.000512153 417 0.000494792
1 0.000512153 415 0.00221354
2 0.000494792 413 0.00480252
2 0.000494792 419 0
2 0.000494792 418 0.00046875
2 0.000494792 417 0.000494792
2 0.000494792 415 0.00221354
3 0.241994 289 0.151168
3 0.241994 390 0.543963
3 0.241994 374 0.237778
3 0.241994 285 0.0940256
3 0.241994 153 0.136803
4 0.0940256 139 0.0823915
4 0.0940256 135 0.102841
4 0.0940256 239 0.0704384
4 0.0940256 145 0.11661
4 0.0940256 286 0.19742
4 0.0940256 193 0.136753
4 0.0940256 190 0.0665234
4 0.0940256 130 0.111161
4 0.0940256 357 0.156578
5 0.19742 350 0.202847
5 0.19742 287 0.301233
6 0.00221354 416 0.000512153
7 0.000512153 413 0.00480252
7 0.000512153 419 0
7 0.000512153 418 0.00046875
7 0.000512153 417 0.000494792
7 0.000512153 415 0.00221354
8 0.000494792 413 0.00480252
8 0.000494792 419 0
8 0.000494792 418 0.00046875
8 0.000494792 417 0.000494792
8 0.000494792 415 0.00221354

Explanation

The output shows three short subsequences

0 0.00221354 416 0.000512153
1 0.000512153 417 0.000494792
2 0.000494792 418 0.00046875

3 0.241994 285 0.0940256
4 0.0940256 286 0.19742
5 0.19742 287 0.301233

6 0.00221354 416 0.000512153
7 0.000512153 417 0.000494792
8 0.000494792 418 0.00046875

The subsequences differ from the other ones. E.g. The number right to 286 is equal to the number right to 5 AND 286-predecessor(285)==1 which is equal to 4-predecessor(3)==1

Code

#include <iostream>
#include <vector>
#include <deque>
#include <iterator>
#include <boost/multi_index_container.hpp>
#include <boost/multi_index/ordered_index.hpp>
#include <boost/multi_index/sequenced_index.hpp>
#include <boost/multi_index/identity.hpp>

using namespace boost::multi_index;

typedef multi_index_container <
    float
  , indexed_by <
        ordered_non_unique<identity<float>>
      , sequenced<>
    >
> Floats;

Floats data {
    0.0880295,  0.226261,   0.391886,   0.222645,   0.245799,   0.217122,
    0.346469,   0.247485,   0.13814,    0.097832,   0.146678,   0.125512,
    0.144755,   0.222828,   0.223468,   0.0831966,  0.0876042,  0.256921,
    0.127172,   0.266851,   0.144833,   0.120677,   0.0984158,  0.149169,
    0.193381,   0.189253,   0.0868424,  0.0873459,  0.087309,   0.0881337,
    0.0682639,  0.0847938,  0.0902474,  0.0776411,  0.0660243,  0.064056,
    0.180571,   0.0884874,  0.167904,   0.116372,   0.11505,    0.125082,
    0.0733008,  0.0693511,  0.07523,    0.0887326,  0.103388,   0.192474,
    0.146061,   0.193763,   0.148796,   0.126345,   0.310319,   0.292378,
    0.15633,    0.238895,   0.314837,   0.103016,   0.229716,   0.264258,
    0.214961,   0.166955,   0.282539,   0.375046,   0.117567,   0.202693,
    0.159527,   0.106814,   0.114234,   0.109835,   0.125579,   0.113058,
    0.105328,   0.07727,    0.0995747,  0.0735286,  0.217444,   0.12372,
    0.113071,   0.136882,   0.223724,   0.0816884,  0.0726367,  0.0786393,
    0.0903125,  0.14577,    0.104147,   0.0639258,  0.0664323,  0.0600347,
    0.204054,   0.208231,   0.275519,   0.0914431,  0.0875499,  0.075549,
    0.0773872,  0.0961089,  0.200859,   0.156671,   0.182509,   0.194792,
    0.143105,   0.143895,   0.118826,   0.0815885,  0.0691775,  0.0650543,
    0.077322,   0.167313,   0.0872938,  0.118657,   0.0793251,  0.235447,
    0.175475,   0.108485,   0.125725,   0.0840929,  0.133828,   0.136571,
    0.159488,   0.161682,   0.128987,   0.105339,   0.0722765,  0.0852756,
    0.077602,   0.0875152,  0.102541,   0.0948633,  0.111161,   0.114935,
    0.112914,   0.0963824,  0.0935395,  0.102841,   0.0886719,  0.107177,
    0.093316,   0.0823915,  0.0716254,  0.11276,    0.110408,   0.0929405,
    0.0939779,  0.11661,    0.194408,   0.302181,   0.144759,   0.475888,
    0.341254,   0.259182,   0.242298,   0.136803,   0.260516,   0.301554,
    0.282263,   0.193691,   0.319655,   0.265002,   0.174405,   0.213088,
    0.151474,   0.117726,   0.123275,   0.177355,   0.28036,    0.220469,
    0.116304,   0.171237,   0.0682943,  0.0786306,  0.0988672,  0.113257,
    0.0676953,  0.0730534,  0.0844835,  0.0783919,  0.0744401,  0.118225,
    0.0797266,  0.141914,   0.0825282,  0.0807509,  0.104989,   0.0984028,
    0.190592,   0.13076,    0.141476,   0.0946745,  0.0665234,  0.0721311,
    0.0941819,  0.136753,   0.0672287,  0.0794444,  0.0927517,  0.0761372,
    0.0795877,  0.107576,   0.0785113,  0.0746376,  0.078342,   0.0666298,
    0.0685286,  0.0639453,  0.0668533,  0.0527062,  0.0772743,  0.069796,
    0.0708854,  0.0855469,  0.0715712,  0.0567209,  0.0524653,  0.101489,
    0.12847,    0.0838455,  0.180063,   0.153225,   0.0834701,  0.0778407,
    0.0876563,  0.0802648,  0.081122,   0.153535,   0.130744,   0.0985937,
    0.111478,   0.130879,   0.105336,   0.12355,    0.0956402,  0.131895,
    0.0957661,  0.0646072,  0.0774067,  0.0981641,  0.0936827,  0.0704384,
    0.0611675,  0.0550391,  0.0525434,  0.186207,   0.186398,   0.121753,
    0.131758,   0.138077,   0.131426,   0.200039,   0.281137,   0.203429,
    0.178511,   0.25569,    0.155723,   0.148943,   0.1202,     0.10926,
    0.183735,   0.183743,   0.255571,   0.941289,   0.89924,    0.214197,
    0.17352,    0.300163,   0.186087,   0.153168,   0.156487,   0.272346,
    0.112641,   0.330271,   0.177431,   0.22755,    0.247396,   0.297569,
    0.229143,   0.178937,   0.377181,   0.303188,   0.107593,   0.0850846,
    0.0823806,  0.189846,   0.241994,   0.0940256,  0.19742,    0.301233,
    0.241302,   0.151168,   0.387852,   0.152539,   0.0814909,  0.100688,
    0.12049,    0.100516,   0.110681,   0.160803,   0.192363,   0.21661,
    0.126094,   0.463813,   0.422639,   0.71377,    0.394594,   0.190033,
    0.744705,   0.646013,   0.534883,   0.327589,   0.468882,   0.210421,
    0.804282,   0.477276,   0.686155,   0.745506,   0.678203,   0.557526,
    0.585575,   0.142582,   0.194462,   1.37214,    0.838103,   0.895616,
    0.902333,   0.437261,   0.529586,   0.28717,    0.16661,    0.154312,
    0.409946,   0.574219,   0.617726,   0.600436,   0.439466,   0.302841,
    0.315707,   0.188863,   0.116543,   0.155879,   0.201241,   0.245734,
    0.181951,   0.265951,   0.286096,   0.527272,   0.564462,   0.60875,
    0.173657,   0.196421,   0.202847,   0.261918,   0.257372,   0.247153,
    0.175794,   0.125009,   0.0950022,  0.156578,   0.413086,   0.401313,
    0.171013,   0.164028,   0.116259,   0.134299,   0.158809,   0.135697,
    0.161725,   0.100851,   0.126265,   0.114529,   0.490764,   0.474581,
    0.27171,    0.241534,   0.237778,   0.205924,   0.263615,   0.378711,
    0.237318,   0.522326,   0.604104,   0.578088,   0.318513,   0.418655,
    0.467813,   0.584076,   0.258585,   0.56207,    0.239733,   0.241463,
    0.543963,   0.595384,   0.470145,   0.415202,   0.341942,   0.593281,
    0.370286,   0.250712,   0.209967,   0.320931,   0.476341,   0.239861,
    0.266044,   0.140451,   0.359811,   0.283312,   0.330161,   0.205534,
    0.273644,   1.87506,    0.0116385,  0.0199392,  0.000299479,0.00480252,
    0.000601128,0.00221354, 0.000512153,0.000494792,0.00046875
};

std::vector<float> query {
    0.00221354, 0.000512153,    0.000494792,    0.241994,   0.0940256,
    0.19742,    0.00221354,     0.000512153,    0.000494792
};

int main() {
    Floats::nth_index<0>::type::iterator lower;
    Floats::nth_index<0>::type::iterator upper;
    Floats::nth_index<1>::type::iterator next;
    std::cout << "q_pos q_val d_pos d_val";
    for (auto i=query.begin(),j=query.end();i<j;++i) {
        lower=data.get<0>().lower_bound(*i-0.001);
        upper=data.get<0>().lower_bound(*i+0.001);
        while (lower!=upper) {
            next=++(data.project<1>(lower++));
            std::cout
                << "\n" << std::distance(query.begin(),i)
                << " " << *i
                << " " << std::distance(get<1>(data).begin(),next)
                << " " << *next
            ;
        }
    }
    // TODO find an intelligent way through output to detect sequences
}

Question

Is the show implementation the fastest way to achieve my goal??

Joaquín M López Muñoz · Accepted Answer · 2015-11-26T08:13:45.773

I don't quite get the overall subsequence matching strategy you follow, but focusing on the code alone there are two things you can do to improve efficiency:

Having a sequenced index means that the expression
```
std::distance(get<1>(data).begin(),next)
```
runs in linear time. Define this index as random_access instead
```
typedef multi_index_container <
    float
  , indexed_by <
        ordered_non_unique<identity<float>>
      , random_access<>
    >
> Floats;
```
and the same expression will execute in constant time. I expect the resulting speedup wil be huge.
The code
```
lower=data.get<0>().lower_bound(*i-0.001);
upper=data.get<0>().lower_bound(*i+0.001);
```
looks inside the index twice; you can do it slightly more efficiently by bundling the two ops into one:
```
std::tie(lower,upper)=data.get<0>().range(
    [lo=*i-0.001](float x){return lo<=x;},
    [hi=*i+0.001](float x){return x<=hi;}
);
```
range coalesces the two lookups into one to the maximum extent possible (the internal rb-tree is walked down just once till the point where the two lookup paths diverge), which I expect will yield a modest improvement in performance (of course, you can only know for sure by measuring). The named captures [lo=*i-0.001] and [hi=*i+0.001] are only supported in C++14, if you just have C++11 you'll need to precalculate these values outside the lambda functions. BTW, the interval you get is not exactly symmetrical: to make it so, you should do upper=data.get<0>().upper_bound(*i+0.001).