Data structure for selecting groups of machines

Question

I have this old batch system. The scheduler stores all computational nodes in one big array. Now that's OK for the most part, because most queries can be solved by filtering for nodes that satisfy the query.

The problem I have now is that apart from some basic properties (number of cpus, memory, OS), there are also these weird grouping properties (city, infiniband, network scratch).

Now the issue with these is that when a user requests nodes with infiniband I can't just give him any nodes, but I have to give him nodes connected to one infiniband switch, so the nodes can actually communicate using infiniband.

This is still OK, when user only requests one such property (I can just partition the array for each of the properties and then try to select the nodes in each partition separately).

The problem comes with combining multiple such properties, because then I would have to generate all combination of the subsets (partitions of the main array).

The good thing is that most of the properties are in a sub-set or equivalence relation (It sort of makes sense for machines on one infiniband switch to be in one city). But this unfortunately isn't strictly true.

Is there some good data structure for storing this kind of semi-hierarchical mostly-tree-like thing?

EDIT: example

node1 : city=city1, infiniband=switch03, networkfs=server01
node2 : city=city1, infiniband=switch03, networkfs=server01
node3 : city=city1, infiniband=switch03
node4 : city=city1, infiniband=switch03
node5 : city=city2, infiniband=switch03, networkfs=server02
node6 : city=city2, infiniband=switch03, networkfs=server02
node7 : city=city2, infiniband=switch04, networkfs=server02
node8 : city=city2, infiniband=switch04, networkfs=server02

Users request:

2x node with infiniband and networkfs

The desired output would be: (node1, node2) or (node5,node6) or (node7,node8).

In a good situation this example wouldn't happen, but we actually have these weird cross-site connections in some cases. If the nodes in city2 would be all on infiniband switch04, it would be easy. Unfortunately now I have to generate groups of nodes, that have the same infiniband switch and same network filesystem.

In reality the problem is much more complicated, since users don't request entire nodes, and the properties are many.

Edit: added the desired output for the query.

Perhaps if you give a more concrete example set, it will be easier to describe your problem — Peaches491, Jun 26 '12 at 12:37
And you can't use an in-memory DB which already handles this kind of complexity for you? I guess you already looked at the multi_index container from boost and decided not to roll something from that? — Nim, Jun 26 '12 at 12:51
@Nim Well, if there wouldn't be any other possibility, then maybe. But I would much prefer some data structure that I can completely integrate into the system. I wouldn't like to pull something that could limit me in the future. — Šimon Tóth, Jun 26 '12 at 12:54
So, as a response to the sample query above, what would you want your answer to look like? A set of all possible machine combos? `{(node1, node2), (node5, node6), (node7, node8)}` — Peaches491, Jun 26 '12 at 12:59
@Peaches491 800 nodes right now, but I need to design it for thousands, the number of grouping properties is 5-6 right now, but again, that can grow significantly. The real problem is much more complicated, but that shouldn't invalidate the data structure if chosen properly. — Šimon Tóth, Jun 26 '12 at 13:10
and what about XML with X-path queries? This is not really a 'data-structure', but, wouldn't it solve the problem? — PermanentGuest, Jun 26 '12 at 13:17
When you say "with infiniband and networkfs" you mean a set of computers which are on the same IB and netfs networks? Or that just have that property in general? — Peaches491, Jun 26 '12 at 13:18
@Peaches491 Yeah, just like the output you posted. The only difference is that I would be in most cases only interested in the first match (not all matches). — Šimon Tóth, Jun 26 '12 at 13:25
@PermanentGuest I don't think that it would provide enough performance. I have heavily oversimplified the problem in this post. — Šimon Tóth, Jun 26 '12 at 13:30
What percentage of your cases are degenerate? if it is very small, it might change the kinds of solutions available to you. For instance, imagine if there was only a handful of cases where the properties were not bound together by a well defined rule. You could just solve the general case and then brute force the exceptions. — frankc, Jun 26 '12 at 14:31

score 3 · Accepted Answer · answered Jun 26 '12 at 13:40

Assuming you have p grouping properties and n machines, a bucket-based solution is the easiest to set up and provides O(2^p·log(n)) access and updates.

You create a bucket-heap for every group of properties (so you would have a bucket-heap for "infiniband", a bucket-heap for "networkfs" and a bucket-heap for "infiniband × networkfs") — this means 2^p bucket-heaps.
Each bucket-heap contains a bucket for every combination of values (so the "infiniband" bucket would contain a bucket for key "switch04" and one for key "switch03") — this means a total of at most n·2^p buckets split across all bucket-heaps.
Each bucket is a list of servers (possibly partitioned into available and unavailable). The bucket-heap is a standard heap (see std::make_heap) where the value of each bucket is the number of available servers in that bucket.
Each server stores references to all buckets that contain it.
When you look for servers that match a certain group of properties, you just look in the corresponding bucket for that property group, and climb down the heap looking for a bucket that's large enough to accomodate the number of servers requested. This takes O(log(p)·log(n)).
When servers are marked as available or unavailable, you have to update all buckets containing those servers, and then update the bucket-heaps containing those buckets. This is an O(2^p·log(n)) operation.

If you find yourself having too many properties (and the 2^p grows out of control), the algorithm allows for some bucket-heaps to be built on-demand from other bucket-heaps : if the user requests "infiniband × networkfs" but you only have a bucket-heap available for "infiniband" or "networkfs", you can turn each bucket in the "infiniband" bucket-heap into a bucket-heap on its own (use a lazy algorithm so you don't have to process all buckets if the first one works) and use a lazy heap-merging algorithm to find an appropriate bucket. You can then use a LRU cache to decide which property groups are stored and which are built on-demand.

Yeah, this looks reasonable. I was thinking about something similar as well, but the problem was the future growth of the amount of properties (I'm on 4-6 already). — Šimon Tóth, Jun 26 '12 at 13:49
The lazy construction, possibly combined with an LRU, might be acceptable even with 20+ properties, if some groups are used far more often than others. — Victor Nicollet, Jun 26 '12 at 14:02
+1. But here's my next complaint :-P When you say that updating after a server becomes available/unavailable is O(2^p·log(n)): I know there are O(log n) (and even O(1)) decrease-key operations for heaps, but don't you also need an O(log n) or better increase-key operation to keep this time complexity? TTBOMK no heaps support both operations this fast -- but I'd love to be proved wrong! — j_random_hacker, Jun 26 '12 at 16:11

score 0 · Answer 2 · answered Jun 26 '12 at 13:35

0

My guess is that there won't be an "easy, efficient" algorithm and data structure to solve this problem, because what you're doing is akin to solving a set of simultaneous equations. Suppose there are 10 categories (like city, infiniband and network) in total, and the user specifies required values for 3 of them. The user asks for 5 nodes, let's say. Your task is then to infer values for the remaining 7 categories, such that at least 5 records exist that have all 10 category fields equal to these values (the 3 specified and the 7 inferred). There may be multiple solutions.

Still, provided there aren't too many different categories, and not too many distinct possibilities within each category, you can do a simple brute force recursive search to find possible solutions, where at each level of recursion you consider a particular category, and "try" each possibility for it. Suppose the user asks for k records, and may choose to stipulate any number of requirements via required_city, required_infiniband, etc.:

either(x, y) := if defined(x) then [x] else y

For each city c in either(required_city, [city1, city2]):
  For each infiniband i in either(required_infiniband, [switch03, switch04]):
    For each networkfs nfs in either(required_nfs, [undefined, server01, server02]):
      Do at least k records of type [c, i, nfs] exist?  If so, return them.

The either() function is just a way of limiting the search to the subspace containing points that the user gave constraints for.

Based on this, you will need a way to quickly look up the number of points (rows) for any given [c, i, nfs] combination -- nested hashtables will work just fine for this.

answered Jun 26 '12 at 13:35

j_random_hacker

50,331
10
105
169

Yeah, well, that's exactly what I can't do. I can't brute force due to performance issues. Plus I'm not expecting and easy solution :-D – Šimon Tóth Jun 26 '12 at 13:39
The total number of computers isn't important at all -- all that matters for execution time is the product of the number of possibilities in each category. So how many categories, and how many possibilities in each? – j_random_hacker Jun 26 '12 at 13:41
My issue is actually the number of queries and future scaling (the number of categories is going to grow with time). – Šimon Tóth Jun 26 '12 at 13:43
OK, but what sort of numbers are we talking for the product of numbers of possibilities? < 1,000,000 will take ~1ms :) – j_random_hacker Jun 26 '12 at 13:48
Yeah well, I don't have a return on the bottom. There is complex matching involved. The O(1) in this case can reach upto 1s. – Šimon Tóth Jun 26 '12 at 13:51
Instead of the "If so, return them", I have another set of complex checks. – Šimon Tóth Jun 26 '12 at 14:10
I see. But won't those complex checks apply regardless of what method you use to find an initial "candidate set" of servers? I mean, how do you hope to overcome these complex checks? – j_random_hacker Jun 26 '12 at 14:12
That's not the point :-) The point is that the assumption that it will take ~1ms is wrong. The complexity matters a lot since the bottom part is expensive. – Šimon Tóth Jun 26 '12 at 14:18
The time taken by your extra complex checks will be the same for any method proposed here though, won't it! – j_random_hacker Jun 26 '12 at 14:22

score 0 · Answer 3 · answered Jun 26 '12 at 13:56

Step 1: Create an index for each property. E.g. for each property+value pair, create a sorted list of nodes with that property. Put each such list into an associative array of some kind- That is something like and stl map, one for each property, indexed by values. Such that when you are done you have a near constant time function that can return to you a list of nodes that match a single property+value pair. The list is simply sorted by node number.

Step 2: Given a query, for each property+value pair required, retrieve the list of nodes.

Step 3: Starting with the shortest list, call it list 0, compare it to each of the other lists in turn removing elements from list 0 that are not in the other lists.

You should now have just the nodes that have all the properties requested.

Your other option would be to use a database, it is already set up to support queries like this. It can be done all in memory with something like BerkeleyDB with the SQL extensions.

The queries are not of the form `WHERE A = $1 AND B = $2`, but rather of the form `GROUP BY A, B HAVING COUNT(*) >= $1` : there are no values involved, the point of the algorithm is precisely to return those values. — Victor Nicollet, Jun 26 '12 at 14:25

svinja · Answer 4 · 2012-06-26T14:22:14.177

If sorting the list by every criteria mentioned in the query is viable (or having the list pre-sorted by each relative criteria), this works very well.

By "relative criteria", I mean criteria not of the form "x must be 5", which are trivial to filter against, but criteria of the form "x must be the same for each item in the result set". If there are also criteria of the "x must be 5" form, then filter against those first, then do the following.

It relies on using a stable sort on multiple columns to find the matching groups quickly (without trying out combinations).

The complexity is number of nodes * number of criteria in the query (for the algorithm itself) + number of nodes * log(number of nodes) * number of criteria (for the sort, if not pre-sorting). So Nodes*Log(Nodes)*Criteria.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace bleh
{
    class Program
    {
        static void Main(string[] args)
        {
            List<Node> list = new List<Node>();

            // create a random input list
            Random r = new Random();
            for (int i = 1; i <= 10000; i++)
            {
                Node node = new Node();
                for (char c = 'a'; c <= 'z'; c++) node.Properties[c.ToString()] = (r.Next() % 10 + 1).ToString();
                list.Add(node);
            }

            // if you have any absolute criteria, filter the list first according to it, which is very easy
            // i am sure you know how to do that

            // only look at relative criteria after removing nodes which are eliminated by absolute criteria
            // example 
            List<string> criteria = new List<string> {"c", "h", "r", "x" };
            criteria = criteria.OrderBy(x => x).ToList();

            // order the list by each relative criteria, using a ***STABLE*** sort
            foreach (string s in criteria)
                list = list.OrderBy(x => x.Properties[s]).ToList();

            // size of sought group
            int n = 4;

            // this is the algorithm
            int sectionstart = 0;
            int sectionend = 0;
            for (int i = 1; i < list.Count; i++)
            {
                bool same = true;
                foreach (string s in criteria) if (list[i].Properties[s] != list[sectionstart].Properties[s]) same = false;
                if (same == true) sectionend = i;
                else sectionstart = i;
                if (sectionend - sectionstart == n - 1) break;
            }

            // print the results
            Console.WriteLine("\r\nResult:");
            for (int i = sectionstart; i <= sectionend; i++)
            {
                Console.Write("[" + i.ToString() + "]" + "\t");
                foreach (string s in criteria) Console.Write(list[i].Properties[s] + "\t");
                Console.WriteLine();
            }
            Console.ReadLine();
        }
    }
}

You don't need multiple stable sorts - sorting once with a lexicographic comparison function yields the same result but with less shuffling around. Either way, you have a complexity of O(p·n·log(n)) for a single query. — Victor Nicollet, Jun 26 '12 at 14:22

score -1 · Answer 5 · answered Jun 26 '12 at 17:03

I would do something like this (obviously instead of strings you should map them to int, and use int's as codes)

struct structNode
{
    std::set<std::string> sMachines;
    std::map<std::string, int> mCodeToIndex;    
    std::vector<structNode> vChilds;        
};

void Fill(std::string strIdMachine, int iIndex, structNode* pNode, std::vector<std::string> &vCodes)
{
    if(iIndex < vCodes.size())
    {           
        // Add "Empty" if Needed
        if(pNode->vChilds.size() == 0)
        {
            pNode->mCodeToIndex.insert(pNode->mCodeToIndex.begin(), make_pair("empty", 0));
            pNode->vChilds.push_back(structNode());
        }

        // Add for "Empty"
        pNode->vChilds[0].sMachines.insert(strIdMachine);
        Fill(strIdMachine, (iIndex + 1), &pNode->vChilds[0], vCodes );

        if(vCodes[iIndex] == "empty")
            return;


        // Add for "Any"        
        std::map<std::string, int>::iterator mIte = pNode->mCodeToIndex.find("any");
        if(mIte == pNode->mCodeToIndex.end())
        {
            mIte = pNode->mCodeToIndex.insert(pNode->mCodeToIndex.begin(), make_pair("any", pNode->vChilds.size()));
            pNode->vChilds.push_back(structNode());     
        }

        pNode->vChilds[mIte->second].sMachines.insert(strIdMachine);
        Fill(strIdMachine, (iIndex + 1), &pNode->vChilds[mIte->second], vCodes );


        // Add for "Segment"
        mIte = pNode->mCodeToIndex.find(vCodes[iIndex]);
        if(mIte == pNode->mCodeToIndex.end())
        {
            mIte = pNode->mCodeToIndex.insert(pNode->mCodeToIndex.begin(), make_pair(vCodes[iIndex], pNode->vChilds.size()));
            pNode->vChilds.push_back(structNode());                 
        }

        pNode->vChilds[mIte->second].sMachines.insert(strIdMachine);
        Fill(strIdMachine, (iIndex + 1), &pNode->vChilds[mIte->second], vCodes );

    }
}

//////////////////////////////////////////////////////////////////////
// Get
//
// NULL on empty group
//////////////////////////////////////////////////////////////////////
set<std::string>* Get(structNode* pNode, int iIndex, vector<std::string> vCodes, int iMinValue)
{       
    if(iIndex < vCodes.size())
    {       
        std::map<std::string, int>::iterator mIte = pNode->mCodeToIndex.find(vCodes[iIndex]);       
        if(mIte != pNode->mCodeToIndex.end())
        {
            if(pNode->vChilds[mIte->second].sMachines.size() < iMinValue)
                return NULL;
            else
                return Get(&pNode->vChilds[mIte->second], (iIndex + 1), vCodes, iMinValue);
        }
        else
            return NULL;        
    }

    return &pNode->sMachines;   
}

To fill the tree with your sample

structNode stRoot;

    const char* dummy[] = { "city1", "switch03", "server01" };  
    const char* dummy2[] = { "city1", "switch03", "empty" };
    const char* dummy3[] = { "city2", "switch03", "server02" };
    const char* dummy4[] = { "city2", "switch04", "server02" };

    // Fill the tree with the sample    
    Fill("node1", 0, &stRoot, vector<std::string>(dummy, dummy + 3));
    Fill("node2", 0, &stRoot, vector<std::string>(dummy, dummy + 3));
    Fill("node3", 0, &stRoot, vector<std::string>(dummy2, dummy2 + 3));
    Fill("node4", 0, &stRoot, vector<std::string>(dummy2, dummy2 + 3));
    Fill("node5", 0, &stRoot, vector<std::string>(dummy3, dummy3 + 3));
    Fill("node6", 0, &stRoot, vector<std::string>(dummy3, dummy3 + 3));
    Fill("node7", 0, &stRoot, vector<std::string>(dummy4, dummy4 + 3));
    Fill("node8", 0, &stRoot, vector<std::string>(dummy4, dummy4 + 3));

Now you can easily obtain all the combinations that you want for example you query would be something like this:

vector<std::string> vCodes;
    vCodes.push_back("empty"); // Discard first property (cities)
    vCodes.push_back("any");   // Any value for infiniband
    vCodes.push_back("any");   // Any value for networkfs (except empty)

    set<std::string>* pMachines = Get(&stRoot, 0, vCodes, 2);

And for example only City02 on switch03 with networfs not empty

vector<std::string> vCodes;
    vCodes.push_back("city2"); // Only city2
    vCodes.push_back("switch03");   // Only switch03
    vCodes.push_back("any");   // Any value for networkfs (except empy)

    set<std::string>* pMachines = Get(&stRoot, 0, vCodes, 2);

Data structure for selecting groups of machines

5 Answers5