6

TL;DR: I would very much like for the order of iteration of in_edges on my graph (adjacency_list with edge_list of setS) to be determinstic, but as far as I can tell, the order of iteration is determined by a comparison operator that is just pointer comparison---thus iteration order is determined by the vagaries of malloc. Help!

For concreteness, my graph and related types are:

struct VertexCargo { int Id; ... };

typedef adjacency_list<setS, vecS, bidirectionalS, property<vertex_info_t, VertexCargo> > Graph;

typedef graph_traits<Graph>::edge_descriptor   ED;
typedef graph_traits<Graph>::vertex_descriptor VD;

My logic, in case anyone spots a fallacy anywhere:

  1. in_edges iteration is directly determined by iteration of the edge list container

  2. edge list of setS implies underlying container std::set<edge_desc_impl> Note: this assumption was incorrect; it is actually a std::set<StoredEdge, which offers a comparator that compares on edge target.

  3. std::set<edge_desc_impl> iteration order determined by operator<(edge_desc_impl, edge_desc_impl)

  4. operator<(edge_desc_impl, edge_desc_impl) ends up being just a pointer comparison;

The operator< is defined in boost/graph/detail/edge.hpp (boost 1.58):

30      template <typename Directed, typename Vertex>
31      class edge_desc_impl  : public edge_base<Directed,Vertex> {

...

35        typedef void                              property_type;
36
37        inline edge_desc_impl() : m_eproperty(0) {}
38
39        inline edge_desc_impl(Vertex s, Vertex d, const property_type* eplug)
40          : Base(s,d), m_eproperty(const_cast<property_type*>(eplug)) { }
41
42        property_type* get_property() { return m_eproperty; }
43        const property_type* get_property() const { return m_eproperty; }
44
45        //  protected:
46        property_type* m_eproperty;
47      };
48

...

63
64      // Order edges according to the address of their property object
65      template <class D, class V>
66      inline bool
67      operator<(const detail::edge_desc_impl<D,V>& a,
68                 const detail::edge_desc_impl<D,V>& b)
69      {
70        return a.get_property() < b.get_property();
71      }

I would really like it if there were a way to induce the comparison (and thus iteration order) to be based on something deterministic (i.e. not pointer addresses determined by mallloc; for example a custom operator I write that looks at VertexCargo::Id for source and target of the edges) but it looks like that might not be workable here because of the void* cast.

From the code above it also looks possible (?) to inject a property giving the desired ordering. But that strikes me as a hack.

Anybody have wisdom to share?


Note on the answer

@sehe's response is the correct answer---the underlying container is in fact a std::set<StoredEdge>, which defines an operator< that compares on the edge target; this is deterministic when the vertex container selector is vecS but not in general (since the vertex identifiers in other cases are basically pointers gotten from malloc).

David Alexander
  • 358
  • 1
  • 12
  • What is the goal. Just in-order iteration, or in-order underlying storage? (IOW are you hoping for this to be an optimization, or is a clarity-of-code concern) – sehe Jun 21 '15 at 19:58
  • my goal is in-order iteration; I do not require in-order storage if that is not necessary to achieve that. but I'm hoping not to have to dump to a vector and then sort manually, if that's what you're getting at... – David Alexander Jun 21 '15 at 20:43
  • I expected this to be easier than it looked, and it is. Some of the assumed types weren't quite alright. See my answer :) – sehe Jun 21 '15 at 22:40

1 Answers1

7

As I expected (from my recollection) the edge lists (both in/out) are already ordered by their targets, hence they are deterministic.

This is especially unambiguous when the VertexContainer selector is vecS because, there, the vertex_descriptor is a simple integral type, that doubles as your vertex_index_t property anyways.

Down The Rabbit Hole

Because I'm not a Boost Graph developer, and hence I donot know the architecture of the BGL types like adjacency_list, I naively started at our top-level entry point:

template <class Config>
inline std::pair<typename Config::in_edge_iterator,
                 typename Config::in_edge_iterator>
in_edges(typename Config::vertex_descriptor u,
         const bidirectional_graph_helper<Config>& g_)
{
  typedef typename Config::graph_type graph_type;
  const graph_type& cg = static_cast<const graph_type&>(g_);
  graph_type& g = const_cast<graph_type&>(cg);
  typedef typename Config::in_edge_iterator in_edge_iterator;
  return
    std::make_pair(in_edge_iterator(in_edge_list(g, u).begin(), u),
                   in_edge_iterator(in_edge_list(g, u).end(), u));
}

Which is instantiated as:

std::pair<typename Config::in_edge_iterator, typename Config::in_edge_iterator>
boost::in_edges(typename Config::vertex_descriptor, const boost::bidirectional_graph_helper<C> &)

with

Config = boost::detail::adj_list_gen<
    boost::adjacency_list<boost::setS, boost::vecS, boost::bidirectionalS, VertexCargo>, boost::vecS, boost::setS,
    boost::bidirectionalS, VertexCargo, boost::no_property, boost::no_property, boost::listS>::config;

typename Config::in_edge_iterator = boost::detail::in_edge_iter<
    std::_Rb_tree_const_iterator<boost::detail::stored_edge_iter<
        long unsigned int, std::_List_iterator<boost::list_edge<long unsigned int, boost::no_property> >,
        boost::no_property> >,
    long unsigned int, boost::detail::edge_desc_impl<boost::bidirectional_tag, long unsigned int>, long int>;

 typename Config::vertex_descriptor = long unsigned int

Filling in

using Config = boost::detail::adj_list_gen<
     boost::adjacency_list<boost::setS, boost::vecS, boost::bidirectionalS, VertexCargo>, boost::vecS, boost::setS,
     boost::bidirectionalS, VertexCargo, boost::no_property, boost::no_property, boost::listS>::config;

Points us the instantiation at adj_list_gen<...>::config and there the iterator is declared as

typedef in_edge_iter<
    InEdgeIter, vertex_descriptor, edge_descriptor, InEdgeIterDiff
> in_edge_iterator;

// leading to
typedef OutEdgeIter InEdgeIter;

// leading to
typedef typename OutEdgeList::iterator OutEdgeIter;

// leading to
typedef typename container_gen<OutEdgeListS, StoredEdge>::type OutEdgeList;

And because the container selector is setS, it will be a std::set of StoredEdge, which is

typedef typename mpl::if_<on_edge_storage,
    stored_edge_property<vertex_descriptor, EdgeProperty>,
    typename mpl::if_<is_edge_ra,
    stored_ra_edge_iter<vertex_descriptor, EdgeContainer, EdgeProperty>,
    stored_edge_iter<vertex_descriptor, EdgeIter, EdgeProperty>
    >::type
>::type StoredEdge;

Which evaluates to

boost::detail::stored_edge_iter<
    long unsigned int,
    std::_List_iterator<boost::list_edge<long unsigned int, boost::no_property> >,
    boost::no_property>

Now this, of course, points to the implementation of the EdgeList...

Now Hold It! Hit The Brakes

But what's most important is the total weak ordering imposed - so we don't go further down that rabbit-hole, but instead shift our attention to stored_edge_iter<>::operator< or similar.

  inline bool operator<(const stored_edge& x) const
    { return m_target < x.get_target(); }

Aha! The ordering is already deterministically defined. You can access it directly using e.g.

for (auto v : make_iterator_range(vertices(g))) {
    std::cout << v <<  " --> ";

    auto const& iel = boost::in_edge_list(g, v);
    for (auto e : iel) std::cout << e.get_target() << " ";
    std::cout << "\n";
}

But you don't need to. Using the generic graph accessors would be pretty much the same:

std::cout << v <<  " --> ";
for (auto e : make_iterator_range(in_edges(v, g)))  std::cout << source(e, g) << " ";
std::cout << "\n";

And you can verify that the collections are properly sorted as expect using e.g.

assert(boost::is_sorted(make_iterator_range(in_edges(v,g))  | transformed(sourcer(g))));
assert(boost::is_sorted(make_iterator_range(out_edges(v,g)) | transformed(targeter(g))));

DEMO

Here's a full demo including all the above and asserting all the expected orderings on a large random-generated graph:

Live On Coliru

#include <iostream>
#include <boost/graph/adjacency_list.hpp>
#include <boost/graph/properties.hpp>
#include <boost/graph/random.hpp>
#include <boost/graph/graph_utility.hpp>
#include <random>

#include <boost/range/adaptors.hpp>
#include <boost/range/algorithm.hpp>
#include <boost/range/algorithm_ext.hpp>

using boost::adaptors::transformed;
using namespace boost;

struct VertexCargo { int Id = rand() % 1024; };

typedef adjacency_list<setS, vecS, bidirectionalS, VertexCargo> Graph;

typedef graph_traits<Graph>::edge_descriptor   ED;
typedef graph_traits<Graph>::vertex_descriptor VD;

struct sourcer {
    using result_type = VD;

    Graph const* g;
    sourcer(Graph const& g) : g(&g) {}
    VD operator()(ED e) const { return boost::source(e, *g); }
};

struct targeter {
    using result_type = VD;

    Graph const* g;
    targeter(Graph const& g) : g(&g) {}
    VD operator()(ED e) const { return boost::target(e, *g); }
};

int main() {

    std::mt19937 rng { std::random_device{}() };

    Graph g;
    generate_random_graph(g, 1ul<<10, 1ul<<13, rng);

    for (auto v : make_iterator_range(vertices(g))) {
        std::cout << v <<  " --> ";

        //auto const& iel = boost::in_edge_list(g, v);
        //for (auto e : iel) std::cout << e.get_target() << " ";

        for (auto e : make_iterator_range(in_edges(v, g)))  std::cout << source(e, g) << " ";
        //for (auto e : make_iterator_range(out_edges(v, g))) std::cout << target(e, g) << " ";

        std::cout << "\n";

        assert(boost::is_sorted(make_iterator_range(in_edges(v,g))  | transformed(sourcer(g))));
        assert(boost::is_sorted(make_iterator_range(out_edges(v,g)) | transformed(targeter(g))));
    }
}
sehe
  • 374,641
  • 47
  • 450
  • 633
  • Thanks for going down the rabbit hole! Did you do this using IDE help or just reading all the source? I find the boost source a bit daunting I must admit, and in some cases it defeat my IDE. iRegarding your answer: it seems to be correct---still reading the code... --- but in fact since this order relationship is only a partial order, `in_edges` may remain nondeterministic concerning the order of edges converging on a common target. – David Alexander Jun 22 '15 at 04:10
  • Erm. `std::set` by definition doesn't contain elements with equivalent ordering, so your problem doesn't exist :) – sehe Jun 22 '15 at 06:38
  • 1
    I used gcc + vim + YouCompleMe, by the way. I started off by planting `std::cerr<<__PRETTY_FUNCTION__<< "\n";` into `in_edges()` to get the overload+deduced types. Then it was mainly reading the source, and the `operator<` was located using `YcmCompleter GoToDeclaration` on [the value of foo here](http://paste.ubuntu.com/11755038/). Yes, boost is daunting for me too. Feels more than necessary, sometimes. Oh well :) – sehe Jun 22 '15 at 06:44
  • Ah, yes you are correct regarding `std::set`, of course... it only contains one entry per equivalence class... Anyway my mental model for the way edges were stored was incorrect, hence my confusion here. Thanks for drawing my attention to `stored_edge`. The comparison is unambiguous. Let me see where my problem now lies.... – David Alexander Jun 22 '15 at 14:51
  • Hi @sehe, thanks to your help I think I have identified the problem. In fact in my codebase it turns out my vector list selector is `listS` not `vecS`, and in this case the comparison you identified is dependent on in-memory addresses. Frankly I am a bit surprised this is so; perhaps it would be more correct for the BGL implementers to compare on `vertex_index`? – David Alexander Jun 22 '15 at 18:49
  • It could be "as correct" and incur costs (ordering of EdgeLists is not specified because it isn't required for BGL). All other places with non-integral vertex-descriptors only supply vertex_index when one is needed. Requiring it internally to sort the edges, would incur the overhead of updating it for every mutation operation. Also, the vertex index would be `const` because any change would silently invalidate all the EdgeLists that contained the related vertex. – sehe Jun 22 '15 at 22:14
  • So, in fact if you need it, it might be time to adapt your own Graph model. As a long shot you could look at using `stable_vector` (I haven't tried whether `adjacency_list` works with that as the VertexContainer, but it _would_ give stability of iterators and integral descriptors iff BGL picks up on the random-access nature of the iterators) – sehe Jun 22 '15 at 22:15