2

I need to do something a little strange with std::set_intersection and I cannot quite figure it out. I asked a similar question about a month ago, and thanks to the excellent responses to the question, I solved the issue of making the std::set_intersection work using a common link field between 2 vectors, each containing a different type of object.

The problem that I am now facing is that I am trying to get the code below to work, I basically need to write std::set_intersection's output to a new type which is effectively a union between some fields from StructA and other fields from StructB. I used a slightly modified sample written by user tclamb but it doesn't compile and I am a bit lost in the compiler errors. I am pretty sure that some of the problems I am facing are to do with the restriction that

According to the section Requirements on types in std::set_intersection InputIterator1 and InputIterator2 have the same value type. In my case this is not true, also in the case of the solution by tclamb it was also not the case, however it seemed to work.

I just edited the code below and incorporated @ivar's suggestions for some redundant code - this makes the problem easier to read - it is now compiling and running - but still producing results that are not quite what I want. The live code is also posted at coliru

#include<vector>
#include<algorithm>
#include<string>
#include <iostream>
#include <iterator>

// I wish to return a vector of these as the result
struct StructC {
    std::string mCommonField;
    std::string mNameFromA; // cherry picked from StructA
    std::string mNameFromB; // cherry picked from StructB
    float mFloatFromA;      // cherry picked from StructA
    int mIntFromB;          // cherry picked from StructB
};

struct StructA {
    // conversion operator from StructA to StructC
    operator StructC() { return { mCommonField, mNameAString, "[]", mFloatValueA, 0 }; }
    std::string mCommonField;
    std::string mNameAString;
    float mFloatValueA;
};

struct StructB {
    // conversion operator from StructB to StructC
    operator StructC() { return { mCommonField, "[]", mNameBString, 0.0f, mIntValueB }; }
    std::string mCommonField;
    std::string mNameBString;
    int mIntValueB;
};

// Comparator thanks to @ivar
struct Comparator {
    template<typename A, typename B>
    bool operator()(const A& a, const B& b) const {
        return a.mCommonField < b.mCommonField;
    }
};

template<typename CharT, typename Traits>
std::basic_ostream<CharT, Traits>& operator<<(std::basic_ostream<CharT, Traits>& os, StructC const& sc) {
    return os << sc.mCommonField << " - " << sc.mNameFromA << " - " 
       << sc.mNameFromB << " - " << std::fixed << sc.mFloatFromA << " - " << sc.mIntFromB << std::endl;
}

int main() {
    Comparator comparator;

    // initially unsorted list of StructA
    std::vector<StructA> aStructs = {
        {"hello", "nameA1", 1.0f}, 
        {"goodbye", "nameA2", 2.0f}, 
        {"foo", "nameA3", 3.0f}
    };

    // initially unsorted list of StructB
    std::vector<StructB> bStructs = {
        {"hello", "nameB1", 10},     // <-- intersection as mCommonField("hello") also in aStructs
        {"goodbye", "nameB2", 20},   // <-- intersection as mCommonField("goodbye") also in aStructs
        {"bar", "nameB3", 30}
    };

    // in the above list, "hello" & "goodbye" are the common in both aStructs & bStructs

    // pre-sort both sets before calling std::intersection
    std::sort(aStructs.begin(), aStructs.end(), comparator);
    std::sort(bStructs.begin(), bStructs.end(), comparator);

    std::vector<StructC> intersection;
    std::set_intersection(aStructs.begin(), aStructs.end(),
                          bStructs.begin(), bStructs.end(),
                          std::back_inserter(intersection),
                          comparator);

    std::copy(intersection.begin(), intersection.end(),
              std::ostream_iterator<StructC>(std::cout, ""));
    return 0;
}
Community
  • 1
  • 1
johnco3
  • 2,401
  • 4
  • 35
  • 67
  • Add to `C` two copy/move constructors with `A` and `B` as input parameters. I think this should be enough. – iavr Apr 18 '14 at 16:44
  • 1
    Do yourself a favor, and don't look at the SGI STL documentation anymore: The C++ standard library effectively forked from STL 20 years ago, and things have diverged since then. The [cppreference.com page for `std::set_intersection`](http://en.cppreference.com/w/cpp/algorithm/set_intersection), for example, doesn't list a requirement that both input iterator types need to have the same value type - because the C++ standard doesn't require them to do so. – Casey Apr 18 '14 at 17:23
  • Thanks, that is goood advice, It indicates "types Type1 and Type2 must be such that objects of types InputIt1 and InputIt2 can be dereferenced and then implicitly converted to Type1 and Type2 respectively." – johnco3 Apr 18 '14 at 17:58
  • @iavr could you give me a simple example of what you mean? I changed my code to have C conversion operators for both StructA and StructB - is that not equivalent, even so it is still not sufficient to achieve my needs. I could see this as being a useful thing if StructA and StructB were not my structs and were therefore unmodifiable. – johnco3 Apr 18 '14 at 18:38
  • @johnco3 General advice: don't do such major editing to your question, it makes it very hard to follow what's going on between question and answers. Better leave original question as is and add your new information below that, with clear indication that it's an edit. – iavr Apr 18 '14 at 21:32
  • @ivar Good advice, I agree. – johnco3 Apr 18 '14 at 21:49

3 Answers3

3

There're two mistakes. First, back_inserter creates a back_insert_iterator<vector<StructC>>, which has operator= on vector<StructC>::value_type, which is StructC. There's no conversion from StructA and StructB to StructC, so we need one. The easiest way to add one is

struct StructA {
    // ...
    operator StructC() { return {mCommonField, int(mFloatValue), 0}; }
};

etc. Second, there's no operator << for StructC. Fixing these bugs we have a fully functional solution.

UPD. Either you could put your results into vector<Common>, which looks very much like it's designed for this issue.

polkovnikov.ph
  • 6,256
  • 6
  • 44
  • 79
  • Thanks, that was really helpful, but now that you helped me with the correct conversion to an output iterator - through the operator conversion semantics, the problem is a little more complex. see http://coliru.stacked-crooked.com/a/9c1bd60ffdf37a56 where I have updated the comments and code to see what I mean, basically I need fields besides the common ones from both StructA and StructB to be in StructC - however the conversion operator in StructA would need fields from StructB to really fill the StructC properly, I'm not sure if this is possible. Very helpful answer though. – johnco3 Apr 18 '14 at 17:47
  • Also I am not sure why only 1 conversion operator was required from StructA to StructC and not from StructB to StructC. I suspect that the implementation of set::intersection after it finds a match - inserts the matched copy from the InputIterator1 - effectively *outputIter = *matchingIterator1 - however I need a kind of merge between the *matchingIterator1 and *matchingIterator2 (StructA and StructB) but I have no idea how to achieve that. – johnco3 Apr 18 '14 at 17:56
  • @johnco3 If you read the documentation carefully, it says "If some element is found m times in [first1, last1) and n times in [first2, last2), the first std::min(m, n) elements will be copied from the first range to the destination range." This answers your question why only one conversion is needed from `A`. – iavr Apr 18 '14 at 20:40
  • @johnco3 As for the kind of merging that you are implying, I am afraid `std::intersection` is not *that* generic. It has a function object for comparison, but it would need another function object that takes two parameters `A`,`B` for the two equivalent items found and gives a `C` to be copied to the output. You could have this object do the merging. Currently only `A` is copied. I think I would just take the code of `std::intersection` and generalize it myself. – iavr Apr 18 '14 at 20:44
1

I am a bit confused but your edits, so I hope I get the latest question right.

As I said previously in a comment, you can add to C constructors for A and B

struct StructA {
    std::string mCommonField;
    std::string mNameAString;
    float mFloatValueA;
};

struct StructB {
    std::string mCommonField;
    std::string mNameBString;
    int mIntValueB;
};

struct StructC {
    std::string mCommonField;
    std::string mNameFromA = "[]";
    std::string mNameFromB = "[]";
    float mFloatFromA = 0;
    int mIntFromB = 0;

    StructC(const StructA& a) : mCommonField{a.mCommonField},
        mNameFromA{a.mNameAString}, mFloatFromA{a.mFloatValueA} {}

    StructC(const StructB& b) : mCommonField{b.mCommonField},
        mNameFromB{b.mNameBString}, mIntFromB{b.mIntValueB} {}
};

instead of A/B conversions to C. The idea is that C should know better how to construct itself from partial information. I think this was your concern in the comment below polkovnikov.ph's answer. Plus, with in-class initializers, members can be initialized to default values so that each constructor only cares for what is relevant.

Another issue is that you don't really need to construct Common objects just for comparison; it's pure overhead. You can do the same with a generic comparator:

struct Comparator
{
    template<typename A, typename B>
    bool operator()(const A& a, const B& b) const
    {
        return a.mCommonField < b.mCommonField;
    }
};

Now struct Common is not needed anymore (except if you need it for other purposes).

Here's a complete live example.

Other than that, I don't see any problem.

Community
  • 1
  • 1
iavr
  • 7,547
  • 1
  • 18
  • 53
  • Thanks for the updates to the live example, however the results in the live example are not quite what I am looking for. the current live example gives the following (putting default constructor values from the fields from B) goodbye - nameA2 - [] - 2.000000 - 0 hello - nameA1 - [] - 1.000000 - 0 I was hoping to have something along the following lines where the values from B were Merged in with the Values from A goodbye - nameA2 - nameB2 - 2.000000 - 20 hello - nameA1 - nameB1 - 1.000000 - 10 – johnco3 Apr 18 '14 at 20:54
  • Maybe it's just not clear what you're looking for. I think I have a better idea now from your comments, I'll write another answer. – iavr Apr 18 '14 at 20:57
  • Thank you so much, I'm not sure if its possible, good luck trying :) – johnco3 Apr 18 '14 at 20:58
  • @johnco3 Done, check again. – iavr Apr 18 '14 at 21:21
1

I think I now have a better idea what you were originally looking for:

Given two input ranges A, B, for every two instances a in A, b in B found to be equivalent, you want to merge a, b into a new object c and copy that to the output range C.

If I got this right now, I am afraid I can think of no combination of standard algorithms that can accomplish that. std::set_intersection is simply not that generic.

So my second attempt is to generalize std::set_intersection itself by adding a Merger function object:

template<
    class InputIt1, class InputIt2,
    class OutputIt, class Compare, class Merge
>
OutputIt set_intersection
(
    InputIt1 first1, InputIt1 last1,
    InputIt2 first2, InputIt2 last2,
    OutputIt d_first, Compare comp, Merge merge
)
{
    while (first1 != last1 && first2 != last2)
    {
        if (comp(*first1, *first2))
            ++first1;
        else
        {
            if (!comp(*first2, *first1))
                *d_first++ = merge(*first1++, *first2);
            ++first2;
        }
    }
    return d_first;
}

Here's what your Merger class can look like:

struct Merger
{
    template<typename A, typename B>
    StructC operator()(const A& a, const B& b) const { return {a, b}; }
};

and here's how StructC actually handles merging:

struct StructC {
    std::string mCommonField;
    std::string mNameFromA;
    std::string mNameFromB;
    float mFloatFromA;
    int mIntFromB;

    StructC(const StructA& a, const StructB& b) : mCommonField{a.mCommonField},
        mNameFromA{a.mNameAString}, mNameFromB{b.mNameBString},
        mFloatFromA{a.mFloatValueA}, mIntFromB{b.mIntValueB} {}
};

As you require, the live example now gives

goodbye - nameA2 - nameB2 - 2.000000 - 20
hello - nameA1 - nameB1 - 1.000000 - 10

Watch out: the code of std::set_intesection is just what I copied from cppreference.com, which is a simplified functional equivalent. In production code, I would check actual implementations for issues related to forwarding, exceptions, special cases, optimizations and anything I haven't thought of.

iavr
  • 7,547
  • 1
  • 18
  • 53
  • Fantastic, thank you so much for this generic solution, its going to save me a ton of std::find_if's – johnco3 Apr 18 '14 at 21:30