2

What is the best C++ alternative to dict(zip(values...)) in python?

I'm tutoring a C++ student currently in my off time, and came across a piece of Python code at my work and found I did not know the best answer.

The code looks like the following (I changed the names of variables, and generalized it a bit, but it's the same idea):

(dict(zip(wordCollection, [word.strip() for word in currentLine.split(',')][1:-1])))

I've replaced the stripped, and split words with a trimmed, tokenized vector using boost and that works fine; however I came to a loss when trying to decide the best way to translate the dict/zip combination.

Brian Deragon
  • 2,929
  • 24
  • 44

4 Answers4

4

Well once you have your vectors like:

std::vector<std::string> wordCollection;
std::vector<std::string> splitWords;

then you can just iterate:

std::map<std::string, std::string> dict; // or std::unordered_map
std::size_t minSize = std::min(wordCollection.size(), splitWords.size());
for (size_t i = 0; i != minSize; ++i) {
    dict.insert(std::make_pair(wordCollection[i], splitWords[i]));
}
Barry
  • 286,269
  • 29
  • 621
  • 977
2

You really shouldn't be trying to translate idioms directly from one language to another.

In C++, you generally don't write functions that take iterators and generate new iterators; instead, you write functions that take input and output iterators and copy from one to the other. So, you could write a zip function that take an input iterator over T, an input iterator over U, and an output iterator over pair<T, U>.

But then you're not going to chain the two calls together this way, because your zip function isn't going to be returning anything (like an iterator range) that could be usefully passed to any kind of dict function. Instead, you might create a dict analog (an unordered_map), create an output iterator into that, and use a zip function to copy pairs into it.

Something like this:

template <I1, I2, O>
void zip(I1 it1, I1 it1end, I2 it2, I2 it2end, O o) {
    while ((it1 != it1end) && (it2 != it2end)) {
        *o++ = std::make_pair(*it1++, *it2++);
    }
}

std::unordered_map<T, U> mapping;
zip(c1.begin(), c1.end(), c2.begin(), c2.end(), std::inserter(mapping, mapping.end()));

Except I don't think you can actually use inserter on an unordered_map this way or not, so you've have to write a map_inserter function instead.

If you don't know the types T and U locally, you may want to wrap this all up in a function template that extracts the types from the element types of the iterators so you can auto it. (In C++11, you can decltype it without needing a function, but the expression will be a mess.)


If you have multiple uses for zip and map_inserter, it may be worth writing them. But otherwise, a better solution would be to expand it into an explicit loop:

auto it1 = c1.begin(), it1end = c1.end(), it2 = c2.begin(), it2end = c2.end();
std::unordered_map<T, U> mapping;
while ((it1 != it1end) && (it2 != it2end)) {
    mapping[*it1++] = *it2++;
}
abarnert
  • 354,177
  • 51
  • 601
  • 671
  • true, I wouldn't normally do that directly in practice, I believe heavily in not translating idioms directly as well; I was just curious what was the best and proper "C++ way" to solve the same problem – Brian Deragon Nov 03 '14 at 19:29
  • I would write zip very differently. I'd want it to be container-like, so that the usage could be like `for (auto pr : zip(c1, c2)) { ... }`. I think that's much more usable than the OutputIterator-style of algorithms. – Barry Nov 03 '14 at 19:40
  • 1
    @Barry: I agree that such things are more useful, but they don't really fit the STL algorithm idiom. (BTW, I'm not defending the STL algorithm idiom. I loved it until I discovered things like lazy lists and Python generators in other languages, and noticed that instead of being able to use those algorithms every once in a while when it's worth writing a lot of scaffolding, you can use them all the time trivially… Which is one of the reasons I don't use C++ as much as I used to. – abarnert Nov 03 '14 at 19:41
  • @abarnert that's not a trend that c++ preaches, quite the contrary, viewing how good c++ coders like Sean Parent preaches about never forgetting about the algorithms in the standard library and beyond, I have a rather different view. – oblitum Nov 03 '14 at 20:50
1

IMO, the best C++ alternative for dict is std::unordered_map, which is a hash table, and for zip, it's ranges::view::zip from D4128 ranges proposal, for which reference implementation can be accessed at github.com/ericniebler/range-v3.

C++11 code:

#include <string>
#include <vector>
#include <unordered_map>
#include <range/v3/view/zip.hpp>

int main() {
    using namespace std;
    using ranges::view::zip;

    int ints[] = {1, 2, 3};
    vector<string> strings = {"a", "b"};
    unordered_map<int, string> dict(zip(ints, strings));
}

I hope for this to turn up as C++ standard in the future.

oblitum
  • 11,380
  • 6
  • 54
  • 120
0
dict(zip(labels,values))  --->  dict([("a",1),("b",0)]) ---> dict(a=1,b=0)

a dict is simply a hashtable ... and this is simply making a hash table of labels and values, where the labels (or keys) are wordCollection and the tokenized string is the values

so probably a hashtable ... although it will probably take more than one line to do it in c++

Joran Beasley
  • 110,522
  • 12
  • 160
  • 179