6

I have a std::string. I want the set of unique characters in it, with each character represented as a std::string.

I can get the set of characters easily:

std::string some_string = ...
std::set<char> char_set(some_string.begin(), some_string.end());

And I could convert them to strings like this:

std::set<std::string> string_set;
for (char c: char_set) {
    string_set.emplace(1, c);
}

But such an approach seems awkward. Is there a better (preferrably standard-library one-liner) way to do this?

EMBLEM
  • 2,207
  • 4
  • 24
  • 32
  • 7
    What seems awkward exactly? To me the whole thing seems awkward. Why would you do this? – Shoe Apr 21 '15 at 18:07
  • @Jefffrey I have an aversion to looping when it doesn't use the standard library, I mean, instant O(*n*). I want to do this because I will be taking the union of this set with another set of `std::string`s, so both sets need to have the same type. – EMBLEM Apr 21 '15 at 18:11
  • 2
    @EMBLEM So we're basically having an XY-Problem here? – Columbo Apr 21 '15 at 18:13
  • 1
    I can't understand the problem. – gomons Apr 21 '15 at 18:15
  • 2
    @EMBLEM, you'll end up with a loop anyway, this is a typical O(n) problem. The only things the standard library or boost can do for you is to hide the loop from plain view. – Timo Geusch Apr 21 '15 at 18:26

5 Answers5

6

You can use:

std::for_each(some_string.begin(), some_string.end(),
              [&string_set] (char c) -> void { string_set.insert(std::string({c}));});

You can also use:

   for (char c: some_string)
   {
      string_set.insert(std::string{c});
   }

Working program:

#include <iostream>
#include <string>
#include <set>
#include <algorithm>

int main()
{
   std::string some_string = "I want the set of unique characters in it";
   std::set<std::string> string_set;
   for (char c: some_string)
   {
      string_set.insert(std::string{c});
   }

   for (std::string const& s: string_set)
   {
      std::cout << s << std::endl;
   }
}

Output:


I
a
c
e
f
h
i
n
o
q
r
s
t
u
w
R Sahu
  • 204,454
  • 14
  • 159
  • 270
4

A transform can be used as a one-liner:

transform(begin(some_string), end(some_string),
          inserter(string_set, begin(string_set)),
          [] (char c) -> std::string { return {c}; });

I wouldn't recommend using this solution though as it's horribly unreadable. Typically you want to write code that is intuitive and easy to understand. What you've written in your answer already suffices and I wouldn't recommend looking for short cuts to reduce your code into a one liner while sacrificing its clarity.

David G
  • 94,763
  • 41
  • 167
  • 253
  • 5
    That basically does the same as the unwanted code in the OP but now with more overhead and less readable code. – LB-- Apr 21 '15 at 18:11
  • @LB-- He said he wanted a "preferrably standard-library one-liner" solution. – David G Apr 21 '15 at 18:13
  • 1
    Why is this downvoted? OP wanted a one-liner and he got a one-liner. It works doesn't it? +1 – Barry Apr 21 '15 at 18:14
  • 2
    @0x499602D2 .. yeah, but taking that literally and providing some unreadable chunk of code that still just about fits the criteria is not helpful. The OP obviously asked the wrong question. – Columbo Apr 21 '15 at 18:14
  • 5
    @EMBLEM What? How is this *any* better than your for loop? – Columbo Apr 21 '15 at 18:15
  • 1
    OK, so the OP has said this is exactly what they wanted. It's still a horrible solution. – LB-- Apr 21 '15 at 18:17
  • 2
    @Columbo It's clear that the OP's original solution is much cleaner and easier to understand, but if he wanted a one-liner, he's got one. I can explain in my answer why his is better though. – David G Apr 21 '15 at 18:18
  • 1
    @0x499602D2 So your answer becomes "here is some ludicrously unreadable solution. Now let me explain why your initial one is better." Why not just explain that what he's got is optimal? – Columbo Apr 21 '15 at 18:19
  • @Columbo It's not necessarily unless if I educate him as to *why* this code shouldn't be used. – David G Apr 21 '15 at 18:20
2

Is there a better (preferrably standard-library one-liner) way to do this?

No. Anything you would find in the C++ Standard Library is intended for more complex cases where they simplify the code you would have to write otherwise. In your case, your code is simpler. Trying to force yourself to use something from the Standard Library for this would make your code more convoluted.

Three answers have already been posted that demonstrate this - they do exactly what you want but they are nearly unreadable at a glance and they add unnecessary overhead when the compiler is unable to optimize them.

Your for loop is the better solution. It is simple, it conveys intent to the reader, and it is easy for the compiler to optimize. There's no reason to waste time looking for a more complex solution to a simple problem.

All solutions are correct, but you should always pick the simplest correct solution. Write less code, not more.

LB--
  • 2,506
  • 1
  • 38
  • 76
0

I doubt that what you want is a great idea, if you really insist, you could create a class that supports implicit conversion from char, implicit conversion to std::string, and can be compared with either another instance of itself or to a string:

class cvt {
    char val;
public:
    cvt(char val) : val(val) {}

    bool operator<(cvt other) const { return val < other.val; }

    bool operator<(std::string const &s) const {
        return !s.empty() && val < s[0];
    }
    friend bool operator<(std::string const &s, cvt const &c) {
        return !s.empty() && s[0] < c.val;
    }
    operator std::string() const { return std::string(1, val); }
};

With this, we can create our set<cvt>, but use it as if it were a set<std::string> (since the elements in it can/will convert to std::string implicitly and compare with std::string):

int main() {
    std::string some_string = "ZABCDECD";

    // Create our (sort of) set<string> from characters in some_string:
    std::set<cvt> char_set(some_string.begin(), some_string.end());

    // An actual set<string> to use with it:    
    std::set<std::string> strings{ "A", "C", "E", "F", "Y" };

    // demonstrate compatibility:
    std::set_intersection(char_set.begin(), char_set.end(), strings.begin(), strings.end(),
        std::ostream_iterator<std::string>(std::cout, "\n"));
}

Live on Coliru.

If we look at the generated code for this on Godbolt, we see that it really is nearly all syntactic sugar--the only code that's actually generated for the cvt class are the tiny bits to copy a byte in to create a cvt from a char, and to compare a cvt to a string. Everything else has been optimized out of existence.

If we're sure our strings won't be empty, we can simplify the comparisons to return val < s[0]; and return s[0] < val;, in which case they get optimized away as well, so the only code that's generated from using cvt is a copying a byte from the source to construct a cvt object.

Depending on what you have in mind, that might fit what you want. It's a fair amount of extra typing, but it optimizes nicely--to the point that it's probably faster to compare a cvt to a string than to compare a string to a string. By far the largest disadvantage is likely to stem from questioning your basic premise, and wondering why you wouldn't just write a loop and be done with it.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • This is excessive additional typing, it adds overhead that is difficult for compilers to optimize, and more than anything it violates KISS. However it is very clever and it is what I would do if for loops didn't exist. – LB-- Apr 21 '15 at 18:47
0
string setToString(const set<char> &s) {
    string str = "";
    std::accumulate(s.begin(), s.end(), str);
    return str;
}

Maybe this could be useful.