18

Suppose I need to call a function foo that takes a const std::string reference from a great number of places in my code:

int foo(const std::string&);
..
foo("bar");
..
foo("baz");

Calling a function with a string literal like this will create temporary std::string objects, copying the literal each time.

Unless I'm mistaken, compilers won't optimize this by creating a static std::string object per literal that can be reused for subsequent calls. I know that g++ has advanced string pool mechanisms, but I don't think it extends to the std::string objects themselves.

I can do this "optimization" myself, which makes the code somewhat less readable:

static std::string bar_string("bar");
foo(bar_string);
..
static std::string baz_string("baz");
foo(baz_string);

Using Callgrind, I can confirm that this does indeed speed up my program.

I thought I'd try to make a macro for this, but I don't know if it's possible. What I would want is something like:

foo(STATIC_STRING("bar"));
..
foo(STATIC_STRING("baz"));

I tried creating a template with the literal as a template parameter, but that proved impossible. And since a function definition in a code block isn't possible, I'm all out of ideas.

Is there an elegant way of doing this, or will I have to resort to the less readable solution?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Masseman
  • 359
  • 3
  • 13
  • So, what about `[]() -> std::string const& { std::string const s("bar"); return s; } ()`? – dyp Sep 09 '14 at 20:01
  • with c++14, you can use user-defined literal `"bar"s`, but I am not sure does it makes a copy or not – Bryan Chen Sep 10 '14 at 03:00

5 Answers5

10

If that function foo does not make a copy of the string then its interface is sub-optimal. It is better to change it to accept char const* or string_view, so that the caller is not required to construct std::string.

Or add overloads:

void foo(char const* str, size_t str_len); // Does real work.

inline void foo(std::string const& s) { foo(s.data(), s.size()); }
inline void foo(char const* s) { foo(s, strlen(s)); }
Maxim Egorushkin
  • 131,725
  • 17
  • 180
  • 271
  • If `foo` makes a copy of the string, the interface is sub-optimal. If `foo` makes a copy of the string, it's better to pass by value. – Luchian Grigore Sep 09 '14 at 09:20
  • 1
    I agree. The interface should not needlessly require `std::string`, but sometimes the interface isn't yours to change, or the string object is actually needed further down the line in `foo`. – Masseman Sep 09 '14 at 09:22
  • @LuchianGrigore True. For gcc reference counted `std::string` accepting it by reference to const is as good as by value in terms of data copies. – Maxim Egorushkin Sep 09 '14 at 09:22
  • 1
    I don't see how this helps. What if the function `foo` uses `std::string` member functions such as `find`? – Rapptz Sep 09 '14 at 17:40
  • 1
    @Rapptz Well, it amounts to insisting on allocating memory and copying the string just to be able call `std::string::find`, doesn't it? There is always [boost string algorithms](http://www.boost.org/doc/libs/1_56_0/doc/html/string_algo.html) library that decouples string algorithms from string representation. The bloated and monolithic interface of `std::string` [has often been called into question](http://www.gotw.ca/gotw/084.htm). – Maxim Egorushkin Sep 09 '14 at 21:22
6

You may use something like that to create your static std::string "in place":


#include <cstdint>
#include <string>

// Sequence of char
template <char...Cs> struct char_sequence
{
    template <char C> using push_back = char_sequence<Cs..., C>;
};

// Remove all chars from char_sequence from '\0'
template <typename, char...> struct strip_sequence;

template <char...Cs>
struct strip_sequence<char_sequence<>, Cs...>
{
    using type = char_sequence<Cs...>;
};

template <char...Cs, char...Cs2>
struct strip_sequence<char_sequence<'\0', Cs...>, Cs2...>
{
    using type = char_sequence<Cs2...>;
};

template <char...Cs, char C, char...Cs2>
struct strip_sequence<char_sequence<C, Cs...>, Cs2...>
{
    using type = typename strip_sequence<char_sequence<Cs...>, Cs2..., C>::type;
};

// struct to create a std::string
template <typename chars> struct static_string;

template <char...Cs>
struct static_string<char_sequence<Cs...>>
{
    static const std::string str;
};

template <char...Cs>
const
std::string static_string<char_sequence<Cs...>>::str = {Cs...};

// helper to get the i_th character (`\0` for out of bound)
template <std::size_t I, std::size_t N>
constexpr char at(const char (&a)[N]) { return I < N ? a[I] : '\0'; }

// helper to check if the c-string will not be truncated
template <std::size_t max_size, std::size_t N>
constexpr bool check_size(const char (&)[N])
{
    static_assert(N <= max_size, "string too long");
    return N <= max_size;
}

// Helper macros to build char_sequence from c-string
#define PUSH_BACK_8(S, I) \
    ::push_back<at<(I) + 0>(S)>::push_back<at<(I) + 1>(S)> \
    ::push_back<at<(I) + 2>(S)>::push_back<at<(I) + 3>(S)> \
    ::push_back<at<(I) + 4>(S)>::push_back<at<(I) + 5>(S)> \
    ::push_back<at<(I) + 6>(S)>::push_back<at<(I) + 7>(S)>

#define PUSH_BACK_32(S, I) \
        PUSH_BACK_8(S, (I) + 0) PUSH_BACK_8(S, (I) + 8) \
        PUSH_BACK_8(S, (I) + 16) PUSH_BACK_8(S, (I) + 24)

#define PUSH_BACK_128(S, I) \
    PUSH_BACK_32(S, (I) + 0) PUSH_BACK_32(S, (I) + 32) \
    PUSH_BACK_32(S, (I) + 64) PUSH_BACK_32(S, (I) + 96)

// Macro to create char_sequence from c-string (limited to 128 chars) without leading '\0'
#define MAKE_CHAR_SEQUENCE(S) \
    strip_sequence<char_sequence<> \
    PUSH_BACK_128(S, 0) \
    ::push_back<check_size<128>(S) ? '\0' : '\0'> \
    >::type

// Macro to return an static std::string
#define STATIC_STRING(S) static_string<MAKE_CHAR_SEQUENCE(S)>::str

Live example

gcc has an extension to simplify MAKE_CHAR_SEQUENCE:

template <typename CHAR, CHAR... cs>
constexpr auto operator ""_c() { return char_sequence<cs...>{}; }
Jarod42
  • 203,559
  • 14
  • 181
  • 302
  • 1
    Amazing! The only downside is that it uses some C++11 syntax that isn't available in g++ 4.4. With 4.8 it works perfectly, though. – Masseman Sep 09 '14 at 11:14
2

If you can use boost 1.55 or greater you can do

#include <boost/utility/string_ref.hpp>

void foo(const boost::string_ref& xyz)
{
}
James
  • 9,064
  • 3
  • 31
  • 49
1

You could use Boost.Flyweight to make a key-value flyweight from const char* to std::string. I'm not sure about the details, might be that it is enough to use flyweight<std::string> everywhere.

filmor
  • 30,840
  • 6
  • 50
  • 48
1

This will work for simple strings - w/o whitespace:

#define DECL_STR(s) const std::string str_##s (#s)

Usage in header (parse once!):

DECL_STR(Foo);
DECL_STR(Bar);

In code:

func(str_Foo);
func(str_Bar);
egur
  • 7,830
  • 2
  • 27
  • 47