Efficient custom string creator

Question

I'd like to know what's the most effective way of passing a custom string. For example, I have this code segment:

outputFile << addSpace(data.len());

where

string addSpace(int n) {
    string result("");
    for (int i = 0; i < n; i++) {
        result += ' ';
    }
    return result;
}

It is clear that the function is not so effective, since the string is returned by-val and then even used in a place where RValue could fit as well. If N was fixed, say N=5, I could just use

outputFile << " "

but clearly that not the case.

So what would be the best solution?(regardless of this specific example of N white spaces, lets say for any parameter dependent string creation). I thought about lambda functions but I'm not really sure.

If `data.len` has a reasonable upper limit and you are calling the function very often you could cache each produced string in a function-static array (or associative array) of strings and then just index with the length to check and if there, return the cached value. Or even more blunt, pre-produce all those strings, if max len is small. Oh, and you should make the return value a const reference if all you do is print it. — Peter - Reinstate Monica, Nov 18 '16 at 10:54
maybe you are looking for something like `std::setw`, but it is unlikely to have any efficiency difference anyway. — apple apple, Nov 18 '16 at 11:09

eerorika · Accepted Answer · 2016-11-18T11:12:09.577

It is clear that the function is not so effective, since the string is returned by-val

The string is local, so return by value is the only option. It is not really the return by value that makes the function less efficient, it's the fact that you create a new string in every call. The fact that you must return by value is simply a symptom of that design. However, the lack of efficiency is probably marginal unless you call the function a lot, with differing n (if caller uses same n, they can keep the returned string and reuse it).

You could of course pass a reference to a string to the function, let the function modify the string, and return the reference, which would indeed be (possibly only marginally) more efficient if the function is called many times, with the same argument string because then you avoid creating a new string every call. But then you need to manage that external string in the calling code, which would make the use of the function more complex, and therefore worse. And if you don't reuse the argument string, this isn't any more efficient anyway.

If you choose to use a new string for every instance, there is a simpler (and potentially more efficient, however even more marginally than difference with re-using a string) way than your function. Simply use the constructor of std::string:

outputFile << std::string(data.len(), ' ');

Not creating a string at all might be even more efficient for this particular case:

outputFile << std::setfill(' ') << std::setw(data.len()) << ' ';

So for the general case, the choice boils down to: The efficiency is crucially more important than a nice interface, and the strings can be reused - then pass a reference to string and modify. Otherwise returning a new string as in your example is better.

Perhaps the idea of passing a reference is indeed the most efficient solution. And yet, I wonder if there's a way to pass an rvalue to the operator <<, just like your second solution which used the constructor of 'std::string', but in a generic way, meaning that we are not limiting ourselves to the case where we want to repeat a fixed character. — Eliran Abdoo, Nov 18 '16 at 11:33
It may possible to create a class, 'customString', implement a proper constructor and cast operator which returns string, then use its constructor just like in the second solution you've provided. The only problem I can think of is that the cast operation will make it inefficient, but I don't have enough knowledge about it to be sure. — Eliran Abdoo, Nov 18 '16 at 11:39
@anonanon the value returned by the function *is* an rvalue. — eerorika, Nov 18 '16 at 11:50

score 0 · Answer 2 · answered Nov 18 '16 at 10:33

0

You can use string.insert function. Function can insert any number of character starting from given position of a string.

string& insert (size_t pos, size_t n, char c);

reference: http://www.cplusplus.com/reference/string/string/insert/

example:

string result = "foo";
int n = 5;
result.insert(result.length(), n, ' '); //result will become "foo     "

answered Nov 18 '16 at 10:33

cokceken

2,068
11
22

It might indeed be very good for the specific example I've given, but as I said, I'm looking for a generic solution for varied options of string creation, and not just repetition of some character. – Eliran Abdoo Nov 18 '16 at 10:48

Peter - Reinstate Monica · Answer 3 · 2016-11-18T11:23:52.780

I think your function is not that inefficient. Let's inspect the potential inefficiencies:

string::operator+=(char) looks terrible (and could be, for custom types). But for strings it does not create and assign new strings, it really just appends a character to the existing one; appending a char should have constant time complexity. If you really just need repetitions of the same character in the string, user2079303's constructor suggestion is indeed the proper way to create it.
Return the result by value: A modern compiler would move the string out of the function or construct it in-place to begin with, instead of creating temporary copies. Moving the string should be very fast as well (in particular, no dynamic memory is (de-) allocated).
Dynamically allocating a string each time the function is called: This is the one part which indeed is inefficient. If you really need strings (and your example was just a trivial illustration), handing in a reference to a pre-existing string, as user2079303 suggested, is a good way to solve it. Some of the Java Swing API was amended with functions which accept pre-existing object references in order to avoid dynamic memory operations for short-lived objects like in your example.

Another way to avoid string creation is a re-design. If the code is indeed about output, hand an ostream reference to the function and let the function decide how to most efficiently output something. As I said, that depends on your use-cases.

A modern (c++11 compliant) compiler **must** move the string out of the function. Even a pre-c++11 probably avoids the copy (and a modern compiler avoids the move) if they implement NRVO. — eerorika, Nov 18 '16 at 11:01
Oh, I didn't know that modern compiler will move out the string instead of copying it, if that's the case then I think the initial setup of a return by-val function will be just fine. Thank you — Eliran Abdoo, Nov 18 '16 at 11:42

score 0 · Answer 4 · answered Nov 18 '16 at 11:37

you can return something that contains enough information to do your io operations

in this case your addspace can become

#include <iostream>

struct secret_thing_to_add_space{
   int number;
};

std::ostream& operator << (std::ostream& os, const secret_thing_to_add_space& s){
   //just use simple method here, you can use something different
   for(int i=0;i<s.number;++i)os<<' ';
   return os;
};

secret_thing_to_add_space addspace(int n){
   return {n};
}

int main(){
   std::cout<< "begin" << addspace(10) << "end";
}

As you can see, it becomes complex. (and unlikely to be more efficient in this case)

Efficient custom string creator

4 Answers4