-1

Recently, I stumbled upon the need of creating a string library for my specific use case in C. Specifically, I wanted to lowercase all characters in a string. I wondered if it is more efficient to use a for loop that increments 1 until the end of the string or is it better (a millimeter of more performance?) to increment a character pointer?

Without incrementing the char pointer:

void string_tolowercase_s(string_t *dest) {
    for (int i = 0; i < dest->text_length; i++)
        dest->text[i] = tolower(dest->text[i]);
}

Incrementing the char pointer:

void string_tolowercase_s(string_t *dest) {
    for (char *letter = dest->text; *letter; letter++)
        *letter = tolower(*letter);
}

EDIT: The string is guaranteed to contain the null terminator ('\0').

moonasteroid
  • 65
  • 1
  • 5
  • 1
    As chux has mentioned, you'd probably want to use `size_t` instead of `int` in the first version. – dragonroot Jul 27 '21 at 02:44
  • Here is a benchmark https://quick-bench.com/q/qtfIp_I7VeIDRkF6kl-EPFtq4RA with optimization and the first is just slightly faster. And here is a benchmark https://quick-bench.com/q/j5mFh7DQBGEv1RUoqbSXFAZRLwg with no optimization and the second one is slightly faster. Just like @dragonroot predicted. – Jerry Jeremiah Jul 27 '21 at 04:43
  • Over the years, the answer to this question has varied. Sometimes, for some processors, pointer-style access was definitely faster, while at other times, array indexing won out. Use whichever one feels more comfortable to you. The chance it will actually make a measurable performance difference is quite small. – Steve Summit Jul 27 '21 at 21:43

2 Answers2

2

The second version would probably be faster when compiled without optimizations, since technically it does less work, but the first version is more likely to get vectorized by the compiler on an optimized build, since the compiler would see a clearly bounded loop using an incrementing index, so it might perform better in that case. That is, if it manages to vectorize tolower.

I also find the second version more readable. I guess it's more a matter of your personal preference in the end.

p.s. One thing you didn't mention is whether your strings may contain '\0' character or not. If they may (like e.g. in C++ std::string), the second version would not perform correctly, stopping on the first such character. Another thing is that size_t is better suited to index arrays than int, since the former is guaranteed to be wide enough to index any element addressable in memory.

dragonroot
  • 5,653
  • 3
  • 38
  • 63
1

The int version fails for very long strings as a strings length may exceed INT_MAX. size_t i would have made better sense here.

The second, incrementing the pointer, does not have this limitation.


The efficiency comparison is secondary to full functionality.

The efficiency of one versus the other is slight and given compiler may emit better code with either one. Beware of premature optimization .

chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256