-1

I've just been introduced to toupper, and I'm a little confused by the syntax; it seems like it's repeating itself. What I've been using it for is for every character of a string, it converts the character into an uppercase character if possible.

for (int i = 0; i < string.length(); i++)
{
    if (isalpha(string[i]))
    {
        if (islower(string[i]))
        {
            string[i] = toupper(string[i]);
        }
    }
}

Why do you have to list string[i] twice? Shouldn't this work? toupper(string[i]); (I tried it, so I know it doesn't.)

Glenn Jansson
  • 31
  • 1
  • 8

5 Answers5

4

toupper is a function that takes its argument by value. It could have been defined to take a reference to character and modify it in-place, but that would have made it more awkward to write code that just examines the upper-case variant of a character, as in this example:

// compare chars case-insensitively without modifying anything
if (std::toupper(*s1++) == std::toupper(*s2++))
  ...

In other words, toupper(c) doesn't change c for the same reasons that sin(x) doesn't change x.


To avoid repeating expressions like string[i] on the left and right side of the assignment, take a reference to a character and use it to read and write to the string:
for (size_t i = 0; i < string.length(); i++) {
  char& c = string[i];  // reference to character inside string
  c = std::toupper(c);
}

Using range-based for, the above can be written more briefly (and executed more efficiently) as:

for (auto& c: string)
    c = std::toupper(c);
user4815162342
  • 141,790
  • 18
  • 296
  • 355
  • It looks obvious now when I look at it again, but I guess I never realized it worked by assigning a new value. Which also means I wouldn't have to have the same thing in both places; I could assign the toupper-ed value to another string entirely. Thanks! – Glenn Jansson Feb 13 '16 at 21:50
  • In C++ standard library functions are not allowed to be macro. – Revolver_Ocelot Feb 13 '16 at 21:55
  • @Revolver_Ocelot Thanks for the correction, I've updated the answer. Interestingly, g++'s `` doesn't bother to replace the macros with equivalent inline functions. I wonder if that incurs an observable performance penalty compared to the macro versions from ``. – user4815162342 Feb 13 '16 at 22:10
2

As from the documentation, the character is passed by value.
Because of that, the answer is no, it shouldn't.

The prototype of toupper is:

int toupper( int ch );

As you can see, the character is passed by value, transformed and returned by value.
If you don't assign the returned value to a variable, it will be definitely lost.
That's why in your example it is reassigned so that to replace the original one.

skypjack
  • 49,335
  • 19
  • 95
  • 187
1

As many of the other answers already say, the argument to std::toupper is passed and the result returned by-value which makes sense because otherwise, you wouldn't be able to call, say std::toupper('a'). You cannot modify the literal 'a' in-place. It is also likely that you have your input in a read-only buffer and want to store the uppercase-output in another buffer. So the by-value approach is much more flexible.

What is redundant, on the other hand, is your checking for isalpha and islower. If the character is not a lower-case alphabetic character, toupper will leave it alone anyway so the logic reduces to this.

#include <cctype>
#include <iostream>

int
main()
{
  char text[] = "Please send me 400 $ worth of dark chocolate by Wednesday!";
  for (auto s = text; *s != '\0'; ++s)
    *s = std::toupper(*s);
  std::cout << text << '\n';
}

You could further eliminate the raw loop by using an algorithm, if you find this prettier.

#include <algorithm>
#include <cctype>
#include <iostream>
#include <utility>

int
main()
{
  char text[] = "Please send me 400 $ worth of dark chocolate by Wednesday!";
  std::transform(std::cbegin(text), std::cend(text), std::begin(text),
                 [](auto c){ return std::toupper(c); });
  std::cout << text << '\n';
}
5gon12eder
  • 24,280
  • 5
  • 45
  • 92
0

toupper takes an int by value and returns the int value of the char of that uppercase character. Every time a function doesn't take a pointer or reference as a parameter the parameter will be passed by value which means that there is no possible way to see the changes from outside the function because the parameter will actually be a copy of the variable passed to the function, the way you catch the changes is by saving what the function returns. In this case, the character upper-cased.

Hatted Rooster
  • 35,759
  • 6
  • 62
  • 122
0

Note that there is a nasty gotcha in isalpha(), which is the following: the function only works correctly for inputs in the range 0-255 + EOF.

So what, you think.

Well, if your char type happens to be signed, and you pass a value greater than 127, this is considered a negative value, and thus the int passed to isalpha will also be negative (and thus outside the range of 0-255 + EOF).

In Visual Studio, this will crash your application. I have complained about this to Microsoft, on the grounds that a character classification function that is not safe for all inputs is basically pointless, but received an answer stating that this was entirely standards conforming and I should just write better code. Ok, fair enough, but nowhere else in the standard does anyone care about whether char is signed or unsigned. Only in the isxxx functions does it serve as a landmine that could easily make it through testing without anyone noticing.

The following code crashes Visual Studio 2015 (and, as far as I know, all earlier versions):

int x = toupper ('é'); 

So not only is the isalpha() in your code redundant, it is in fact actively harmful, as it will cause any strings that contain characters with values greater than 127 to crash your application.

See http://en.cppreference.com/w/cpp/string/byte/isalpha: "The behavior is undefined if the value of ch is not representable as unsigned char and is not equal to EOF."

BenG
  • 14,826
  • 5
  • 45
  • 60
H. Guijt
  • 3,325
  • 11
  • 16