3

I am new to c++ programming and would like to write a program which has the following requirement:

Given a text consists of

  • words
  • letters
  • numbers
  • punctuations, and
  • whitespaces.

Filter out any characters that not in the range of 0..9, a..z or A..Z.

This means that when I typed in:

The quick brown fox jumps over the lazy dog!

The output will be:

Thequickbrownfoxjumpsoverthelazydog

I have typed the following codes and try to run it and the outcome is fine. However, when I submitted it onto another c++ platform for checking the validity, there is no output to be generated.

I am so confused... Please help if you could. Thank you very much to you all.

#include <iostream>
#include <string>
using namespace std;

int main()
{
    string line;
    getline(cin, line);
    for (int i = 0; i < line.size(); ++i)
    {
        if (!((line[i] >= 'a' && line[i] <= 'z') || (line[i] >= 'A' && line[i] <= 'Z') || (line[i] >= '0' && line[i] <= '9')))
        {
            line[i] = '\0';
        }
    }
    cout << line;
    return 0;
}
JeJo
  • 30,635
  • 6
  • 49
  • 88
Chun Tsz
  • 55
  • 2
  • 1
    You are not erasing any character. You substitute `\0`, which commonly has a special meaning - it marks end of string. Your "other C++ platform" could parse strings this way and it concludes it received empty string. Try [`erase(i, 1)`](https://en.cppreference.com/w/cpp/string/basic_string/erase) in your `if` instead. – Yksisarvinen Dec 07 '18 at 10:21
  • You mistake is thinking that `line[i] = '\0';` erases a character. It doesn't, it just replaces one character with another. `std::string` has a method `erase` for erasing characters, try using that. – john Dec 07 '18 at 10:37
  • Do you mean by changing the line[i] = '\0]; to erase(i, 1) ? I do not quite understand what you guys mean – Chun Tsz Dec 07 '18 at 10:44

3 Answers3

3

If you want to remove the characters other than letters and digits, the better choice would be using erase–remove idiom.

  1. Use std::isalnum to check the character in the string is either an alphabet or a numeric. If you pack it into a unary predicate(lambda function), you can apply to the following algorithm function.
  2. Using std::remove_if, and the above-mentioned predicate, collect all characters in the string, which had to be removed.
  3. Lastly, using std::string::erase remove all the characters which have been collected by std::remove_if.

Something like as follows: See a demo here

#include <cctype>     // std::isalnum
#include <algorithm>  // std::remove_if

std::string str{ "The quick brown fox jumps over the lazy dog!" };

// predicate to check the charectors
const auto check = [](const char eachCar)->bool { return !std::isalnum(eachCar); };

// collect the chars which needed to be removed from the string
const auto charsToRemove = std::remove_if(str.begin(), str.end(), check);

// erase them out
str.erase(charsToRemove, str.end());

Disclaimer: The above solution does not cover OP's concern(@john has explained it well in his answer), rather it could be helpful for future readers.

JeJo
  • 30,635
  • 6
  • 49
  • 88
  • 1
    So what chance do you think the OP has of understanding the above code? – john Dec 07 '18 at 10:38
  • Well I suppose it a common issue, do you answer the question for the OP or for anyone else who might be reading. Personally I always answer for the OP when it's basic stuff like this. – john Dec 07 '18 at 10:47
  • @john Yes... I prefer always to answer the concern issue. When it comes to alternative shortcuts(i.e, using C++ algorithums), I rather go for answering using them. As you mentioned, for future readers, that might be helpful. Sometimes, OP even can inform himself about the std facilities available what he/she about to reinvent.(IMHO) – JeJo Dec 07 '18 at 10:58
1

Your code just replaces one character with another. The simple way to erase characters from a string is to use the erase method. Something like this

#include <iostream>
#include <string>
using namespace std;

int main() 
{
    string line;
    getline(cin, line);
    for (int i = 0; i < line.size(); )
    {
        if (!((line[i] >= 'a' && line[i]<='z') || (line[i] >= 'A' && line[i]<='Z')||(line[i] >= '0' && line[i]<='9')))
        {
            line.erase(i, 1);
        }
        else
        {
            ++i;
        }
    }
    cout << line; 
    return 0;
}

Note that the code only adds one to i when we don't erase a character, otherwise you'd skip the character after the one erased because the string is now one shorter.

john
  • 85,011
  • 4
  • 57
  • 81
  • Thank you for your answer. You are great and the answer is easy to understand for one like me. However, there is still one problem which is that there is a space is indented in the output? How should I delete the indentation? – Chun Tsz Dec 07 '18 at 10:47
  • @ChunTsz I'm not sure, what's the input and what's the output that you see? There could easily be a bug in the above code, I haven't tested anything. – john Dec 07 '18 at 10:49
  • @john I have solved the problem but another one came out and I am unable to handle.. for example, if the input is : abcdefg /n hijklmn , then now the program outputs only abcdefg but no hijklmn append behind the former. How should I amend the codes? Can I get some hints? – Chun Tsz Dec 07 '18 at 11:48
  • 1
    You need to read multiple lines, `getline` reads only one line, so you need some sort of loop around your code. E.g. `while (getline(cin, line)) { ... }` – john Dec 07 '18 at 12:00
  • @john sorry John. To be honest, I wrote part of these codes under the coaching by my friend. Do you mean I should add a while loop to the getline function only? – Chun Tsz Dec 07 '18 at 12:15
  • 1
    Try both ways and think about what the difference is when you run the program. Programming isn't magic you can figure it out for yourself. – john Dec 07 '18 at 13:52
0

\0 is the end of a string, so when you use this, you are cutting off your string at the first occurence.

You'd better remove that char from your array, but then I'd advise you to go from the end back to the beginning:

Pseudo-code:

for i = size(line)-1 back to i = 0:
  if line[i] in ('a'-'z', 'A'-'Z', ...):
    for j = i to size(line)-1:
      line[j] = line[j+1]
   reduce_by_the_last_character(line)
Dominique
  • 16,450
  • 15
  • 56
  • 112