0

I am working on a C function which must input a string and remove all the non-letter characters at the beginning only. For example, if the input string was "123 456 My dog has fleas." then the output string must be: "My dog has fleas."

Here's what I have, which works on the above example:

int isALetter(char x){
   // Checks to see is x is an ASCII letter
   if(  ((int)x>=65 && (int)x<=90)  ||  ((int)x>=97 && (int)x<=122)  )
      return 0;      // TRUE
   return 1;         // FALSE
}
char* removeNonLettersAtBeginning(char* str){
   while( isALetter(str[0]) == 1  &&  &str[0] != NULL )
      str++;
   return str;
}

Here's what bugs me... If the string has no letters at all, the code doesn't seem to work. If I submit string " " (no letters) then I get "XDG_SESSION_ID=3818". I don't know what that string is, but I'm assuming its "garbage" in the system.

But my removeNonLettersAtBeginning() function should be returning a "" string, an empty string. I can't figure out what the problem is, but I'm betting it lies here:

   while( isALetter(str[0]) == 1  &&  &str[0] != NULL )

The "&str[0] != NULL" part of that line is to ensure I don't run off the end of the string; I'm trying to check to see if I've hit the Null character which terminates the string. Anyone see where I'm going wrong?

Pete
  • 1,511
  • 2
  • 26
  • 49

3 Answers3

2

You check of null terminator is wrong, null terminator is '\0' not NULL

#include <stdio.h>

int isALetter(char x){
   // Checks to see is x is an ASCII letter
   if( (x>='A' && x<='Z') || (x>='a' && x<='z') )
      return 0;      // TRUE
   return 1;         // FALSE
}
char* removeNonLettersAtBeginning(char* str){
   if (str != NULL)
   {
      while( isALetter(*str) == 1  &&  *str != '\0' )
         str++;
   }
   return str;
}

int main (void)
{
    char test_string[] = "        test\n";
    char *test_ptr = test_string;

    printf ("%s", test_ptr);

    test_ptr = removeNonLettersAtBeginning(test_ptr);

    printf ("%s", test_ptr);
}

As a side note, to make your code more readable, avoid using magic numbers like 65, 90. You can, as shown, easily use chars to do so: 'A', 'Z'...

LPs
  • 16,045
  • 8
  • 30
  • 61
  • I think the main problem was not the `NULL`, but the extra `&`. (But, yes, `'\0'` is a much better null character constant.) – Steve Summit Jan 24 '17 at 16:30
  • 1
    Strictly speaking, `x>='A' && x<='Z'` isn't the right solution, either, as it can fail on certain obscure, non-ASCII character sets. Much better to use `isalpha()` from ``. – Steve Summit Jan 24 '17 at 16:31
2

you write :

while( isALetter(str[0]) == 1  &&  &str[0] != NULL ) //error in str[0] 
      str++;                                         //it must be *str

here, you used char * str which will points to string that is to be tested.

As you said you want to remove all non-characters from string. but, you're using a char type of pointer in wrong way.

error free code:

 while( isALetter(*str) == 1  &&  *str != '\0')  
          str++;         

it should be work for u in favour :)

Shivam Sharma
  • 1,015
  • 11
  • 19
1

Here's another approach.

#include <ctype.h>
...
void stripNonAlpha( char *str )
{
  size_t r = 0, w = 0; // read and write indices

  /**
   * Find the first alpha character in the string
   */
  while ( str[r] && !isalpha( str[r] ) )
    r++;

  /**
   * Shift remaining characters to the left, including the 0 terminator
   */
  while ( (str[w++] = str[r++] ) )
    ; //empty loop
}

Basically, this code searches for the first alphabetical character in the string; once found, that character and all following characters are copied over the initial part of the string. For example, let's take the string "123 test". Initially, here's what everything looks like:

  r
  |
  v
+---+---+---+---+---+---+---+---+---+
|'1'|'2'|'3'|' '|'t'|'e'|'s'|'t'| 0 |
+---+---+---+---+---+---+---+---+---+
  ^
  |
  w

The first loop checks the value of the character at index r; while it's neither the end of the string nor an alpha character, advance r. At the end of the loop, we have this:

                  r
                  |
                  v
+---+---+---+---+---+---+---+---+---+
|'1'|'2'|'3'|' '|'t'|'e'|'s'|'t'| 0 |
+---+---+---+---+---+---+---+---+---+
  ^
  |
  w

The second loop copies characters from r and writes them to w (up to and including the 0 terminator), like so:

                      r
                      |
                      v
+---+---+---+---+---+---+---+---+---+
|'t'|'2'|'3'|' '|'t'|'e'|'s'|'t'| 0 |
+---+---+---+---+---+---+---+---+---+
      ^
      |
      w
                          r
                          |
                          v
+---+---+---+---+---+---+---+---+---+
|'t'|'e'|'3'|' '|'t'|'e'|'s'|'t'| 0 |
+---+---+---+---+---+---+---+---+---+
          ^
          |
          w
                              r
                              |
                              v
+---+---+---+---+---+---+---+---+---+
|'t'|'e'|'s'|' '|'t'|'e'|'s'|'t'| 0 |
+---+---+---+---+---+---+---+---+---+
              ^
              |
              w
                                  r
                                  |
                                  v
+---+---+---+---+---+---+---+---+---+
|'t'|'e'|'s'|'t'|'t'|'e'|'s'|'t'| 0 |
+---+---+---+---+---+---+---+---+---+
                  ^
                  |
                  w
                                      r
                                      |
                                      v
+---+---+---+---+---+---+---+---+---+
|'t'|'e'|'s'|'t'| 0 |'e'|'s'|'t'| 0 |
+---+---+---+---+---+---+---+---+---+
                      ^
                      |
                      w

Some sample output:

$ ./stripper "123 345 this is a test"
before: "123 345 this is a test"
after:  "this is a test"

$ ./stripper "this is a test"
before: "this is a test"
after:  "this is a test"

$ ./stripper "          "
before: "          "
after:  ""

$ ./stripper "12345"
before: "12345"
after:  ""

$ ./stripper "12345 abc 23456"
before: "12345 abc 23456"
after:  "abc 23456"

Obviously, this operation is destructive - the input string is modified. If you don't want that, you'll need to write to a different target string. That should be easy enough to figure out, through.

John Bode
  • 119,563
  • 19
  • 122
  • 198