4

One of my assignments in to write my own UNIX Shell. To receive input from the user, I am using fgets to capture the input as a string but I'm not really sure how it works. When I run:

char command[50];
fgets(command, sizeof(command), stdin);

printf("Your Command: %s", &command);
int length = strlen(command);
printf("Length of String: %d\n", length);

Lets say my the input was "exit". strlen says that the string is 5 characters long, instead of four. I want to do this:

if( (strcmp(command, "exit")) == 0 ){
    doSomething();
}

but command is never equaling the string that I want it to; its like it has an unknown character that Im not sure of. Is it the null character at the end? How do I change the if statement to check that the user input caught with fgets equals "exit"? Thanks!

Mark Elliot
  • 75,278
  • 22
  • 140
  • 160
user446836
  • 733
  • 4
  • 16
  • 24

7 Answers7

8

fgets considers the line terminator as a valid character. That's the extra character you are receiving.

Just do something like command[strlen(command) - 1] = '\0'; to remove the line terminator. Then you are free to do all your strcmp's.

Marlon
  • 19,924
  • 12
  • 70
  • 101
4

From the fgets manual page:

fgets() reads in at most one less than size characters from stream and stores them into the buffer pointed to by s. Reading stops after an EOF or a newline. If a newline is read, it is stored into the buffer. A '\0' is stored after the last character in the buffer.

Bottom-line: you have an extra newline at the end of your string when comparing.

thkala
  • 84,049
  • 23
  • 157
  • 201
3

fgets will always include the line termination character in the input string. You can remove any space, including the newline characters, from the end of your "command" by doing:

char command[50];
fgets(command, sizeof(command), stdin);

size_t length = strlen(command);
// Trim off trailing "spaces" including newline characters
while ((length > 0) && isspace(command[length-1]))
      command[--length] = '\0';

printf("Your Command: %s\n", &command); // Include newline now...
// This is computed above...
// int length = strlen(command);

// Continue as before
Foo Bah
  • 25,660
  • 5
  • 55
  • 79
Reed Copsey
  • 554,122
  • 78
  • 1,158
  • 1,373
  • 1
    I'm sure there is a faster way to dump whitespace characters than calling `strlen` 3 times in a `while` loop (given that `strlen` is going to iterate the whole string) – Mark Elliot Feb 04 '11 at 01:14
  • @Mark: Better? I wasn't really worried about that level of optimization given the length of these strings, and the fact that he's waiting on user input... It now has a single strlen (which was already in his original code) – Reed Copsey Feb 04 '11 at 01:16
  • @Mark - You can even do it with only one assignment, regardless of how much trailing space you have. (One could argue that this is unsafe, but it'll only cause problems for code that's unsafe in the first place.) – Chris Lutz Feb 04 '11 at 01:18
  • Almost better. Someone with 113k rep should know better than to store a string length in a signed type. `size_t`! `size_t`! `size_t`! – Chris Lutz Feb 04 '11 at 01:19
  • Your right, i don't think optimization is a huge deal for this assingment. if((strncmp(command, "exit", 4)) == 0) works, but I'm going to use this while loop so typing "exitasdjhsd" won't exit the shell as well – user446836 Feb 04 '11 at 01:27
2

fgets is capturing the line break, too.

Note that you can overcome this in a few ways, one might be using strncmp:

if((strncmp(command, "exit", 4)) == 0)

which checks if only the first 4 characters of command match (though this might not be the right option for you here).

Another tactic is to check with the line break in place:

if((strcmp(command, "exit\n")) == 0)
Mark Elliot
  • 75,278
  • 22
  • 140
  • 160
2

Probably the easiest way to handle this is to switch to using scanf to read the input:

char command[51];

scanf("%50[^\n]", command);

if (0 == strcmp(command, "exit"))
    do_something();
Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • That doesn't look like a valid format string. What compiler (or more properly, libc implementation) does it work with? – Ben Voigt Feb 04 '11 at 02:16
  • @Ben Voigt: Unless I've made a typo that I can't see, it is a valid format string. It should work with any conforming implementations of C. I haven't done much conformance testing recently, but I'd be rather surprising to see one get this wrong (it was in the C89 standard, and wasn't new even then). – Jerry Coffin Feb 04 '11 at 02:27
  • hmmm, [Microsoft claims that it isn't ANSI standard](http://msdn.microsoft.com/en-us/library/xdb9w69d.aspx), but I guess they could be wrong. – Ben Voigt Feb 04 '11 at 02:47
  • The C standard, §7.19.6.2/12 claims it is standard. I think it's *probably* right. Using a '-' to signify a range (e.g., `a-z`) is probably what they're talking about as an extension, but I haven't used that above. – Jerry Coffin Feb 04 '11 at 02:54
1

Your string still has the newline at the end. You could compare with "exit\n" or use something like strncmp(command, "exit", 4). Note that that would accept anything that started with "exit" and ignore the rest.

Fred Larson
  • 60,987
  • 18
  • 112
  • 174
0

As noted, fgets(3) gives you the trailing '\n'. If you use gets(3), you don't gets the trailing newline. Nothing like consistency, sez I.

Perl has a hand chomp() function that trims the trailing newline if its present — you could do worse than to roll your own:

#define NUL ((char)0)
void chomp( char *s )
{
  if ( s != null )
  {
    int len = strlen(s) ;
    if ( len >= 1 && s[len-1] == "\n" )
    {
      s[len-1] = NUL ;
    }
  }
  return ;
}
Nicholas Carey
  • 71,308
  • 16
  • 93
  • 135
  • 2
    Though not it's originally intended use, `strtok` works nicely for this: `strtok(s, "\n");` In fact, it's about the only thing I *do* use `strtok` for. – Jerry Coffin Feb 04 '11 at 01:36
  • 1
    `gets` should not be considered a fix though, because it doesn't protect against buffer overflow. – Ben Voigt Feb 04 '11 at 02:16
  • Your function has some problems. a) I prefer to denote the nul character as `'\0'` which is clearly a character literal, though you could use `0` just as well. Calling it `NUL` makes it easy to confuse with the null pointer. Which, b) is called `NULL`, not `null`. c) `strlen` returns a `size_t`, which is neither guaranteed to be the same size as an `int` nor is signed. Don't use an `int` to store object sizes or array indices, and don't go halfway with an unsigned integer type. Use `size_t`. That's what it's there for. d) I prefer `if(s == NULL) return;` to reduce the overall indentation. – Chris Lutz Feb 04 '11 at 02:18
  • @Ben Voigt: You're certainly right about that -- `gets` is a problem, not a fix. – Jerry Coffin Feb 04 '11 at 02:58
  • @Chris Lutz: reason for using `NUL` is that that is the name of that character, in ASCII, of the control character at code point 0x00. The name `NUL` has been used for more than 40 years (dating back to the 1968 ASCII Standard produced by ANSI X3.2. Here's a scanned image of the code table from that standard: http://www.asciitable.biz/images/ascii-small.jpg (the 1963 version of the ASCII standard just used descriptive phrasing and 0x00 was `NULL` at that point. The C programming language macro `NULL` is the newcomer here. – Nicholas Carey Feb 04 '11 at 17:56
  • Further, The Unicode block _CO Controls and Basic Latin Characters_ in the Basic Multilingual Plane (for all intents and purposes, ASCII), also names the character at code point U+0000 `NUL`. http://www.unicode.org/charts/PDF/U0000.pdf – Nicholas Carey Feb 04 '11 at 18:10
  • @Nicholas Carey - I know it's the newcomer but it's vastly less useful. The C standard is the one we're programming with, and that defines the `NULL` macro as a null pointer, which is going to be used a lot more and explicitly needed (over plain `0`) in a few cases. A special macro for ASCII NUL is only going to be used when dealing with bad string functions, and is never going to be explicitly needed over the constant `0` or literal `'\0'`, which are equivalent (and both of type `int` rather than `char`). Your rebutall time woild have been better spent fixing the other issues in your answer. – Chris Lutz Feb 04 '11 at 18:26