1

I have written a simple calculator in C using recursive-descent parser. It all works great but having a problem when I start with a '(' sign.

I take the user input from argv[1] and put the content in a global variable for temporary simplicity. Than I simply go through each character in the string and looking for what pattern it may be, ex number, add sign or multiplication (just like a regular recursive-descent parser work, no real fancy work).

But, if I do this:

./calculator (1+2)*0.5

I receive an error message,

bash: syntax error near unexpected token `1+2'

This is because I have to escape the '(' and ')' so

./calculator \(1+2\)*0.5

works fine.

So my questions is:

How do I solve this without having to think about eater putting single or double quotations around the equation or escape them?

Why is 0.5*(1+2) still working? Should I not have to escape the parenthesis there also?

Salviati
  • 758
  • 2
  • 9
  • 28
  • 2
    This appears to be a shell issue, not a C issue. Added [tag:bash] tag. – Fred Larson May 23 '18 at 16:18
  • Yes I am using bash, what do you think I should search for to find out the problem? – Salviati May 23 '18 at 16:19
  • 1
    It's fine. I just added a tag so the bash experts will find it and give you better answers. – Fred Larson May 23 '18 at 16:19
  • 1
    @Salviati: The fundamental thing you need to realize is that whatever you type on the command line *must be valid shell syntax*. Your script cannot change that. The shell parses the command line, according to *its* rules, and then (if it works right) passes the result of that to your script. `./calculator (1+2)*0.5` is not valid shell syntax, and there's nothing your script can do to change that. – Gordon Davisson May 23 '18 at 16:55
  • Thank you for the explanation! I understand that now and the reason for me asking is to avoid this in the future and just know it is how it is built. Nothing wrong with it, I just want to understand. Thank you! – Salviati May 23 '18 at 17:05

2 Answers2

4

What happens here is that your line looks like a function definition:

$ ./calculator () {
>     echo "function called with arguments '$@'"
> }

defines a shell function called ./calculator. It can then be called like a command would be executed:

$ ./calculator arguments go here
function called with arguments 'arguments go here'

Your error stems from the fact that Bash expects that ( be followed by ) for it to be a proper function definition, but your parents weren't empty!

Bash shell reserves many meta characters on command line, not only parentheses. * is used for pathname generation. Different shells work differently; in Z shell (zsh), even this wouldn't work:

% ./calculator 1*2
zsh: no match

You should must escape all these metacharacters when given on command line. Do not learn a "safe subset" because soon you will try another shell and it fails. Or this might happen:

$ echo 1*2
1*2
$ touch 1-31337-2
$ echo 1*2
1-31337-2

There are two simple solutions to avoid backslashitis:

  1. use single quotes around everything:

    $ ./calculator '(1+2)*0.5'
    

    works nicely if your string doesn't contain '.

    Double quotes would also work but there are more meta characters that are reserved by bash within double quotes, for example $.

  2. read the calculation from standard input instead, with a prompt

    $ ./calculator
    calculator> 1 + 2 * 0.5
    

    you can use for example readline library for easy interactive editing, too!

  • Just realized I did a typo, sorry about that, but I did know that both quotes and single-quotes work also, I mentioned it in the first question at the bottom but accidentally said "parenthesis", sorry about that. But anyway, there is no other way of coming around this? If no I will mark it as solved and just stop putting parenthesis at the start of the equation. – Salviati May 23 '18 at 16:30
  • 3
    @Salviati All tools that accept parameters that may contain parentheses requires that the user escapes them in their shell, including `sed`, `awk` and `bash` itself. This is simple, predictable, well understood, canonical Unix behavior. You can technically get around it with shell specific hacks like bash's "magic aliases", but you'll quickly find that it's better to work *with* the system than against it. – that other guy May 23 '18 at 16:49
  • Thank you for the explanation. Absolutely, I will work with the system but I do not like to not know why something does not work so I can avoid similar mistakes or questions in the future. Anyway, now I understand why, thank you! – Salviati May 23 '18 at 17:02
  • @AnttiHaapala Thank you so much for all your help! – Salviati May 23 '18 at 17:05
  • 1
    @Antti: That's a good answer but the bit about function definitions is slightly misleading. Parentheses would be errors inside arguments as well (with a couple of interesting exceptions). The full explanation was too long for a comment so I added an answer. – rici May 23 '18 at 19:30
  • @rici yea thanks, it took a while to realize what was going on here anyway, as I've been a zsh user for life ;-) – Antti Haapala -- Слава Україні May 24 '18 at 07:43
2

Summary: Either quote your expressions (preferably with single quotes) or use something other than ( and ) for grouping.

For an answer to "why does 0.5*(1+2) work?", go to the end. (Hint: it's because you don't have a file named 0.5.)


Parentheses are what the bash manual refers to as metacharacters. (Posix no longer uses this term; instead, it refers to such characters as "operators". But the basic effect is the same.) Unless quoted, metacharacters are always tokens by themselves (or, in cases like << and &&, along with the rest of the operator they start), and they have syntactic significance.

This is different from braces ({ and }) which are reserved words, not metacharacters, and so do not delimit tokens. As reserved words, they only have special significance when they are tokens by themselves and are the first token in a command:

{echo x      # The command to be executed is `{echo`, which probably doesn't exist
echo {x      # No problem. Prints the string '{x'
echo { }     # Also no problem. Prints '{ }'
{ echo x; }  # A compound command. The ; is necessary.
(echo x)     # Also a compound command but ; and whitespace are optional

[ and ] are somewhat similar. [ is a command (not even a reserved word), while [[ is a reserved word which starts a conditional compound command but only if it is the first word in a command.

So you could use brackets or braces as grouping operators without worrying about quoting, because the arguments to your function are never going to be the first words in the command.

As a side-note, the difference between [ (a command) and [[ (a reserved word) is shown by the fact that only the first one can be preceded by a variable assignment (in this case, the assignment has no useful effect):

$ foo=3 [ -z "$foo" ] && echo yes
yes
$ foo=3 [[ -z "$foo" ]] && echo yes
[[: command not found
$ [[ -z "$foo" ]] && echo yes
yes

The precise syntactic significance of parentheses depends, as usual, on the syntax in which they appear. In the case of (, this might be:

  • A function definition:

    func () { echo "$@"; }
    
  • A compound command executed in a subshell

    (sleep 1; echo "Hello..."; sleep 5; echo "World!")&
    
  • Surrounding the pattern in a case clause:

    case "$word"; in
      (Hello) echo "Hi" ;;
         Bye) echo "Seeya" ;;  # The open parenthesis is optional in this syntax
    esac
    
  • In bash, it may also be used as part of array assignment:

    local numbers=(one two three)
    

    and it can form part of the (( operator, used in arithmetic conditional compound commands and arithmetic for statements.

Parentheses might also appear as part of a longer construct not starting with a parenthesis, such as command substitution: $(. But if a parenthesis is recognised as a token and it doesn't fit any of the syntactic constructs which include parentheses, a syntax error will be signalled:

$ echo a b(c)
bash: syntax error near unexpected token `('

That leaves us with a small mystery: how do we explain the following:

$ echo a+(b+4)
a+(b+4)
$ echo a-(b+4)
bash: syntax error near unexpected token `('
$ echo a*(b+4)
a*(b+4)
$ echo a/(b+4)
bash: syntax error near unexpected token `('

The answer is that I have shopt -s extglob in my bash start-up files. And you probably do, too, because many distributions do that for you by default. If "extended glob" patterns are available, then the following are patterns:

?(pattern-list)
       Matches zero or one occurrence of the given patterns
*(pattern-list)
       Matches zero or more occurrences of the given patterns
+(pattern-list)
       Matches one or more occurrences of the given patterns
@(pattern-list)
       Matches one of the given patterns
!(pattern-list)
       Matches anything except one of the given patterns

A pattern-list can contain only a single pattern, so b+4 is a valid pattern, and a+(b+4) will therefore match a file whose name starts with an a and is followed by one or more instances of the characters b+4:

$ touch ab+4b+4b+4
$ echo a+(b+4)
ab+4b+4b+4

Like any other filename pattern, if no filename is matched, the pattern is not substituted:

$ rm ab+4b+4b+4
$ echo a+(b+4)
a+(b+4)

Unless you have other shell options set:

$ shopt -s failglob
$ echo a+(b+4)
bash: no match: a+(b+4)
rici
  • 234,347
  • 28
  • 237
  • 341