1

I've got quite a lot of headaches trying to debug my recursive function. It turns out that Dash interprets local variables strangely. Consider the following snippet:

iteration=0;

MyFunction()
{
    local my_variable;

    iteration=$(($iteration + 1));

    if [ $iteration -lt 2 ]; then
        my_variable="before recursion";
        MyFunction
    else
        echo "The value of my_variable during recursion: '$my_variable'";
    fi
}

MyFunction

In Bash, the result is:

The value of my_variable during recursion: ''

But in Dash, it is:

The value of my_variable during recursion: 'before recursion'

Looks like Dash makes the local variables available across the same function name. What is the point of this and how can I avoid issues when I don't know when and which recursive iteration changed the value of a variable?

pevik
  • 4,523
  • 3
  • 33
  • 44
Livy
  • 631
  • 4
  • 15
  • 1
    It says *The shell uses dynamic scoping, so that if you make the variable x local to function f, which then calls function g, references to the variable x made inside g will refer to the variable x declared inside f,...* in dash manpage – oguz ismail Sep 01 '19 at 09:54

2 Answers2

4

local is not part of the POSIX specification, so bash and dash are free to implement it any way they like.

dash does not allow assignments with local, so the variable is unset unless it inherits a value from a surrounding scope. (In this case, the surrounding scope of the second iteration is the first iteration.)

bash does allow assignments (e.g., local x=3), and it always creates a variable with a default empty value unless an assignment is made.

chepner
  • 497,756
  • 71
  • 530
  • 681
  • I don't think that's the case, the `local x=3` statement produces equal results in dash and bash. It's also not true that dash can implement "local" in any way it likes. The Debian Almquist shell has to support [the following extensions of POSIX.1-2017](https://www.debian.org/doc/debian-policy/ch-files.html#scripts) that clearly include both the "local" keyword and the "local x=3" use-case. – salmin Oct 20 '22 at 15:05
  • Hm, pretty sure I tested that before posting this answer. (Could that be a recent-ish change in `dash`?) Debian is free to set any additional requirements it likes on `/bin/sh` in its own operating system and its own shell; that does not mean they were not free to do so. *POSIX* does not require a conforming shell to support `local`, but nor does it *forbid* any particular definition for `local`. – chepner Oct 20 '22 at 15:42
  • I don't think it's new, it works well with dash 0.5.7 from Ubuntu 14.04. The migration of Ubuntu and Debian from bash to dash happened around 2006. Some bashisms were already supported at that time and some were implemented during the migration. – salmin Oct 21 '22 at 08:59
3

This is a consequence of your attempt to read the variable in the inner-most invocation without having set it in there explicitly. In that case, the variable is indeed local to the function, but it inherits its initial value from the outer context (where you have it set to "before recursion").

The local marker on a variable thus only affects the value of the variable in the caller after the function invocation returned. If you set a local variable in a called function, its value will not affect the value of the same variable in the caller.

To quote the dash man page:

Variables may be declared to be local to a function by using a local command. This should appear as the first statement of a function, and the syntax is

 local [variable | -] ...

Local is implemented as a builtin command.

When a variable is made local, it inherits the initial value and exported and readonly flags from the variable with the same name in the surrounding scope, if there is one. Otherwise, the variable is initially unset. The shell uses dynamic scoping, so that if you make the variable x local to function f, which then calls function g, references to the variable x made inside g will refer to the variable x declared inside f, not to the global variable named x.

The only special parameter that can be made local is “-”. Making “-” local any shell options that are changed via the set command inside the function to be restored to their original values when the function returns.

To be sure about the value of a variable in a specific context, make sure to always set it explicitly in that context. Else, you rely on "fallback" behavior of the various shells which might be different across shells.

pevik
  • 4,523
  • 3
  • 33
  • 44
Holger Just
  • 52,918
  • 14
  • 115
  • 123
  • By declaring the variable with `local my_variable`, I thought that it will be created and set to NULL or an empty string, thus hiding global variables of the same name. – Livy Sep 01 '19 at 10:08
  • Attempting to read an undefined variable is generally considered undefined behavior (which you should thus avoid). Most shells however will by default hide the undefinedness of a variable on read and return an empty value. It is still a code-smell though and can lead to hard-to-debug errors. You can enforce that the shell throws an error in that case with `set -u`. – Holger Just Sep 01 '19 at 10:17
  • In compiled languages, declaring a variable allocates a location in system memory for it, usually with garbage value. It seems declaring a variable with `local my_var` does nothing other than preventing `my_var` from affecting outer scope. What is the correct way of declaring and setting a variable to NULL/empty? I guess it would be `local my_var=;` – Livy Sep 01 '19 at 10:40