5

Problem

I have this bash script:

ACTIVE_DB=$(grep -P "^[ \t]*db.active" config.properties | cut -d= -f2 | tr -s " ")
echo $ACTIVE_DB
if [ "$ACTIVE_DB" = "A" ]
then
    ln -sf config-b.properties config.properties
else
    ln -sf config-a.properties config.properties
fi

config-a.properties

db.active = A

config-b.properties

db.active = B

When I run the script, a hard copy (=cp) is performed and config.properties is often not a symbolic link (nor a physical link for that matter) but a whole new file with the same content as config-a.properties or config-b.properties.

$ ls -li
53 -rw-r--r-- 1 ogregoir ogregoir     582 Sep 30 15:41 config-a.properties
54 -rw-r--r-- 1 ogregoir ogregoir     582 Sep 30 15:41 config-b.properties
56 -rw-r--r-- 1 ogregoir ogregoir     582 Oct  2 11:28 config.properties

When I run this in the prompt manually line by line, I have no trouble and a symbolic link is indeed always created and config.properties points towards config-a.properties or config-b.properties.

$ ls -li
53 -rw-r--r-- 1 ogregoir ogregoir     582 Sep 30 15:41 config-a.properties
54 -rw-r--r-- 1 ogregoir ogregoir     582 Sep 30 15:41 config-b.properties
55 lrwxrwxrwx 1 ogregoir ogregoir      20 Oct  2 11:41 config.properties -> config-b.properties

Notes

  • No file is open anywhere else (I'm the only active user and the application using the configuration isn't running).
  • Sometimes ln -sf acts normally, but the usual rule is that it makes a hard copy.
  • The script is run from another directory, but cds to the directory where the config*.properties files are located before performing the actions here.
  • The script is way much longer, but this is the shortest example that reproduces the error.
  • bash version is 4.1.2 (it's local, so I don't care about shellshock).
  • ln version is 8.4.
  • Operating System: Red Hat Enterprise Linux Server release 6.5 (Santiago).
  • Filesystem used for that folder: ext4.

Question

  • Why doesn't my script consistently create a symbolic link but makes a hard copy?
  • How to force a symbolic link here?
Olivier Grégoire
  • 33,839
  • 23
  • 96
  • 137
  • 6
    The `ln` command will *not* create a copy. Never – hek2mgl Oct 02 '14 at 10:02
  • 1
    Yes, I can read `man ln`, but yet it does... randomly! – Olivier Grégoire Oct 02 '14 at 10:05
  • Which OS and which filesystem? – Cyrus Oct 02 '14 at 10:08
  • OS: `Red Hat Enterprise Linux Server release 6.5 (Santiago)` ; filesystem : ext4. – Olivier Grégoire Oct 02 '14 at 10:09
  • Stupid question, but... did you try to restart the machine? Is it still a copy afterwards? – stuXnet Oct 02 '14 at 10:23
  • 1
    Sorry, but I can't reproduce this problem with RHEL6.0 and 100.000 changed links: `cp /etc/passwd /tmp/config-a.properties; cp /etc/passwd /tmp/config-b.properties; cd /tmp; c=0; while true; do ln -sf config-a.properties config.properties; [ ! -h config.properties ] && exit; ln -sf config-b.properties config.properties; [ ! -h config.properties ] && exit; echo $c; c=$((c+1)); done` – Cyrus Oct 02 '14 at 10:35
  • @stuXnet: no, I haven't and I will not, as this is supposed to go on a server that won't shut down afterwards and I can reproduce this issue on the server (same setup, except shellshock will be patched). @Cyrus: is it possible that it happens because the file is still not properly processed from the previous command (`grep|cut|tr`)? This is actually our best guess at the moment because when we remove that test, we keep getting symbolic links. Your test doesn't take that in account. – Olivier Grégoire Oct 02 '14 at 11:00
  • Have you tried quoting the file names and the `echo` before the `ln -sf` ? – tchap Oct 02 '14 at 11:48
  • 1
    Add `command -v ln` just before each call to show what `ln` actually calls (to rule out a shell function or unexpected binary like `/this/is/wrong/ln`). Since a symbolic link's target doesn't even need to exist, it is unlikely the previous command has any effect on what you are observing. – chepner Oct 02 '14 at 12:22
  • Have you tried using full path `/bin/ln`? This will eliminate the possibility that your script is executing a different `ln`. – alvits Oct 13 '14 at 23:37
  • 4
    I suspect you have some other script or code that is overwriting the symlinks. For example, `sed -i` will trash symlinks. There are a variety of commands and utilities that modify a file by creating a copy, modifying the copy, and then moving the copy over top of the original, which destroys the original symlink. Or an alternate explanation: you are not running the script you think you are, or are not modifying the files you think you are. – John Kugelman Oct 14 '14 at 22:12
  • Did someone replace the contents of `/bin/ln` with the contents of `/bin/cp` or something else, possibly even more nefarious? – twalberg Oct 14 '14 at 22:17
  • 1
    @JohnKugelman Oh, nice! I indeed use `sed -i` later on in the `config.properties`. I'll check when I'll get back to work. – Olivier Grégoire Oct 14 '14 at 22:21
  • I dont't think that `grep|cut|tr` pipeline can be the culprit, but it's ugly and inefficient. `awk -F '[ \t=]+' '$1=="db.active"{print $2}'` seems both more efficient and more robust to me. – tripleee Oct 15 '14 at 04:05
  • 1
    Finally!. The code in the question is NOT the source of the problem. –  Oct 15 '14 at 15:45

2 Answers2

3

I suspect you have some other script or code that is overwriting the symlinks. For example, sed -i will trash symlinks. There are a variety of commands and utilities that modify a file by creating a copy, modifying the copy, and then moving the copy over top of the original, which destroys the original symlink.

John Kugelman
  • 349,597
  • 67
  • 533
  • 578
  • This is it indeed. I checked with and without `sed -i` and the result was different. Thank you very much! I didn't think it would replace the file so I left it in my small test but didn't copy/paste it in the question. Sorry. – Olivier Grégoire Oct 15 '14 at 09:36
  • I now use `sed --follow-symlinks -i` and my script is doing exactly what I want. Thanks! – Olivier Grégoire Oct 15 '14 at 11:56
  • This: `sed -i "$(realpath ./config.properties)"` also works by (fully) resolving the file. `https://stackoverflow.com/questions/7665/how-to-resolve-symbolic-links-in-a-shell-script` –  Oct 15 '14 at 16:33
1

The only answer possible to the question (as asked): why ln behave as cp is: It can not.

The only other possible answer is: what you present to us is not exactly what is being executed, or there are other scripts running which alter the answer.

Some possible alternatives:
1.- The ln command is actually doing a hard-link. The i-node list (ls -li) confirms that the i-node numbers are distinct. So, no, that is not the reason.

2.- Is there an alias or function for ln?
That is easy to check. Just issue type -a ln inside Bash. The result will show what is bash interpreting ln to be. If it is ONLY the file /bin/ln, then it is correct.
You confirmed that there is no alias or function involved.

3.- As "the script is run from another directory". The point here is: Is there other file anywhere in the filesystem that has the same i-node number (if ln is actually creating a hard link). The existence of some other file with the same i-node could be verified with (use the inode numbers 53,54,56 from your listing):

find / -follow -inum <your inum>

4.- I hope that you are truly aware that config-b.properties does not actually exist (as a file). Editing such file may trash the link.

Is the actual script executed also changing/updating the file contents?

Note01: Note that the K trick does resolve the extraction in just one external call: http://www.charlestonsw.com/perl-regular-expression-k-trick/

ACTIVE_DB=$(grep -Po "^[ \t]*db.active[ ]+=[ ]+\K." config.properties)

It has been confirmed that a sed -i to config-b.properties later in the real executed script was the source of the problem.

  • Regarding the quotes, I know that the result of the cut is " A". That's why I trim it afterwards. For the rest, I'll check that as I'm back at work. Thanks for some insights. – Olivier Grégoire Oct 12 '14 at 23:31
  • The result of `type -a ln` is `ln is /bin/ln`. Seems ok for me. The snippets I showed in my question are taken as is from my script and configuration files. We really have databases that we call `A` and `B` and that can replace each other. Finally, I don't understand your second point. Could you elaborate? – Olivier Grégoire Oct 13 '14 at 11:50
  • The command ´ln´ could not do what you report. Something else should be happening. I'll edit my answer to give more options. –  Oct 14 '14 at 22:59