0

I have a text file with a list of filenames. I would like to create a variable from a specific line number using AWK. I get the correct output using:

awk "NR==\$Line" /myPath/fileList.txt

I want to assign this output to a variable and from documentation I found I expected the following to work:

INFILE=$(awk "NR==\$Line" /myPath/fileList.txt)

or

INFILE=`awk "NR==\$Line" /myPath/fileList.txt`

However,

echo "\$INFILE" 

is blank. I am new to bash scripting and would appreciate any pointers.

ghoti
  • 45,319
  • 8
  • 65
  • 104
Sara
  • 1
  • 1
  • 1
  • 3
  • Neither of those commands are _supposed_ to give any output, they set the variable INFILE. (First version is "better".) – Mat Apr 16 '12 at 20:30
  • Sorry, I should have clarified. echo "\$INFILE" is blank. – Sara Apr 16 '12 at 20:50
  • If you escape your dollar sign, you don't expand the INFILE variable. Try `echo "$INFILE"` instead. – ghoti Apr 17 '12 at 01:27
  • @ghoti: I found when I tested this by hard coding the line number that the escape was necessary. I think this is because I am submitting my script to a job scheduler. – Sara Apr 17 '12 at 01:49
  • `echo "\$INFILE"` is definitely not blank. It means echo the characters `$INFILE` literally: dollar sign, followed by `INFILE`. – Kaz Apr 17 '12 at 15:57
  • @Kaz: Normally that is true. That's how I know that there's a problem. – Sara Apr 17 '12 at 17:53
  • I did not find out why I need to escape the variables, this is probably an SGE issue. I did figure out how to get them to be evaluated correctly. I needed another escape in front of the first $ and the quotes removed. The following works for me: infile=\$(awk -v line=\$SGE_TASK_ID 'NR == line' /myPath/my_outfile_list.txt) – Sara Apr 17 '12 at 17:57

2 Answers2

3

The output of the AWK command is assigned to the variable. To see the contents of the variable, do this:

echo "$INFILE"

You should use single quotes for your AWK command so you don't have to escape the literal dollar sign (the literal string should be quoted, see below if you want to substitute a shell variable instead):

awk 'NR == "$Line"' /myPath/fileList.txt

The $() form is much preferred over the backtick form (I don't understand why you have the backticks escaped, by the way). Also, you should habitually use lowercase or mixed case variable names to avoid name collision with shell or environment variables.

infile=$(awk 'NR == "$Line"' /myPath/fileList.txt)
echo "$infile"

If your intention is that the value of a variable named $Line should be substituted rather than the literal string "$Line" being used, then you should use AWK's -v variable passing feature:

infile=$(awk -v "line=$Line" 'NR == line' /myPath/fileList.txt)
Dennis Williamson
  • 346,391
  • 90
  • 374
  • 439
  • the $line variable is an environmental variable set by my job scheduler, SGE. I believe I need to escape the variable name in my submit script. I attempted infile=$(awk -v "line=$SGE_TASK_ID" 'NR == line' /myPath/fileList.txt) but am getting the same result from echo $infile. – Sara Apr 16 '12 at 22:00
  • @Sara: Then probably my last example would be the way to go. Then you can `echo "$infile"` or use the variable in other ways, of course. – Dennis Williamson Apr 16 '12 at 22:02
  • Thanks, your example makes sense and looks like it should work, but after following it I'm still getting the same blank result from echo "$infile". – Sara Apr 16 '12 at 22:41
  • Dennis Williamson's example should (and *does*) output **abc**; assuming you have set *SGE_TASK_ID* to a suitable value, eg. `SGE_TASK_ID=1; echo abc >file; infile=$(awk -v "line=$SGE_TASK_ID" 'NR == line' file); echo "$infile"` – Peter.O Apr 16 '12 at 23:21
  • Thanks for verifying. I suspect I have some additional problem since I cannot duplicate these results. I think its interesting that I need to escape both my variables and the environmental variable when using echo. I am wondering if the job scheduler could cause this behavior, although I cannot find this in the documentation. Any thoughts or suggestions would be appreciated. – Sara Apr 16 '12 at 23:29
  • @Sara: I just noticed that you used the phrase "submit script" and made the connection with all the extra escaping and a light went off. You're submitting this script to some program for *it* to actually execute it rather than running the script in a more usual way - is that correct? I would then recommend that you make your script in a separate file and only submit the filename to the program that's executing it. Then no special escaping should be necessary. – Dennis Williamson Apr 16 '12 at 23:53
  • I found that: echo awk -v "line=\$SGE_TASK_ID" 'NR == line' /myPath/fileList.txt produces: awk -v line=undefined NR == line /myPath/fileList.txt. Adding an escape before the $ produces: awk -v line=3 NR == line /myPath/fileList.txt. This is exactly what I want to execute and assign to infile. But infile=$(awk -v "line=\$SGE_TASK_ID" 'NR == line' /myPath/fileList.txt) echo "$infile" produces a blank. The problem is definitely the variable assignment because I can hard code a number and get the correct result. – Sara Apr 17 '12 at 00:05
  • @Dennis: Thanks for your input. My script is saved as a file and submitted to the job scheduler which uses the script to create an array of jobs. I'm not completely clear on how to apply what you're recommending to this case? – Sara Apr 17 '12 at 00:15
  • @Sara - Why not simplify things, write your automation in your own script somewhere else in the filesystem, then have that script called by a small wrapper that you submit to your job scheduler? – ghoti Apr 17 '12 at 01:29
  • @ghoti: I'm a beginner and am not familiar with what you're describing, so this would probably not be simple for me. I hoped I was getting close to a solution here, since I just need to get awk to properly interpret my variable. Please correct me if I'm wrong and need to try a different approach. – Sara Apr 17 '12 at 02:00
  • @Sara - Far be it for me to push you in a new direction if you feel you're getting close. :-) But the largest part of your struggle seems to be making things work with your scheduler. If you can avoid the issue of escaping `$` characters by running a more "pure" script stored in a separate file, then the help you get here would be more about the code and less about the analysis of your scheduler. OTOH, it's great that you've attracted some folks who seem to know your environment. Good luck with it. :) – ghoti Apr 17 '12 at 02:23
  • @Sara: If SGE is the Sun Grid Engine, you may find the information and example scripts [here](http://web.njit.edu/all_topics/HPC/basement/sge/SGE.html) to be instructive. Basically, you create a shell script file with its first line as `#!/bin/sh` or `#!/bin/bash` and put the rest of your script after that without any extra escaping (read: "perhaps almost none"). Save the file with a name you choose and in a directory that the scheduler has access to. Then submit `/dir/where/script/is/scriptname` (substituting the actual path and file name) to the scheduler. – Dennis Williamson Apr 17 '12 at 04:44
  • You'll notice that the example scripts at the link don't have any escaping. – Dennis Williamson Apr 17 '12 at 04:44
  • @Dennis: That's correct, I'm using Sun Grid Engine as my job scheduler. I do have these lines in my script, so I'm not sure why I'm still needing to use escapes. I noticed the online examples did not use them while examples from another user on my compute cluster did. I will look into whether this could be system specific or if there is another error in my submit script. – Sara Apr 17 '12 at 06:43
0

Don't mask the dollar sign.

wrong:

 echo "\$INFILE" 

right:

 echo $INFILE
 echo ${INFILE}
 echo "$INFILE"
 echo "${INFILE}"

The ${x} - construct is useful, if you like to glue texts together.

 echo $INFILE0

will look for a Variable INFILE0. If $INFILE is 5, it will not produce "50".

 echo ${INFILE}0

This will produce 50, if INFILE is 5.

The apostrophes are useful if you variable contains whitespace and for more or less unpredictable text.

If your rownumber is a parameter:

#!/bin/bash
Line=$1
INFILE=$(awk "NR==$Line" ./demo.txt)
echo "$INFILE"

If INFILE contains multiple spaces or tabs, echo $INFILE would condense them to single spaces, while "$INFILE" preserves them.

user unknown
  • 35,537
  • 11
  • 75
  • 121
  • 1
    No, that won't produce INFILE ten times. It will produce INFILE with a "0" after it. In this context, the curly braces aren't used for both parameter expansion and brace expansion - only the former. You could do `INFILE=foo; echo "${INFILE}"{0..9}` which would output `foo0 foo1 foo2 foo3 foo4 foo5 foo6 foo7 foo8 foo9`. By the way, you should *always* quote variables for output. And those aren't apostrophes (single quotes), they're double quotes. – Dennis Williamson Apr 16 '12 at 21:26
  • I can get this to work by hard coding the line number: infile=$(awk "NR==1" /myPath/fileList.txt). But I can only see the value using echo \$infile. echo $infile is blank. I'm still working on understanding why this is. – Sara Apr 16 '12 at 22:31
  • @DennisWilliamson: If INFILE is 7, it will produce 70, that's what I meant. But my first impression, that the content of a line is a number, is wrong, and while I realized it in the end, I was interrupted while writing, and forgot to adjust the first example. I don't agree about "you should always ..." - that's cargo cult programming. – user unknown Apr 17 '12 at 01:05
  • @Sara: Do you want to echo the command, or the result of the command? I updated my answer to clarify Dennis questions (it was really bad explained with `10 times`) and also added a short variation, which takes a parameter for $LINE. Note, that $LINE isn't masked in my example too, but maybe I get you completely wrong. – user unknown Apr 17 '12 at 01:26
  • I tried echoing the command from inside the $() to see how it was being evaluated. The variable was undefined w/ out the escape, and looked correct with the escape. I think this is because I am submitting my script through a job scheduler. Since the command looked correct using echo, I'm not sure why it's not being evaluated correctly when I put it inside the $(). – Sara Apr 17 '12 at 02:42
  • Which job scheduler? There are some pitfalls with cronjobs: No.1: No path is set (by default). No.2: Your (or the crontabs owner) environment variables aren't set. Test your script without cron, and specify where the problem is, or specify your scheduler problem, if it only exists in the scheduler. Note, that you tagged your question `bash`. Use `env -i ` to test your script how it is started without environment. – user unknown Apr 17 '12 at 04:00
  • @userunknown: It's not "cargo cult programming". It protects against unexpected behavior when the value contains whitespace. "[Also unfortunately, quoting in shell programming is extremely important. It's something no one can avoid learning. Improper shell quoting is one of the most common sources of scripting bugs and security issues.](http://mywiki.wooledge.org/Quotes)" The scheduler isn't `cron`, it's something called "SGE" (which may be the Sun Grid Engine) which Sara reveals in one of her comments. – Dennis Williamson Apr 17 '12 at 04:39
  • @DennisWilliamson: Quoting everything is often a technique to avoid learning when it is necessary to quote and when it is not. I often see people quoting literal constants which don't need a quote. – user unknown Apr 17 '12 at 08:46
  • @userunknown: You'll note that I said "on output". I see lots of unnecessary quoting, too, however it's safer to quote variables than not and it doesn't hurt to do so. I certainly know when it's safe not to. Perhaps the warning should be modified to "quote all variables until you learn when you don't have to and don't come crying to me when you don't quote one and it bites you." ;-) – Dennis Williamson Apr 17 '12 at 12:35