0

I strongly doubt about the grep best use in my code and would like to find a better and cleaner coding style for extracting the session ID and security level from my cookie file :

cat mycookie 
# Netscape HTTP Cookie File
# https://curl.haxx.se/docs/http-cookies.html
# This file was generated by libcurl! Edit at your own risk.

#HttpOnly_127.0.0.1 FALSE   /   FALSE   0   PHPSESSID   1hjs18icittvqvpa4tm2lv9b12
#HttpOnly_127.0.0.1 FALSE   /mydir/ FALSE   0   security    medium

The expected output is the SSID hash :

1hjs18icittvqvpa4tm2lv9b12

Piping grep with tr '\n' '\0' works like a charm in the command line, but generates warnings (warning: command substitution: ignored null byte in input”) at the bash code execution. Here is the related code (with warnings):

ssid=$(grep -Po 'PHPSESSID.*' path/sessionFile | grep -Po '[a-z]|[0-9]' | tr '\n' '\0')

I am using bash 4.4.12 (x86_64-pc-linux-gnu) and could read here this crystal clear explanation :

Bash variables are stored as C strings. C strings are NUL-terminated. They thus cannot store NULs by definition.

I could see here and there in both cases a coding solution using read:

# read content from stdin into array variable and a scalar variable "suffix"
array=( )
while IFS= read -r -d '' line; do
  array+=( "$line" )
done < <(process that generates NUL stream here)
suffix=$line # content after last NUL, if any

# emit recorded content
printf '%s\0' "${array[@]}"; printf '%s' "$suffix"

I don't want to use arrays nor a while loop for this specific case, or others. I found this workaround using sed:

ssid=$(grep -Po 'PHPSESSID.*' path/sessionFile | grep -Po '[a-z]|[0-9]' | tr '\n' '_' | sed -e 's/_//g')

My two questions are :

1) Would it be a better way to substitute tr '\n' '\0', without using read into a while loop ?
2) Would it be a better way to extract properly the SSID and security level ?

Thx

hornetbzz
  • 9,188
  • 5
  • 36
  • 53
  • 1
    `ssid=$(awk '/PHPSESSID/{printf("%s",$NF)}' file)`? – Cyrus Dec 29 '18 at 06:06
  • Please add your desired output for that sample input to your question. – Cyrus Dec 29 '18 at 06:06
  • I ran your command `ssid=$(...`, both on command line and from a script, but didn't receive any error or warning. And I don't understand that substitution about \n... what I am missing? – linuxfan says Reinstate Monica Dec 29 '18 at 07:07
  • @Cyrus: Thx, I edited my question and added the expected output. – hornetbzz Dec 29 '18 at 19:29
  • @linuxfan: I did not say there is an error. I said I'm looking for a more elegant solution than my uggly piped `grep` and `sed` sequence and that I'd like to understand how to appropriately replace `tr \n \0` by stg else, w/o using a `while, read` loop. Thx. – hornetbzz Dec 29 '18 at 19:32

2 Answers2

2

It looks like you're trying to get rid of the newlines in the output from grep, but turning them into nulls doesn't do this. Nulls aren't visible in your terminal, but are still there and (like many other nonprinting characters) will wreak havoc if they get treated as part of your actual data. If you want to get rid of the newlines, just tell tr to delete them for you with ... | tr -d '\n'. But if you're trying to get the PHPSESSID value from a Netscape-format cookie file, there's a much much better way:

ssid=$(awk '($6 == "PHPSESSID") {print $7}' path/sessionFile)

This looks for "PHPSESSID" only in the sixth field (not in e.g. the path or cookie values -- both places it could legally appear), and specifically prints the seventh field of matching lines (not just anything after "PHPSESSID" that happens to be a digit or lowercase letter).

Gordon Davisson
  • 118,432
  • 16
  • 123
  • 151
1

You could also try this, if you don't want to use awk:

ssid=$(grep -P '\bPHPSESSID\b' you_cookies_file)
echo $ssid   # for debug only

which outputs something like

#HttpOnly_127.0.0.1 FALSE / FALSE 0 PHPSESSID 1hjs18icittvqvpa4tm2lv9b12

Then with cut(1) extract the relevant field:

echo $ssid |cut -d" " -f7

which outputs

1hjs18icittvqvpa4tm2lv9b12

Of course you should capture the last echo.

UPDATE

If you don't want to use cut, it is possible to emulate it with:

echo $ssid | (read a1 b2 c3 d4 e5 f6 g7; echo $g7)

Demonstration to capture in a variable:

$ field=$(echo $ssid | (read a1 b2 c3 d4 e5 f6 g7; echo $g7))
$ echo $field
1hjs18icittvqvpa4tm2lv9b12
$

Another way is to use positional parameters passing the string to a function which then refers to $7. Perhaps cleaner. Otherwise, you can use an array:

array=($(echo $ssid))
echo ${array[6]}   # outputs the 7th field

It should also be possible to use regular expressions and/or string manipulation is bash, but they seem a little more difficult to me.