0

I have a string:

The disk 'virtual memory' also known as 'Virtual Memory' has exceeded the maximum utilization threshold of 95 Percent.

I need to search every time in this string word The disk and if found then I need to extract only phrase in '*' also known as '*' and put it in a variable MONITOR

In other words I want to search and put the value to

MONITOR="'virtual memory' also known as Virtual Memory'"

How can I do it using awk?

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
Nik
  • 191
  • 2
  • 11

2 Answers2

1

Here's a snippet that does what you describe. You should put it in $(...) to assign it to the $MONITOR variable:

$ awk '/The disk '\''.*'\'' also known as '\''.*'\'' has exceeded/ {gsub(/The disk /,"");gsub(/ has exceeded.*$/,"");print}' input.txt

The two problems with awk in this case is

  • it doesn't have submatch extraction on its regexes (which is why my solution uses gsub() in the body to get rid of the first and last part of the line.
  • To use the quotes in your awk regex in a shell script you need the '\'' sequence to scape it (more info here)
Community
  • 1
  • 1
Wernsey
  • 5,411
  • 22
  • 38
  • Don't use backticks — use `$(...)` notation instead. Granted, there isn't as clear-cut an advantage here, but in general, the `$(...)` notation is superior for a variety of reasons. – Jonathan Leffler Feb 11 '13 at 15:24
  • instead of having the string in input.txt i have string in a variable – Nik Feb 11 '13 at 15:54
  • Instead of having input.txt, i have string stored in a variable – Nik Feb 11 '13 at 15:54
  • @Nik Remove the `input.txt` and pipe the contents of your variable like so: `echo $THEVARIABLE | awk '...'` – Wernsey Feb 11 '13 at 16:19
0

It might be a little easier with sed than awk:

string="The disk 'virtual memory' also known as 'Virtual Memory' has exceeded the maximum utilization threshold of 95 Percent."

MONITOR=$(echo "$string" | sed -n "/The disk \('[^']*' also known as '[^']*'\) .*/s//\1/p")

If awk is necessary, then:

MONITOR=$(echo "$string" | awk "/The disk '[^']*' also known as '[^']*'/ {
                                print \$3, \$4, \$5, \$6, \$7, \$8, \$9; } {}')

The empty braces {} matches any line and prints nothing, so awk only processes lines that match the regex. Note that this assumes each disk has a name with two words in it. You need to use more powerful processing (gsub function, for example) to do regex-based substitution. This is not awk's forte; sed is easier to use for that task.

Both commands are set up to handle multiple lines of data interspersed with non-matching lines (but also work on single lines containing the matching information). It would also not be very difficult to just print the names between quotes on separate lines, so that you have less dissection to do afterwards (to get the two space-separated names).

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • Instead of having stored in input.txt it is stored in a variable $CONTAINER_STRING and i want to extract the match value and store it in variable $MONTIOR, does below expression looks ok now? $CONTAINER=`echo "$CONTAINER_STRING" | awk '/The disk '\''.*'\'' also known as '\''.*'\'' has exceeded/ {gsub(/The disk /,"");gsub(/ has exceeded.*$/,"");print}'' – Nik Feb 11 '13 at 15:56
  • @Nik: If your data has double quotes around the names, then that looks almost right—I've not tested what you suggest, but I see some inconsistencies in the quote handling. If your data has single quotes around the names(as shown in the question), it won't work. It might be simpler (in terms of dealing with quotes) to put the `awk` program in a file (say `script.awk`) and then use that: `CONTAINER=$(echo "$CONTAINER_STRING" | awk -f script.awk)`. Note that there is no `$` at the start of the assignment. – Jonathan Leffler Feb 11 '13 at 16:50
  • Jonathan i liked your idea of using sed too...i will try this – Nik Feb 11 '13 at 18:44
  • $Container is already a vraible containing string , "The disk 'virtual memory' also known as 'Virtual Memory' has exceeded the maximum utilization threshold of 95 Percent." @Jonathan $MONITOR=$(echo "$Container" | sed -n "/The disk \('[^']*' also known as '[^']*'\) .*/s//\1/p") What will be the output of $MONITOR expected is this going to be ? – Nik Feb 11 '13 at 18:46
  • You keep changing variable names; `$Container` is quite separate from `$CONTAINER`. You still would not normally write `$MONITOR=Something`; that assigns the value `Something` to the variable whose name is held in `$MONITOR`. To modify the variable itself, you write: `MONITOR=Something`. The `sed` command you quote uses `(` and `)` without backslashes; what happens depends on the version of `sed` you're using. With most versions, the match fails as unescaped parentheses are not metacharacters. If you're using GNU `sed`, it might (or might not) interpret the parentheses as metacharacters. – Jonathan Leffler Feb 11 '13 at 18:57
  • I am sorry about that let me try to write cleanly $CONTAINER is a string which already contains string ""The disk 'virtual memory' also known as 'Virtual Memory' has exceeded the maximum utilization threshold of 95 Percent." Now i have applied the expression of sed you provided $MONITOR=$(echo "$CONTAINER" | sed -n "/The disk \('[^']*' also known as '[^']*'\) .*/s//\1/p") I am expecting the output of $MONITOR will be 'virtual memory' also known as 'Virtual Memory', is this a write assumption? – Nik Feb 11 '13 at 19:04
  • No! Please note that my answer writes: `MONITOR=$(...)` and ***not*** `$MONITOR=$(...)`. As I explained in my previous comment, putting the `$` at the start of the assignment is almost invariably incorrect, though not syntactically illegal. The one thing it doesn't do is assign to the variable `MONITOR` unless you get `Yes` from the following fragment: `if [ "$MONITOR" = "MONITOR" ]; then echo Yes; else echo No; fi`. Please pay attention to what I am saying (or what you are writing — I can't tell which is the problem)! I tested the code in the answer before posting it; it works on Mac OS X. – Jonathan Leffler Feb 11 '13 at 19:11
  • Taken care of that Sir, what about the rest of the logic: MONITOR=$(echo "$CONTAINER" | sed -n "/The disk ('[^']*' also known as '[^']*') .*/s//\1/p") I am expecting the output of $MONITOR will be 'virtual memory' also known as 'Virtual Memory', is this a write assumption? – Nik Feb 11 '13 at 19:16
  • ok i think it worked but partially and may be because i did not provided right string, here was the string content Mon 11 Feb, 2013 - 13:46:32 The disk 'disk partition 01pgp3:/pgp' also known as 'Disk partition 01pgp3:/pgp' is no longer available. Sed extracted Mon 11 Feb, 2013 - 13:46:32 'disk partition 01pgp3:/pgp' also known as 'Disk partition 01pgp3:/pgp'. I just need to drop day and time , how can i do that? – Nik Feb 11 '13 at 20:02
  • Put a `.*` in front of `The disk`, as in: `MONITOR=$(echo "$CONTAINER" | sed "s/.*The disk \('[^']*' also known as '[^']*'\) .*/s//\1/p")`. – Jonathan Leffler Feb 11 '13 at 20:30