0

What I am trying to do is to get the best-matching word in a file and the number of errors for it using agrep. For now I am only able to get the word using this script:

array=(bla1 bla2 bla3)
for eachWord in "${array[@]}"; do
  result=$(yes "yes" | agrep -B ${eachWord} /home/victoria/file.txt)
  printf "$result\n"
done

Where bla{1,2,3} are some words.

The output I have is the following:

agrep: 4 words match within 2 errors; search for them? (y/n)counting
first
and
should
agrep: 1 word matches within 1 error; search for it? (y/n)should
agrep: 2 words match within 4 errors; search for them? (y/n)must
must
agrep: 1 word matches within 2 errors; search for it? (y/n)should

Is there any way I can have the number of errors (2,1,4,2 in the output example above)?

agc
  • 7,973
  • 2
  • 29
  • 50
Victoria
  • 159
  • 2
  • 2
  • 13
  • What do you want with it? – Inian Mar 25 '19 at 08:58
  • the Levenstein distance for mine and the best-matching word – Victoria Mar 25 '19 at 09:24
  • 1
    As far as i've understood, you want the output to be: 2 1 4 2(i.e, number of errors). Can you try this: `result=$(yes "yes" | agrep -B ${eachWord} /home/victoria/file.txt|sed -E -n 's/.*\s+within\s+([0-9]+)\s+errors\;.*/\1/p')`. I'm pretty sure,this can be done using one `sed` or `awk`. – User123 Mar 25 '19 at 09:43
  • What I get after this is the following. Is there a way to extract only the second number? : agrep: 4 words match within 2 errors; search for them? (y/n) agrep: 1 word matches within 1 error; search for it? (y/n) agrep: 2 words match within 4 errors; search for them? (y/n) agrep: 1 word matches within 2 errors; search for it? (y/n) – Victoria Mar 25 '19 at 10:02

1 Answers1

0

The main problem is, that agrep reports the errors to standard error (file descriptor 2) and not to standard out (file descriptor 1). In order to throw away stdout and return stderr, you have to redirect stdout to /dev/null and redirect stderr to stdout:

2>&1 1>/dev/null

The minor problem is, that agrep does not output proper line endings, if you feed it by yes. You have to write a newline it stderr:

echo >&2

Finally, as User123 told you, you need a sed command to extract the number of errors.

This is an example:

for a in r1234t rot ruht rood; do
  yes y | agrep -B "$a" /etc/passwd
  echo >&2
done 2>&1 1>/dev/null |
sed -n 's/.* \([0-9]\+\) error.*/\1/p'

Output is:

4
1
2
1
ceving
  • 21,900
  • 13
  • 104
  • 178
  • I guess the OP only wants the second number (i.e more precisely, the number of errors: the number before the string 'errorrs'), so `sed` command could be modified a bit: `'s/.*\s+within\s+([0-9]+)\s+errors\;.*/\1/p'` – User123 Mar 26 '19 at 05:30