0

I am trying to update a numbering/numeration in a test.html file:

<td class="no">(8)</td>
<td class="no">(9)</td>
<td class="no">(10)</td>
<td class="no">(11)</td>
<td class="no">(23)</td>

A new line could be added between the other lines, so I don't want to update the numeration always manually. Another condition is, that the update should start after number 7.

I tried to use gensub by replacing the line by the match but it doesn't work how I thought. There must be an easier way to determine the numbers! No tutorials or forum posts did help me or I didn't understand them...

So far what I have:

/^<td class="no">\([0-9]+\)<\/td>$/ {
  a = gensub(/(.*)([0-9]+)(.*)/, "\\2", "g") # this finds only 1 digit, why?
  if (a > 7) print a
}
Sanane Lan
  • 93
  • 1
  • 9
  • The answer to you `why` is that `.` in `.*` matches digits too. In any case, you can apply the answer given at http://stackoverflow.com/a/40512703/1745001 to this problem. See in particular the simplified gawk-specific solution for non-nested terminators at the end of the answer. – Ed Morton Nov 10 '16 at 17:07

1 Answers1

1

If you only need to determine the numbers, you only must get rid of any character not being a digit

/^<td class="no">\([0-9]+\)<\/td>$/ {
  gsub("[^0-9]","")
  if ((0+$0) > 7) print
}

update: (0+$0) > 7 replaces my original $0 > 7 because the cygwing gawk does not compare $0 and 7 as numerical values but as string values --- I do not know why. I'm not familiar with cygwin.

This solution prints the following output:

8
9
10
11
23

If the test.html file had contained a line like

<td class="no">(71)</td>

the original code ($0 > 7) would have also print

71

in cygwin.

Jdamian
  • 3,015
  • 2
  • 17
  • 22
  • This prints only 8 and 9 into the console. I am using Cygwin on Windows. – Sanane Lan Nov 14 '16 at 14:48
  • @SananeLan, my code prints the numbers inside the brakets when it is greater than 7. What do you code prints out? – Jdamian Nov 16 '16 at 07:51
  • well as I said, only the the lines where 8 and 9 are in brakets. Does your code also print out the numbers with 2 digits? – Sanane Lan Nov 16 '16 at 09:16
  • @SananeLan, yes it does. It prints any number greater than 7. Have you tried my code? – Jdamian Nov 16 '16 at 10:41
  • I used only your code, and I tried it with a test.html file containing (8) to (12) and it prints only 8 and 9 into the console. I used it like: gawk -f listnums.awk test.html. I tried it with cygwin and mingw, same result. – Sanane Lan Nov 17 '16 at 06:52
  • Please, can you post the test.html file used? – Jdamian Nov 17 '16 at 07:37
  • I edited my question. (I don't seem to be able to use [at]user, it disappears after posting a comment, or maybe I just can't see it?) – Sanane Lan Nov 21 '16 at 06:53
  • @SananeLan, I edited my solution. I cannot explain why cygwin gawk behaves different than RedHat or CentOS gawk. – Jdamian Nov 21 '16 at 08:06
  • yes you are right, now it works. I neither know why cygwin behaves like that, and I've no experience with (g)awk, so I couldn't think of that... – Sanane Lan Nov 21 '16 at 11:02