-2

I'm having difficulties finding how to match a particular word when this word begin also by a question mark inside a CSV file. I need to use tcsh code.

What I mean it that I can match "cat" while excluding "zcat" but it includes "?cat". Here is my code:

#!/bin/tcsh -f
set viewSet = PRE_IMPL
set nbViewSet=`awk -F ";" '{ for (i=1; i<=NF; i++) { if ($i == "VIEW SETS") print i } }' csv.csv`
/usr/bin/awk -F ";" -v col="$nbViewSet" '(match($col, '"/\<"$viewSet"\>/"') != 0) {print}' csv2.csv

So with this code, I have what follows as an input CSV file:

STANDARD KEYS;COMPATIBLE KEYS;CELL KEY;COND KEY;WORKSPACE PATH;VIEW PATH;CATEGORIES;CONDS SECTION;VIEW SETS;TYPES
;;;;;;;;PRE_IMPL;
;;;;;;;;zPRE_IMPL;
;;;;;;;;?PRE_IMPL;
;;;;;;;;PRE_IMPL;

So here I want to match only the word "PRE_IMPL" and neither "zPRE_IMPL" nor "?PRE_IMPL". My code manage to exclude "zPRE_IMPL" but not "?PRE_IMPL" and I didn't manage to change that, the output is:

;;;;;;;;PRE_IMPL;
;;;;;;;;?PRE_IMPL;
;;;;;;;;PRE_IMPL;

How do I change my code to match "PRE_IMPL" only?

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • @Sylvain_cmz, good that you have shown your efforts in form of code, could you please post more clear samples in your question and let us know then. – RavinderSingh13 Oct 28 '20 at 09:44

1 Answers1

0

You can use a regex like (^|[^?])PRE_IMPL to require the match to be either at beginning of the field, or next to a character which is not a question mark.

Tangentially, there is no need to run Awk twice here. (Or to use /usr/bin/awk in one place and just awk in another.)

awk -F ";" -v viewSet="$viewSet" '
  NR==1{ for (i=1; i<=NF; i++) if ($i == "VIEW SETS") col=i; next }
  match($col, "(^|[^?])" viewSet "\>")' csv2.csv
tripleee
  • 175,061
  • 34
  • 275
  • 318
  • This luckily avoids any code which depends on which shell you are using. I'll repeat my recommendation to try to transition away from `tcsh` which only has a very niche user base any longer. – tripleee Oct 28 '20 at 09:51
  • My Awk doesn't like the `"\>"` but if it works for you, go for it. I got it to work by similarly replacing it with `"([^A-Za-z0-9_]|$)"`. – tripleee Oct 28 '20 at 09:59
  • Thanks that'all I needed, I'm filtering my CSV file fine now ! Glad I could get help so fast :) I'll see what I can do to go away from tcsh, I was using this script to learn regexp and shell use at the begining. It's sure that tsch isn't the most efficient to do this kind of script, I'll see what I can change ! – Sylvain_cmz Oct 28 '20 at 10:10