1

I am new to bash scripting and need help with below Question. I parsed a log file to get below and now stuck on later part. I have a file1.csv with content as:

mac-test-1,10.32.9.12,15
mac-test-2,10.32.9.13,10
mac-test-3,10.32.9.14,11
mac-test-4,10.32.9.15,13

and second file2.csv has below content:

mac-test-3,10.32.9.14
mac-test-4,10.32.9.15

I want to do a file comparison and if the line in second file matches any line in first file then change the content of file 1 as below:

mac-test-1,10.32.9.12, 15, no match
mac-test-2,10.32.9.13, 10, no match
mac-test-3,10.32.9.14, 11, matched
mac-test-4,10.32.9.15, 13, matched

I tried this

awk -F "," 'NR==FNR{a[$1]; next} $1 in a {print $0",""matched"}' file2.csv file1.csv 

but it prints below and doesn't include the not matching records

mac-test-3,10.32.9.14,11,matched 
mac-test-4,10.32.9.15,13,matched

Also, in some cases the file2 can be empty so the result should be like this:

 mac-test-1,10.32.9.12,15, no match
 mac-test-2,10.32.9.13,10, no match
 mac-test-3,10.32.9.14,11, no match
 mac-test-4,10.32.9.15,13, no match
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
Akash Deep
  • 77
  • 1
  • 6
  • You should include in your example cases where the 1st field matches between the 2 files but the 2nd field doesn't (and vice-versa) so we can see how you want those handled. Right now we can't tell if you want to match on the first field or the second or both and your statement `if the line in second file matches any line in first file` doesn't reflect your expected output since there are no cases where the whole line from file1 matches the whole line from file2. – Ed Morton Oct 07 '22 at 13:27

3 Answers3

3

With your shown samples please try following awk code. You need not to check condition first and then print the statement because when you are checking $1 in a then those items who doesn't exist will NEVER come inside this condition's block. So its better to print whole line of file1.csv and then print status of that particular line either its matched OR not-matched based on their existence inside array.

awk '
BEGIN  { FS=OFS="," }
FNR==NR{
  arr[$0]
  next
}
{
  print $0,(($1 OFS $2) in arr)?"Matched":"Not-matched"
}
' file2.csv file1.csv


EDIT: Adding a solution to handle empty file of file2.csv scenario here, same concept wise as above only thing it handles scenarios when file2.csv is an Empty file.

awk -v lines=$(wc -l < file2.csv) '
BEGIN  { FS=OFS=","}
(lines==0){
  print $0,"Not-Matched"
  next
}
FNR==NR{
  arr[$0]
  next
}
{
  print $0,(($1 OFS $2) in arr)?"Matched":"Not-matched"
}
' file2.csv file1.csv
RavinderSingh13
  • 130,504
  • 14
  • 57
  • 93
  • Thanks, the above script prints nothing if the file2 is empty. I want to handle a scenario that when file 2 is empty then it should print like this: mac-test-1,10.32.9.12,15, no match mac-test-2,10.32.9.13,10, no match mac-test-3,10.32.9.14,11, no match mac-test-4,10.32.9.15,13, no match – Akash Deep Oct 07 '22 at 09:26
  • @AkashDeep, Sure let me check and get back here. – RavinderSingh13 Oct 07 '22 at 09:34
  • @AkashDeep, I have added **EDIT** solution now in my answer, kindly do check it once and let me know how it goes. When I tested it, it worked fine for me with an empty file2.csv scenario. – RavinderSingh13 Oct 07 '22 at 09:37
  • 1
    Thank you so much, I will try this and post here if it works :) – Akash Deep Oct 07 '22 at 09:38
  • @AkashDeep, Your welcome, try it out and let me know how it goes. – RavinderSingh13 Oct 07 '22 at 09:45
  • @AkashDeep, Also Akash, you need to change file2's real name 2 places if you see code carefully in case your Real file's name is something different FYI. – RavinderSingh13 Oct 07 '22 at 09:48
2

You are not printing the else case:

awk -F "," 'NR==FNR{a[$1]; next}
{
 if ($1 in a) {
  print $0 ",matched"
 } else {
  print $0 ",no match"
 }
}' file2.csv file1.csv

Output

mac-test-1,10.32.9.12,15,no match
mac-test-2,10.32.9.13,10,no match
mac-test-3,10.32.9.14,11,matched
mac-test-4,10.32.9.15,13,matched

Or in short, without manually printing the comma but using OFS:

awk 'BEGIN{FS=OFS=","} NR==FNR{a[$1];next}{ print $0 OFS (($1 in a)?"":"no")"match"}' file2.csv file1.csv

Edit

I found a solution on this page handling FNR==NR on an empty file.

When file2.csv is empty, all output lines will be:

mac-test-1,10.32.9.12,15,no match

Example

awk -F "," '
ARGV[1] == FILENAME{a[$1];next}
{
 if ($1 in a) {
  print $0 ",matched"
 } else {
  print $0 ",no match"
 }
}' file2.csv file1.csv
The fourth bird
  • 154,723
  • 16
  • 55
  • 70
  • This is what I was looking on how to use if else condition with this. Only one issue with this solution how do I handle the case when the file 2 is empty/null then all the records in file 1 to be populated as no match? – Akash Deep Oct 07 '22 at 08:50
  • @AkashDeep That is how the mechanism works, you first create an array with all the entries from `file2.csv` and then when processing the second file there is a print for the if and the else. – The fourth bird Oct 07 '22 at 08:56
  • 1
    @AkashDeep, What do you want to do with empty lines? – RavinderSingh13 Oct 07 '22 at 09:05
  • 1
    @RavinderSingh13 I have a use case where in some scenario file 2 can be empty so in that case all the records of file 1 should be updated with "no match". – Akash Deep Oct 07 '22 at 09:33
  • @AkashDeep I have added another example in case file2.csv is empty – The fourth bird Oct 07 '22 at 10:51
1

Each of @RavinderSingh13's and @Thefourthbird's answers contain large parts of the solution but here it is all together:

awk '
    BEGIN { FS=OFS="," }
    { key = $1 FS $2 }
    FILENAME == ARGV[1] {
        arr[key]
        next
    }
    {
        print $0, ( key in arr ? "matched" : "no match") 
    }
' file2.csv file1.csv

or if you prefer:

awk '
    BEGIN { FS=OFS="," }
    { key = $1 FS $2 }
    !f {
        arr[key]
        next
    }
    {
        print $0, ( key in arr ? "matched" : "no match") 
    }
' file2.csv f=1 file1.csv
Ed Morton
  • 188,023
  • 17
  • 78
  • 185