-1

I've created a CSV from shell. Then I need to filter the information by column. I used this command:

$cut -d ';' -f 12,22 big_file.csv

The input looks like:

ACT;XXXXXX;MCD;881XXXX;881017XXXXXX;ABCD;BMORRR;GEN;88XXXXXXXXXX;00000;01;2;000008608008602;AAAAAAAAAAA;0051;;;;;;093505;
ACT;XXXXXX;MCD;881XXXX;881017XXXXXX;ABCD;BMORRR;GEN;88XXXXXXXXXX;00000;01;3;000008608008602;AAAAAAAAAAA;0051;;;;;;085000;anl@mail.com

The output is:

ID CLIENT;email
00000xxxxxxxxx
00000000xxxxxx;anl@mail.com

As you can see, the last column does not appear (note, that the semicolon is missing in the first line). I want this:

ID CLIENT;email
00000xxxxxxxxx;
00000000xxxxxx;anl@mail.com

I have another CSV file with information and it works. I've reviewed the csv and the columns exist.

2 Answers2

1

There doesn't seem to be a way to make cut do this. The next step up in expressivity is awk, which does it easily:

$ cat testfile
one;two;three;four
1;2;3
first;second
only
$ awk -F';' '{ OFS=FS; print $1, $3 }' < testfile
one;three
1;3
first;
only;
$ 
hmakholm left over Monica
  • 23,074
  • 3
  • 51
  • 73
1

You don't get the semicolon in the output of your second line, because your second line contains just 21 fields (the first contains 23 fields). You can check that using:

(cat bigfile.csv | tr -d -c ";\n" ; echo "1234567890123456789012") | cat -n | grep -v -E ";{22}" 

This will output all lines from bigfile.txt with less than 22 semicolons along with the corresponding line numbers.

To fix that, you can add a bunch of empty fields at the end of each line and pipe the result to cut like this:

sed -e's|^\(.*\)|\1;;;;;;;;;;;;;;;;;;;;;;;;|g' bigfile.csv | cut -d ';' -f 12,22 | cut -d ';' -f 12,22 

The result is:

XXXXXXXXYYY;XXXNNN
XXXXYYYYXXXXX;
jottbe
  • 4,228
  • 1
  • 15
  • 31