I'm trying to count the number of elements/words present in each field of a big table. Fields are delimited by whitspaces, and field elements ("words") by commas. The table also contains empty fields (e.g. two or more consecutive whitespaces), which is equivalent to 0 elements.
For example, from a table such as this:
val1 this,is,text this,more,text stop
val2 this,is a field
val3 end,text
This would be the desired output:
val1 3 3 0 1
val2 0 2 1 1
val3 0 0 0 2
(I'd like to keep the first column as is)
Please note that there are two blank spaces before the stop
value in the first line, indicating that the fourth field has 0 elements. Similar things happen in other lines.
I've been using the split function of awk to create an array with the desired number of elements for each field:
awk '{ for(i = 2; i <= NF; i++) {
$i=split($i,a,",") ; { if (!$i) { $i="0" }};
}; print $0}' input
I'm splitting each field i
into an array a
of n
elements, and assigning this value to the variable $i
. In the case of 0 elements in the given field, (!$i
), $i=0
.
But this is my current, unwanted output:
val1 3 3 1
val2 2 1 1
val3 2
As you can see, 0 values are omitted. I think that there's some issue with the assignment of the 0 value to empty fields.
Can anyone help me? Thanks a lot in advance!