-3

I have a file for example file.dat.gz that is tab delimited.

For example

hi^Iapple^Itoast

is it possible to count in between the tabs using wc?

Since the above counts would be 2, 5, 5 wc would return 0 but if it was greater than 8000 could it list 1 or the exact value?

I eat toast
  • 117
  • 6
  • 1
    Count *what* in between tabs? What is "it" in "if it was greater"? What do you mean by "list 1 or the exact value"? Can you show expected output? – Benjamin W. Mar 23 '20 at 17:47
  • ah it's count the characters in between the tabs. well essentially what I'm looking for is if it is possible to do a word count between two tabs? but the overall goal is to do a word count between two tabs that outputs the result that are greater than 8000 – I eat toast Mar 23 '20 at 18:26
  • so an expected output would be a character blob that is greater than 8*10^3. – I eat toast Mar 23 '20 at 18:27
  • 1
    Show what you've tried and provide a [mcve] in your question – oguz ismail Mar 23 '20 at 18:31

2 Answers2

0

Doesn't need wc.

Set $IFS to a tab temporarily on the line ahead of a read.
That will exclude spaces (c.f. "a b c").
Read into an array, and loop each.

Test for length > 8000 and behave accordingly. Here's a quick example you should be able to adapt.

 $: IFS="   " read -a lst < in
 $: for x in "${lst[@]}"
 >  do l="${#x}"
 >     if (( l > 8000 ))
 >     then x='<too long>'
 >     fi
 >     printf "'%s' = %d\n" "$x" "$l"
 >  done
 'hi' = 2
 'a b c' = 5
 'apple' = 5
 '<too long>' = 10000
 'toast' = 5

If you are processing a really big file, write it in awk or perl for better performance.

Paul Hodges
  • 13,382
  • 1
  • 17
  • 36
0
 awk -F'\t' '{for (i=1; i<=NF;i++) if(length($i)>8000)  print $i}'

Demo

$echo -e "hi\tapple\ttoast" |  awk -F'\t' '{for (i=1; i<=NF;i++) if(length($i)>2)  print $i}' 
apple
toast
$echo -e "hi\tapple\ttoast" |  awk -F'\t' '{print length($1) , length($2) , length($3)}' 
2 5 5
$echo -e "hi\tapple\ttoast"
hi  apple   toast
$echo -e "hi\tapple\ttoast" |  awk -F'\t' '{print length($1) , length($2) , length($3)}' 
2 5 5
$echo -e "hi\tapple\ttoast" |  awk -F'\t' '{for (i=1; i<=NF;i++) if(length($i)>2)  print $i}' 
apple
toast
$
Digvijay S
  • 2,665
  • 1
  • 9
  • 21