17

I have data of the following form:

num1    This is a string
num2    This is another string

I want to limit length of all strings which are after the first tab..such that length(string)<4. Therefore, the output which I get is:

num1    This is a string
num2    This is another 

I can do this using python. But I am trying to find a linux equivalent in order to achieve the same.

Danstahr
  • 4,190
  • 22
  • 38
Jannat Arora
  • 2,759
  • 8
  • 44
  • 70

3 Answers3

33

In bash, you can use the following to limit the string, in this case, from index 0 to index 17.

$ var="this is a another string"

$ echo ${var:0:17}

this is a another
TrebledJ
  • 8,713
  • 7
  • 26
  • 48
jramirez
  • 8,537
  • 7
  • 33
  • 46
  • Yes you are right...i have a very big file..and i want to automate the procedure...instead of doing the same 1 line at a time manually – Jannat Arora Nov 08 '13 at 22:29
19

Using , by columns :

$ awk '{print $1, $2, $3, $4}' file

or with :

sed -r 's@^(\S+\s+\S+\s+\S+\s+\S+).*@\1@' file

or by length using :

$ cut -c 1-23 file
Gilles Quénot
  • 173,512
  • 41
  • 224
  • 223
0

If you'd like to truncate strings on word boundaries, you could use fold with the -s option:

awk -F"\t" '{
    printf "%s\t", $1; system(sprintf("fold -sw 17 <<< \"%s\" | sed q", $2))
}'

The drawback is fold and sed need to be called for each line (sed q is the same as tail -n1).

Cole Tierney
  • 9,571
  • 1
  • 27
  • 35