0

Say I have an input stream consisting of lines separated into a certain number of fields. I would like to cut on the various fields, pipe a certain field (or fields) to a program (which is assumed to return one line for each input line) and leave the other fields as is, and paste the results back together. I can probably imagine convoluted solutions, but there ought to be a clean and natural way to do that.

As a specific example, say I have a program producing lines of the form:

$ inputprog
<a> hello world!
<b> hi everyone!
<a> hi!

Say I would like to put the message in uppercase while leaving the first field unchanged. Here is how I would imagine things:

$ inputprog | program -d' ' -f2- "tr a-z A-Z"
<a> HELLO WORLD!
<b> HI EVERYONE!
<a> HI!

I am looking for a reasonably clean way to approximate program. (I am not interested in solutions which are specific to this example.)

Thanks in advance for your help!

a3nm
  • 8,717
  • 6
  • 31
  • 39

1 Answers1

1

awk can do what you want. For example:

$ echo "field1 field2" | awk '{$2 = toupper($2); print;}'
field1 FIELD2

Comes pretty close to what you want to do. $2 = toupper($2); changes the second field, while print prints out the whole (modified) line.

However, you got a problem in how you define a 'field'. In the example above fields are separated by spaces (you can change the field separator to an arbitrary regexp with like so: -F'<[a-zA-Z]+>' - this would consider as a field separator). But in your example you seem to view <a> as one field and hello world! as another one. Any program could only come to your desired behaviour by wild guessing that way. Why wouldn't world! be considered a third field? So, if you can get input with a clear policy of separating fields, awk is exactly what you want.

Check out pages like http://people.cs.uu.nl/piet/docs/nawk/nawk_92.html (awk string functions) and http://www.pement.org/awk/awk1line.txt (awk 1 liners) for more information.

BTW, one could also make your specific example above work by looping over all the fields except the first one (NF == Number of Fields):

$ echo "<a> hello world!
<b> hi everyone!
<a> hi" |  
awk '{for(i=2;i<=NF;++i) { $i=toupper($i); }; print;}'
<a> HELLO WORLD!
<b> HI EVERYONE!
<a> HI

Even though you are not interested in the solution to this example. ;-)

P.S.: sed should also be able to do the job (http://en.wikipedia.org/wiki/Sed)

firefrorefiddle
  • 3,795
  • 20
  • 31
  • Thank you for this detailed reply! I know about `sed` and `awk`, but I really need to pipe the field into an external program for the use case that I have. `toupper()` works for this simple case, but `awk` built-ins would not be enough for what I want to do. However, it seems that `awk` has facilities to pipe fields to external commands, and I can probably use this to do what I want. Thanks for giving me the idea! :-) – a3nm Jul 28 '11 at 15:53