-1

I have two files like this:

# step         distance            
           0    4.48595407961296e+01
        2500    4.50383737781376e+01
        5000    4.53506757198727e+01
        7500    4.51682465277482e+01
       10000    4.53410353656445e+01

  # step   distance             
           0    4.58854106214881e+01
        2500    4.58639266431320e+01
        5000    4.60620560167519e+01
        7500    4.58990075106227e+01
       10000    4.59371359946124e+01

So I want to join the two files together, while maintaining the spacing. Especially, the second file needs to remember the ending values of the first one and start counting from that one.

output:

  # step         distance            
               0    4.48595407961296e+01
            2500    4.50383737781376e+01
            5000    4.53506757198727e+01
            7500    4.51682465277482e+01
           10000    4.53410353656445e+01
           12500    4.58854106214881e+01
           15000    4.58639266431320e+01
           17500    4.60620560167519e+01
           20000    4.58990075106227e+01
           22500    4.59371359946124e+01

With calc it was easy to do the problem is that the spacing needs to be in order to work and in that case calc makes a complete mess.

Litisqe Kumar
  • 2,512
  • 4
  • 26
  • 40

2 Answers2

0

Perl to the rescue!

#!/usr/bin/perl
use warnings;
use strict;

open my $F1, '<', 'file1' or die $!;
my ($before, $after, $diff);
my $max = 0;
while (<$F1>) {
    print;
    my ($space1, $num, $space2) = /^(\s*) ([0-9]+) (\s*)/x or next;

    ($before, $after) = ($space1, $space2);
    $diff = $num - $max;
    $max = $num;
}

$before = length "$before$max";  # We'll need it to format the computed numbers.

open my $F2, '<', 'file2' or die $!;
<$F2>; # Skip the header.
while (<$F2>) {
    my ($step, $distance) = split;
    $step += $max + $diff;
    printf "% ${before}d%s%s\n", $step, $after, $distance;
}

The program remembers the last number in $max. It also keeps the length of the leading whitespace plus $max in $before to format all future numbers to take up the same space (using printf).

You didn't show how the distance column is aligned, i.e.

       20000    4.58990075106227e+01
       22500   11.59371359946124e+01 # dot aligned?
       22500    11.34572478912301e+01 # left aligned?

The program would align it the latter way. If you want the former, use a similar trick as for the step column.

choroba
  • 231,213
  • 25
  • 204
  • 289
  • Many thanks, but I'm no Perl expert. How Am I suppose to launch the script? Just putting the inputfiles after it's not working – user2710445 Sep 25 '15 at 13:27
  • @user2710445: I hardcoded the file names as `file1` and `file2`. If you want them as parameters to the script, just replace them with `shift`. As in `open my $F1, '<', shift or die $!`. – choroba Sep 25 '15 at 13:59
0
# start awk and set the *Step* between file to 2500
awk -v 'Step=2500' '

   # 1st line of 1 file (NR count every line, from each file) init and print header
   NR == 1 {LastFile = FILENAME; OFS = "\t"; print}

   # when file change (new filename compare to previous line read)
   #  Set a new index (for incremental absolute step from relative one) and new filename reference
   FILENAME != LastFile { StartIndex = LastIndex + Step; LastFile = FILENAME}

   # after first line and for every line stating witha digit (+ space if any)
   #  calculate absolute step and replace relative one, print the new content
   NR > 1 && /^[[:blank:]]*[0-9]/ { $1 += StartIndex; LastIndex = $1;print }
   ' YourFiles*
  • Result will depend of files order
  • output separator is set by OFS value (tab here)
NeronLeVelu
  • 9,908
  • 1
  • 23
  • 43
  • Works like a charm, huge thanks. For me awk syntax it's complicate to understand, if you have time and will just point out some explanations or reference where I can find them – user2710445 Sep 25 '15 at 13:24
  • added some comment directly in source code. There are a lot of pdf, html or other book about (g)awk on the web that will help to understad more deeply – NeronLeVelu Sep 25 '15 at 13:55