0

Have two files file1 and file2. Their contents are:

file1 - input

Line1
Line2
Line3
Line4

file2 - input

<head>
<intro> This is an introduction </intro>
 <line> this is a line1 </line>
 </head>
<head>
 <intro> This is another intro </intro>
 <line> this is a line2 </intro>
 </head>
<head>
<intro> This is an introduction </intro>
 <line> this is a line3 </line>
 </head>
<head>
 <intro> This is another intro </intro>
 <line> this is a line4 </intro>
 </head>

Want to read file1 and replace the line tag value in file2 with Line1, Line2, Line3, Line4 (see output). Which is the easiest method (sed, awk, grep, perl, python ...) of doing this?

Output

    <head>
    <intro> This is an introduction </intro>
     <line> Line1 </line>
     </head>
    <head>
     <intro> This is another intro </intro>
     <line> Line2 </intro>
     </head>
    <head>
    <intro> This is an introduction </intro>
     <line> Line3 </line>
     </head>
    <head>
     <intro> This is another intro </intro>
     <line> Line4 </intro>
     </head>

If you think this is a duplicate, please kindly link the duplicate. I have tried to go though solutions that look similar but none me found.

Edit: Just in case someone wants to append/concatenate instead of replacing, one can easily modify the markline expression in the python2 code of @cdarke as below and use.


markline = re.sub(r'</line>$',''+subt+'</line>',markline)

deepseefan
  • 3,701
  • 3
  • 18
  • 31

2 Answers2

3

With GNU sed and bash's Process Substitution:

sed -e '/<line>[^<]*<\/[^>]*>/{R '<(sed 's|.*| <line> & </line>|' file1) -e 'd;}' file2

Output:

<head>
<intro> This is an introduction </intro>
 <line> Line1 </line>
 </head>
<head>
 <intro> This is another intro </intro>
 <line> Line2 </line>
 </head>
<head>
<intro> This is an introduction </intro>
 <line> Line3 </line>
 </head>
<head>
 <intro> This is another intro </intro>
 <line> Line4 </line>
 </head>
Cyrus
  • 84,225
  • 14
  • 89
  • 153
2

The easiest method is probably the one that you are familiar with. It is easy in Perl and Python (and Ruby, and Lua) if you know those languages. 'Easy' is subjective.

(Examples edited to add spaces)

Here is a Python 2 version:

import re

lines = open('file1').readlines()

with open('file2') as fh:
    for markline in fh:
        if '<line>' in markline:
            subt = lines.pop(0).rstrip()
            markline = re.sub(r'<line>.*</line>', '<line> ' + subt + ' </line>',
                          markline)

        print markline,

Here is a Perl version:

use strict;
use warnings;

open(my $fh1, 'file1') or die "Unable to open file1 for read: $!";

my @lines = <$fh1>;
chomp(@lines);
close($fh1);

open(my $fh2, 'file2') or die "Unable to open file2 for read: $!";

while (<$fh2>) {
    s/<line>.*<\/line>/'<line> ' . shift(@lines) . ' <\/line>'/e;
    print 
}

close($fh2);

I have assumed typos in the input data.

The code I have shown works, but is inflexible. All these languages have several XML parsers, and really you should learn one of these languages and an XML parser.

cdarke
  • 42,728
  • 8
  • 80
  • 84
  • 1
    Both scripts produce the same results however there is no space on each side of the `LineN` replaced string as in the Output shown in the OP. In the python script the relevant part of the line of code should be `' ' + subt + ' '` and in the perl script should be `' ' . shift(@lines) . ' <\/line>'`. I tried editing it myself however adding 4 spaces was not enough to allow the edit. – user3439894 May 16 '15 at 16:58
  • @user3439894: thanks. I added the spaces to the code. – cdarke May 17 '15 at 09:03