-3

file: input_file.xml

<Property>

<Name>ACTIVE</Name>

<ColumnHeader>Active&#xD;

Role</ColumnHeader>

</Property>

<Property>

<Name>DEFAULT</Name>

<ColumnHeader>Default&#xD;

Role</ColumnHeader>

</Property>

I need to extract all the text between <ColumnHeader> and </ColumnHeader> then I need to replace that text and create a new file.

There is a problem with the &#xD; I don't know how to apply a command with that.

The output will be, something like that: output_file.xml

<Property>
<Name>ACTIVE</Name>
<ColumnHeader>New Text 1</ColumnHeader>
</Property>
<Property>
<Name>DEFAULT</Name>
<ColumnHeader>New Text 2</ColumnHeader>
</Property>

Guys, I need to do that for several files. Thanks a lot for your help.

kikin
  • 1
  • 1
  • 1
    Can you edit your question and add your exact expected output? – Jack Fleeting Aug 11 '23 at 12:14
  • What do you want to replace it with? Where do those new strings come from? – Ed Morton Aug 11 '23 at 13:44
  • Hi Guys, I already updated the question. Thanks for your help. – kikin Aug 11 '23 at 15:04
  • You've now shown expected output so that's good but you still haven't told us where those 2 new text strings `New Text 1` and `New Text 1` came from. Do you have a second file full of such strings, or will they be hard-coded in the script, or piped to it from some other command or what? – Ed Morton Aug 11 '23 at 15:09
  • Hi Ed, the new strings could come from another file, or I can put them manually. The second file is the example of how the new file will be after to apply the script. Thanks. – kikin Aug 11 '23 at 15:12
  • The solution to your problem depends on the answer to my question so please pick an implementation and show that in your question. In particular I don't know how a tool could implement "I can put them in manually", and if your new strings can contain newlines then you need to decide how to delimit them if they're stored in a file. – Ed Morton Aug 11 '23 at 15:14
  • I do understand that the second file is your expected output. I don't understand where `New Text 1` and `New Text 2` came from. – Ed Morton Aug 11 '23 at 15:17
  • Regarding "I need to do that for several files" - that's trivial with any Unix tool and anyone can show you how to do that once you've updated your question to state and show where to get the new text strings from and then eventually have a working solution for 1 file. – Ed Morton Aug 11 '23 at 15:29
  • Hi Ed, the new text doesn't come from anywhere, because I think I can put the new text on the script and then when I run the script I'd have the new file with the new texts on it. Another solution would be if we can extract the text inside the labels, and then I can put the new texts on another file (replace.txt) and after that run the script replace the original text for the text on replace.txt on the correct order. Thanks. – kikin Aug 11 '23 at 17:16

2 Answers2

0

Assuming your input is stored in a file input_file.xml, to extract the values between the <ColumnHeader> and </ColumnHeader> tags in a xml/txt file you can run a shell command something like this:

grep -oP '<ColumnHeader>\K.*?(?=<\/ColumnHeader>)' input_file.xml > extracted_text.txt

To store the extracted value in a variable in the same script to be reused later:

extracted_text=$(grep -oP '<ColumnHeader>\K.*?(?=<\/ColumnHeader>)' input_file.xml)

If you need to replace the original file with some new text and create a new file (which I think is your use case):

sed -E 's/<ColumnHeader>.*<\/ColumnHeader>/REPLACEMENT_TEXT/' input_file.xml > replaced_file.xml

In the above command of course you can use a variable insted of REPLACEMENT_TEXT:

sed -E 's/<ColumnHeader>.*<\/ColumnHeader>/$replacement_text/' input_file.xml > replaced_file.xml

If you need to replace the text in the original file then do this:

sed -E -i 's/<ColumnHeader>.*<\/ColumnHeader>/REPLACEMENT_TEXT/' input_file.xml

Hope this helps.

Edit: After going through the updated question, here's a new approach that might be suitable in this scenario: You can achieve this by using 'awk', a powerful text processing tool

@kikin - Please try below command for a sample input.xml file which contains your input xml content.

awk -F'[<>]' '/<ColumnHeader>/ && !c { c++; $3="new text 1" } /<ColumnHeader>/ && c { c++; $3="new text 2" } 1' input.xml > output.xml

Note: In this command we can use different parameters instead of hard coded strings 'new text 1' and 'new text 2':

str1="new text 1"
str2="new text 2" 
awk -F'[<>]' '/<ColumnHeader>/ && !c { c++; $3="$str1" } /<ColumnHeader>/ && c { c++; $3="$str2" } 1' input.xml > output.xml

Also if this works for your core requirement of replacing the texts, then please use a while loop to loop over a txt file (e.g.: file_names.txt) that contains the list of all the files which you want to make the changes to as below:

while IFS= read -r file_name; do
    awk -v RS='</?ColumnHeader>' -v ORS= 'NR%2 { print $0; next } { gsub(/Active/, "new text 1"); gsub(/Default/, "new text 2"); print $0 }' "$file_name" > "${file_name%.xml}_output.xml"
done < file_names.txt

file_names.txt file sample content: Note: This file list can be populated from a

input_file1.xml
input_file2.xml
input_file3.xml
akshbhaw
  • 1
  • 2
  • Those sed commands would replace from the first in the file to the last <\/ColumnHeader> in the file, not between pairs of them as I expect the OP wants. You'd also have to make sure REPLACEMENT_TEXT didn't contain at least `\`, `&`, or `/`. – Ed Morton Aug 11 '23 at 14:29
  • Hi akshbhaw, thanks, so, I need to do that for several files, is there a way to do that only by running a script?? Thanks. – kikin Aug 11 '23 at 15:08
  • @kikin pleas re-read [my comment](https://stackoverflow.com/questions/76883411/how-to-extract-and-replace-text-by-using-a-linux-script/76884172?noredirect=1#comment135540095_76883678) as this solution won't work for 1 file, never mind many files. – Ed Morton Aug 11 '23 at 15:12
  • Hi Ed, is not working, because I need to replace those texts with different texts, in my example I have two "" with different text inside them, so, I need to replace the first string with would say: ROLE and the second one with ROLES and so on. – kikin Aug 11 '23 at 15:29
  • @kikin yes, I know. Please read my comments under your question and then edit your question to show us where to get the new text strings from. – Ed Morton Aug 11 '23 at 15:37
  • @kikin - Please refer to edit in my answer if you are still looking for a solution. – akshbhaw Aug 22 '23 at 17:41
0

Since you seem to be dealing with an xml file, you should be using an xml parser. Something like:

xmlstarlet edit \
  --update "//Property[Name='ACTIVE']/ColumnHeader" \
  --value 'New Text 1' \
  --update "//Property[Name='DEFAULT']/ColumnHeader" \
  --value 'New Text 2' \
            file.xml
Jack Fleeting
  • 24,385
  • 6
  • 23
  • 45