2

I am very new to regular expressions. I am using UltraEdit, and would like to use regular expressions to make the changes described below.

I have some text in the following pattern:

<Music href="6000111.genre" title="AAA">
    <Music format="ditamap" href="000760.rock" title="222"/>
    <Music format="ditamap" href="000756.rock" title="333"/>
</Music>

I need to add prefix 'Z' in front of href with extension .rock.

href="000760.rock" --> href="Z000760.rock"

The output should look like this:

<Music href="6000111.genre" title="AAA">
    <Music format="ditamap" href="Z000760.rock" title="222"/>
    <Music format="ditamap" href="Z000756.rock" title="333"/>
</Music>

What would be the regular expression to do this in UltraEdit?

mjuarez
  • 16,372
  • 11
  • 56
  • 73
user1749707
  • 159
  • 14

2 Answers2

2

Re-wrote my answer to

  1. Add new use-case OP added where some values have the X prefix and must not be replaced.
  2. I was initially putting the double quote character in brackets when there was no need.

The first case I answered is where none of the HREF values already have the X prefix.

Find:

href="([^"]*)\.rock"

And replace:

href="X\1.rock"

Start:

<Music href="6000111.genre" title="AAA">
    <Music format="ditamap" href="000760.rock" title="222"/>
    <Music format="ditamap" href="000756.rock" title="333"/>
</Music>

Finish:

<Music href="6000111.genre" title="AAA">
    <Music format="ditamap" href="X000760.rock" title="222"/>
    <Music format="ditamap" href="X000756.rock" title="333"/>
</Music>

Screen shot showing this first result is below.

Screen shot showing first result.

Breakdown of the regex:

  1. Find: href="([^"]*)\.rock"
    1. href=" - this finds href="
    2. ([^"]*) - this creates the first backreference - tells the engine to look for and remember everything between the brackets: [^"]* so that we can reference it in the replace part.
      1. [^"] - this part of the pattern says any character that is not a double quote.
      2. And the asterisk at the end of [^"]* is a repetition pattern that says look for zero or more characters that matches the thing just before it (so find zero or more characters that are not a double quote).
    3. \.rock" this defines the rest of the pattern which must be .rock"
    4. Note that I have escaped the period character: \.. That is because period has a special meaning in a regex and we are telling the regex that we mean a literal dot or period.
  2. Replace: href="X\1.rock"
    1. href="X - says to output literally href="X..
    2. \1 - says to replace \1 with the first backreference we created (zero or more characters that are not a double quote).
    3. .rock" - says to output literally .rock".
      1. Note that I didn't need to escape the period here, because it doesn't have the same meaning in replace - it just means the literal dot.

The second case is in response to OP's comment that some of the HREF values already have the X prefix. In this case, change the regex as below.

Find:

href="([^X][^"]*)\.rock"

And replace:

href="X\1.rock"

Start:

<Music href="6000111.genre" title="AAA">
    <Music format="ditamap" href="000760.rock" title="222"/>
    <Music format="ditamap" href="X000756.rock" title="333"/>
    <Music format="ditamap" href="000757.rock" title="444"/>
    <Music format="ditamap" href="X000758.rock" title="555"/>
    <Music format="ditamap" href="000759.rock" title="666"/>
</Music>

Finish:

<Music href="6000111.genre" title="AAA">
    <Music format="ditamap" href="X000760.rock" title="222"/>
    <Music format="ditamap" href="X000756.rock" title="333"/>
    <Music format="ditamap" href="X000757.rock" title="444"/>
    <Music format="ditamap" href="X000758.rock" title="555"/>
    <Music format="ditamap" href="X000759.rock" title="666"/>
</Music>

Screen shot showing this second result is below.

Screen shot showing second result.

Breakdown of the regex:

  1. Find: href="([^X][^"]*)\.rock"
    1. href=" - this finds href="
    2. ([^X][^"]*) - this creates the first backreference - tells the engine to look for and remember everything between the brackets: ([^X][^"]*)* so that we can reference it in the replace part.
      1. [^X]* - this part of the pattern says any character that is not an X.
      2. [^"] - this part of the pattern says any character that is not a double quote.
      3. And the asterisk at the end of [^"]* is a repetition pattern that says look for zero or more characters that matches the thing just before it (so find zero or more characters that are not a double quote).
    3. \.rock" this defines the rest of the pattern which must be .rock"
    4. Note that I have escaped the period character: \.. That is because period has a special meaning in a regex and we are telling the regex that we mean a literal dot or period.
  2. Replace: href="X\1.rock"
    1. href="X - says to output literally href="X..
    2. \1 - says to replace \1 with the first backreference we created (zero or more characters that are not a double quote).
    3. .rock" - says to output literally .rock".
      1. Note that I didn't need to escape the period here, because it doesn't have the same meaning in replace - it just means the literal dot.
Robert Mark Bram
  • 8,104
  • 8
  • 52
  • 73
  • Thanks Robert!works great. one question: How do I modify so that if X is already there then it should not find (or replace). i.e., it only finds '000760.rock' but not 'X000760.rock'. Otherwise, the solution you mentioned adds another X as prefix. – user1749707 Feb 13 '14 at 07:55
  • @user1749707, if the X may (or may not) already be there, use this as your **find**: `href=["]X?([^"]*)\.rock["]` and keep the **replace** the same: `href="X\1.rock"`. – Robert Mark Bram Feb 13 '14 at 12:02
  • Hi, I used ["]X?([^"]*)\.rock["] but it finds value with X as well. It shoudl NOT find and replace that already has X prefix. Anything wrong? – user1749707 Feb 13 '14 at 19:44
  • @user1749707, `href=["]X?([^"]*)\.rock["]` would still be ok in most cases because the output would be the same - i.e. you would still end up with all entries having the X before them. Just in case there is some other reason why you would absolutely not want those cases changed, I have added a second solution to my answer for that. I also added explanations of the regular expressions and corrected an error where I was putting double quotes in square brackets when that wasn't needed. – Robert Mark Bram Feb 13 '14 at 23:58
1

I'm not sure for Ultraedit, but I assume it's close to notepad++:

Find what: (href=")(.+?\.rock")
Replace with: $1X$2

X or Z as it's not clear in your question.

Toto
  • 89,455
  • 62
  • 89
  • 125