1

I'm trying to use regular expressions to search/replace sub-patterns but I seem to be stuck. Note: I'm using TextWrangler on OSX to complete this.

SCENARIO:

Here is an example of a complete match:

{constant key0="variable/three" anotherkey=$variable.inside.same.match key2="" thirdkey='exists'}

Each match will always:

  • start with the following: {constant key0=
  • terminate with a single curly brace: }
  • contain one or more key=value pairs
    • the key of the first pair is constant (in this case, the key is key0)
    • the value of the first pair is variable (in this case, the value is "variable/three")
    • each additional pairs, if any, are separated by whitespace

Here's an example of what a minimal (but complete) match would look like (with only one key=value pair):

{constant key0="first/variable/example"}

Here's another example of a valid match, but with trailing whitespace after the last (and only) key=value pair:

{constant key0="same/as/above/but/with/whitespace/after/quote" }

GOAL:

What I need to be able to do is extract each key and each value from each match and then rearrange them. For example, I might need the following:

{constant key0="variable/4" variable_key_1="yes" variable_key_2=0}

... to look like this after all is said and done:

$variable_key_1 = "yes"; $variable_key_2 = 0; {newword "variable/4"}

... where

  • a $ has been added to the extracted keys
  • spaces have been added between each key=value pair's =
  • a ; has been appended to each extracted value
  • the word constant has been changed to newword, and
  • key0= has been removed completely.

Here are some examples of what I've tried (note that the first one actually works, but only when there is exactly one key/value pair):

Search:
(\{constant\s+key0=\s*)([^\}\s]+)(\s*\})
Replace:
{newword \2}

Search:
(\{constant\s+key0=)([^\s]+)(([\s]+[^\s]+)([\s]*=\s*)([^\}]+)+)(\s*\})
Replace:
I wasn't able to come up with a good way to replace the output of this one.

Any help would be most appreciated.

jerdiggity
  • 3,655
  • 1
  • 29
  • 41

1 Answers1

1

Because of the nature of this match, it's actually three different regexes—one to figure out what the match is, and two others to process the matches. Now, I don't know how you intend to escape the quotes, so I'll give one for each common escapement system.
Without further ado, here's the set for the backslash escapement system:

Find:
\{constant\s+key0=([^\s"]\S*|"(\\.|[^\\"])*")(\s+[^\s=]+=([^\s"]\S*|"(\\.|[^\\"])*"))*\s*\}
Search 1:
(?<=\s)([^\s=]+)=([^\s"]\S*|"(\\.|[^\\"])*")(?=.*\})
Replace 1:
$1 = $2;
Search 2:
^\{constant\s+key0 = ([^\s"]\S*|"(\\.|[^\\"])*");\s*(?=\S)(.*)\}
Replace 2:
$2 {newword $1}

Now the URL/XML/HTML escapement system, much easier to parse:

Find:
\{constant\s+key0=([^\s"]\S*|"[^"]*")(\s+[^\s=]+=([^\s"]\S*|"[^"]*"))*\s*\}
Search 1:
(?<=\s)([^\s=]+)=([^\s"]\S*|"[^"]*")(?=.*\})
Replace 1:
$1 = $2;
Search 2:
^\{constant\s+key0 = ([^\s"]\S*|"[^"]*");\s*(?=\S)(.*)\}$
Replace 2:
$2 {newword $1}

Hope this helps.

RamenChef
  • 5,557
  • 11
  • 31
  • 43