10

I have a string that may or may not have multiple matches for a designated pattern.

Each needs to be replaced.

I have this code:

var pattern = @"\$\$\@[a-zA-Z0-9_]*\b";
var stringVariableMatches = Regex.Matches(strValue, pattern);
var sb = new StringBuilder(strValue);

foreach (Match stringVarMatch in stringVariableMatches)
{
    var stringReplacment = variablesDictionary[stringVarMatch.Value];
    sb.Remove(stringVarMatch.Index, stringVarMatch.Length)
            .Insert(stringVarMatch.Index, stringReplacment);
}

return sb.ToString();

The problem is that when I have several matches the first is replaced and the starting index of the other is changed so that in some cases after the replacement when the string is shorten I get an index out of bounds..

I know I could just use Regex.Replace for each match but this sound performance heavy and wanted to see if someone could point a different solution to substitute multiple matches each with a different string.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Mortalus
  • 10,574
  • 11
  • 67
  • 117
  • *Regex.Replace for each match but this **sound** performance heavy* What's the size of your data? – Thomas Ayoub May 11 '16 at 09:10
  • I have around 100,000 strings like this to iterate thought .. each may have 1-3 matches to replace each with a different string. – Mortalus May 11 '16 at 09:11
  • @LucasTrzesniewski: I think Mortalus means that using `Regex.Replace` inside the `foreach` is a performance killer. The point is that `Regex.Replace` can be used *instead* of `Regex.Matches`. – Wiktor Stribiżew May 11 '16 at 09:15
  • @WiktorStribiżew yes I think I have misuderstood what he intended to do (and his comment on your answer confirms just that) – Lucas Trzesniewski May 11 '16 at 09:16

1 Answers1

23

Use a Match evaluator inside the Regex.Replace:

var pattern = @"\$\$\@[a-zA-Z0-9_]*\b";
var stringVariableMatches = Regex.Replace(strValue, pattern, 
        m => variablesDictionary[m.Value]);

The Regex.Replace method will perform global replacements, i.e. will search for all non-overlapping substrings that match the indicated pattern, and will replace each found match value with the variablesDictionary[m.Value].

Note that it might be a good idea to check if the key exists in the dictionary.

See a C# demo:

using System;
using System.IO;
using System.Text.RegularExpressions;
using System.Collections.Generic;
using System.Linq;
public class Test
{
    public static void Main()
    {
        var variablesDictionary = new Dictionary<string, string>();
        variablesDictionary.Add("$$@Key", "Value");
        var pattern = @"\$\$@[a-zA-Z0-9_]+\b";
        var stringVariableMatches = Regex.Replace("$$@Unknown and $$@Key", pattern, 
                m => variablesDictionary.ContainsKey(m.Value) ? variablesDictionary[m.Value] : m.Value);
        Console.WriteLine(stringVariableMatches);
    }
}

Output: $$@Unknown and Value.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • 1
    No worries, I carefully study profiles, look at interesting content and then vote up what is worth being voted up. Rarely, I shall say: never, the system kicked in and determined my activity to be "serial voting" ;-) – GhostCat Feb 19 '19 at 12:07