2

I'm working on a C# class to parse BBCodes out of text pulled from a database for PHPBB posts. On the PHPBB there is a custom BBCode added which looks like this:

[deck={TEXT1}]{TEXT2}[/deck]

Which, sitting in the database, looks like this:

[deck=FirstText:13giljne]Large Multiline Text[/deck:13giljne]

I'm attempting to replace that using a Regex in C#. My C# looks like this:

string text = "[deck=FirstText:13giljne]Large Multiline Text[/deck:13giljne]";
string replace = "my replacement string";
string pattern = @"\[deck=((.|\n)*?)(?:\s*)\]((.|\n)*?)\[/deck(?:\s*)\]";
RegexOptions options = RegexOptions.Compiled | RegexOptions.IgnoreCase | RegexOptions.Singleline;
Regex regex = new Regex(pattern, options);
string result = regex.Replace(text, replace);

I'm pretty sure it all just comes down to my Regex pattern being wrong. Which comes as no surprise to me, since Regex isn't exactly my strong suit.

Thanks in advance. Any help is greatly appreciated.

EDIT: Since some people found it unclear, I'll add larger examples.

Source text:

[deck=Foo:13giljne]
    Item #1
    Item #2
    Item #3
    Item #4
[/deck:13giljne]

Desired result:

<span>Foo</span>
<div>
    Item #1
    Item #2
    Item #3
    Item #4
</div>

Hopefully this gives a clearer picture of what I'm trying to do.

BRW
  • 354
  • 3
  • 18
  • If you can elaborate a little more... "I'm attempting to replace that using a Regex in C#" attempting to replace what with what? – m0skit0 Nov 15 '11 at 17:21
  • @m0skit0 I'm trying to have the regex match the string in the variable called "text", and replace it with the string in the variable called "replace". – BRW Nov 15 '11 at 17:22
  • But isn't that the whole string? Why not just use *replace* instead of *text*? I don't understand your problem... – m0skit0 Nov 15 '11 at 17:24
  • @m0skit0 I just put that small part into _text_ as an example of the string I'm trying to match. The actual value of _text_ contains other text before and after the part I'm trying to match. That value is being pulled straight from the database, and I'm trying to parse out the BBCodes and replace them before they are displayed. – BRW Nov 15 '11 at 17:28
  • I assume that you're trying to get rid of the `[deck=...]` and `[/deck...]` tags, and that you want **only** the text in between the BBCode tags. Is that correct? – jwheron Nov 15 '11 at 17:36
  • Can you show an example of the text before the replacement and then the text after the replacement so I can see what the end result should look like that you are trying to achieve? – M3NTA7 Nov 15 '11 at 17:36
  • @jwiscarson You're close. What I'm trying to do is replace the `[deck=...]` and `[/deck...]` with HTML. I'll edit the question with some examples, I guess. – BRW Nov 15 '11 at 17:39

4 Answers4

2

I think your regex shows that you need to match "First Text" and "Large Multiline Text".

\[deck=([^\:]+?):(?:[^\]]+)\]([^\[]+?)\[/deck:(?:[^\]]+)\]

This should help and it's very close to yours.

dereli
  • 1,814
  • 15
  • 23
  • 1
    I think you have a bug. Replace `([^\]]+?)` with `([^\[]+?)` in the part that's trying to match `Large Multiline Text`. You're looking for everything up to the next open bracket, not the next close bracket. – Jim Mischel Nov 15 '11 at 18:17
  • Thanks Jim. Copy-paste chorus. – dereli Nov 16 '11 at 09:42
1

If you're new to regular expressions, you might try matching a little at a time so that you're sure your string will match. For example, given the string:

string text = "[deck=FirstText:13giljne]Large Multiline Text[/deck:13giljne]";

Write an expression that matches the first part:

string firstPart = "\[deck=[^\]]+\]";

The [^\]]+ says, "match everything that isn't a ] character".

Verify that it matches:

Match m = Regex.Match(text, firstPart);

Then tack on the second part:

string firstAndSecond = firstPart + "[^\[]*";

And test that.

Once that's working, you can add the last part:

string search = firstAndSecond + "\[/deck[^\]]\]";

The final regular expression would be (\[deck=[^\]]+\])([^\[]+)(\[/deck[^\]]+\].

I grouped the individual parts to make it easier to see them. You can remove the groups if you want or make them non-capturing.

EDIT:

I see from your edit that you want to capture the FirstText, and the three groups:

string search = "(\[deck=([^:]+):[^\]]+\])([^\[]+)(\[/deck[^\]]+\]";

The replacement string, then, would be something like:

string replace = "<span>$2</span>\n<div>$3</div>";
Jim Mischel
  • 131,090
  • 20
  • 188
  • 351
  • I need to seperately match the first part, with `[deck=FirstText:13giljne]`, because I need to use what comes between `[deck=` and `:13giljne]`. – BRW Nov 15 '11 at 17:46
  • Your final regex still did not match the string. Thank you for the help, though. – BRW Nov 15 '11 at 17:58
0

Check STML Parser on GitHub or Nuget. It doesn't use RegEx but it is much more efficient and faster.

enter image description here

Tawani
  • 11,067
  • 20
  • 82
  • 106
-1

If you want to replace "[deck=FirstText:13giljne]Large Multiline Text[/deck:13giljne]", don't use regex. Use Replace.

string result= text.Replace("[deck=FirstText:13giljne]Large Multiline Text[/deck:13giljne]", replace);

Regexes are usually used where the string is not fully known, but its structure is known.

m0skit0
  • 25,268
  • 11
  • 79
  • 127
  • That won't help. "FirstText" is not a static value, ":13giljne" is not a static value, and "Large Multiline Text" is not a static value. These are all variables the values of which I have no control over. That was just an example of something that MIGHT come out of the database that I need to match. – BRW Nov 15 '11 at 17:33
  • 2
    @m0skit0, um, I don't think you are quite grasping the spirit of what the OP is asking for. – Kirk Woll Nov 15 '11 at 17:34
  • I think Wayne did a pretty good job, especially on a first question. – jwheron Nov 15 '11 at 17:38
  • Well then why you don't answer him then? :P I don't care if it's his/her first question. He/she should explain better what he/she wants. Period. Anyway, good luck Wayne. – m0skit0 Nov 15 '11 at 17:41
  • 2
    I'll try to say this without being accusatory. I don't know if you've spent time in the review interface or read a lot of questions from new people. Not everyone has the benefit of years of programming experience. Not everyone understands how much detail we need to answer questions appropriately. You **should** care that it's Wayne's first question, because we were all in that place once, and someone had to help us out when we had no clue what was going on. – jwheron Nov 15 '11 at 17:52
  • If you look at the question comments, you'll see that I tried to understand what he meant. And I don't feel like accused because I've answered way more questions than that I actually asked, and of course I have no guilt helping people. It's jus that some people actually forget they are the ones in need of help and not me. Thanks for the comment btw. – m0skit0 Nov 15 '11 at 19:16