2

I am not clued up on Regex as much as I should be, so this may seem like a silly question.

I am splitting a string into a string[] with .Split(' ').
The purpose is to check the words, or replace any.

The problem I'm having now, is that for the word to be replaces, it has to be an exact match, but with the way I'm splitting it, there might be a ( or [ with the split word.

So far, to counter that, I'm using something like this:
formattedText.Replace(">", "> ").Replace("<", " <").Split(' ').

This works fine for now, but I want to incorporate more special chars, such as [;\\/:*?\"<>|&'].

Is there a quicker way than the method of my replacing, such as Regex? I have a feeling my route is far from the best answer.

EDIT
This is an (example) string
would be replaced to
This is an ( example ) string

Brad Larson
  • 170,088
  • 45
  • 397
  • 571
TheGeekZn
  • 3,696
  • 10
  • 55
  • 91
  • I don't think I understand the question. What do you need these special characters to be replaced with? It would help if you showed an example string. – Chirag Bhatia - chirag64 Mar 05 '13 at 09:30
  • If you want to extract all words from a string, I guess this should be the answer to your question : http://stackoverflow.com/q/2159026/1236044 – jbl Mar 05 '13 at 09:59
  • @Chirag64 sorry - updated it. – TheGeekZn Mar 05 '13 at 09:59
  • If you're only interested in the words (and not the structure and what not of the string), have you considered just replacing all the special characters with spaces, or removing them all together? You could use look arounds to replace any special characters not adjacent to a space with a space, and remove all the others. – rvalvik Mar 05 '13 at 10:40
  • Sorry for the late reply. I need the structure in-tact. – TheGeekZn Mar 05 '13 at 13:34

2 Answers2

4

If you want to replace whole words, you can do that with a regular expression like this.

string text = "This is an example (example) noexample";
string newText = Regex.Replace(text, @"\bexample\b", "!foo!");

newText will contain "This an !foo! (!foo!) noexample"

The key here is that the \b is the word break metacharacter. So it will match at the beginning or end of a line, and the transitions between word characters (\w) and non-word characters (\W). The biggest difference between it and using \w or \W is that those won't match at the beginning or end of lines.

Jim Mischel
  • 131,090
  • 20
  • 188
  • 351
0

I thing this is the right thing you want

if you want these -> ;\/:*?"<>|&' symbols to replace

string input = "(exam;\\/:*?\"<>|&'ple)";
        Regex reg = new Regex("[;\\/:*?\"<>|&']");
        string result = reg.Replace(input, delegate(Match m)
        {
            return " " + m.Value + " ";
        });

if you want to replace all characters except a-zA-Z0-9_

 string input = "(example)";
        Regex reg = new Regex(@"\W");
        string result = reg.Replace(input, delegate(Match m)
        {
            return " " + m.Value + " ";
        });
Civa
  • 2,058
  • 2
  • 18
  • 30