There are a few aspects to this
Premature optimization
The method given works and is easy to understand/maintain. Is it causing a performance problem?
If not, then don't worry about it. If it ever causes a problem, then look at it.
Expected Results
In the example, what you do want the output to be?
"Did you this asking"
or
"Did you this asking"
You haved added spaces to the end of "try" and "before" but not "yourself". Why? Typo?
string.Replace() is case-sensitive. If you care about casing, you need to modify the code.
Working with partials is messy.
Words change in different tenses. The example of 'do' being removed from 'doing' words, but how about 'take' and 'taking'?
The order of the stop words matters because you are changing the input. It is possible (I've no idea how likely but possible) that a word which was not in the input before a change 'appears' in the input after the change. Do you want to go back and recheck each time?
Do you really need to remove the partials?
Optimizations
The current method is going to work its way through the input string n times, where n is the number of words to be redacted, creating a new string each time a replacement occurs. This is slow.
Using StringBuilder (akatakritos above) will speed that up an amount, so I would try this first. Retest to see if this makes it fast enough.
Linq can be used
EDIT
Just splitting by ' ' to demonstrate. You would need to allow for punctuation marks as well and decide what should happen with them.
END EDIT
[TestMethod]
public void RedactTextLinqNoPartials() {
var arrToCheck = new string[] { "try", "yourself", "before" };
var input = "Did you try this yourself before asking";
var output = string.Join(" ",input.Split(' ').Where(wrd => !arrToCheck.Contains(wrd)));
Assert.AreEqual("Did you this asking", output);
}
Will remove all the whole words (and the spaces. It will not be possible to see from where the words were removed) but without some benchmarking I would not say that it is faster.
Handling partials with linq becomes messy but can work if we only want one pass (no checking for 'discovered' words')
[TestMethod]
public void RedactTextLinqPartials() {
var arrToCheck = new string[] { "try", "yourself", "before", "ask" };
var input = "Did you try this yourself before asking";
var output = string.Join(" ", input.Split(' ').Select(wrd => {
var found = arrToCheck.FirstOrDefault(chk => wrd.IndexOf(chk) != -1);
return found != null
? wrd.Replace(found,"")
: wrd;
}).Where(wrd => wrd != ""));
Assert.AreEqual("Did you this ing", output);
}
Just from looking at this I would say that it is slower than the string.Replace() but without some numbers there is no way to tell. It is definitely more complicated.
Bottom Line
The String.Replace() approach (modified to use string builder and to be case insensitive) looks like a good first cut solution. Before trying anything more complicated I would benchmark it under likely performance conditions.
hth,
Alan.