-3

I want to split the string on the basis of characters and string like (, . ; and or though but etc.).
Original string: "This movie is great. I like the story, acting is nice and direction is perfect but music is not good."
Result:
This movie is great
I like the story
acting is nice
direction is perfect
music is not good

I have tried this.

string test = "This movie is great. I like the story, acting is nice and direction is perfect but music is not good.";
var splittC = Regex.Split(test, ",");
foreach(var a in splittC){
    var splittD = Regex.Split(test, "."); 
    foreach(var b in splittD){
       var splittA = Regex.Split(test, "and"); 
    }
}// and so on....

It is taking so much loops.
And if there is no Comma in this string then it will not check other characters. How to solve these problems. Please help.

rafaelc
  • 57,686
  • 15
  • 58
  • 82
Ahmad Vaceem
  • 47
  • 3
  • 10
  • 3
    Possible duplicate of [splitting a string based on multiple char delimiters](http://stackoverflow.com/questions/7605785/splitting-a-string-based-on-multiple-char-delimiters) – nikib3ro Sep 21 '16 at 19:54
  • 1
    there's a [string.Split overload](https://msdn.microsoft.com/en-us/library/tabh47cf(v=vs.110).aspx) that will do this for you – Jonesopolis Sep 21 '16 at 19:54
  • you can split on all of the delimiters I hope that you are aware of that without using Regex for example `var splittC= test.Split(new[] { ',', ' .' }, StringSplitOptions.RemoveEmptyEntires);` – MethodMan Sep 21 '16 at 19:56
  • `test.Split(new string[] { ",", ".", ";", "and", "or", "though", "but" }, StringSplitOptions.RemoveEmptyEntries);` – Arturo Menchaca Sep 21 '16 at 19:57

5 Answers5

1

String.Split allows a string[] parameter.

Try this:

string test = "This movie is great. I like the story, acting is nice and direction is perfect but music is not good.";
var splitVals = test.Split(new string[] { ",", ".", ";", " and ", " or ", " though ", " but ", " etc. "}, StringSplitOptions.RemoveEmptyEntries);
John Bustos
  • 19,036
  • 17
  • 89
  • 151
  • This will split the `story` because of `or` – L.B Sep 21 '16 at 20:00
  • Nope, @L.B - Look closer, you have ` or ` – John Bustos Sep 21 '16 at 20:00
  • 1
    you do not need to declare the `new StringSplitOptions()` you just add this and it will get rid of the last element which if you debug it you will see will be `""` in your current code change it to the following at the end `var splitVals = test.Split(new string[] { ",", ".", ";", " and ", " or ", " though ", " but ", " etc. " },StringSplitOptions.RemoveEmptyEntries);` – MethodMan Sep 21 '16 at 20:06
  • Thanks. But in this I've to write all the strings hard-coded. If I want to fetch the splitting characters from a file or DB then? – Ahmad Vaceem Sep 21 '16 at 20:06
  • @AhmadVaceem Don't extend your question. This post answers your question. Read some docs and if you get other problems post a new question. – L.B Sep 21 '16 at 20:08
  • @AhmadVaceem, so long as you feed in an array of words to split by, it'll work. So get the info from the DB as an array and you're good to go. – John Bustos Sep 21 '16 at 20:08
1

Parsing natural languages is hard because the computer doesn't understand context. If they could, we could talk to them as if they were people.

Sometimes the ands and periods in sentences are not separators, and sometimes sentences don't start with capital letters.

iPhones are great, said Mr. Smith.

"A one and a two and a three and a four." sang the musicians.

To do the job well, I recommend you either

(a) very strictly control the input allowed, or

(b) use a natural language parsing library, such as SharpNLP which is native, or you can call NLTK from C#. NLTK is probably the best but even it sometimes fails. It's also 5 GB in size due to the training data its machine learning requires.

Community
  • 1
  • 1
0

To make this work you need to parse the sentence with a lexical analyser then process the objects produced. Example keyword lexical items are "and", "," etc. The rest of the text in the parsed items between the keyword items can then be concatenated and sent to the output.

Brian Leeming
  • 11,540
  • 8
  • 32
  • 52
0

try using this simple regex i wrote it may be helpful for you:

var splitRegex=@"\.|\,|\;|(?:\sand\s)|(?:\sor\s)|(?:\sthough\s)|(?:\sbut\s)";
var splittC = Regex.Split(test, splitRegex);
...

the results is: Split by regex it may need some modifications to work in all situations.

Abdo
  • 322
  • 6
  • 15
0
string test = "This movie is great. I like the story, acting is nice and direction is perfect but music is not good.";
var splitVals = test.Split(new string[] 
{   ",", ".", ";", " and ", " or ",
    " though ", " but ", " etc. "
},StringSplitOptions.RemoveEmptyEntries);
MethodMan
  • 18,625
  • 6
  • 34
  • 52