5

I would like to split the example string:

~Peter~Lois~Chris~Meg~Stewie

on the character ~ and have the result be

Peter
Lois
Chris
Meg
Stewie

Using a standard string split function in javascript or C# the first result is of course an empty string. I'd like to avoid having to ignore the first result because the first result may actually be an empty string.

I've been fiddling around with using a regular expression and I'm stumped. I'm sure somebody has come across and elegant solution to this.

António Almeida
  • 9,620
  • 8
  • 59
  • 66
Craig
  • 770
  • 3
  • 14
  • 26
  • Er, what is it that you actually want? You seem to want to not ditch the first element, but you want to allow it to be an empty string...can you reword that section? – Rob Feb 01 '09 at 02:17
  • Agreed, you say you want to allow the first element to be an empty string, how will that differ from the case where you want the result to start with a non-empty string? – Adam Bellaire Feb 01 '09 at 02:25

3 Answers3

4

For your requirements, I see two options:

(1) Remove the initial prefix character, if present.

(2) Use a full regular expression to separate the string.

Both are illustrated in this code:

using System;
using System.Linq;
using System.Text.RegularExpressions;

class APP { static void Main() {

string s = "~Peter~Lois~Chris~Meg~Stewie";

// #1 - Trim+Split
Console.WriteLine ("[#1 - Trim+Split]");
string[] result = s.TrimStart('~').Split('~');
foreach (string t in result) { Console.WriteLine("'"+t+"'"); }

// #2 - Regex
Console.WriteLine ("[#2 - Regex]");
Regex RE = new Regex("~([^~]*)");
MatchCollection theMatches = RE.Matches(s);
foreach (Match match in theMatches) { Console.WriteLine("'"+match.Groups[1].Value+"'"); }

// #3 - Regex with LINQ [ modified from @ccook's code ]
Console.WriteLine ("[#3 - Regex with LINQ]");
Regex.Matches(s, "~([^~]*)")
    .OfType<Match>()
    .ToList()
    .ForEach(m => Console.WriteLine("'"+m.Groups[1].Value+"'"))
    ;
}}

The regular expression in #2 matches the delimiter character followed by a match group containing zero or more non-delimiter characters. The resultant matches are the delimited strings (including any empty strings). For each match, "match.Value" is the entire string including leading delimiter and "match.Groups1.Value" is the first match group containing the delimiter free string.

For completeness, the third encoding (#3) is included showing the same regular expression method in #2, but in a LINQ coding style.

If you are struggling with regular expressions, I highly recommend Mastering Regular Expressions, Third Edition by Jeffrey E. F. Friedl. It is, by far, the best aid to understanding regular expressions and later serves as an excellent reference or refresher as needed.

rivy
  • 1,570
  • 1
  • 15
  • 30
1

In C#, this seems to get what you want:

"~Peter~Lois~Chris~Meg~Stewie".Split("~".ToCharArray(), StringSplitOptions.RemoveEmptyEntries);
Jay Bazuzi
  • 45,157
  • 15
  • 111
  • 168
  • .ToCharArray() call is not needed if you just use single quotes around the ~ character, '~' instead of "~" – Eric Schoonover Feb 01 '09 at 02:29
  • @spoon16: For the default overload, that's correct, but because I'm using the overload that takes the enum, I have to pass an actual array for the first character. I got a compile error without it. – Jay Bazuzi Feb 01 '09 at 03:15
1

Here's a LINQ approach...

Note, with RegexOptions.ExplicitCapture the matches are not included. Without it the '~' will be included as well.

using System;
using System.Linq;
using System.Text.RegularExpressions;

namespace ConsoleApplication2
{
    class Program
    {
        static void Main(string[] args)
        {
            string s = "~Peter~Lois~Chris~Meg~Stewie";
            Regex.Split(s, "(~)", RegexOptions.ExplicitCapture)
                .Where(i=>!String.IsNullOrEmpty(i))
                .ToList().ForEach(i => Console.WriteLine(i));
            Console.ReadLine();
        }
    }
}
ccook
  • 5,869
  • 6
  • 56
  • 81
  • Unfortunately, this removes all empty strings [similar to the ".Split('~', StringSplitOptions.RemoveEmptyEntries)" answer], but the questioner wants internal empty strings preserved. The approach is enlightening though, so, I've posted a modification to my answer including a similar LINQ coding. – rivy Feb 01 '09 at 08:25