2

I'm looking for a way, in .NET, to split a string while ignoring split characters that are within quotes (or another delimiter). (This functionality would match what a typical CSV parser does if the split delimiter is a comma.) I'm not sure why this ability isn't built into String.Split().

Daniel Brückner
  • 59,031
  • 16
  • 99
  • 143
Pat
  • 16,515
  • 15
  • 95
  • 114

4 Answers4

5

You can use a regular expression for that. Example:

string test = @"this,i""s,a"",test";
string[] parts =
  Regex.Matches(test, @"(""[^""]*""|[^,])+")
  .Cast<Match>()
  .Select(m => m.Value)
  .ToArray();

foreach (string s in parts) Console.WriteLine(s);

Output:

this
i"s,a"
test
Guffa
  • 687,336
  • 108
  • 737
  • 1,005
  • Nice! This method works well for everything I need. I added a `.Select(m => m.Value.Trim())` to clean things up. – Pat Jul 06 '10 at 15:55
  • didn't work my test case, "java.exe -cp \"a stupid jar with a space.jar\" my.MainClass" which should have returned a 4 parts, but i ended up with one – spy Apr 07 '19 at 00:34
  • @spy: The example uses comma as a separator. You would need to use a space instead of the comma in the regular expression. – Guffa Apr 17 '19 at 23:58
1

Check out Marc's answer in this post:

Input array is longer than the number of columns in this table. Exception

He mentions a library you can use for this.

Community
  • 1
  • 1
spinon
  • 10,760
  • 5
  • 41
  • 59
0

If you also want to allow single quote (') then change the expression to @"(""[^""]""|'[^']'|[^\s])+".

If you want to remove the quotes from the string then change your Select to .Select(m => m.Value.Trim(new char [] {'\'','"'})).

Brad
  • 1
0

Using @Guffa's method, here is my full solution:

/// <summary>
/// Splits the string while preserving quoted values (i.e. instances of the delimiter character inside of quotes will not be split apart).
/// Trims leading and trailing whitespace from the individual string values.
/// Does not include empty values.
/// </summary>
/// <param name="value">The string to be split.</param>
/// <param name="delimiter">The delimiter to use to split the string, e.g. ',' for CSV.</param>
/// <returns>A collection of individual strings parsed from the original value.</returns>
public static IEnumerable<string> SplitWhilePreservingQuotedValues(this string value, char delimiter)
{
    Regex csvPreservingQuotedStrings = new Regex(string.Format("(\"[^\"]*\"|[^{0}])+", delimiter));
    var values =
        csvPreservingQuotedStrings.Matches(value)
        .Cast<Match>()
        .Select(m => m.Value.Trim())
        .Where(v => !string.IsNullOrWhiteSpace(v));
    return values;
}
Pat
  • 16,515
  • 15
  • 95
  • 114