22

When Span<T> was announced, I wanted to use it in a parser for my toy programming language. (Actually, I'd probably store a Memory<char>, but that's beside the point.)

However, I have grown used to switching on strings:

switch (myString) {
    case "function":
        return TokenType.Function;
    // etc.
}

Switching on a Span<char> won't work, and allocating a String to compare against kind of defeats the purpose of using a Span.

Switching to using if-else statements would result in the same problem.

So, is there a way to efficiently to this? Does ToString() on a Span<char> not allocate?

Plasticcaz
  • 341
  • 1
  • 2
  • 5

2 Answers2

23

System.MemoryExtensions contains methods that compare contents of Spans.

Working with .NET Core that supports implicit conversions between String and ReadOnlySpan<char>, you would have:

ReadOnlySpan<char> myString = "function";

if (MemoryExtensions.Equals(myString, "function", StringComparison.Ordinal))
{
    return TokenType.Function;
}
else if (MemoryExtensions.Equals(myString, "...", StringComparison.Ordinal))
{
    ... 
}

I'm calling the MemoryExtensions.Equals explicitly here because that way it is happy with the implicit conversion of the string literal (e.g. "function") to a ReadOnlySpan<char> for comparison purposes. If you were to call this extension method in an object-oriented way, you would need to explicitly use AsSpan:

if (myString.Equals("function".AsSpan(), StringComparison.Ordinal))

If you are particularly attached to the switch statement, you could abuse the pattern matching feature to smuggle the comparisons in, but that would not look very readable or even helpful:

ReadOnlySpan<char> myString = "function";

switch (myString)
{
    case ReadOnlySpan<char> s when MemoryExtensions.Equals(s, "function", StringComparison.Ordinal):
        return TokenType.Function;
        break;
    case ReadOnlySpan<char> s when MemoryExtensions.Equals(s, "...", StringComparison.Ordinal):
        ...
        break;
}

If you are not using .Net Core and had to install the System.Memory NuGet package separately, you would need to append .AsSpan() to each of the string literals.

GSerg
  • 76,472
  • 17
  • 159
  • 346
2

Calling ToString() would cause an allocation because strings are immutable but something you could consider is using the various MemoryExtensions Class methods to perform the comparison. So you could leave your source code being parsed in a Span<char> and use code such as the following:

System.ReadOnlySpan<char> myString = "function test();".AsSpan();
if (myString.StartsWith("function".AsSpan()))
    Console.WriteLine("function");

That will cause an intermediate string allocation for each token (the myString allocation was just to demonstrate) but you could initialize the token table as a once-off operation outside the token parser method. Also you might want to take a look into the Slice method as an efficient way to move through the code as you're parsing it.

And thanks to GSerg for pointing out among other things that .NET Core can handle the implicit conversion from string to ReadOnlySpan<char> so you can ommit the AsSpan() if using .NET Core.

PeterJ
  • 3,705
  • 28
  • 51
  • 71
  • Why via `ToCharArray()`? `ReadOnlySpan myString = "function";`, and then `if (MemoryExtensions.Equals(myString, "function", StringComparison.Ordinal)) { ... }` or `if (myString.Equals("function".AsSpan(), StringComparison.Ordinal)) { ... }`. – GSerg Oct 06 '18 at 14:15
  • @GSerg is `ReadOnlySpan` only available under .NET Core? I just tried it using LINQPad / 4.7 and it didn't find a reference. – PeterJ Oct 06 '18 at 14:28
  • But `Span` is also only available in Core, for regular .Net you need to [install the nuget package](https://stackoverflow.com/a/47321870/11683). – GSerg Oct 06 '18 at 14:31
  • Just realized I did something dumb and typed your code into a new LINQPad window without the reference. But it still couldn't do an implicit type conversion from string to ReadOnlySpan but making it all a ReadOnlySpan is a good idea which I'll change, I'd just used `StartsWith` because I assume that's what he'd really want for a parser. – PeterJ Oct 06 '18 at 14:55
  • The implicit conversion only works for .Net Core. You have to add `AsSpan()` otherwise. – GSerg Oct 06 '18 at 14:56
  • @GSerg Thanks for the help I haven't done anything with core yet and assumed that sort of thing would be the same but have updated with that info. – PeterJ Oct 06 '18 at 15:04