Which common operations can be made more efficient by Span?

Question

Let's say I have a Web application and I want to make use of the new Span<T> type to reduce GC pressure and improve performance.

Which patterns should I look out for? Are there any typical operations that the .NET team had in mind when implementing this new feature?

https://blogs.msdn.microsoft.com/dotnet/2017/11/15/welcome-to-c-7-2-and-span/ or http://adamsitnik.com/Span/ may be of interest. — mjwills, Nov 16 '17 at 12:43
[MSDN Magazine: C# 7.2: Understanding Span Video](https://channel9.msdn.com/Events/Connect/2017/T125) — Tim Schmelter, Nov 16 '17 at 12:45
If anything it can simplify the API. I do a lot of vector/matrix operations and I am excited to commonize storage. Currently I have different classes for unmanaged (`fixed`) points & managed arrays for storage. — John Alexiou, Nov 17 '17 at 12:32
No *common* operations can be sped up by using `Span`, as in, just drop it in and watch the performance go up -- it's more an issue of allowing you to rethink the way you handle buffers and allocation. If you have a code path where arrays are passed/manipulated but never stored in fields (or you can factor out the field storage into parameters), `Span` allows you to bypass copying and reduce GC pressure. Code bases that are already heavily invested in the use of unmanaged memory to avoid GC obviously benefit from `Span`, but that's not your average web application. — Jeroen Mostert, Nov 17 '17 at 12:35
For web applications, you can expect infrastructure code like Kestrel and other parts of the managed network stack to benefit from `Span` on the lower levels, without even involving your code. — Jeroen Mostert, Nov 17 '17 at 12:36

Patrick Hofman · Answer 1 · 2018-03-11T07:56:56.610

There are quite some cases where this new class and infrastructure can help, but if they are common depends on your code...

As an example, see this pre-C# 7.2 implementation:

static void Main(string[] args)
{
    byte[] file = new byte[] { 0, 1, 2, 3, 4, 5, 6, 7, 8 };
    byte[] header = file.Take(4).ToArray();
    byte[] content = file.Skip(4).ToArray();

    bool isValid = IsValidHeader(header);
}

private static bool IsValidHeader(byte[] header)
{
    return header[0] == 0 && header[1] == 1;
}

The file.Take(4).ToArray() and byte[] content = file.Skip(4).ToArray(); are the problem here: we have to create a new array just to split the two parts of the byte array. When the size of the byte array gets bigger, you can imagine the impact on the performance and memory usage (imagine a 10 MB file array, suddenly takes 20 MB in memory).

Now see the C# 7.2 implementation:

static void Main(string[] args)
{
    byte[] file = new byte[] { 0, 1, 2, 3, 4, 5, 6, 7, 8 };
    var header = file.Take(4);
    var content = file.Skip(4);

    bool isValid = IsValidHeader(header);
}

private static bool IsValidHeader(ReadOnlySpan<byte> header)
{
    return header[0] == 0 && header[1] == 1;
}

Using ReadOnlySpan<T> (part of the Span<T> infrastructure) here makes it possible to use the data in the array, without duplicating it! The memory pressure is still 10MB. The array is not duplicated. And since arrays, lists and stream readers all use Span<T>, you can build one common method for all sources.

Of course, this could have been implemented with IEnumerable<T> too, but this performance is so much better (no endless skipping for example if you use the content variable repeatedly).

Which common operations can be made more efficient by Span?

1 Answers1