An Introduction to Writing High-Performance C# Using Span<T> Struct
Let’s talk about Span<T>
today, which has been talked about couple of years for now because it has been introduced with C#7.2 and supported in the .NET Core 2.1 and later runtimes. In this article we will go through some examples on how Span<T>
is being used and discuss why you should consider using it when you write your next lines of code.
📖 What is Span<T>?
System.Span<T>
is a new value type at the heart of .NET. It enables the representation of contiguous regions of arbitrary memory, regardless of whether that memory is associated with a managed object, is provided by native code via interop, or is on the stack. And it does so while still providing safe access with performance characteristics like that of arrays. 🙄 Yeah me too also a bit confused. Let’s breake it down!
First, it’s a type, a value type I might add (There are two kinds of types in C#: reference types and value types. Variables of reference types store references to their data (objects), while variables of value types directly contain their data). Span<T> provides type-safe(i.e. prevent the objects of one type from peeking into the memory assigned for the other object) access to a contiguous area(adjacent, next or together in sequence) of memory. This memory can be located on the heap, the stack or even be formed of unmanaged memory.
Developers generally don’t need to understand how a library they’re using is implemented. However, in the case of Span<T>, it’s worthwhile to have at least a basic understanding of the details behind it. As I mentioned earlier it’s value type, containing a ref and a length, defined approximately as follows:
Because of this ref
field, we can pass a value (object, array etc.) by reference (like a pointer in C), such that you have a ref T on the stack. And because of that operations can be as efficient as on arrays: indexing into a span doesn’t require computation to determine the beginning from a pointer and its starting offset, as the ref field itself already encapsulates both.
Because of this, you also have to understand that spans are only a view into the underlying memory and aren’t a way to instantiate a block of memory. Span<T>
provides read-write access to the memory and ReadOnlySpan<T>
provides read-only access. Therefore, creating multiple spans on the same array creates multiple views of the same memory. Let me elaborate.
Suppose you have an array of strings allocated somewhere on the heap. You can wrap a span around this string array by passing it to the span constructor. Doing so assigns the pointer field to the memory address where the data starts (0th element of the array) and sets the length field to the number of consecutive accessible elements (in this case, it’s 4)
To further understand this, consider following example. Let’s create two spans with slices of the same array, call them first view and the second view.
You can see the firstView and the secondView overlaps and this is not a problem because as I mentioned earlier Span<T> is a only a view into the underlying memory.
We can use Span with:
- Heap (Managed objects) — e.g. Arrays, Strings
- Stack (via stackalloc)
- Native/Unmanaged (P/Invoke)
It is very useful because we can simply slice an existing chunk of memory and manage it, we don’t have to copy it and allocate new memory.
List of types that we can convert to Span<T>:
- Arrays
- Pointers
- stackalloc
- IntPtr
With ReadOnlySpan<T> we can convert all above and string.
Okay, now you understand the structure and basic implementation of Span<T>, let’s move on to optimizations.
🚀 Optimize How?
Consider the Requirement: We need a method, that takes an array and returns 1/4 of its elements, starting from the middle element.
Imagine a world without Span<T> and if I were to write this, I would do something like: return myArray.Skip(Size / 2).Take(Size / 4).ToArray();
Now this would work, but we need to compare it with few other implementations so for that purpose we are using BenchmarkDotNet, a benchmark library for .NET. Designing benchmarks are not the scope of this article, but it is fairelt simple. Read more:
In our example we have 3 methods to do the same thing, one with earlier implementation, another with Array.Copy() and finally with Span<T> (uses AsSpan extension method which creates new read-only span over a portion of the target array from a specified position to the specified length).
We have created a BenchmarkDemo1
class with Annotation [MemoryDiagnoser]
, and have a SetUp()
method to fill up the array. Methods are annotated with [Benchmark]
to make them the Benchmark tests and Original()
method set to Baseline
, and when you run this console application you need to make the Solution Cofiguration to Release
, not Debug
.
Okay let’s talk about the results. Specially the Mean and Allocated columns of the summary.
You can clearly see that for all three sizes, Span<T> only took arround 1 ns (nano second) where other methods have taken drastically more time. Also, notice Span has 0 allocation of memory. 😮
Let’s take another example: Given a string date of format — “dd mm yyyy”. convert it to DateTime.
Usually I would do something like this.
Let’s do the same with Span. For this we are going to use a version of Span<T> called ReadOnlySpan<T>. As the span is an synchronous accessor for the memory, the readonly span is an accessor for the readonly memory.
Now when we run this benchmark we see the same optimization.
0 allocation and 25% decrease of mean.
Full source code for these benchmarks can be found here.
⚠️ Limitations of Span<T>
Span<T>
is a ref struct that is allocated on the stack rather than on the managed heap. Ref struct types have a number of restrictions to ensure that they cannot be promoted to the managed heap, including that they can't be boxed (the process of converting a value type to the type object
or to any interface type implemented by this value type), they can't be assigned to variables of type Object, dynamic
or to any interface type, they can't be fields in a reference type, and they can't be used across await
and yield
boundaries. In addition, calls to two methods, Equals(Object) and GetHashCode, throw a NotSupportedException.
Because it is a stack-only type, Span<T>
is unsuitable for many scenarios that require storing references to buffers on the heap. This is true, for example, of routines that make asynchronous method calls. For such scenarios, you can use the complementary System.Memory<T> and System.ReadOnlyMemory<T> types, which is another topic for a future discussion. 😉
Thak you for coming this far, I will see you in the next one. 🖖
Further Reading: