[DotNetBook] Span: a new .NET data type

Tutorial

With this article, I continue to publish a whole series of articles, the result of which will be a book on the work of the .NET CLR, and .NET as a whole (about 200 pages of the book are already ready, so welcome to the end of the article for the links).

Both the language and the platform have existed for many years: and all this time there were many means for working with unmanaged code. So why now comes the next API to work with unmanaged code if in fact it has existed for many, many years? In order to answer this question, it is enough to understand what we lacked before.

Developers of the platform have tried to help us brighten up everyday development using unmanaged resources: these are automatic wrappers for imported methods. And marshalling, which in most cases works automatically. This is also the instruction stackalllocdescribed in the chapter on the thread stack. However, as for me, if early developers using C # came from the C ++ world (as I did), now they come from higher-level languages (for example, I know a developer who came from JavaScript). And what does it mean? This means that people are becoming more suspicious of uncontrollable resources and constructions that are close in spirit to C / C ++ and even more so - to the language of the Assembler.

Note

The chapter published on Habré is not updated and it is possible that it is already somewhat outdated. So, please ask for a more recent text to the original:

CLR Book: GitHub, table of contents
CLR Book: GitHub, chapter
Release 0.5.2 of the book, PDF: GitHub Release

As a result of such an attitude, there is less and less content of unsafe code in projects and more and more confidence in the API of the platform itself. This is easily verified if you look for the use of the design stackallocin open repositories: it is negligible. But if you take any code that uses it:

Interop.ReadDir
/src/mscorlib/shared/Interop/Unix/System.Native/Interop.ReadDir.cs class

unsafe
{
    // s_readBufferSize is zero when the native implementation does not support reading into a buffer.
    byte* buffer = stackalloc byte[s_readBufferSize];
    InternalDirectoryEntry temp;
    int ret = ReadDirR(dir.DangerousGetHandle(), buffer, s_readBufferSize, out temp);
    // We copy data into DirectoryEntry to ensure there are no dangling references.
    outputEntry = ret == 0 ?
            new DirectoryEntry() { 
               InodeName = GetDirectoryEntryName(temp), InodeType = temp.InodeType 
            } 
            : default(DirectoryEntry);
    return ret;
}

The reason for the unpopularity becomes clear. Look without reading the code and answer one question for yourself: do you trust him? I can assume that the answer is no. Then answer the other: why? The answer will be obvious: besides the fact that we see a word Dangerousthat somehow hints that something can go wrong, the second factor affecting our attitude is a line byte* buffer = stackalloc byte[s_readBufferSize];, and if it is even more specificbyte*. This record is a trigger for anyone to have a thought in their head: “what else couldn’t it be possible to do something else?”. Then let's understand a little bit more about psychoanalysis: why can a similar idea arise? On the one hand, we use language constructs and the syntax proposed here is far from, for example, C ++ / CLI, which allows you to do anything at all (including inserting on a pure Assembler), and on the other, it looks unusual.

So what is the question? How to get developers back into the fold of unmanaged code? It is necessary to give them a sense of calm that they cannot make a mistake by chance, out of ignorance. So, for what are introduced типы Span<T>and Memory<T>?

Span [T], ReadOnlySpan [T]

The type Spanrepresents a part of a certain data array, a subrange of its values. At the same time, allowing, as in the case of an array, to work with elements of this range for both writing and reading. However, let us, for overclocking and general understanding, compare the types of data for which a type implementation is made Spanand look at the possible goals of its introduction.

The first data type you want to talk about is a regular array. For arrays, working with Span will look like this:

    var array = new [] {1,2,3,4,5,6};
    var span = new Span<int>(array, 1, 3);
    var position = span.BinarySearch(3);
    Console.WriteLine(span[position]);  // -> 3

As we can see in this example, for a start we create a certain array of data. After that we create Span(or a subset), which, referring to the array itself, allows it to the access code only access to the range of values that was specified during initialization.

Here we see the first property of this data type: this is the creation of some context. Let's develop our idea with contexts:

void Main()
{
    var array = new [] {'1','2','3','4','5','6'};
    var span = new Span<char>(array, 1, 3);
    if(TryParseInt32(span, out var res))
    {
        Console.WriteLine(res);
    }
    else
    {
        Console.WriteLine("Failed to parse");
    }
}
public bool TryParseInt32(Span<char> input, out int result)
{
    result = 0;
    for (int i = 0; i < input.Length; i++)
    {
        if(input[i] < '0' || input[i] > '9')
            return false;
    result = result * 10 + ((int)input[i] - '0');
    }
    return true;
}
-----
234

As we see, it Span<T>introduces an abstraction of access to a certain section of memory both for reading and writing. What does this give us? If we recall, on the basis of what else can be made Span, then we recall both the unmanaged resources and the lines:

// Managed array
var array = new[] { '1', '2', '3', '4', '5', '6' };
var arrSpan = new Span<char>(array, 1, 3);
if (TryParseInt32(arrSpan, out var res1))
{
    Console.WriteLine(res1);
}
// String
var srcString = "123456";
var strSpan = srcString.AsSpan().Slice(1, 3);
if (TryParseInt32(strSpan, out var res2))
{
    Console.WriteLine(res2);
}
// void *
Span<char> buf = stackalloc char[6];
buf[0] = '1'; buf[1] = '2'; buf[2] = '3';
buf[3] = '4'; buf[4] = '5'; buf[5] = '6';
if (TryParseInt32(buf.Slice(1, 3), out var res3))
{
    Console.WriteLine(res3);
}
-----
234
234
234

That is, it turns out that it Span<T>is a means of unification in working with memory: managed and unmanaged, which guarantees security in working with this kind of data during the Garbage Collection: if the memory plots with controlled arrays start moving, then it will be safe for us .

However, is it worth it to rejoice so much? Was it possible to achieve all this before? For example, if we talk about managed arrays, then there is even no doubt: just wrap the array in another class, providing a similar interface and everything is ready. Moreover, a similar operation can be done with strings: they have the necessary methods. Again, just wrap the string in the exact same type and provide methods for working with it. Another thing is that in order to store a string, buffer or array in one type, you will have to tinker strongly, storing references to each of the possible options in a single copy (of course, only one will be active):

public readonly ref struct OurSpan<T>
{
    private T[] _array;
    private string _str;
    private T * _buffer;
    // ...
}

Or if to make a start from architecture, then to do three types inheriting the uniform interface. It turns out that in order to make a single interface tool between these data types managed, while maintaining maximum performance, there is no other Span<T>way than the path.

Further, if we continue the argument, what is ref structin terms of Span? These are the very “structures, they are only on the stack,” which we so often hear about at interviews. And this means that this data type can only go through the stack and has no right to go to the heap. Therefore Span, being a ref structure, it is a context data type that ensures the operation of methods, but not objects in memory. From this for his understanding and it is necessary to make a start.

From here we can formulate the definition of the type Span and the readonly type associated with it of type ReadOnlySpan:

Span is a data type that provides a single interface for working with heterogeneous types of data arrays, as well as the ability to transfer a subset of this array to another method so that regardless of the context depth, the access speed to the original array is constant and as high as possible.

And really: if we have something like this code:

public void Method1(Span<byte> buffer)
{
    buffer[0] = 0;
    Method2(buffer.Slice(1,2));
}
Method2(Span<byte> buffer)
{
    buffer[0] = 0;
    Method3(buffer.Slice(1,1));
}
Method3(Span<byte> buffer)
{
    buffer[0] = 0;
}

then the speed of access to the source buffer will be as high as possible: you are not working with a managed object, but with a managed pointer. Those. not with the .NET managed type, but with the unsafe type enclosed in a managed shell.

Span [T] examples

The person is so arranged that often until he gets a certain experience, then the final understanding for which the tool is necessary often does not come. Therefore, since we need some experience, let's turn to examples.

ValueStringBuilder

One of the most algorithmically interesting examples is a type ValueStringBuilderthat is prikopan somewhere in the depths mscorliband for some reason, like many other interesting data types, is marked with a modifier internal, which means that if we hadn’t studied the source code mscorlib, we would never learned.

What is the main disadvantage of the StringBuilder system type? This is of course its essence: both he himself and what he is based on (and this is an array of characters char[]) are reference types. And this means at least two things: we still (albeit slightly) load a bunch and the second - we increase the odds of a miss in processor caches.

Another question I had about StringBuilder was the formation of small strings. Those. when the resulting string "give a tooth" will be short: for example, less than 100 characters. When we have sufficiently short formatting, performance raises questions:

    $"{x} is in range [{min};{max}]"

How much worse is this record than manual formation via StringBuilder? The answer is far from always obvious: everything strongly depends on the place of formation: how often this method will be called. After all, it first string.Formatallocates internal memory StringBuilder, which will create an array of characters (SourceString.Length + args.Length * 8) and if during the formation of the array it turns out that the length was not guessed, then another one will be created to form the continuation StringBuilder, thereby forming a simply linked list . As a result, it will be necessary to return the generated string: this is another copy. Tranquility and waste. Now, if we could get rid of placing the first string in the heap of the first array, it would be great: we would definitely get rid of one problem.

Take a look at the type of the bowels mscorlib:

Class ValueStringBuilder
/ src / mscorlib / shared / System / Text / ValueStringBuilder

    internal ref struct ValueStringBuilder
    {
        // это поле будет активно если у нас слишком много символов
        private char[] _arrayToReturnToPool;
        // это поле будет основным
        private Span<char> _chars;
        private int _pos;
        // тип принимает буфер извне, делигируя выбор его размера вызывающей стороне
        public ValueStringBuilder(Span<char> initialBuffer)
        {
            _arrayToReturnToPool = null;
            _chars = initialBuffer;
            _pos = 0;
        }
        public int Length
        {
            get => _pos;
            set
            {
                int delta = value - _pos;
                if (delta > 0)
                {
                    Append('\0', delta);
                }
                else
                {
                    _pos = value;
                }
            }
        }
        // Получение строки - копирование символов из массива в массив
        public override string ToString()
        {
            var s = new string(_chars.Slice(0, _pos));
            Clear();
            return s;
        }
        // Вставка в середину сопровождается развиганием символов
        // исходной строки чтобы вставить необходимый: путем копирования
        public void Insert(int index, char value, int count)
        {
            if (_pos > _chars.Length - count)
            {
                Grow(count);
            }
            int remaining = _pos - index;
            _chars.Slice(index, remaining).CopyTo(_chars.Slice(index + count));
            _chars.Slice(index, count).Fill(value);
            _pos += count;
        }
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public void Append(char c)
        {
            int pos = _pos;
            if (pos < _chars.Length)
            {
                _chars[pos] = c;
                _pos = pos + 1;
            }
            else
            {
                GrowAndAppend(c);
            }
        }
        [MethodImpl(MethodImplOptions.NoInlining)]
        private void GrowAndAppend(char c)
        {
            Grow(1);
            Append(c);
        }
        // Если исходного массива, переданного конструктором не хватило
        // мы выделяем массив из пула свободных необходимого размера
        // На самом деле идеально было бы если бы алгоритм дополнительно создавал
        // дискретность в размерах массивов чтобы пул не был бы фрагментированным
        [MethodImpl(MethodImplOptions.NoInlining)]
        private void Grow(int requiredAdditionalCapacity)
        {
            Debug.Assert(requiredAdditionalCapacity > _chars.Length - _pos);
            char[] poolArray = ArrayPool<char>.Shared.Rent(Math.Max(_pos + requiredAdditionalCapacity, _chars.Length * 2));
            _chars.CopyTo(poolArray);
            char[] toReturn = _arrayToReturnToPool;
            _chars = _arrayToReturnToPool = poolArray;
            if (toReturn != null)
            {
                ArrayPool<char>.Shared.Return(toReturn);
            }
        }
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        private void Clear()
        {
            char[] toReturn = _arrayToReturnToPool;
            this = default; // for safety, to avoid using pooled array if this instance is erroneously appended to again
            if (toReturn != null)
            {
                ArrayPool<char>.Shared.Return(toReturn);
            }
        }
        // Пропущенные методы: с ними и так все ясно
        private void AppendSlow(string s);
        public bool TryCopyTo(Span<char> destination, out int charsWritten);
        public void Append(string s);
        public void Append(char c, int count);
        public unsafe void Append(char* value, int length);
        public Span<char> AppendSpan(int length);
    }

This class in its functionality is similar to its older brother StringBuilder, while possessing one interesting and very important feature: it is a significant type. Those. stored and transmitted entirely by value. And the newest type modifier ref, which is assigned to the type declaration signature, tells us that this type has an additional limitation: it has the right to be only on the stack. Those. outputting its equivalents to class fields will result in an error. Why all these squats? To answer this question, just look at the class StringBuilder, the essence of which we have just described:

Class StringBuilder /src/mscorlib/src/System/Text/StringBuilder.cs

public sealed class StringBuilder : ISerializable
{
    // A StringBuilder is internally represented as a linked list of blocks each of which holds
    // a chunk of the string.  It turns out string as a whole can also be represented as just a chunk,
    // so that is what we do.
    internal char[] m_ChunkChars;                // The characters in this block
    internal StringBuilder m_ChunkPrevious;      // Link to the block logically before this block
    internal int m_ChunkLength;                  // The index in m_ChunkChars that represent the end of the block
    internal int m_ChunkOffset;                  // The logical offset (sum of all characters in previous blocks)
    internal int m_MaxCapacity = 0;
    // ...
    internal const int DefaultCapacity = 16;

StringBuilder is a class within which there is a reference to an array of characters. Those. when you create it, then at least two objects are created: a StringBuilder itself and an array of characters of at least 16 characters (by the way, this is why it is so important to set the expected length of the string: its construction will go through the generation of a single-linked list of 16-character arrays. Agree, waste ). What does this mean in the context of our conversation about the type ValueStringBuilder: by default is missing, because it borrows memory from the outside, plus it itself is a significant type and forces the user to place a buffer for characters on the stack. As a result, the entire instance of the type is placed on the stack along with its contents, and the question of optimization here becomes resolved. No heap allocation? No problem with subsidence performance. But you tell me: why then do not use the ValueStringBuilder (or its self-written version: it’s internal itself and is not available to us) always? The answer is this: you have to look at the problem that you are solving. Will the resulting string be of known size? Will it have some known maximum length? If the answer is "yes" and if the size of the string does not go beyond some reasonable bounds, then a meaningful version of StringBuilder can be used. Otherwise, if we expect long lines, we switch to using the regular version. and if the size of the string does not go beyond some reasonable bounds, then a meaningful version of StringBuilder can be used. Otherwise, if we expect long lines, we switch to using the regular version. and if the size of the string does not go beyond some reasonable bounds, then a meaningful version of StringBuilder can be used. Otherwise, if we expect long lines, we switch to using the regular version.

ValueListBuilder

The second type of data that I would like to especially note is the type ValueListBuilder. It was created for situations where it is necessary to create a certain collection of elements for a short time and immediately hand it over to some algorithm for processing.

Agree: the task is very similar to the task ValueStringBuilder. Yes, and it is solved in a very similar way:

ValueListBuilder.cs file

Speaking directly, such situations are quite frequent. However, earlier we solved this question in a different way: we created List, filled it with data and lost the link. If at the same time the method is called often enough, a sad situation arises: many instances of the class Listhang in the heap, and along with them the arrays associated with them hang in the heap. Now this problem is solved: no additional objects will be created. However, as in the case of ValueStringBuilder, it is solved only for Microsoft programmers: the class has a modifier internal.

Rules and practice of use

In order to finally understand the essence of the new data type, it is necessary to “play around” with it by writing a couple of, or better, more methods that use it. However, the basic rules can be learned now:

If your method will process some incoming dataset without changing its size, you can try to stay on the type Span. If this does not modify this buffer, then on the type ReadOnlySpan;
If your method will work with strings, calculating some statistics or parsing a string, then your method must accept ReadOnlySpan<char>. It is obliged: this is a new rule. After all, if you accept a string, you are forcing someone to make a substring for you
If it is necessary within the framework of the method to make a fairly short array with data (say, 10Kb maximum), then you can easily organize such an array with Span<TType> buf = stackalloc TType[size]. However, of course, TType should only be a significant type, since stackalloconly works with meaningful types.

In other cases, it is worth looking at Memoryeither using either the classic data types.

How Span Works

Additionally, I would like to talk about how Span works and what is so remarkable about it. And there is something to talk about: the data type itself is divided into two versions: for .NET Core 2.0+ and for all others.

File Span.Fast.cs, .NET Core 2.0

public readonly ref partial struct Span<T>
{
    /// Ссылка на объект .NET или чистый указатель
    internal readonly ByReference<T> _pointer;
    /// Длина буфера данных по указателю
    private readonly int _length;
    // ...
}

File ??? [decompiled]

public ref readonly struct Span<T>
{
    private readonly System.Pinnable<T> _pinnable;
    private readonly IntPtr _byteOffset;
    private readonly int _length;
    // ...
}

The fact is that a large .NET Framework and .NET Core 1. * do not have a specially modified garbage collector (unlike the version of .NET Core 2.0+) and therefore have to drag along an additional pointer: to the beginning of the buffer with which Job. That is, it turns out that Spaninternally it works with managed objects of the .NET platform as unmanaged. Take a look at the insides of the second version of the structure: there are three fields. The first field is a link to the managed object. The second is an offset from the beginning of this object in bytes to get the beginning of the data buffer (in rows this is a buffer with characters char, in arrays it is a buffer with data from an array). And, finally, the third field is the number of elements of this buffer laid one after another.

For example, let's take a job Spanfor strings:

File coreclr :: src / System.Private.CoreLib / shared / System / MemoryExtensions.Fast.cs

public static ReadOnlySpan<char> AsSpan(this string text)
{
    if (text == null)
        return default;
    return new ReadOnlySpan<char>(ref text.GetRawStringData(), text.Length);
}

Where is the string.GetRawStringData()following:

File with the definition of fields coreclr :: src / System.Private.CoreLib / src / System / String.CoreCLR.cs

GetRawStringData definition file coreclr :: src / System.Private.CoreLib / shared / System / String.cs

public sealed partial class String :
    IComparable, IEnumerable, IConvertible, IEnumerable<char>,
    IComparable<string>, IEquatable<string>, ICloneable
{
    //
    // These fields map directly onto the fields in an EE StringObject.  See object.h for the layout.
    //
    [NonSerialized] private int _stringLength;
    // For empty strings, this will be '\0' since
    // strings are both null-terminated and length prefixed
    [NonSerialized] private char _firstChar;
    internal ref char GetRawStringData() => ref _firstChar;
}

Those. it turns out that the method climbs directly into the line, and the specification ref charallows the GC to track an unmanaged link inside the line, moving it along with the line while the GC is triggered.

The same story happens with arrays: when created Span, some code inside the JIT calculates the offset of the beginning of the array data and initializes with this offset Span. And how to calculate the offsets for strings and arrays, we learned in the chapter about the structure of objects in memory.

Span [T] as return value

Despite all the idyll associated with Span, there are at least logical, but unexpected limitations on his return from the method. If you look at the following code:

unsafe void Main()
{
    var x = GetSpan();
}
public Span<byte> GetSpan()
{
    Span<byte> reff = new byte[100];
    return reff;
}

then everything looks very logical and good. However, it is worth replacing one instruction with another:

unsafe void Main()
{
    var x = GetSpan();
}
public Span<byte> GetSpan()
{
    Span<byte> reff = stackalloc byte[100];
    return reff;
}

as the compiler will forbid the instruction of such type. But before you write why, I ask you to guess what problems such a design will incur.

So, I hope that you thought, built guesses and assumptions, and maybe even understood the reason. If so, the chapter on the stack flow, I painted on the screws is not for nothing. After all, by giving a link to the local params method that finished the work, you can call another method, wait for it to finish and read its local variables through x [0.99].

However, fortunately, when we make an attempt to write this kind of code, the compiler gives a CS8352 Cannot use local 'reff' in this context because it may expose referenced variables outside of their declaration scopedamn about giving a warning: and it will be right: if you get around this error, you will have the opportunity, for example, to tweak such a situation in a plugin that it will be possible to steal others passwords or enhance the privileges of running our plugin.

If you have questions

If Span<T>there are questions regarding, let's discuss. The data types are very fresh and almost never used by anyone, and therefore it is very, very difficult to disassemble the use cases.

Link to the whole book

CLR Book: GitHub
Release 0.5.0 books, PDF: GitHub Release

Tags: