
Writing a vector in Dlang
Good day, Habr!
In this post I want to consider some features of the D language, using the example of creating the structure of an algebraic vector. This post does not address linear algebra or other mathematics.
It is worth recalling that, unlike C ++ in D, classes and structures have different logical purposes and they are arranged differently. Structures cannot be inherited, in structures there is no other information than fields (in classes there is a table of virtual functions, for example), structures are stored by value (classes are always referenced). Structures are great for simple data types.
So, imagine that we want to create a vector that we could safely use in calculations, pass it to opengl, while it was easy to use.
Let's start with a simple one:
Everything is clear here: the size and type of the vector are determined by the template parameters.
Let's examine the constructor. Three points at the end of vals allow you to call the constructor without parentheses for the array:
It’s not very convenient to register the full type every time you create a variable, we make an alias:
If this approach does not seem flexible to you, D allows you to make aliases with template parameters:
But if we pass 0 to the templatization, then we get a static vector with zero length, I don’t think it is useful. Add restriction:
Now when you try to instantiate a zero-length vector template:
We get the error:
Add some math:
Now we can use our vector like this:
In this case, D preserves the priority of operations (first multiplication, then addition).
But if we try to use vectors of different types, we run into the problem that these types of vectors are not compatible. Bypass this problem:
Without changing the function code, we added support for all possible data types, even our own, the main thing is that the binary operation “op” returns the result. In this case, the result should be able to be implicitly cast to type T. It is worth noting that the vector int with the vector float cannot be added, since the result of adding int and float is float, and it is cast to int only explicitly, using the cast construct.
The element-wise operations with numbers are also implemented:
If desired, you can limit the set of operations within the signature constraint construction (“if” to the function body) by checking “op” for compliance with the desired operations.
If we want our vector to be accepted by functions that take static arrays:
We can use the interesting construction of the D language: creating an alias for this.
Now wherever the compiler wants to get a static array, and a vector will be passed, the data field will be passed. A side effect is that writeln now also accepts data and does not write out the full type when printing. Also now there is no need to redefine opIndex:
Add a little variety. At the moment, we can instantiate a vector with at least strings
and some operations on the vector do not make sense, such as finding the length or finding the unit vector. This is not a problem for D. Add the methods for finding the length and unit vector in this way:
Now the len2 method (square of length) will be declared for almost all numeric data types, but len and e are only for float, double and real. But if you really want to, you can do it for everyone:
Now the len and e methods accept a template parameter, which by default is calculated as the largest type of the two
If desired, we can explicitly specify it, for example, if we need double precision of the length of the vector int.
A bit about the constructor. You can create a constructor with the ability to create a vector more variably, for example like this:
It looks simple:
Such a constructor can accept parameters of various types, in any quantity.
Define the fillData function:
It fulfills only three basic types: number, static array and vector. A more flexible version takes up much more space and there are few excellent moments in it. Consider the isVector template. It allows you to determine whether type E is a vector. This is again done by checking the existence of the type, but for the function.
A vector will not be complete if we cannot access its fields like this: ax + by
You can simply create several properties with similar names:
but, this is not for us. Let's try to implement a more flexible way of access:
We will use the opDispatch magic method for this. Its essence is that if the method of the class (or structure in our case) is not found, then the line after the point is sent to this method as a template parameter:
Add parameterization to the type of our vector as a string and slightly limit the options for this string
The isCompatibleAccessStrings function checks the validity of a field access string. Define the rules:
Although there is nothing special in this function, for completeness it is worth bringing its text.
Now declare the methods:
The text of a full-fledged vector can be found on github or in the descore package on dub (at the moment there is not the latest version, without options for accessing fields, but soon everything will change).
In this post I want to consider some features of the D language, using the example of creating the structure of an algebraic vector. This post does not address linear algebra or other mathematics.
It is worth recalling that, unlike C ++ in D, classes and structures have different logical purposes and they are arranged differently. Structures cannot be inherited, in structures there is no other information than fields (in classes there is a table of virtual functions, for example), structures are stored by value (classes are always referenced). Structures are great for simple data types.
So, imagine that we want to create a vector that we could safely use in calculations, pass it to opengl, while it was easy to use.
Let's start with a simple one:
struct Vector(size_t N,T)
{
T[N] data;
this( in T[N] vals... ) { data = vals; }
}
Everything is clear here: the size and type of the vector are determined by the template parameters.
Let's examine the constructor. Three points at the end of vals allow you to call the constructor without parentheses for the array:
auto a = Vector!(3,float)(1,2,3);
It’s not very convenient to register the full type every time you create a variable, we make an alias:
alias Vector3f = Vector!(3,float);
auto a = Vector3f(1,2,3);
If this approach does not seem flexible to you, D allows you to make aliases with template parameters:
alias Vector3(T) = Vector!(3,T);
auto a = Vector3!float(1,2,3);
auto b = Vector3!int(1,2,3);
But if we pass 0 to the templatization, then we get a static vector with zero length, I don’t think it is useful. Add restriction:
struct Vector(size_t N,T) if( N > 0 ) { ... }
Now when you try to instantiate a zero-length vector template:
Vector!(0,float) a;
We get the error:
vector.d(10): Error: template instance vector.Vector!(0, float) does not match template declaration Vector(ulong N, T) if (N > 0)
Add some math:
struct Vector(size_t N,T) if( N > 0 )
{
...
auto opBinary(string op)( in Vector!(N,T) b ) const
{
Vector!(N,T) ret;
foreach( i; 0 .. N )
mixin( "ret.data[i] = data[i] " ~ op ~ " b.data[i];" );
return ret;
}
}
Now we can use our vector like this:
auto a = Vector3!float(1,2,3);
auto b = Vector3!float(2,3,4);
auto c = Vector3!float(5,6,7);
c = a + b / c * a;
In this case, D preserves the priority of operations (first multiplication, then addition).
But if we try to use vectors of different types, we run into the problem that these types of vectors are not compatible. Bypass this problem:
...
auto opBinary(string op,E)( in Vector!(N,E) b ) const
if( is( typeof( mixin( "T.init" ~ op ~ "E.init" ) ) : T ) )
{ ...}
...
Without changing the function code, we added support for all possible data types, even our own, the main thing is that the binary operation “op” returns the result. In this case, the result should be able to be implicitly cast to type T. It is worth noting that the vector int with the vector float cannot be added, since the result of adding int and float is float, and it is cast to int only explicitly, using the cast construct.
The element-wise operations with numbers are also implemented:
auto opBinary(string op,E)( in E b ) const
if( is( typeof( mixin( "T.init" ~ op ~ "E.init" ) ) : T ) )
{ ...}
If desired, you can limit the set of operations within the signature constraint construction (“if” to the function body) by checking “op” for compliance with the desired operations.
If we want our vector to be accepted by functions that take static arrays:
void foo(size_t N)( in float[N] arr ) { ... }
We can use the interesting construction of the D language: creating an alias for this.
struct Vector(size_t N,T) if (N > 0)
{
T[N] data;
alias data this;
...
}
Now wherever the compiler wants to get a static array, and a vector will be passed, the data field will be passed. A side effect is that writeln now also accepts data and does not write out the full type when printing. Also now there is no need to redefine opIndex:
auto a = Vector3!float(1,2,3);
a[2] = 10;
Add a little variety. At the moment, we can instantiate a vector with at least strings
auto a = Vector2!string("hell", "habr");
auto b = Vector2!string("o", "ahabr");
writeln( a ~ b ); // ["hello", "habrahabr"]
and some operations on the vector do not make sense, such as finding the length or finding the unit vector. This is not a problem for D. Add the methods for finding the length and unit vector in this way:
import std.algorithm;
import std.math;
struct Vector(size_t N,T) if (N > 0)
{
...
static if( is( typeof( T.init * T.init ) == T ) )
{
const @property
{
auto len2() { return reduce!((r,v)=>r+=v*v)( data.dup ); }
static if( is( typeof( sqrt(T.init) ) ) )
{
auto len() { return sqrt( len2 ); }
auto e() { return this / len; }
}
}
}
}
Now the len2 method (square of length) will be declared for almost all numeric data types, but len and e are only for float, double and real. But if you really want to, you can do it for everyone:
...
import std.traits;
struct Vector(size_t N,T) if (N > 0)
{
this(E)( in Vector!(N,E) b ) // позволяет конструировать вектор из других совместимых векторов
if( is( typeof( cast(T)(E.init) ) ) )
{
foreach( i; 0 .. N )
data[i] = cast(T)(b[i]);
}
...
static if( isNumeric!T )
{
auto len(E=CommonType!(T,float))() { return sqrt( cast(E)len2 ); }
auto e(E=CommonType!(T,float))() { return Vector!(N,E)(this) / len!E; }
}
...
}
Now the len and e methods accept a template parameter, which by default is calculated as the largest type of the two
CommonType!(int,float) a; // float a;
CommonType!(double,float) b; // double b;
If desired, we can explicitly specify it, for example, if we need double precision of the length of the vector int.
A bit about the constructor. You can create a constructor with the ability to create a vector more variably, for example like this:
auto a = Vector3f(1,2,3);
auto b = Vector2f(1,2);
auto c = Vector!(8,float)( 0, a, 4, b, 3 );
It looks simple:
struct Vector(size_t N,T) if (N > 0)
{
...
this(E...)( in E vals )
{
size_t i = 0;
foreach( v; vals ) i += fillData( data, i, v );
}
...
}
Such a constructor can accept parameters of various types, in any quantity.
Define the fillData function:
size_t fillData(size_t N,T,E)( ref T[N] data, size_t no, E val )
{
static if( isNumeric!E )
{
data[no] = cast(T)val;
return 1;
}
else static if( isStaticArray!E &&
isNumeric!(typeof(E.init[0])) )
{
foreach( i, v; val )
data[no+i] = v;
return val.length;
}
else static if( isVector!E )
{
foreach( i, v; val.data )
data[no+i] = cast(T)v;
return val.data.length;
}
else static assert(0,"unkompatible type");
}
It fulfills only three basic types: number, static array and vector. A more flexible version takes up much more space and there are few excellent moments in it. Consider the isVector template. It allows you to determine whether type E is a vector. This is again done by checking the existence of the type, but for the function.
template isVector(E)
{
enum isVector = is( typeof( impl(E.init) ) );
void impl(size_t N,T)( Vector!(N,T) x );
}
A vector will not be complete if we cannot access its fields like this: ax + by
You can simply create several properties with similar names:
...
auto x() const @property { return data[0]; }
...
but, this is not for us. Let's try to implement a more flexible way of access:
- with the ability to create vectors with a different set of fields (xyz, rgb, uv)
- so that you can access the fields not only in the singular (a.xy = vec2 (1,2))
- one type of vector must have several access options
We will use the opDispatch magic method for this. Its essence is that if the method of the class (or structure in our case) is not found, then the line after the point is sent to this method as a template parameter:
class A
{
void opDispatch(string str)( int x ) { writeln( str, ": ", x ); }
}
auto a = new A;
a.hello( 4 ); // hello: 4
Add parameterization to the type of our vector as a string and slightly limit the options for this string
enum string SEP1=" ";
enum string SEP2="|";
struct Vector(size_t N,T,alias string AS)
if ( N > 0 && ( AS.length == 0 || isCompatibleAccessStrings(N,AS,SEP1,SEP2) ) )
{
...
}
The isCompatibleAccessStrings function checks the validity of a field access string. Define the rules:
- field names must be valid D language identifiers;
- the number of names in each variant must correspond to the dimension of the vector N;
- names are separated by a space (SEP1);
- options must be separated by a vertical bar (SEP2).
Although there is nothing special in this function, for completeness it is worth bringing its text.
function text isCompatibleAccessStrings and other helper
/// compatible for creating access dispatches
pure bool isCompatibleArrayAccessStrings( size_t N, string str, string sep1="", string sep2="|" )
in { assert( sep1 != sep2 ); } body
{
auto strs = str.split(sep2);
foreach( s; strs )
if( !isCompatibleArrayAccessString(N,s,sep1) )
return false;
string[] fa;
foreach( s; strs )
fa ~= s.split(sep1);
foreach( ref v; fa ) v = strip(v);
foreach( i, a; fa )
foreach( j, b; fa )
if( i != j && a == b ) return false;
return true;
}
/// compatible for creating access dispatches
pure bool isCompatibleArrayAccessString( size_t N, string str, string sep="" )
{ return N == getAccessFieldsCount(str,sep) && isArrayAccessString(str,sep); }
///
pure bool isArrayAccessString( in string as, in string sep="", bool allowDot=false )
{
if( as.length == 0 ) return false;
auto splt = as.split(sep);
foreach( i, val; splt )
if( !isValueAccessString(val,allowDot) || canFind(splt[0..i],val) )
return false;
return true;
}
///
pure size_t getAccessFieldsCount( string str, string sep )
{ return str.split(sep).length; }
///
pure ptrdiff_t getIndex( string as, string arg, string sep1="", string sep2="|" )
in { assert( sep1 != sep2 ); } body
{
foreach( str; as.split(sep2) )
foreach( i, v; str.split(sep1) )
if( arg == v ) return i;
return -1;
}
///
pure bool oneOfAccess( string str, string arg, string sep="" )
{
auto splt = str.split(sep);
return canFind(splt,arg);
}
///
pure bool oneOfAccessAll( string str, string arg, string sep="" )
{
auto splt = arg.split("");
return all!(a=>oneOfAccess(str,a,sep))(splt);
}
///
pure bool oneOfAnyAccessAll( string str, string arg, string sep1="", string sep2="|" )
in { assert( sep1 != sep2 ); } body
{
foreach( s; str.split(sep2) )
if( oneOfAccessAll(s,arg,sep1) ) return true;
return false;
}
/// check symbol count for access to field
pure bool isOneSymbolPerFieldForAnyAccessString( string str, string sep1="", string sep2="|" )
in { assert( sep1 != sep2 ); } body
{
foreach( s; str.split(sep2) )
if( isOneSymbolPerFieldAccessString(s,sep1) ) return true;
return false;
}
/// check symbol count for access to field
pure bool isOneSymbolPerFieldAccessString( string str, string sep="" )
{
foreach( s; str.split(sep) )
if( s.length > 1 ) return false;
return true;
}
pure
{
bool isValueAccessString( in string as, bool allowDot=false )
{
return as.length > 0 &&
startsWithAllowedChars(as) &&
(allowDot?(all!(a=>isValueAccessString(a))(as.split("."))):allowedCharsOnly(as));
}
bool startsWithAllowedChars( in string as )
{
switch(as[0])
{
case 'a': .. case 'z': goto case;
case 'A': .. case 'Z': goto case;
case '_': return true;
default: return false;
}
}
bool allowedCharsOnly( in string as )
{
foreach( c; as ) if( !allowedChar(c) ) return false;
return true;
}
bool allowedChar( in char c )
{
switch(c)
{
case 'a': .. case 'z': goto case;
case 'A': .. case 'Z': goto case;
case '0': .. case '9': goto case;
case '_': return true;
default: return false;
}
}
}
Now declare the methods:
struct Vector( size_t N, T, alias string AS="" )
if( N > 0 && ( isCompatibleArrayAccessStrings(N,AS,SEP1,SEP2) || AS.length == 0 ) )
{
...
static if( AS.length > 0 ) // только если строка доступа есть
{
@property
{
// можно и получать и менять значения: a.x = b.y;
ref T opDispatch(string v)()
if( getIndex(AS,v,SEP1,SEP2) != -1 )
{ mixin( format( "return data[%d];", getIndex(AS,v,SEP1,SEP2) ) ); }
// константный метод
T opDispatch(string v)() const
if( getIndex(AS,v,SEP1,SEP2) != -1 )
{ mixin( format( "return data[%d];", getIndex(AS,v,SEP1,SEP2) ) ); }
// в случае, если существует вариант доступа, где каждое полe определяется одной буквой
static if( isOneSymbolPerFieldForAnyAccessString(AS,SEP1,SEP2) )
{
// auto a = b.xy; // typeof(a) == Vector!(2,int,"x y");
// auto a = b.xx; // typeof(a) == Vector!(2,int,"");
auto opDispatch(string v)() const
if( v.length > 1 && oneOfAnyAccessAll(AS,v,SEP1,SEP2) )
{
mixin( format( `return Vector!(v.length,T,"%s")(%s);`,
isCompatibleArrayAccessString(v.length,v)?v.split("").join(SEP1):"",
array( map!(a=>format( `data[%d]`,getIndex(AS,a,SEP1,SEP2)))(v.split("")) ).join(",")
));
}
// a.xy = b.zw;
auto opDispatch( string v, U )( in U b )
if( v.length > 1 && oneOfAnyAccessAll(AS,v,SEP1,SEP2) && isCompatibleArrayAccessString(v.length,v) &&
( isCompatibleVector!(v.length,T,U) || ( isDynamicVector!U && is(typeof(T(U.datatype.init))) ) ) )
{
foreach( i; 0 .. v.length )
data[getIndex(AS,""~v[i],SEP1,SEP2)] = T( b[i] );
return opDispatch!v;
}
}
}
}
The text of a full-fledged vector can be found on github or in the descore package on dub (at the moment there is not the latest version, without options for accessing fields, but soon everything will change).