NeoCode January 5, 2012 at 01:19

Rust Programming Language Overview

From the sandbox

Rust is a new experimental programming language developed by Mozilla. The language is compiled and multi-paradigmatic, it is positioned as an alternative to C / C ++, which is interesting in itself, since there are not so many applicants for competition. You can recall D Walter Bright or Go from Google.
Rust supports functional, parallel, procedural, and object-oriented programming, i.e. almost the entire spectrum of paradigms actually used in applied programming.

I do not set out to translate the documentation (in addition, it is very scarce and constantly changing, because there has not been an official release of the language yet), but instead I want to highlight the most interesting features of the language. Information is collected both from official documentation and from extremely few references to the language on the Internet.

First impression

The syntax of the language is built in a traditional C-like style (which cannot but rejoice, since this is already a de facto standard). Naturally, the well-known C / C ++ design errors are taken into account.
Traditional Hello World looks like this:

use std;
fn main(args: [str]) {
    std::io::println("hello world from " + args[0] + "!");
}

An example is a little more complicated - the factorial calculation function:

fn fac(n: int) -> int {
    let result = 1, i = 1;
    while i <= n {
        result *= i;
        i += 1;
    }
    ret result;
}

As you can see from the example, functions are declared in the “functional” style (this style has some advantages over the traditional “int fac (int n)”). We see automatic type inference (let keyword), the absence of parentheses in the while argument (similar to Go). The compactness of keywords is immediately apparent. The creators of Rust made all the keywords purposefully as short as possible, and to be honest, I like it.

Small but interesting syntax features

Underscores can be inserted into numeric constants. A handy thing, now this feature is being added to many new languages.
0xffff_ffff_ffff_ffff_ffff_ffff
Binary constants Of course, a real programmer should convert bin to hex in his mind, but it’s more convenient!0b1111_1111_1001_0000
The bodies of any operators (even those consisting of a single expression) must be enclosed in curly braces. For example, you could write in C if(x>0) foo();, in Rust you must put curly brackets around foo ()
But the arguments of the if, while and the like operators do not need to be enclosed in parentheses
in many cases, blocks of code can be considered expressions. In particular, for example, this is possible:
```
let x = if the_stars_align() { 4 }
        else if something_else() { 3 }
        else { 0 };
```
syntax for declaring functions - first the keyword fn, then the list of arguments, the type of the argument is indicated after the name, then, if the function returns a value, the arrow "->" and the type of the return value
in the same way, variables are declared: the let keyword, the name of the variable, after the variable, you can specify the type after the colon, and then assign the initial value.
let count: int = 5;
by default, all variables are immutable; The mutable keyword is used to declare mutable variables.
the names of the basic types are the most compact of all that I have encountered: i8, i16, i32, i64, u8, u16, u32, u64, f32, f64
as mentioned above, automatic type inference is supported

I have built-in debugging tools in the language:
The fail keyword terminates the current process
The log keyword outputs any language expression to the log (for example, in stderr)
The assert keyword checks the expression, and if it is false, the current process terminates
The note keyword allows you to display additional information in case of abnormal termination of the process.

Data types

Rust, like Go, supports structural typing (although, according to the authors, languages developed independently, so this is the influence of their common predecessors - Alef, Limbo, etc.). What is structural typing? For example, you have a structure declared in some file (or, in Rust's terminology, “record”)
type point = {x: float, y: float};
You can declare a bunch of variables and functions with argument types of point. Then, somewhere else, you can declare some other structure, for example
type MySuperPoint = {x: float, y: float};
and variables of this type will be fully compatible with variables of type point.

In contrast, the nominative typing adopted in C, C ++, C # and Java does not allow such constructions. With nominative typing, each structure is a unique type, which by default is incompatible with other types.

Structures in Rust are called “records”. There are also tuples - these are the same records, but with nameless fields. Tuple elements, unlike recording elements, cannot be mutable.

There are vectors - somewhat similar to regular arrays, and in some - type std :: vector from stl. When initializing a list, square brackets are used, and not curly as in C / C ++

let myvec = [1, 2, 3, 4];

A vector, nevertheless, is a dynamic data structure, in particular, vectors support concatenation.

let v: mutable [int] = [1, 2, 3];
v += [4, 5, 6];

There are patterns. Their syntax is quite logical, without heaps of "template" from C ++. Function templates and data types are supported.

fn for_rev(v: [T], act: block(T)) {
    let i = std::vec::len(v);
    while i > 0u {
        i -= 1u;
        act(v[i]);
    }
}
type circular_buf = {start: uint,
                        end: uint,
                        buf: [mutable T]};

The language supports so-called tags . This is nothing more than a union from C, with an additional field - the code of the option used (that is, something in common between the union and the enumeration). Or, from a theory point of view, an algebraic data type.

tag shape {
    circle(point, float);
    rectangle(point, point);
}

In the simplest case, the tag is identical to the enumeration:

tag animal {
       dog;
       cat;
     }
let a: animal = dog;
a = cat;

In more complex cases, each element of the “enumeration” is an independent structure with its own “constructor”.
Another interesting example is a recursive structure, with the help of which an object of type “list” is defined:

tag list {
       nil;
       cons(T, @list);
     }
let a: list = cons(10, @cons(12, @nil));

Tags can participate in pattern matching expressions, which can be quite complex.

alt x {
         cons(a, @cons(b, _)) {
             process_pair(a,b);
         }
         cons(10, _) {
             process_ten();
         }
         _ {
             fail;
         }
     }

Pattern matching

To begin with, you can consider the pattern of matching as an improved switch. The keyword alt is used, followed by the analyzed expression, and then in the body of the operator - patterns and actions in case of coincidence with patterns.

alt my_number {
  0       { std::io::println("zero"); }
  1 | 2   { std::io::println("one or two"); }
  3 to 10 { std::io::println("three to ten"); }
  _       { std::io::println("something else"); }
}

As "patterns", you can use not only constants (as in C), but also more complex expressions - variables, tuples, ranges, types, placeholder characters (placeholders, '_'). You can add additional conditions using the when statement, immediately following the pattern. There is a special operator option for match types. This is possible because the language has a universal variant type any , whose objects can contain values of any type.

Pointers.In addition to the usual "pointer" pointers, Rust supports special "smart" pointers with built-in reference counting - shared (Shared boxes) and unique (Unique boxes). They are somewhat similar to shared_ptr and unique_ptr from C ++. They have their own syntax: @ for shared and ~ for unique. For unique pointers, instead of copying, there is a special operation - moving:

let x = ~10;
let y <- x;

after such a move, the pointer x is uninitialized.

Closures, partial use, iterators

From this point, functional programming begins. Rust fully supports the concept of higher-order functions — that is, functions that can take as their arguments and return other functions.

1. The lambda keyword is used to declare a nested function or function data type.

fn make_plus_function(x: int) -> lambda(int) -> int {
    lambda(y: int) -> int { x + y }
}
let plus_two = make_plus_function(2);
assert plus_two(3) == 5;

In this example, we have a make_plus_function function that takes one argument “x” of type int and returns a function of type “int-> int” (here lambda is the keyword). This function is described in the body of the function. The absence of the “return” operator is a bit confusing, however, for FP this is a common thing.

2. The block keyword is used to declare a functional type - an argument to a function, which can be substituted for something that looks like a block of regular code.

fn map_int(f: block(int) -> int, vec: [int]) -> [int] {
    let result = [];
    for i in vec { result += [f(i)]; }
    ret result;
}
map_int({|x| x + 1 }, [1, 2, 3]);

Here we have a function, to the input of which a block is supplied - essentially a lambda function of the type “int-> int”, and a vector of type int (about the syntax of vectors below). The “block” itself in the calling code is written using a somewhat unusual syntax {| x | x + 1}. Personally, I like lambdas in C #, the symbol | it is persistently perceived as a bitwise OR (which, by the way, in Rust is also there, like all the old good sishny operations).

3. A partial application is the creation of a function based on another function with a large number of arguments by specifying the values of some arguments of this other function. To do this, use the bind keyword and the placeholder "_":

let daynum = bind std::vec::position(_, ["mo", "tu", "we", "do", "fr", "sa", "su"])

To make it clearer, I’ll say right away that this can be done in ordinary C by creating a simple wrapper, something like this:
const char* daynum (int i) { const char *s ={"mo", "tu", "we", "do", "fr", "sa", "su"}; return s[i]; }

But a partial application is a functional style, not a procedural one (by the way, from the given example it’s not clear how to make a partial application to get function without arguments)

Another example: the add function is declared with two int arguments, returning an int. Next, the functional type single_param_fn is declared, which has one int argument and returns an int. Using bind, two function objects add4 and add5 are declared, built on the basis of the add function, which has partially given arguments.

fn add(x: int, y: int) -> int {
         ret x + y;
     }
type single_param_fn = fn(int) -> int;
let add4: single_param_fn = bind add(4, _);
let add5: single_param_fn = bind add(_, 5);

Functional objects can be called just like regular functions.

assert (add(4,5) == add4(5));
assert (add(4,5) == add5(4));

4. Pure functions and predicates
Pure functions are functions that do not have side effects (including those that do not call any functions other than pure ones). Such functions are extruded by the keyword pure.

     pure fn lt_42(x: int) -> bool {
         ret (x < 42);
     }

Predicates are pure functions that return a bool type. Such functions can be used in the typestate system (see below), that is, called at the compilation stage for various static checks.

Syntax macros
A planned feature, but very useful. In Rust, it is still at the initial development stage.

std::io::println(#fmt("%s is %d", "the answer", 42));

An expression similar to the one in printf, but executed at compile time (accordingly, all argument errors are detected at the compilation stage). Unfortunately, there are very few materials on syntactic macros, and they themselves are still under development, but there is hope that something like Nemerle macros will turn out .
By the way, unlike the same Nemerle, I consider the decision to highlight macros syntactically using the # symbol to be very literate: a macro is an entity that is very different from a function, and I consider it important at a glance to see where the functions are called in the code and where - macros.

Attributes

A concept similar to C # attributes (and even with similar syntax). For this, special thanks to the developers. As you would expect, attributes add meta-information to the entity they annotate,

#[cfg(target_os = "win32")]
fn register_win_service() { /* ... */ }

Another version of the attribute syntax was invented - the same line, but with a semicolon at the end, annotates the current context. That is, that which matches the closest curly braces covering such an attribute.

fn register_win_service() {
    #[cfg(target_os = "win32")];
    /* ... */
}

Parallel computing

Perhaps one of the most interesting parts of the language. At the same time, the tutorial is not currently described at all :)
The Rust program consists of a "task tree". Each task has an input function, its own stack, means of interaction with other tasks - channels for outgoing information and ports for incoming, and owns some of the objects in the dynamic heap.
Many Rust tasks can exist within the same operating system process. Rust tasks are “lightweight”: each task consumes less memory than the OS process, and switching between them is faster than switching between OS processes (here, probably, all the same “flows” are meant).

A task consists of at least one function with no arguments. The task is launched using the spawn function. Each task can have channels through which it transfers information to other tasks. A channel is a special chan type template parameterized by a channel data type. For example, chan is a channel for transmitting unsigned bytes.
To send to the channel, the send function is used, the first argument of which is the channel, and the second is the value to send. In fact, this function places the value in the internal buffer of the channel.
Ports are used to receive data. A port is a generic port type, parameterized by the port data type: port - port for receiving unsigned bytes.
To read from ports, the recv function is used, the argument of which is the port, and the return value is the data from the port. Reading blocks the task, i.e. if the port is empty, the task enters the standby state until another task sends data to the channel connected to the port.
Linking channels to ports is very simple - by initializing a channel with a port using the chan keyword: Several channels can be connected to one port, but not vice versa - one channel cannot be connected to several ports simultaneously.

let reqport = port();

let reqchan = chan(reqport);

Typestate

I did not find the generally accepted translation into Russian of the concept of “typestate”, so I will call it “type states”. The essence of this feature is that in addition to the usual type control adopted in static typing, additional contextual checks are possible at the compilation stage.
In one form or another, type states are familiar to all programmers - according to the compiler, "the variable is used without initialization." The compiler determines the places where the variable, which has never been written to, is used for reading, and generates a warning. More generally, this idea looks like this: each object has a set of states that it can take. In each state, valid and invalid operations are defined for this object. And the compiler can perform checks - whether a specific operation on an object is allowed in a particular place in the program. It is important that these checks are performed at the compilation stage.

For example, if we have an object of type “file”, then it may have a state of “closed” and “open”. And the read operation from the file is not allowed if the file is closed. In modern languages, usually the read function either throws an exception or returns an error code. The type state system could detect such an error at the compilation stage - just as the compiler determines that a variable read operation occurs before any possible write operation, it could determine that the Read method, which is valid in the file open state, is called to the “Open” method, which transfers the object to this state.

Rust has a notion of “predicates” - special functions that have no side effects and return a bool type. Such functions can be used by the compiler to be called at the compilation stage for the purpose of static checks of certain conditions.

Constraints are special checks that can be performed at the compilation stage. To do this, use the check keyword.

pure fn is_less_than(int a, int b) -< bool {
          ret a < b;
     }
 fn test() {
   let x: int = 10;
   let y: int = 20;
   check is_less_than(x,y);
 }

Predicates can be “hung” on the input parameters of functions in this way:

fn test(int x, int y) : is_less_than(x,y) { ... }

There is very little information on typestate, so many points are not yet clear, but the concept is interesting anyway.

That's all. It is possible that I nevertheless missed some interesting points, but the article was already swollen. If you wish, you can now compile the Rust compiler and try playing around with various examples. Assembly information is available on the official language website .

Tags: