Why Your First Rust FizzBuzz Implementation May Not Work

Original author: Chris Morgan
  • Transfer
Полное оригинальное название статьи: «Why your first FizzBuzz implementation may not work: an exploration into some initially surprising but great parts of Rust (though you still might not like them)»

tl;dr;-версия: На первый взгляд некоторые аспекты Rust могут показаться странными и даже неудобными, однако, они оказываются весьма удачными для языка, который позиционируется как системный. Концепции владения (ownership) и времени жизни (lifetime) позволяют привнести в язык сильные статические гарантии и сделать программы на нём эффективными и безопасными, как по памяти, так и по времени.

Лицензия: CC-BY, автор Chris Morgan.

Why your first FizzBuzz implementation may not work: exploring some of Rust's features that are initially shocking, but are actually its best (though you may not like them anyway)

http://chrismorgan.info/media/images/rust-fizzbuzz.svgFizzBuzz is supposed to be a simple assignment for a beginner, but there are a few pitfalls in Rust that you should be aware of. These pitfalls are not Rust's problems, but rather differences from what most programmers are familiar with, limitations that at first glance may seem very tough, but in reality give huge advantages at a low price.

Rust is a "moving target", however, the language becomes more stable. The code from the article works with version 0.12. If something breaks, please contact me . Regarding Python code, it will work in both the two and the three.

Simple implementation


Ok, I said in the title that your first FizzBuzz implementation may not work, well, it may work. You could write it as in the example below. For brevity, omit fn main() { … }. If you are concerned that the code in Python is shorter than in Rust, then for you there is a special form of code in Python, it is available by clicking on the checkbox. ( approx. lane in the original article there is a checkbox that switches Python code to a “special form” from __future__ import braces, something like an Easter egg from the author ).
FizzBuzz implementation with separate instructions print: everything just works.
PythonRust
for i in range(1, 101):
    if i % 15 == 0:
        print('FizzBuzz')
    elif i % 5 == 0:
        print('Buzz')
    elif i % 3 == 0:
        print('Fizz')
    else:
        print(i)

for i in range(1i, 101) {
    if i % 15 == 0 {
        println!("FizzBuzz");
    } else if i % 5 == 0 {
        println!("Buzz");
    } else if i % 3 == 0 {
        println!("Fizz");
    } else {
        println!("{}", i);
    }
}


Both programs produce the desired result, and they are obviously very similar. The main thing worth mentioning here is that in Rust println!()[1] it requires a string literal as the first argument, a format string; the corresponding Python code would look like this: print('{}'.format(i)).

But what if we want to get rid of duplicate calls printin code? Here's what it might look like:
FizzBuzz with one instruction print.
PythonRust (not compiled)
for i in range(1, 101):
    if i % 15 == 0:
        result = 'FizzBuzz'
    elif i % 5 == 0:
        result = 'Buzz'
    elif i % 3 == 0:
        result = 'Fizz'
    else:
        result = i
    print(result)

for i in range(1i, 101) {
    let result = if i % 15 == 0 {
        "FizzBuzz"
    } else if i % 5 == 0 {
        "Buzz"
    } else if i % 3 == 0 {
        "Fizz"
    } else {
        i
    };
    println!("{}", result);
}


Notice how in Rust we can use the whole block ifas an expression. Even the result of the assignment is not really needed, we could just cram the whole block into the construction where it is used. This is a very familiar approach for rubists, but not for pythonists, because in Python everything is an instruction, not an expression. If you are skeptical about this approach, I understand; when I started getting acquainted with Rust, its bias in expressions and the ability to omit instructions returnseemed to me odd. But, using Rust, I realized that this is not the case. This is actually great.

Python has been my favorite language for five years, but despite the fact that I continue to write professionally in this language (though I would like to switch to Rust), I found that I often lack Rust chips. When working with Rust, I don’t feel the same shortcoming of any of Python, except for the need for libraries that Rust does not already have. In general, Rust has very much downgraded Python in my eyes.

Rust code looks good, but in reality it does not work due to strict typing rules in this language. So what is the type of variable result? The first three branches ifreturn strings, and the fourth - an integer:

f.rs:7:12: 11: 6 error: if and else have incompatible types: expected `& 'static str`, found` int` (expected & -ptr, found int)
f.rs:7} else if i% 3 == 0 {
f.rs:8 "Fizz"
f.rs:9} else {
f.rs:10 i
f.rs:11};
error: aborting due to previous error

This does not work. How about turning a number into a string?
for i in range(1i, 101) {
    let result = if i % 15 == 0 {
        "FizzBuzz"
    } else if i % 5 == 0 {
        "Buzz"
    } else if i % 3 == 0 {
        "Fizz"
    } else {
        i.to_string()
    };
    println!("{}", result);
}

Here we pulled off a chip with which many are familiar from other programming languages ​​( to_string) and applied it in an area in which not many understand. In general, this does not work.

f.rs:7:12: 11: 6 error: if and else have incompatible types: expected `& 'static str`, found` collections :: string :: String` (expected & -ptr, found struct collections :: string :: String)
f.rs:7} else if i% 3 == 0 {
f.rs:8 "Fizz"
f.rs:9} else {
f.rs:10 i.to_string ()
f.rs:11};
error: aborting due to previous error

“What?” I just hear you say, “Now aren't they all the lines?” What’s the matter with this &'static str(how the hell is that to pronounce?) And collections::string::String? ” At this stage, we should take a closer look at the analysis of the types of values ​​produced by branches: the first three branches do not just produce some kind of “string”, they produce &'static str, and the fourth branch does not produce just “whole”, but int. In languages ​​like Python, Ruby, and JavaScript, integer types are combined (and JS went even further and combined all numeric types), while C #, Java, Go have many integer types that vary in size. But even languages ​​like C #, Java, Go have only one type for a string.

But Rust is not. He has two of them.

Two types of lines? Is that what?


Here we could confine ourselves to a simple explanation and go further, but since we have gone so deep, why not go down to the end and understand what has been done and why it is absolutely worth it. So why could C♯, Java and Go be satisfied with the same string type, but Rust not? To answer this question we must go down to the level of memory management.

Both C♯, Java, and Go are all managed languages [2] (also known as garbage collection languages)) That is, they have a mechanism in runtime that controls the allocation and freeing of memory at the appropriate time: when no one else uses the string, it can be special. Thus, they can return a link to a string without worrying about its life time: strings that are still in use will not be special.

There is also one concession for these languages. As a rule, they have immutable (immutable) lines - if two lines are concatenated, memory will be allocated (allocation) for a new line of the required size. (This also means that for a solution that will concatenate Fizz and Buzz, as appropriate, there will be two allocations for numbers divisible by 15. True, some languages ​​can mitigate this negative effect a little by applying what is calledrow pool or interning . The success of this mechanism depends on the optimizer and how the code is written) I suppose the lines are immutable because, as an alternative, we will have more of the evils - changing the line may affect other lines depending on it. This greatly affects the correctness of the program and can lead to race conditions in what is essentially a primitive type.
Also, for unicode lines, this can lead to incorrect line cuts. Of course, these problems also occur elsewhere, but having them in lines can be much worse. (I said that these languages ​​have one string type, but this is not entirely true - there are also specialized string types, for example, both Java and .NET have a mechanism calledStringBuilder).

Модель Rust отличается от используемой в языках со сборкой мусора и основана на понятии владения (ownership). В этой модели каждый объект имеет одного владельца (прим. пер. is owned) в одном месте, в один момент времени, а в других местах можно безопасно получать указатель на него, одалживать (borrow).

collections::string::String это тип с владением. Это значит, что он имеет исключительное право владения содержимым стоки. Когда объект такого типа покидает свою область видимости (scope), строка освобождается. Поэтому, любая подстрока не может иметь тип String, потому что между строкой и подстрокой не будет связи, и когда первая покинет свою область видимости, вторая станет некорректной. Вместо этого, подстроки (или string slices ) use a type that is a reference to an object that someone else owns &str. Rust, thanks to the concept of the lifetime of an object, is able to ensure that not a single line slice survives its original string, thus maintaining memory safety.

The lifetime guide has a more detailed explanation. Here, if you see the construction 'такого_видаafter the reference type, be aware that this is how the lifetime of the link is determined. There is a special lifetime 'static, which means that the object exists throughout the entire program. Such objects are baked directly into the executable file, as well as string literals that are found in the code - that is, the type of a string literal &'static str.

Previously, when the type ~Twas what it is now , but was fake, the type was a variable-size string type. He kept the current (size) and maximum (capacity) size - as the current type (which replaced ). All wrapper types were supposed to work this way. Now, this is a simple wrapped meaning. That's why it is not used - without an additional capacity, it would need to reallocate the memory each time it is added to the string. knows how to reallocate memory and does it by default. Therefore, the difference between and is significant. I can add that during this change, the new type was namedBoxstr~strString~strBoxStringBox&str

StrBuf. In fact, the situation is not much different from that in other languages. In fact, this is the effect of the lack of compulsory garbage collection, which makes some applications &strstupid. In Rust, you will have to access the string buffer a bit more often than in other languages, simply because other languages allow you to access your main string type more lightly.


Back to FizzBuzz


That is, the problem is that in one branch we have a row with ownership, and the rows in the other three are just static string slices (links to statically defined strings). How do we solve this problem? Maybe we’ll try to make them all string slices (yes, we can implicitly result in a type of &strany lifetime , if longer than . Since longer than anything, the compiler can freely convert it to a suitable lifetime):'a'b'a'b'static
for i in range(1i, 101) {
    let result = if i % 15 == 0 {
        "FizzBuzz"
    } else if i % 5 == 0 {
        "Buzz"
    } else if i % 3 == 0 {
        "Fizz"
    } else {
        i.to_string().as_slice()
    };
    println!("{}", result);
}

Looks like a good idea, huh? Sorry, this will not work either:
f.rs:10:9: 10:22 error: borrowed value does not live long enough
f.rs:10 i.to_string (). as_slice ()
                ^ ~~~~~~~~~~~~
f.rs:2:25: 13: 2 note: reference must be valid for the block at 2:24 ...
f.rs:2 for i in range (1i, 101) {
f.rshaps let result = if i% 15 == 0 {
f.rs:4 "FizzBuzz"
f.rscla} else if i% 5 == 0 {
f.rs:6 "Buzz"
f.rs:7} else if i% 3 == 0 {
       ...
f.rs:9:12: 11: 6 note: ... but borrowed value is only valid for the expression at 9:11
f.rs:9} else {
f.rs:10 i.to_string (). as_slice ()
f.rs:11};
error: aborting due to previous error

Here we rest against the lifetime: the line generated in is i.to_string()not stored enough time and is freed at the end of the block. Thus, a link to it also cannot leave the block. This is a potential bug with a link to invalid memory that the Rust compiler has successfully caught. In some languages ​​this is called a “dangling pointer” and it is Very Bad.

Here we can simply raise the string variable per block, it is enough for us that the string is valid during the body of the loop. Sometimes you will encounter situations in which this will be enough, but often not.
for i in range(1i, 101) {
    let x;
    let result = if i % 15 == 0 {
        "FizzBuzz"
    } else if i % 5 == 0 {
        "Buzz"
    } else if i % 3 == 0 {
        "Fizz"
    } else {
        x = i.to_string();
        x.as_slice()
    };
    println!("{}", result);
}
We place the link in the enclosing block, this works.

How about making everything a type String?


We can go in the opposite direction, requiring all branches to return rows with ownership:
for i in range(1i, 101) {
    let result = if i % 15 == 0 {
        "FizzBuzz".to_string()
    } else if i % 5 == 0 {
        "Buzz".to_string()
    } else if i % 3 == 0 {
        "Fizz".to_string()
    } else {
        i.to_string()
    };
    println!("{}", result);
}
We do everything in lines, but not free for runtime.

This approach works well, but it means that memory will be allocated for each iteration, not only for those in which we get a number.

Write function


We went as much as we could in this direction, without rolling the code into absurdity. How about changing the very statement of the problem, namely that we do not print the result, but return it from the function?

Let's start with this code:
Function fizz_buzzreturning String.
PythonRust
def fizz_buzz(i):
    if i % 15 == 0:
        return 'FizzBuzz'
    elif i % 5 == 0:
        return 'Buzz'
    elif i % 3 == 0:
        return 'Fizz'
    else:
        return i
for i in range(1, 101):
    print(fizz_buzz(i))

fn fizz_buzz(i: int) -> String {
    if i % 15 == 0 {
        "FizzBuzz".to_string()
    } else if i % 5 == 0 {
        "Buzz".to_string()
    } else if i % 3 == 0 {
        "Fizz".to_string()
    } else {
        i.to_string()
    }
}
for i in range(1i, 101) {
    println!("{}", fizz_buzz(i));
}

Now we have an extra level of encapsulation. It demonstrates just the case when a solution with taking a variable to a higher level will not work, because the variable will leave the function itself.
( You can try it yourself ; the return value of the function cannot be represented in the Rust type system, since there is no suitable lifetime — xit will not receive the lifetime 'static, and there is nothing to which we could attach it.)

Also, since we put the code in the function, we highlight new lines for those cases when it is not needed.

We introduce SendStr


Fortunately, Rust supports algebraic data types (also known as enum). And also, in the standard library there is a suitable type that can describe an object that is either a string slice or an ownership string.

The following is a definition of this type (without a description of the methods that make it even more useful):
pub enum MaybeOwned<'a> {
    Slice(&'a str),
    Owned(String)
}
pub type SendStr = MaybeOwned<'static>;

Definitions MaybeOwnedand SendStrout std::str.

SendThis is a restriction that indicates that an object can be safely transferred between tasks (that is, between threads, without losing memory security); it also implies that the object is self-contained, and can be returned from a function. Let there be a string of type &'static str, as in the definition SendStr; it does not contain references to any objects inside the function, right? Therefore, it can exist as long as necessary. The same is true for String. Therefore, either of these two objects can be captured inside the enum-type, which says that we own one or the other object. Therefore, it SendStrsatisfies the condition Send. This type stores some value in itself and the user canвыполнять над ним разные операции. Сейчас самый примечательный факт в том, что мы можем извлекать строковой срез из этого типа с помощью as_slice(). Данный тип также реализует std::fmt::Show, что означает, что мы можем использовать его в форматированном выводе напрямую, указывая {} (типаж Show это прямой аналог __str__() в Python или to_s(), toString(), &c в других языках, но он работает напрямую с объектом writer, что позволяет избавиться от промежуточного строкового объекта. Вызов to_string() на любом типе, реализующем Show также вызывает этот механизм).

Вот как выглядит применение:
use std::str::SendStr;
fn fizz_buzz(i: int) -> SendStr {
    if i % 15 == 0 {
        "FizzBuzz".into_maybe_owned()
    } else if i % 5 == 0 {
        "Buzz".into_maybe_owned()
    } else if i % 3 == 0 {
        "Fizz".into_maybe_owned()
    } else {
        i.to_string().into_maybe_owned()
    }
}
for i in range(1i, 101) {
    println!("{}", fizz_buzz(i));
}

The fizz_buzz function returns SendStr. It works.
( .into_maybe_owned()taken from IntoMaybeOwnedand available by default )

Cool! Now we have reduced the amount of work that the computer needs to do and made our well-known example faster.
But can we go further?

Writing your own enumtype and type implementationstd::fmt::Show


Of course, what we pass is not really a “string”; these are some values ​​of “Fizz”, “Buzz”, “FizzBuzz”, or a number. We simply converted all options to a string in advance; we can easily do this lazily, avoiding unnecessary allocations (in fact, all allocations here can be avoided).

Let's make your own enum.
In the process, we also implement it std::fmt::Showfor him, which will allow us to output it directly to stdout, without the need for an intermediate line.
Using an isolated data type to efficiently represent possible value options.
Python counterpart (extremely tight)Rust
class FizzBuzzItem:
    def __init__(self, value):
        self._value = value
    def __str__(self):
        if self is Fizz:
            return "Fizz"
        elif self is Buzz:
            return "Buzz"
        elif self is FizzBuzz:
            return "FizzBuzz"
        else:
            return str(self._value)
# притворимся, что эти типы непрозрачны
Fizz = FizzBuzzItem(object())
Buzz = FizzBuzzItem(object())
FizzBuzz = FizzBuzzItem(object())
def Number(number):
    return FizzBuzzItem(number)
def fizz_buzz(i):
    if i % 15 == 0:
        return FizzBuzz
    elif i % 5 == 0:
        return Buzz
    elif i % 3 == 0:
        return Fizz
    else:
        return Number(i)
for i in range(1, 101):
    print(fizz_buzz(i))

use std::fmt;
enum FizzBuzzItem {
    Fizz,
    Buzz,
    FizzBuzz,
    Number(int),
}
impl fmt::Show for FizzBuzzItem {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        match *self {
            Fizz => f.write(b"Fizz"),
            Buzz => f.write(b"Buzz"),
            FizzBuzz => f.write(b"FizzBuzz"),
            Number(num) => write!(f, "{}", num),
        }
    }
}
fn fizz_buzz(i: int) -> FizzBuzzItem {
    if i % 15 == 0 {
        FizzBuzz
    } else if i % 5 == 0 {
        Buzz
    } else if i % 3 == 0 {
        Fizz
    } else {
        Number(i)
    }
}
for i in range(1i, 101) {
    println!("{}", fizz_buzz(i));
}


Please note that this is a really good way to present the data, although we might not bother so much in this case and just replace the first three branches with the type Word(&'static str): Word("FizzBuzz")etc. (In truth, this was the first version I wrote at this step. Even I was turned on using strings where this is not required!)

We could go further by writing a separate iterator, but given how the iterators work in Rust, this is completely optional - you can just write range(1, 101).map(fizz_buzz). This will give a lot more flexibility. Once it is implemented somewhere , you can simply add it to the end and you will get a type that implements . The cycle can be rewritten in this style once or twice:Iterator.map(fizz_buzz)Iterator

We apply the function fizz_buzzby the iterator of integers.
PythonRust
for f in map(fizz_buzz, range(1, 101)):
    print(f)

for f in range(1, 101).map(fizz_buzz) {
    println!("{}", f);
}


Whichever of the methods we choose, as a result, we get the good old FizzBuzz program exhaust.

Conclusion


Now you know why your first Rust FizzBuzz implementation might not work. Some of the difficulties described in the article are typical for statically typed languages, some are specific to Rust. (In fact, the situation is similar to that in C ++, except that C ++ will allow you to make a bunch of stupid mistakes and does not give any guarantees for working with memory. Do not argue with me about this, here I am only quoting other people, I I don’t know C ++ to the proper extent.)

We went over the subject of ownership model in Rust, and how it can prevent you from writing in the style you are used to, and why (without the description of specific advantages). We also mentioned the effective concept of enum-types (algebraic data types), which allows us to describe data more strictly and efficiently.

I hope you saw the power of all these things, and it interested you.

Is the described additional semantic load? Yes.

It is unpleasant? Periodically. (My experience says that all this saves from difficulties as often as it creates them.)

Does this improve the efficiency of your programs? Of course, and with complete confidence. Previously, these things required a loss of security and correctness, now in Rust, you do not need such compromises.

Does this simplify code penetration? In simple cases, like this one, there isn’t much difference, but in complex cases these mechanisms become real help. (I really miss them in Python.)

Summing up these concepts, we can say that there are good and bad sides: sometimes you will love them, sometimes you will hate them. But at least I don't hate them that often.

Should you use Rust? Well, I suggest at least trying it. You may find it raw or unsuitable for your purposes due to its emphasis on system programming. For many high-level tasks, it can be somewhat cumbersome. But I believe that the time will come, and it will be a cool tool for things like web programming, which I talked about in a talk at StrangeLoop (you can also see slides, 2MB SVG ).

Finally, if you are new to Rust or have not understood some part of the article, I suggest you familiarize yourself with the official documentation; a thirty-minute introduction to Rust describes the concept of ownership quite well, and in Hyde there are well-disclosed enumtypes and much more. There are also more detailed guides on specific issues . If you still have questions, places like the #rust channel on irc.mozilla.org can help a lot - I am there for a long time, my nickname is ChrisMorgan .

Well, if you really like messing around with FizzBuzz optimization


Yes, please . This is the final version, with the minimum corrections necessary to compile with the modern version of Rust, and the string version OUTfor improved readability (!?):
#![no_std]
#![feature(asm, lang_items)]
extern crate libc;
static OUT: &'static [u8] = b"\
    1\n2\nFizz\n4\nBuzz\nFizz\n7\n8\nFizz\nBuzz\n11\nFizz\n13\n14\nFizzBuzz\n\
    16\n17\nFizz\n19\nBuzz\nFizz\n22\n23\nFizz\nBuzz\n26\nFizz\n28\n29\nFizzBuzz\n\
    31\n32\nFizz\n34\nBuzz\nFizz\n37\n38\nFizz\nBuzz\n41\nFizz\n43\n44\nFizzBuzz\n\
    46\n47\nFizz\n49\nBuzz\nFizz\n52\n53\nFizz\nBuzz\n56\nFizz\n58\n59\nFizzBuzz\n\
    61\n62\nFizz\n64\nBuzz\nFizz\n67\n68\nFizz\nBuzz\n71\nFizz\n73\n74\nFizzBuzz\n\
    76\n77\nFizz\n79\nBuzz\nFizz\n82\n83\nFizz\nBuzz\n86\nFizz\n88\n89\nFizzBuzz\n\
    91\n92\nFizz\n94\nBuzz\nFizz\n97\n98\nFizz\nBuzz\n";
#[start]
fn start(_argc: int, _argv: *const *const u8) -> int {
    unsafe {
        asm!(
            "
            mov $$1, %rax
            mov $$1, %rdi
            mov $0, %rsi
            mov $$0x19d, %rdx
            syscall
            "
            :
            : "r" (&OUT[0])
            : "rax", "rdi", "rsi", "rdx"
            :
        );
    }
    0
}
#[lang = "stack_exhausted"] extern fn stack_exhausted() {}
#[lang = "eh_personality"] extern fn eh_personality() {}
#[lang = "fail_fmt"] extern fn fail_fmt() {}


Translator Notes :
1. Rust has a developed macro system, in this case, println!in compile-time, it is deployed in a call specialized for a specific type println.
2. When you read the original for the first time, you might get the impression that we are talking about managed code , however, here we mean managed memory . Despite the various formulations inside and outside the brackets, we are talking about the same thing.

The material is large enough, stylistic or semantic errors of translation are quite possible. Also, due to the fact that I am not an expert in Rust and statically typed languages, inaccuracies in the description of some mechanisms may occur. In both cases, I will be grateful if you send me your corrections in private messages.
Thanks for attention.

Also popular now: