Focusing on ownership

Original author: Nicholas D. Matsakis
  • Transfer

Note translator: the record is dated May 13, 2014, so some details, including the source code, may not correspond to the current state of affairs. The answer to the question of why a translation of such an old post is needed will be the value of its content for developing an understanding of such one of the fundamental concepts of the Rust language as possession.


Over time, I became convinced that it would be better to abandon the distinction between mutable and immutable local variables in Rust. At least, many people are skeptical about this issue. I wanted to state my position in public. I will cite various motives: philosophical, technical and practical, and also turn to the main defense of the current system. (Note: I viewed this as a Rust RFC, but decided that the tone was better for a blog post, and I don’t have time to rewrite it now.)


Explanation


I wrote this article quite strongly and I believe that the line I advocate will be correct. However, if we do not end up maintaining the current system, it will not be a disaster or something like that. It has its advantages, and in general I find it quite enjoyable. I just think we can improve it.


In a word


I would like to remove the distinction between immutable and mutable local variables and rename &mutpointers to &my, &onlyor &uniq(to me, no difference). If only there was a keyword mut.


Philosophical motive


The main reason I want to do this is that I think it will make the language more consistent and easy to understand. Essentially, this reorients us from talk about changeability to talk about the use of pseudonyms ("aliasing") (which I will call "sharing" (see "sharing"), see below).


Variability becomes a consequence of the uniqueness that follows: "You can always change everything you have unique access to. Shareable data is usually immutable, but if you need, you can change them using some sort of types Cell."


In other words, over time it became clear to me that problems with data races and memory security arise when you have both pseudonyms and volatility at the same time . A functional approach to solving this problem is to eliminate variability. Rust's approach would be to remove the use of pseudonyms. This gives us a story that can be told, and which will help us to understand.


Terminology note: I think that we should refer to the use of pseudonyms as division ( note: translator: hereinafter, instead of "aliasing" we use "sharing" in the meaning of "sharing" or "shared ownership", since neither "use of pseudonyms", Neither "pseudonymy" gives an understanding of what is at stake ). In the past, we avoided this because of its multi-threaded references. However, if / when we implement plans for paralleling the data that I proposed, then this connotation is not entirely inappropriate. In fact, given the close relationship between memory security and data races, I really want to promote this connotation.


Educational motive


I think that the current rules are more difficult to understand than they should be. It is not obvious, for example, that it &mut Tdoes not imply any shared ownership. In addition, the designation &mut Tsuggests that it &Tdoes not imply any variability, which is not entirely accurate, due to such types as Cell. And it is impossible to agree on how to call them (“changeable / immutable links” is the most common, but this is not entirely correct).


On the contrary, the type seems to &my Tor &only Tseems to simplify the explanation. This is a unique link - naturally, you cannot force two of them to point to the same place. And variability is an orthogonal thing: it comes from uniqueness, but it also holds for cells. And the type &Tis just its opposite, a shared link . RFC PR # 58 gives a number of similar arguments. I will not repeat them here.


Practical motive


Currently, there is a gap between borrowed pointers, which can be either shared or changeable + unique, and local variables, which are always unique, but can be changeable or immutable. The end result of this is that users must place ads muton things that are not directly editable.


Local variables cannot be modeled using links.


This phenomenon arises from the fact that links are not as expressive as local variables. In general, this prevents the abstraction. Let me give a few examples to explain what I mean. Imagine that I have an environment structure that stores a pointer to an error counter:


structEnv { errors: &mutusize }

Now I can create instances of this structure (and use them):


letmut errors = 0;
let env = Env { errors: &mut errors };
...
if some_condition {
    *env.errors += 1;
}

OK, now imagine that I want to isolate the code that changes env.errorsin a separate function. I might think that since the variable is envnot declared as mutable, I can use an immutable reference &:


letmut errors = 0;
let env = Env { errors: &mut errors };
helper(&env);
fnhelper(env: &Env) {
  ...
  if some_condition {
      *env.errors += 1; // ОШИБКА
  }
}

But it is not. The problem is that it &Envis a type with shared ownership ( note of the translator: as you know, there can be more than one immutable reference to an object at a time ), and therefore it env.errorsappears in a space that allows for separate ownership of the object env. For this code to work, I must declare it envas mutable and use the link &mut( note of translator: &mutto indicate to the compiler that unique ownership is exercised env, because only one changeable reference to the object can exist at a time, and the data race is excluded, but mutbecause create a mutable reference to an immutable object ):


letmut errors = 0;
letmut env = Env { errors: &mut errors };
helper(&mut env);

This problem arises due to the fact that we know that local variables are unique, but we cannot put this knowledge into a borrowed link without making it mutable.


This problem occurs in a number of other places. So far we have written about this in different ways, but the feeling continues to haunt us that we are talking about a break, which simply should not be.


Type checking for closures


We had to bypass this restriction in the case of closures. The closures are mainly absorbed into structures such as Env, but not quite. This is due to the fact that I do not want to require local variables to be declared as mutif they are used through &mutclosures. In other words, take some code, for example:


fnfoo(errors: &mutusize) {
    do_something(|| *errors += 1)
}

An expression describing a closure will actually create an instance of the structure Env:


structClosureEnv<'a, 'b> {
    errors: &uniq &mutusize
}

Pay attention to the link &uniq. This is not what the end user can enter. It means "unique, but not necessarily changeable" pointer. This is required to pass type checking. If the user tried to write this structure manually, he would have to write &mut &mut usize, which in turn would require that the parameter errorsbe declared as mut errors: &mut usize.


Unpacked closures and procedures


I foresee that this restriction is a problem for unpacked closures. Let me elaborate on the design that I was considering. In principle, the idea was that the expression is ||equivalent to some new structural type that implements one of the types Fn:


traitFn<A, R> { fncall(&self, ...); }
traitFnMut<A, R> { fncall(&mutself, ...); }
traitFnOnce<A, R> { fncall(self, ...); }

The exact type will be selected according to the expected type, as of today. In this case, closure consumers can write one of two things:


fnfoo(&self, closure: FnMut<usize, usize>) { ... }
fnfoo<T: FnMut<usize, usize>>(&self, closure: T) { ... }

We ... probably want to fix the syntax, perhaps add sugar, such as FnMut(usize) -> usize, or save | usize | -> usize, etc. It is not so important, it is important that we will pass the closure by value . Please note that in accordance with the current DST (Dynamically-Sized Types) rules, it is permissible to pass type into a type by value as an argument, therefore the argument FnMut<usize, usize>is a valid DST and is not a problem.


Aside : this project is not complete, and I will describe all the details in a separate message.


The problem is that the link is required to call the closure &mut. Since the closure is passed by value, users will again have to write mutwhere it looks out of place:


fnfoo(&self, mut closure: FnMut<usize, usize>) {
    let x = closure.call(3);
}

This is the same problem as in the example Envabove: what actually happens here is that the type FnMutjust wants a unique link, but since it is not part of the type system, it requests a variable link.


Now we can perhaps get around this in different ways. One option that we could do is to have the ||syntax not be revealed to “a certain structural type”, but rather to “a structural type or a pointer to a structural type, as dictated by the type inference”. In this case, the caller could write:


fnfoo(&self, closure: &mutFnMut<usize, usize>) {
    let x = closure.call(3);
}

I do not want to say that this is the end of the world. But this is another step forward in the growing distortions that we need to go through in order to maintain this gap between local variables and references.


Other parts of the API


I have not done an exhaustive study, but, naturally, this distinction is creeping out somewhere else. For example, to read from Socket, I need a unique pointer, so I have to declare it mutable. Therefore, sometimes this does not work:


let socket = Socket::new();
socket.read() // ОШИБКА: нужна изменяемая ссылка

Naturally, according to my proposal, such code would work fine. You would still get an error message if you tried to read from &Socket, but then it would say something like “it’s impossible to create a unique link to a shared link”, which I personally find clearer.


But don't we need mutsecurity?


No, not at all. Rust programs would be equally good if you just declared all bindings as mut. The compiler is perfectly capable of tracking which local variables change at any time — precisely because they are local to the current function. What the type system really cares about is uniqueness.


The value that I see in the current rules of application mut, and I will not deny that it has value, is primarily that they help declare the intention. That is, when I read the code, I know which variables can be reassigned. On the other hand, I also spend a lot of time reading C ++ code and, frankly, never noticed that this was a major stumbling block. (The same goes for the time I spent reading code in Java, JavaScript, Python, or Ruby.)


It is also true that I sometimes find bugs because I declared a variable as I mutforgot to change it. I think that we could get similar advantages with the help of other more aggressive checks (for example, none of the variables used in the cycle condition change in the body of the cycle). I personally can not remember to come across the opposite situation: that is, if the compiler says that something needs to be changeable, it basically always means that I have forgotten the key word somewhere mut. (Think: when was the last time you responded to a compiler error about an unacceptable change, doing something other than restructuring the code to make the change valid?)


Alternatives


I see three alternatives to the current system:


  1. The one that I introduced, where you just throw away "volatility" and track only the uniqueness.
  2. The one where you have three reference types: &, &uniq, and &mut. (As I wrote, this is actually the type system we have today, at least in terms of the borrow checker.)
  3. A more rigorous version, in which "non-mut" variables are always considered to be separate. This would mean that you have to write:


    letmut errors = 0;
    letmut p = &mut errors; // Заметьте, что `p` должен быть объявлен, как `mut`.
    *p += 1;

    You need to declare phow mut, because otherwise the variable will be considered to be separate, although it is a local variable and, therefore, the change is *pnot allowed. What is strange in this scheme is that the local variable does NOT allow separate possession, and we know for sure, because when you try to create its alias, it will move, it will launch a destructor, etc. That is, we still have the concept of "owned", which is different from "does not allow separate ownership."


    On the other hand, if we described this system, saying that volatility is inherited through &mutpointers, without even sticking to shared ownership, this could make sense.



Of these three, I definitely prefer # 1. It is the simplest, and now I am most interested in how we can simplify Rust, keeping its character. Otherwise, I prefer the one we have right now.


Conclusion


In principle, I find that the current rules on variability have some value, but they are expensive. They are a sort of flowing abstraction: that is, they tell a simple story that turns out to be incomplete. This leads to confusion when people move from the initial understanding, which &mutreflects how the variability works, to a full understanding: sometimes mutit is only needed to ensure uniqueness, and sometimes variability is achieved without a keyword mut.


Moreover, we must act with caution in order to maintain fiction, which mutmeans changeability, not uniqueness. We have added special cases for the borrower to check for closures. We need to make the rules on &mutvariability more complex in general. We must either add mutto the closures so that we can call them, or we can make the clocks syntax re-arranged in a less obvious way. And so on.


In the end, everything turns into a more complex language as a whole. Instead of just thinking about shared ownership and uniqueness, the user should think about shared ownership and volatility, and both of them are somehow messed up.


I do not think it's worth it.


Also popular now: