Gradual programming
- Transfer
Programming is essentially an incremental (or gradual, sequential) process, and the programming languages we use should reflect this fact. This article discusses several different directions in which software models move as they develop, and also raises the question of how potential research in the field of usability of programming languages will serve in the future to formulate the concept of human-oriented programming languages.
Choosing the right task
What serious problems do the programming languages that we use in our work in 2018 have? Which of them after the decision will be able to have the greatest effect on the next generation of programmers?
If you are interested in this issue, we recommend reading the post by Graydon Hoar (creator of Rust) “What's next?” , As well as the post by Stephen Diel “The Imminent Future of Programming Languages” .For me, this question hides the most attractive feature of researching programming languages - the fact is that the tools and theories that we develop affect not only one specific area, but also potentially everyone involved in programming. The following question also arises from this: how can you tell us, graciously, about the needs of every programmer living on Earth? It’s easy to work on the X language, based on the new type theory, or on the Y language, which has a new “feature” that is interesting to me personally - but what about all the other programmers?
This is one of the most important shortcomings of programming languages as a modern field of research. A huge number of such studies take place under the banner of intuition of the researchers themselves, and in addition to them, the specific experience of certain people working with specific programming tools, languages, environments, etc. Obviously, the intuitive approach allowed us to go quite far since we were able to reach our current level - confirming the thesis that smart people are most often distinguished by good intuition - but let me assume that the obvious stagnation in the widespread use of modern practice of research on nuclear weapons first of all, with a lack of attention to the end user (in other words, to the ordinary programmer). The opinion that I have come across more than once isa really new idea was Prolog .
It seems to me that a look at programming languages ( PL, PL ) through the lenses of human-machine interaction ( HCI) Is the area’s most important meta issue today. More than ever, today we need to conduct surveys, interviews, study user experiences, involve sociologists and psychologists, and so on, in order to formulate hypotheses based on the obtained data that will affect the “difficult” sections of the programming discipline. It is necessary not only to make the programming process comfortable for those who are just starting to learn it, but for everyone else - from gray-haired low-level system developers to young people represented by web developers. Interest in this area is already beginning to take shape; for example, a CHI conference called Usability of Programming Languages Special Interest Group is being held, scientific papers like"An empirical analysis of the propagation of programming languages» ( Empirical an Analysis of the Language Programming Adoption ), as well as going to the working groups on the usability of languages .
However, while our knowledge of the usability of languages is still not so great, nothing can prevent us from continuing to work on key problems of PL, which in our opinion will bring tangible results. The manifesto that I formulated below is based mainly on my personal experience - I have been programming for over ten years, have been involved in game development (Lua, C), websites (PHP, JS), highly loaded / distributed systems (C ++, Go, Rust ), compilers (Standard ML, OCaml) and data science (Python, R). During this time, I managed to work on small scripts, personal projects, open source software, products in tiny (2 people), small (15 people), medium (500 people) and large (2000+ people) companies, and now I am engaged in scientific research. I studied the theory of programming languages at Carnegie Mellon University,CS242 programming language course at Stanford .
I told all this in order to make it clear to you: even though we need much more data in order to thoroughly approach the discussion of these problems, I tried very hard to formulate a reasonable opinion about the problems that exist in modern programming languages, have weight in many various fields of activity and are really found in the real world. Of course, I will not deny that there is much that I don’t know about - therefore, as usual, I suggest you read this post with a healthy share of criticism.
Thinking gradually
I firmly believe in the following: programming languages should be designed in such a way that they directly correspond to the programming processes that occur in the head of a programmer. We should strive to understand exactly how people think about programs, and try to determine which of the programming processes are understood by the human mind on an intuitive level. All kinds of fascinating questions are enough here, for example:
- Is imperative programming more intuitive for people than functional programming? If so, is this because our brain is configured this way, or simply because it is the most popular form of programming?
- How far should we go in striving to match the natural processes taking place in the human head? Or maybe, on the contrary, the balance should be shifted towards changing the way of thinking of programmers?
- How much do comments really affect our understanding of programs? Variable names? Types? Flow control?
A simple observation of the human programming process shows that this process is incremental. No one writes the entire program from scratch in one pass, compiles it and immediately releases it, after which it never opens its code again. Programming is when you work hard for a long time by trial and error, where the duration of the samples and the severity of the errors are closely dependent on a specific area and instrumentation. This is why the ability to examine the output and the fast compilation time is so significant - for example, the ability to modify an HTML document and immediately refresh what is happening by refreshing the page. Bret Victor, in his Learnable Programming article , discusses this idea in detail.
I call this process “gradual programming” (gradual programming ).
I would use the term “incremental” programming, but incremental computing already has its own, different from mine, and fixed meaning, especially since the term “gradual” is used among enthusiast YaP in this context.While the paradigms of imperative or functional programming focus on the aspects that underlie our mental model of a program, gradual programming describes the process by which this mental model is formed. In this sense, gradual programming is just ... programming; but, it seems to me, the new term is appropriate here, since it will be useful to us in order not to get confused further.
As far as I know, the only recorded use of the term “gradual programming” in addition to this article is this publication , but there the term is given a slightly different perspective. One of its authors is Jeremy Sike, one of the creators of gradual typing .
With gradual programming, the programmer monitors co-evolution ( parallel evolution) two things: 1) the syntactic representation of the program, expressed for the computer in the programming language, and 2) the conceptual representation of the program, located in the head of the programmer. At the beginning of this process, the programmer starts without any syntax (from an empty file) and is usually armed with a very vague idea of how the resulting program should work. From this point, the programmer continues to take small steps in the direction of building the components of the program - until its final version is ready.
If you are programming, you almost certainly went through this process several times, and most likely more or less recognized it in my description - however, usually most of our thought process occurs implicitly (for example, inside your head) and never appears in form communications. To make sure this gradual process exists, let us take a closer look at the following example in detail. Suppose I want to write a program that adds a line with text to a file. In my head, I have a certain model of a program that looks something like this:
входной файл = некий другой ввод
входная строка = некий ввод
записываем входную строку в конец входного файла
Then I decide in which language I will write this program - in our case it will be Python. To begin with, instead of trying to write the whole program at once, I just take the first line from the model and try to write it as it is in Python.
$ cat append.py
input_file = input()
print(input_file)
Here I made several decisions. First, I decided that the input would be made from
stdin
(for simplicity), and used a function input()
, a standard Python library function. I had to come up with a name for this value, input_file
and this name should have been consistent with the Python syntax conventions. I also added an expression print
that was not part of my original programming model, but is part of a temporary programming model designed to debug my small programs. Then I will try to execute it:$ echo "test.txt" | python append.py
Traceback (most recent call last):
File "append.py", line 1, in
input_file = input()
File "", line 1, in
NameError: name 'test' is not defined
Oops, I mixed up
input()
and raw_input()
. The problem was not with my programming model - I still think of the program in exactly the same way as before - but with my "decoding" in Python. I correct my mistake:$ cat append.py
input_file = raw_input()
input_line = raw_input()
print(input_file, input_line)
$ echo "test.txt\ntest" | python append.py
('test.txt', 'test')
Next I have to figure out how to add a line to the end of the file. In my original mental model, it was encapsulated in the expression “write input line to the end of input file”, but now it’s time to turn this vague idea into more specific steps that I can write with ease in Python. In particular, if I already have an understanding of how the file system works, then I know that I must first open the file in append mode , write a line, and then close the file.
After some thought, my mental model begins to look like this:
входной файл = некий другой ввод
входная строка = некий ввод
файл = открыть входной файл для записи в режиме добавления
записываем входную строку в конец входного файла
закрываем файл
So now, let's translate all of this into Python:
$ cat append.py
input_file = raw_input()
input_line = raw_input()
file = open(input_file, 'a')
file.write(input_line)
file.close()
$ echo "test.txt\ntest" | python append.py
$ cat test.txt
test
Success! Again, the goal of our example was to demonstrate the co-evolution of the syntactic and conceptual model of a program as you work on it. Based on my programming experience, along with teaching others to program, I can say that it provides a fairly common example of the thought process that accompanies the way many of us program.
Axis of evolution
The example described above showed us the gradual nature of the programming process, but did not shed any light on how we should approach the process of creating tools that would suit this process. To simplify our task, let's divide the evolution of the program into many small axes of evolution. Essentially, let's ask ourselves: what types of useful information about the program do developers gradually learn and understand? Then we can assume how programming languages can help optimize each axis individually.
1. Concrete / abstraction (Concrete / abstract)
When creating programs, the generally accepted way of working is to start with a specific example that you are trying to implement, and then generalize (or abstract) this example, which is done so that it can cover a wider range of use cases. Abstraction is the cornerstone of programming languages, usually provided through functions and interfaces. For example, we can turn our script into a function:
def append_line(input_file, input_line):
# наш код выше
append_line('test.txt', 'test')
append_line('test.txt', 'test again')
However, the more vague your model is from the beginning, the more difficult it will be to immediately switch to an abstract solution, so this evolution from specificity to abstraction today is often observed when working with modern programming languages (again, see the chapter "Create by Abstracting" in Learnable Programming ) .
2. Anonymity / named (Anonymous / named)
When we are at the beginning of our programming process at the stage of iteration / experimentation, it is natural that we as programmers want to optimize the speed of writing code, rather than reading it . One form of such optimization is short variable names and anonymous values. For example, a shortened version of the first version of our script might look like this:
s = raw_input()
f = open(s, 'a')
f.write(raw_input())
f.close()
Here, the names of the variables are less informative than before: we use
s
instead input_file
, f
instead file
, but input_line
in general lost the name. However, if it’s faster to write, and no one will ever read the script again, why not? If we plan to continue to use this script in a large code base, then we can begin to incrementally change the names to more informative ones so that the citizens conducting the code review are satisfied. Here is another example of a gradual change that is easy to put into practice and that is common among programmers who write in modern programming languages.3. Imperative / declarative
For a variety of reasons, linear, sequential imperative code is more naturally perceived by programmers compared to functional / declarative code in terms of their conceptual programming model. For example, a simple list transformation will almost certainly use loops
for
:in_l = [1, 2, 3, 4]
out_l = []
for x in in_l:
if x % 2 == 0:
out_l.append(x * 2)
While the more declarative version abstracts the flow of execution into subject-oriented primitives:
in_l = [1, 2, 3, 4]
out_l = map(lambda x: x * 2, filter(lambda x: x % 2 == 0, in_l))
The difference between these two approaches is not only stylistic - declarative code is usually much easier to analyze for a structure, for example
map
(display, map) can be parallelized in a trivial way, while a loop for
is generally much worse for this. Such transformations most often occur in languages that support a mixture of imperative and functional code (at least closures).4. Dynamic typing / static typing (Dynamically typed / statically typed)
The rise in popularity of dynamically typed languages over the past 20 years (Python, Javascript, R, Lua, ...) should be sufficient evidence that people find dynamic typing useful - regardless of which side of the barricade you are on , the fact remains . Despite the fact that dynamic typing has many advantages (various data structures, free duck typing, etc.), the easiest way is to increase productivity by omitting: the types of variables do not need to be known at compile time, so the programmer does not have to spend his mental energy is also at it.
However, types still remain extremely useful tools for ensuring correctness and performance, so the programmer may want to gradually add type signatures to an untyped program if he can be sure that a variable must have a specific type. Such a nascent idea, which is called optional, or gradual typing ( Gradual typing ) has already won recognition in the Python , Javascript is , Julia , Clojure's , Racket , the Ruby , Hack , and other languages.
For example, our program after rewriting might look like this:
input_file: String = raw_input()
input_line: String = raw_input()
file: File = open(input_file, 'a')
file.write(input_line)
file.close()
5. Dynamic deallocation / static deallocation (Dynamically deallocated / statically deallocated)
You can look at memory management, or at lifetime, through the same prism through which we looked at types. In 2018, all programming languages should have safe access to memory , the only question here is whether the memory deallocation should be determined at compile time (such as in Rust with its borrow checker ) or at run time (such as in any other language in which there is a garbage collector). Garbage collection is, undoubtedly, a big plus for the programmer’s productivity - it’s therefore natural that our original programming model should not assume how long each value should live until the moment of deallocation.
However, as before, pinpoint control over the lifetime of a value is still useful for correctness and performance. Owning and borrowing, such as that implemented in Rust, can help structure the system to eliminate data races in competitive programming, as well as avoid the need to use a garbage collector in runtime.
Extending our typed example, this might look like this:
input_file: String = raw_input()
input_line: String = raw_input()
file: mut File = open(&input_file, 'a')
file.write(&input_line)
file.close()
As far as I know, in contrast to the optional, or gradual typing ( gradual typing ), work in the direction of creating a gradual (optional) memory management is not carried out (with the exception of this publication ).
6. General purpose / domain-specific
When a programmer begins to write a program, he wants every function in his language that is available for use in the implementation to be used to achieve the maximum possible prototyping speed in order to increase the productivity of the creative process. Usually this is very few people think during software development, with the possible exception of the coding style (“which subset of Python should I use?”).
However, a growing wave of high-performance domain-specific languages like TensorFlow , Halide , Ebb , and Weldpoint us to the fact that if a programmer uses only a small subset of general-purpose programs (for example, differentiable pure functions), then the optimizer can produce significantly more efficient code. From the point of view of gradualness, this implies the possibility of a future workflow in which the programmer gradually narrows the subset of the language that he uses in the separate part of the program so that the compiler can provide a much better optimized backend for it.
Concept of gradual programming
Not that these axes could be called a new idea - in the sense that, say, the compromise between static and dynamic typing has been known for quite some time. However, what I wanted to demonstrate to you is that these solutions are not one-time and one-time-they can change as the program develops. Therefore, all axes are most likely 1) changing as a separate program evolves, and 2) changing using precise coordination, i.e. for example, when typed and untyped code must be mixed within the same system. This anathematizes the all-or-nothing approach that most languages today adhere to: everything must either be typed, or it must be untyped. Everything should either be collected by the garbage collector, or it should not be collected at all.
In light of this, advanced gradual programming involves the following research process:
- Detection of parts of the programming process that change gradually step by step in time, but now require an unjustified overhead or switching between languages for use in work.
- Development of language mechanisms that will allow programmers to gradually move along specific axes with a uniform software environment.
- An empirical test on real living programmers whether the proposed mechanisms work in practice and whether their results coincide with hypotheses.
Each of these steps requires further research. I gave an initial analysis of my perspective on the important incremental parts of the programming process, but I inevitably missed many others. For some of the axes I mentioned (memory management, language specialization), there are still no documented attempts to systematize the attitude towards them at the language level. I think that work on extensible compilation will help speed up the development of language extensions on these fronts.
Even for more well-trodden areas like gradual typing , publications began to appear in 2016, whose authors asked the question “Is it true that Sound Gradual Typing has come to an end?” (It’s more alive than all living and feels greatThank you very much for your concern). CircleCI abandoned the use of gradual typing in Clojure two years later. Suppose that the theory itself is well understood and that productivity is growing - there is no practical information on how programmers can interact with optional (gradual) types. Is it easy to write programs using this typing? Are partially typed programs more confused than fully typed / untyped programs? Can the IDE solve any of the problems listed here? And so on - we have no answers to these questions.
Another important issue in incremental programming is the choice between type inference and type signatures ( inference vs. annotation) As our compilers get smarter, it becomes easier for the compiler to output information like types, lifetime, etc. in the case when the programmer did not explicitly specify them. However, the output engines are far from perfect, and in the case when they cannot work out as they should (as far as I know), each language feature based on the output will require explicit annotation from the user, in contrast to suitable dynamic checks.
I imagine it this way: gradual systems have three modes of operation: for any specific type of program information (for example, type), it is either explicitly annotated , or inferred , or delayed for execution (deferred to runtime ).
This question is interesting in itself if we try to consider it from the point of view of HCI (human-computer interaction): how efficiently can people program in a system where a missing type indication can or cannot be output? How does this affect usability, performance, and correctness? Most likely, all these issues will become another important area of research for gradual programming.
In general, I welcome the wide opportunities that gradual programming methods can provide us with. When they begin to gain popularity, programmers of all skill levels will be able to benefit from languages that better suit their way of thinking.
Comments can also be sentby mail to the author of the article, and also leave it on Hacker News.