mou March 29, 2009 at 09:57

Seaside 2.9: Partial sequels

Transfer

Some time ago on a habr the topic about "continuations" from HabrUser qmax slipped . He was very impressed with the idea, but he couldn’t tell in detail. And recently, one of the developers of Seaside, Julian Fitzell wrote an article, amazing in its clarity. With his permission, I translated it and would like to share it with the habrasociety.

I would immediately like to say about terminology. As a translation of the word continuation, I use the closest in meaning "continuation". The general terminology of the article for an inexperienced developer in Smalltalk may seem unusual. So, instead of a call stack, a "chain of contexts" is used, and instead of a thread, a "process" is used. If you still have questions after reading, feel free to ask them in the comments. Thanks.

This is the second post in a series of reviews of the upcoming release of Seaside. Take a look at the first post on exception handling .

Continuation at Seaside

Seaside is often referred to as a “continuation-based” web framework, and indeed, at the dawn of development, sequels were used everywhere to portray magic. Seaside 2.8 still uses first-class extensions (which means I'll explain a bit later) in three different cases:

to stop processing the request (request) and immediately return a response (response);
to interrupt the execution of the code and continue it after the user clicks on the link or follows a redirect (for example, to set cookies for the user);
to implement a call / answer scheme for components.

However, the upcoming release of Seaside will completely eliminate the use of extensions in the core of the framework. The first of these cases will be re-implemented using exceptions, and the code for the second and third cases will be moved to an optional but available for installation package. This means that you can install Seaside without using continuations at all. This fact should improve portability between Smalltalk dialects that currently do not support continuations.

At the same time, we will also replace first-class extensions with partial extensions, and this article should give an idea of what this means and why we are making these changes. All this can bring down the bar (especially during debugging!), So do not worry, but let the information settle down, and then return to it and re-read it. I simplified several things by sacrificing details, in the hope of making this topic more understandable for people who are embarrassed by the very idea of working sequels. I accept any feedback on how I managed to maintain this balance.

What are continuations?

First of all, when I mention continuations, I mean continuations of the first class. Seaside also uses a continuation transfer method to implement the rendering cycle (this is the _k parameter that you see in the URLs generated by Seaside). This is a closely related concept, but not what I will talk about further.

Continuations are often defined as “residual computations,” but I find this a bit of a fuzzy definition if you still do not understand the essence of this phenomenon. For me, the simplest explanation is that the continuation saves a “snapshot” of the running process, which can be continued later. You call a method that calls another method that calls another method, and so on, and then you take a snapshot of this chain of calls and save the snapshot object somewhere. In the future, you can restore it at any time by abandoning the code that is currently executing, and your program will continue to run from that very place, from the very method recorded in the “snapshot”. This is the continuation of the first class.

Smalltalk users find it easier to understand, because when you save a Smalltalk image and open it later, you see exactly the same picture as when saving. You can open the saved image as many times as you want, and each time you will return to the same state. If you save the image to a new file, you can return to the old one. Continuations, in principle, do the same thing, only instead of the whole image they save the only process.

Implementation of “Call and Answer”

One of the most spectacular features of Seaside is the ability to write multi-step tasks that require user participation in the usual iterative style:

answer := self confirm: 'Do it?'. answer ifTrue: [ self doItAlready ]

This is just what becomes easier when using continuations: we want to stop in the middle of the method and ask the user to enter information. If he answers, then we want to continue execution from where we left off. Now let's see how first-class extensions can be used to achieve this.

How to read charts

A small digression. The following diagrams depict context chains (although they are abstract enough to call them a stack of frames). Each time you call a method or execute a block, a new context is created at the “top” of the chain. Each time the method returns a value or the block ends, the context from the “top” is deleted. The method context knows which method called it, for which object it was called, as well as the value of any variable defined in this method. He also knows the context below him in the chain. If you need help to understand this process, then take a look at the illustration, it depicts everything step by step.

The following diagrams represent a chain of contexts for processing a single HTTP request. Each request is the result of a click on a link that generates a callback. Each callback ultimately sends either #call:or #answer:.

Charts show a chain of contexts at the moment when it is sent #call:or#answer, and depict what happened next. The up arrows show progress as you call the methods, and down - as they complete. I depict exceptions in the form of a dashed arrow, the tail of which is at the place of occurrence of the exception, and the head indicates the place of its processing. In the case when the continuation is saved, both chains are displayed on the diagram: the one that is being executed now and the saved one; while the arrows are directed as usual. Obviously, these are very simplified illustrations: I am more interested in describing a general idea than specific details.

To clarify, a gray bar is marked on each chart. All that is above it is user code: that part of the callback that will be executed. Everything under the line is part of the framework: reading from a socket, managing a session, etc.

Naïve (fr.) Implementation

Ok, let's take a look at one of the possible implementations using continuations. Imagine that a user is on a web page containing a “do it” link. Clicking on the link performs the callback given above as an example, which the user should ask “Do it?”. In the process of processing this request, the following occurs:

The framework searches for the correct callback and executes it.
During the callback execution (inside the #inform: method in the above example), a message is sent #call:.
The result in each context is stored for continued use.
An exception is thrown that stops the callback processing and returns control to the framework.
The framework continues to work and returns a response to the browser (in Seaside, the rendering phase is performed to display the components in the response, but I simplify it a bit here).

As a result, the browser should display a “Do it?” Prompt and a link or button to confirm the action. When the user clicks on this link (or button), the callback will be activated, which will execute self answer: true.. And when the second request is received, the following will happen:

The framework searches for the corresponding callback and executes it.
Callback sends a message #answer:.
The current chain of contexts is discarded and the one that we saved in the continuation is restored to its place. Note that this method returns a second time. This is of course strange, but no more strange than saving a Smalltalk image right in the middle of computing. Each time you open the image, you will see the result of the same calculation.
Now that we have restored the previous chain of contexts, execution will continue in the first callback as if our call #call:(the place where we saved the continuation) has just ended
The restored callback completes its execution (in our example, it checks the value of the user's response and sends it #doItAlready)
The framework sends a response to the browser.

But there is a problem, and that is why I called this implementation naïve. As you can see, the answer is incorrectly returned on the first request. The socket associated with the first request, unfortunately, has long been closed and the browser is no longer waiting for an answer. The browser expects to receive an answer that, apparently, will never come to the socket associated with request number two. Oops!

(Almost) Working Call and Answer

So, the first implementation does not work, but I hope she showed what happens to the sequels. The problem is that when we restore the continuation, we do not want to throw out absolutely everything that the framework has done. At a minimum, we need a context that will return the response to the correct socket.

An easy way to limit the number of contexts captured by a continuation is to create a new process. A new process starts with a new, empty chain of contexts, so when we create a continuation, only the contexts in this chain will be captured. We can use the semaphore to make the first process wait while the new one is processing the request. When the second process is completed, it will ignite the semaphore, and the original process will return the response to the correct socket.

The following diagram depicts this diagram (contexts of different processes are represented by different symbols):

At some point, a new process is created in the framework code, and the original one is waiting for a semaphore signal.
The new process finds and executes the corresponding callback request.
Callback sends a message #call:.
The continuation is saved (note that this time the continuation starts from the starting point of a new process).
An exception is thrown, the callback stops processing and returns control to the framework
The framework creates a response for the browser and ignites the semaphore.
The original process continues and returns a response to the browser.

So far, the only advantage is that the continuation is less. But when the second request arrives, it becomes obvious how this approach solves our problem:

At some point in the framework code, a new process is created, and the original one expects a semaphore signal.
The new process finds and executes the corresponding callback request.
Callback sends a message #answer:.
The current chain of contexts is discarded and the one we saved in the continuation is restored (but note, this time only contexts in the spawned process are discarded, and the pending process remains unaffected).
After we restored the saved chain of contexts, execution continues as if the call had #call:just ended.
Callback completes execution.
The framework creates a response for the browser and ignites the semaphore, informing the parent process about the completion of its work.
The original process continues execution, this time correctly returning a response to the browser.

Now we have not only made the continuation smaller, but have also ensured that the answer to the second request has returned as intended. It is this implementation that was used in Seaside 2.8 and earlier.

But there are a number of significant problems:

Creating interprocess communication increases the complexity of the system.
Exceptions cannot cross the border beyond which a new process was created. Indeed, if you throw an exception, the first process will never know about it (technically it is surmountable and you can simulate this behavior to some extent, but this complicates the system even more). This means that error handling must be fully performed in the generated process. This also adds difficulty, for example, when working with a database that uses exceptions to mark objects as “dirty”, or to indicate the transaction status of the current process.
Exceptions thrown after restoration continues will cross the restored chain of contexts. Also, when the exception is handled, the restored chain of contexts will be unwound, and not the one that was thrown. Look at the framework contexts, colored in red on the last diagram: they will not have a chance to complete the execution and all the safety blocks defined by them will never be executed. Believe me, when I say that this can give rise to several insidious bugs.
It is necessary to find a compromise between size and accuracy in terms of points 2 and 3. If you start a new process immediately before executing the callback, you will get a very small continuation and a more shortened exception handling. Unfortunately, your exceptions cannot be thrown far enough and the code will finish executing in a completely different place, for example, in the rendering phase.
Debugging turns into a nightmare (well, at least in Squeak), when the code depends on the running process. I’m not sure that debuggers will learn how to proceed to the process in which the error occurred directly, but at least they won’t be able to do this without error.

Partial Continuations

Partial sequels imply that instead of preserving the entire chain of contexts, we retain only the part that is of interest to us. And when we restore the partial continuation, we replace with it not the entire chain, but only the part that is of no interest. Let's take a look at how it works.

When the first request arrives, everything happens exactly the same as in the first example, so I won’t analyze it step by step, except for one thing: using partial extensions, we can specify the exact range of contexts to save in the continuation. In this case, we save only those contexts that are part of the user code - the callback. Remember the problem from the first implementation? The framework code processes one specific request; These framework contexts will be absolutely useless when processing any other request (even for the same URL, there will still be a new request). Since the callback can cover several HTTP requests in its execution, we only need to save these (request-dependent) callback contexts for future restoration.

Remember also that the context chain in real life can be much longer than shown in these diagrams: so we save 5 contexts instead of, say, 40! How? Good savings.

Now let's take a look at how the second request is processed. This illustration is slightly different and more complex because the chain of contexts changes at runtime, so I will cover it step by step:

The request is being processed.
The framework searches for the appropriate callback and executes it.
Callback sends a message #answer:.
Then, the saved partial continuation is searched instead of the existing callback code, and the saved contexts are literally “transplanted” to the current ones, rewriting the message senders. I move my hands in the air, omitting the details, but you must believe me, everything actually happens that way. The right side of the diagram shows the state after the completion of "transplantation". Notice that all the framework contexts are left untouched, and we are still in the original process.
The execution of the saved callback continues as if the method call #call:would only end.
As soon as the restored callback finishes its execution, it will return control (because we replaced the senders) directly to the framework code that processes the current request.
Next, a response will be generated and transmitted through the appropriate socket to the browser.

Magic! I'm sure it looks that way, but it works just fine. As a result, we have short sequels and we don’t need to create a new process, and all the framework code gets a chance to complete its execution successfully.

Conclusion

The decision on partial sequels is currently implemented in the development version of Seaside and will be included in the next release. Squeak and VisualWorks already support the implementation of partial continuations in the code. GemStone is close to completing their implementation in its VM. Dialects that cannot implement partial continuations have a choice:

can simulate partial extensions with varying degrees of completeness, using extensions of the first class;
can continue to use a system similar to the one that was in Seaside 2.8;
can leave them alone. As I noted above, we removed the use of continuations and the Seaside kernel: platforms can simply stop supporting the call #call:and this is now easy, just not provide the Seaside-Flow package.

I hope this was useful and interesting reading and I would be grateful for your comments on everything that seemed difficult or useful for understanding. Happy Seasiding.

Tags: