powerman June 24, 2014 at 23:22

Mojolicious Documentation: Lost Chapters

Tutorial

Update: The article has been updated to comply with Mojolicious 6.0.

Mojolicious is a delightful modern web framework for Perl. Of the shortcomings, I can name only two: the backward compatibility policy and the documentation.

This series of articles assumes that the reader is already superficially familiar with the framework, and he has a need to understand the details that are either not described in the documentation or are not described in sufficient detail and clearly. The official documentation (in English) is ideal for an initial introduction .

disadvantages

В официальном FAQ написано: "… we will always deprecate a feature before removing or changing it in incompatible ways between major releases … as long as you are not using anything marked experimental, untested or undocumented, you can always count on backwards compatibility …". Для начала, вторая фраза противоречит первой. Далее, вот цитата из Guides::Contributing «Features may only be changed in a major release or after being deprecated for at least 3 months.». Честно говоря, 3 месяца это и так смешной срок когда речь идёт об обратной совместимости, но похоже что даже этот срок соблюдается не всегда (поддержку «X-Forwarded-HTTPS» сделали deprecated два месяца назад, а удалили месяц назад- yes, it was a major release, so the rules are not formally violated, but the general attitude towards backward compatibility is quite indicative). How many developers update the framework more than once every 3 months, and even carefully read Changes or their application logs for deprecated warnings? At the same time, during the last year, approximately 20 functions / features were deprecated. In practice, of course, everything is not so bad as it sounds - something does not break so often (personally, in the last year I was only affected by replacing $app->secret()with $app->secrets()). But the fact remains - backward compatibility is broken, often broken, and without really good reasons: for example, in the case of secret()absolutely nothing prevented you from adding to the code

sub secret { shift->secrets([shift]) }

or just add support for additional parameters in secret()instead of adding a new function by secrets()implementing the desired feature without breaking compatibility at all.

As for the documentation, many consider it excellent, even one of the serious advantages of Mojolicious, but not a drawback. The problem with the documentation is that it is all focused on examples. This is really cool when you start learning the framework. This saves a lot of time when you need to make a feature and you quickly google an example of a similar feature in official guides. But as soon as you go beyond the scope of standard tasks and you need to understand how something works ideologically or architecturally,what specific parameters this function can take and what specifically it can return in different situations - it turns out that for many Mojolicious modules such documentation is missing in principle. And not because this information refers to “undocumented opportunities” - almost all of this is briefly mentioned here and there in various examples, which means it is considered “documented”. Often there are several ways to access certain data (request parameters, response body, etc.) but it is not described how they differ from each other and in what situations it is more correct to use which methods. And last - alphabetical orderfunctions in the dock, really ?! No, I understand that all people are different and it’s probably convenient for someone, but for many, it’s much easier to perceive the documentation in which functions are grouped by task. (Although in the code, especially when reading it through a browser, where it is not so convenient to use the search as in Vim, the alphabetical order of functions unexpectedly turned out to be quite convenient - in addition to new / DESTROY / AUTOLOAD - it is still better to place them at the beginning.)As a result, you have to read the code to figure it out (some prefer to watch tests instead!), Which is not so simple - firstly, it is not a standard of readability: the author likes to use pearl chips that allow you to write code compactly (and often such code works faster ), but readability worsens; secondly, the active use of both inheritance and the exchange of events between objects complicates the understanding of what is happening inside the 104 classes that make up Mojolicious-5.

We can do little with the backward compatibility problem (although you can probably make a Mojolicious plugin that will emulate it whenever possible). But the second problem is not difficult to solve - you can write the missing documentation yourself. As I study Mojolicious, I plan to describe some things that, in a good way, should be in the official documentation, hence the title of this article.

$ self

They often use Mojolicious in the documentation $self, which does not add readability - there are too many classes in the framework, and it’s far from always looking at $selfwhich class to understand this object in this example. Therefore, in the examples $selfI will use instead :

$app    # YourApp → Mojolicious
$r      # Mojolicious::Routes ($app->routes)
$c      # YourApp::SomeController → Mojolicious::Controller
$ua     # Mojo::UserAgent

Routing: internal device

The first thing you need to understand about the routing device in Mojolicious is that it is implemented as a tree of nodes, and the structure of this tree is (almost) in no way connected with the path hierarchy in url. Consider an example:

$r->get("/a/b/c/d")     ->to(text=>"1");
$ab = $r->route("/a")->route("/b");
$ab->get("/c")          ->to(text=>"2-1");
$ab->get("/c/d")        ->to(text=>"2-2");
$r->get("/a/b/c/d/e")   ->to(text=>"3");

As a result, such a tree will be built:

$ r {}
 ├─ / a / b / c / d {text => "1"}
 ├─ / a─── / b─┬─ / c {text => "2-1"}
 │ └─ / c / d {text => "2-2"}
 └─ / a / b / c / d / e {text => "3"}

And here is how it will work:

GET / a / b => 404 Not Found
GET / a / b / c => "2-1"
GET / a / b / c / d => "1"
GET / a / b / c / d / e => "3"
GET / a / b / c / d / e / f => 404 Not Found

As you might guess, the tree is scanned sequentially (in depth), until the first successful match with the current request - so if you have nodes in the routing definition that match the same requests, then carefully monitor where they are in the tree - order, in which they match the requests, may not match the order in which they are written in the code.

The second thing to understand is that only the leaves (terminal nodes) of the tree will process incoming requests, all intermediate (internal) nodes will not process requests, regardless of how they were created (via route()normal get(), etc.) and whether for them, a request handler ( "controller#action"or {cb=>\&handler}, etc.).

For example, create a tree using get():

$r->get("a", {text=>"A"})->get("b", {text=>"B"});

GET / a => 404 Not Found
GET / a / b => "B"

Or we’ll not create any nodes at all, but instead just configure an existing root node:

$app->routes->to(text=>"wow");

GET / => "wow"

The only case when the handler given to the intermediate node is used is if this node is under. Such nodes are created through under(), or you can make an existing node under by calling inline(1). After determining the terminal node that should process the current request, the handlers of all under-nodes on the path from the root of the tree to the terminal will be called sequentially. These handlers must return true or false (you can even asynchronously) - if they return false, then subsequent handlers, including the terminal node handler, will not be called.

Further, any tree node may contain the following information:

HTTP method (s)
path pattern, may include placeholders
```
"/something/:name/:id"
```
constraints for allowed placeholders
```
name => ["alex","nick"], id => qr/^\d+$/
```
- special limitation: is it possible to add extensions to the path template and which ones
```
format => 0 или format => ["json","xml"]
```
conditions - any functions that will be called after the path coincides with the template in order to perform any additional checks (e.g. http-headers) and return true / false to allow or prohibit the use of this node to process the current request
```
agent => qr/Firefox/
```
default parameters (defaults) - here you can set control parameters (controller / action / cb / ...), and default values for placeholders (making these placeholders optional) and any other values that should be in the $c->stash()processing of the request
explicit name of this node - e.g. for use inurl_for
- if it is not set, it will be generated automatically

All this data (except the node name) is “inherited” by the nested nodes (unless they are explicitly redefined), which allows you to create and use intermediate nodes exclusively for setting all this data by default for the nested nodes. By the way, nodes can be created without giving them any parameters at all, even a path template - in this case it will be the same as the parent node.

# Установим defaults для всех на корневом узле.
$r->to("users#");
# Создадим узел для /resource с ограничением формата.
$b = $r->route("/resource", format => 0);
# Добавляя в $b вложенный узел мы делаем $b промежуточным узлом и теперь
# он больше не может сам обрабатывать запросы к /resource.
# Поскольку мы вызвали get() без указания шаблона пути, то он будет
# срабатывать на тот же путь, что и его родитель, т.е. на /resource.
$b->get()->over(agent=>qr/Firefox/)->to("#ff_only");
$b->get()->to("#not_ff");

As far as I understand, there is most likely no difference between setting default values through $app->defaultsand through the root node $app->routes->to(maybe some hooks work just before routing, and then the values from them $app->defaultswill be available and from the root node may not be available).

There are a few more nuances, for example: the under-node does not handle requests even if it is a terminal node, the root node handles the format value a little differently than all the other nodes ... but I do not think that this is important for general understanding, so I will not go into details.

I have not yet figured out how connecting a separate Mojolicious application to the current one works $r->any("/path")->to(app=>$otherapp), maybe there are additional nuances there.

Routing: setup

There is a difference between Mojolicious and Mojolicious :: Lite - in Mojolicious :: Lite under (together with group) it works slightly differently than under in Mojolicious. Here I will describe the functions of Mojolicious (more precisely, Mojolicious :: Routes :: Route).

All parameters of all functions are optional (except over(), to()and via()- they, when called without parameters, return the current value).

Simple low-level features:
- $r->route() creates a new node, parameters:
  - path template (first scalar with an odd number of parameters)
  - restrictions, including format (parameter pairs)
- $r->via() sets HTTP method (s)
  - method (s) (list or link to list)
- $r->over() sets conditions
  - conditions (parameter pairs or a reference to an array of parameter pairs)
- $r->to() sets default options
  - handler (application or package, controller and / or action) (the first scalar with an odd number of parameters)
  - default parameters (parameter pairs or hash reference)
Heaped up combines for the lazy:
- $r->under() creates an under-node
  - all parameters like get()
- $r->any() creates a node for any HTTP methods
  - method (s) can be set as the first parameter (array reference)
  - other parameters like get()
- $r->get() creates a node for the HTTP GET method
  - path pattern (first of scalar parameters)
  - condition (scalar plus the parameter following it)
  - node name (scalar being the last parameter)
  - handler function (function reference; sets the default parameter value to “cb”)
  - restrictions, including format (array reference)
  - default parameters (hash reference)
- $r->post() creates a node for the HTTP POST method
  - all parameters like get()
- $r->put() creates a node for the HTTP PUT method
  - all parameters like get()
- $r->delete() creates a node for the HTTP DELETE method
  - all parameters like get()
- $r->patch() creates a node for the HTTP PATCH method
  - all parameters like get()
- $r->options() creates a node for the OPTIONS HTTP method
  - all parameters like get()

# вот какую фигню может переварить Mojolicious
$r->get("/users/:id",
    [ format => 0 ],
    agent => qr/Firefox/,
    { id => -1, controller => "users" },
    [ id => qr/^\d+$/ ],
    headers => { "X-Secret" => "letmeit" },
    \&cb,
    { action => "list" },
    "my_cool_route",
);
# а вот как это можно записать без get()
$r->route("/users/:id", id => qr/^\d+$/, format => 0)
    ->via("GET")
    ->over(agent => qr/Firefox/, headers => { "X-Secret" => "letmeit" })
    ->to(id => -1, controller => "users", action => "list", cb => \&cb)
    ->name("my_cool_route");

HTTP request parameters

There are not just many ways to get to the request parameters, but a lot of them . Moreover, far from all of them should be used - in some cases, the parameters obtained from different places are mixed together, and it becomes unrealistic to understand where it comes from.

Mojolicious has 4 types of parameters:

GET - obtained from the query string in the url, and the HTTP request method can be anything - GET, POST, etc.
POST - received from the body of a POST request of type application/x-www-form-urlencodedor type multipart/form-data- but in this case only normal parameters are taken, except for files
UPLOAD - multipart/form-datafiles received from the body of a POST request
ROUTE - values cut out from the url path using placeholders in the routing, excluding those reserved for stash

Further, it should be noted that the same parameter can be passed several times, moreover, it can be passed several times in each of the ways - GET, POST, UPLOAD, and when determining the routing, you can mention the same placeholder several times . For GET, POST and UPLOAD, all passed values of one parameter are saved, but for ROUTE, only one, the last value is used, if one placeholder is specified several times.

Most often, examples are mentioned $c->param- let's see where the values returned by this function come from (in Mojolicious up to 5.47) :

scalar $c->param() - returns undef
@names = $c->param() - returns the names of all GET, POST, UPLOAD and ROUTE parameters
$value = $c->param("name") - returns:
1. the last value of the placeholder "name" if it is, otherwise
2. first value of UPLOAD "name" if it is, otherwise
3. first POST value "name" if it is, otherwise
4. first value GET "name" if it is, otherwise
5. undef
@values = $c->param("name") - returns:
1. the last value of the placeholder "name" if it is, otherwise
2. all UPLOAD "name" values, if any, otherwise
3. all POST "name" values after which all GET "name" values if any, otherwise
4. ()

Personally, I prefer to write as clear code as possible, and I don’t like such “magic” functions that the parameter will return, but it’s completely unknown where it comes from. Of course, in some cases you need to not pay attention to whether the parameters are passed by the GET or POST, but what does $c->paramit is already beyond good and evil (for example, if you expected a GET / POST parameter, but get UPLOAD, then instead of the string value get the object Mojo :: Upload). Everything is good in moderation, even magic functions-lazy people create a wow factor that they love to implement in Mojolicious.

Here is a list of functions that are recommended to be limited to access the parameters of the HTTP request:

Get placeholder value from url path:

$value = $c->stash( "name" )              # ROUTE

Get the names and all values of all parameters:

# HASHREF: значения ключей в хеше будут:
# - скалярами (если у параметра одно значение)
# - ссылкой на массив скаляров (если у параметра несколько значений)
$params = $c->req->params->to_hash        # сначала POST, потом GET
$params = $c->req->query_params->to_hash  # GET
$params = $c->req->body_params->to_hash   # POST
# ARRAYREF: значения элементов массива объекты, обычно Mojo::Upload
$uploads = $c->req->uploads               # UPLOAD

Get the names of all parameters:

@names = @{ $c->req->params->names }           # POST и GET
@names = @{ $c->req->query_params->names }     # GET
@names = @{ $c->req->body_params->names }      # POST
@names = keys %{{ map { $_->name => 1 } @{ $c->req->uploads } }}  # UPLOAD

Get the last value of one parameter:

$c->req->params->param( "name" )          # сначала POST, потом GET
$c->req->query_params->param( "name" )    # GET
$c->req->body_params->param( "name" )     # POST
$c->req->upload( "name" )                 # UPLOAD

Get all values of one parameter:

$c->req->params->every_param( "name" )          # сначала POST, потом GET
$c->req->query_params->every_param( "name" )    # GET
$c->req->body_params->every_param( "name" )     # POST
$c->req->every_upload( "name" )                 # UPLOAD

This approach will ensure clarity and uniformity of code. But, for completeness, here is a list of the remaining functions:

Alternative options for calling the above functions:
- $c->req->param this is the same as $c->req->params->param
- $c->req->url->query this is the same as $c->req->query_params
Instead, to_hashyou can use it pairs- it returns a link to an array where the names and values of all parameters go sequentially, while one name can occur several times, but all the values are scalars.

Parsing

This is not quite from my dictionary, but I can’t find another description: in terms of parsing downloaded pages, mojo is just a little sweetie! The documentation for this part of Mojo is significantly better. Nevertheless, there is something to add here.

The fact is that Mojo is not just a web server framework, but a web framework in general - for both the server and the client. Therefore, modules that implement the HTTP message format are used by both the server and the client - and they have very different needs for processing these messages. As a result, when you need to do something and climb into the documentation, you are interested in either server-side functionality or client-side functionality at that moment - and you see not only that, but both, also carefully sorted in alphabetical order. As a result, finding the desired function is quite difficult, and there is a need for a visual cheat sheet, where there will be functions only for the client or only for the server, preferably grouped by some adequate criterion.

The result is the following plate. This is the first alpha version :) so if something is not clear or there are ideas for improvement - write, we will finish it by joint efforts. A few notes:

Even with an error, pumping $tx->reswill be available (with 404 plug).
The tree in $domhas a root node, and the node with the tag html(if it was in the downloaded page) is a descendant of the root node. Because of this, to_stringand contentat the root node, return the same.
When accessing non-existent elements (or, for example, if a collection was expected and one element was found through $dom->child_tag_nameand the collection method is called), the pearl will throw an exception "there is no such method" - in other words, almost always the parser must be enclosed in eval.
It is very easy to accidentally get a collection of collections instead of a collection of nodes, or a collection in which some of the elements are empty - the collection methods flattenand will help here compact.
Parameter "*"means a string with CSS selector.

$tx = $ua->get($url);           # Mojo::Transaction::HTTP → Mojo::Transaction
$tx->error                      # undef или {message=>'…',…}
$tx->success                    # undef или $tx->res
$tx->req                        # Mojo::Message::Request  → Mojo::Message
$tx->res                        # Mojo::Message::Response → Mojo::Message
$tx->redirects                  # [ Mojo::Transaction::HTTP, … ]
$res = $tx->res;                # Mojo::Message::Response → Mojo::Message
$res->error                     # undef или {message=>'Parse error',…}
$res->to_string                 # "…" (headers+content)
$res->is_status_class(200);     # bool
$res->code                      # 404
$res->message                   # "Not Found"
$res->headers                   # Mojo::Headers
$res->cookies                   # [ Mojo::Cookie::Response, … ]
$res->cookie('name')            # Mojo::Cookie::Response → Mojo::Cookie
$res->body                      # "…"
$res->text                      # "…" (decoded body using charset)
$res->dom                       # Mojo::DOM
$res->json                      # Mojo::JSON
$headers = $res->headers;       # Mojo::Headers
$headers->names                 # [ "Content-Type", "Server", … ]
$headers->to_hash               # { "Content-Type" => "…", … }
$headers->header('Server')      # "…"
$headers->$standard_header_name # "…" (shortcuts for useful headers)
$dom = $res->dom;               # Mojo::DOM
$dom->to_string                 # "…" (этот узел, включая содержимое)
$dom->content                   # "…" (содержимое этого узла)
$dom->type                      # "…" (тип узла: root,tag,text,comment,…)
$dom->tag                       # "…" или "" (название тега)
$dom->attr                      # {name=>"val",…}
$dom->attr('name')              # "val"
$dom->{name}                    # синоним $dom->attr("name")
$dom->all_text                  # "…" (из всех узлов)
$dom->all_text(0)               # "…" (из всех узлов, не трогая пробелы)
$dom->text                      # "…" (из этого узла)
$dom->text(0)                   # "…" (из этого узла, не трогая пробелы)
$dom->root                      # Mojo::DOM (корневой узел)
$dom->parent                    # Mojo::DOM или undef (узел-родитель)
$dom->next                      # Mojo::DOM или undef (следующий тег-брат)
$dom->next_node                 # Mojo::DOM или undef (следующий узел-брат)
$dom->previous                  # Mojo::DOM или undef (предыдущий тег-брат)
$dom->previous_node             # Mojo::DOM или undef (предыдущий узел-брат)
$dom->matches('*')              # true/false (тест этого узла как тега)
$dom->at('*')                   # Mojo::DOM или undef (первый подходящий тег)
$dom->find('*')                 # Mojo::Collection (указанные теги)
$dom->ancestors                 # Mojo::Collection (теги-родители)
$dom->ancestors('*')            # Mojo::Collection (указанные теги-родители)
$dom->following                 # Mojo::Collection (следующие теги-братья)
$dom->following("*")            # Mojo::Collection (указанные следующие теги-братья)
$dom->following_nodes           # Mojo::Collection (следующие узлы-братья)
$dom->preceding                 # Mojo::Collection (предыдущие теги-братья)
$dom->preceding("*")            # Mojo::Collection (указанные предыдущие теги-братья)
$dom->preceding_nodes           # Mojo::Collection (предыдущие узлы-братья)
$dom->children                  # Mojo::Collection (теги-дети)
$dom->children('*')             # Mojo::Collection (указанные теги-дети)
$dom->descendant_nodes          # Mojo::Collection (все узлы)
$dom->child_nodes               # Mojo::Collection (все узлы-дети)
$dom->[0]                       # синоним $dom->child_nodes->[0]
$res->dom('*')                  # синоним $dom->find('*')

Tips & Tricks

Support for non-blocking applications in CGI mode

If you slightly modify the script that launches the Mojolicious application, you can provide support for non-blocking code even when working in cgi mode: https://gist.github.com/powerman/5456484

How Mojo :: UserAgent Works When Testing Your Application

If you thought about how in tests it $uasends requests to your application and receives answers without starting a TCP server on some port - the answer is simple: you can globally set any application $app, and all Mojo :: UserAgent objects, it doesn’t matter, created before or after, they will execute requests to the url, in which the protocol and host are not specified, through this one $app.

Usually this “just works” due to the fact that this operation is performed by Mojolicious :: Lite. But if your test does not use this module, and even in the code somewhere you create your Mojo :: UserAgent objects - everything that “worked by itself” suddenly stops working. You can fix it like this:

Mojo::UserAgent::Server->app($app);

ojo and Mojolicious :: Lite

The ojo documentation forgot to mention that in addition to the described ojo functions, Mojolicious :: Lite functions are also available, which simplifies experiments with Mojolicious:

$ perl -Mojo -E 'get "/", {text=>"wow\n"}; app->start' get /
wow

Environment variables

Mojolicious supports dozens of different environment variables with names starting with MOJO_- I see no reason to describe everything, because I suspect that the list may vary between versions of Mojolicious, but a couple are worth mentioning:

MOJO_MODE - most of the other environment variables affect one thing, but this one affects different things, which is documented again in different places, so just put it all together:
- sets $app->mode
- defines the name of the log file
- determines which version of templates exception, and not_foundwill be used (the file name will include the value MOJO_MODE, for example. «exception. $ mode.html.ep»)
- $app->log->leveldefault changes from "debug" to "info" if the value is MOJO_MODEnot "development"
- if it exists, the function ${mode}_mode()will be called before startup()(not sure if this feature is considered documented) .
MOJO_LOG_LEVEL - Often you have to manually set it to “warn” in tests using Mojolicious :: Lite so that the test output is not cluttered with log messages.

Tags: