Alessandra August 12, 2014 at 15:56

Perl Functions

Tutorial

Perl has a huge number of features that, at first glance, look superfluous, and in inexperienced hands can generally lead to the appearance of bugs. It comes to the fact that many programmers who regularly write in Perl are not even aware of the full functionality of this language! The reason for this, it seems to us, is the poor quality and questionable content of the literature for a quick start in the field of programming in Perl. This does not apply only to books with Lama, Alpaca and Camel ( Learning Perl , Intermediate Perl and Programming Perl ) - we strongly recommend reading them.

In this article we want to talk in detail about the little tricks of working with Perl regarding the unusual use of functions that can be useful to everyone who is interested in this language.

How do Perl functions work?

In most programming languages, a function description looks something like this:

function myFunction (a, b) {
	return a + b;
}

And the function is called like this:

myFunction(1, 2);

At first glance, everything is simple and clear. However, the call to this function is as follows:

myFunction(1, 2, 3);

... in most cases will lead to errors due to the fact that the number of arguments passed to the function is incorrect.

A function in Perl can be written like this:

sub my_sub($$;$) : MyAttribute {
	my ($param) = @_;
}

Where $$; $ is the prototype, and MyAttribute is the attribute. Prototypes and attributes will be discussed later in the article. In the meantime, we will consider a simpler option for writing a function:

sub my_sub {
	return 1;
}

Here we wrote a function that returns one.

But this entry does not indicate how many arguments the function takes. That is why nothing prevents calling her like this:

my_sub('Туземец', 'Бусы', 'Колбаса', 42);

And everything is perfectly executed! This is because in Perl, passing parameters to a function is tricky. Perl is famous for having many so-called “special” variables. Each function has a special variable @_, which is an array of input parameters.

Therefore, inside the function, we can put the input parameters into variables like this:

my ($param) = @_;

This works in the case of several parameters:

my ($param1, $param2, $param3) = @_;

Very often in functions they write the following:

sub my_sub {
	my $param = shift;
	...
}

The fact is that in Perl, many functions use default variables when called without arguments. shift by default gets data from the @_ array. Therefore, the entries:

my $param = shift;

... and

my $param = shift @_;

... completely equivalent, but the first entry is shorter and obvious to Perl programmers, so it is used.

shift can also be used to obtain several parameters, including combining into one list assignment:

my ($one, $two, $three) = (shift, shift, shift);

Another entry:

my ($one, $two, $three) = @_;

... works the exact same way.

Now attention! A rake that every Perl programmer steps on sooner or later:

sub my_sub {
	my $var = @_;
	print $var;
}

If we call this function as my_sub (1, 2, 3) in $ var, we suddenly get not 1, but 3. This is because in this case the context of the variable is defined as scalar, and in Perl the array in the scalar context returns its size, not the first element. To fix the error, it is enough to take $ var in brackets to make the context list. Like this:

sub my_sub {
	my ($var) = @_
}

And now, as expected, when calling my_sub (1, 2, 3), $ var will be 1.

In addition, in Perl, parameters are passed by reference. This means that we can modify the parameters that are passed to it inside the function.

For instance:

my $var = 5;
my_sub($var);
print $var;
sub my_sub {
	# вспоминаем, что доступ к элементам массива выполняется в скалярном контексте
	# т. е. доступ к нулевому элементу массива @arr будет выглядеть как $arr[0], то же самое и с
	# @_.
	$_[0]++; # $_[0] — первый элемент массива @_.
}

The result will be 6. However, in Perl, you can do some sort of “pass by value” like this:

my $var = 5;
my_sub($var);
print $var;
sub my_sub {
	my ($param) = @_;
	$param++;
}

And now the result will be 5.

And the last two nuances, which are very important. Perl automatically returns the result of the last expression from the function.

Take the code from the previous example and modify it a bit:

my $var = 5;
my $result = my_sub($var);
print $result;
sub my_sub {
	my ($param) = @_;
	++$param;
}

This will work in exactly the same way as if the last line of the function had an explicit return value:

return ++$param;

The function will return 6.

And another feature: if another function is called in the body of the function using an ampersand and without brackets, the internal function receives the parameters of the function in which it is called in the input. That is, the @_ array will be automatically passed from the external function to the internal one. This can lead to unobvious bugs.

use strict;
use Data::Dumper;
my_sub(1, 2, 3);
sub my_sub {
	&inner;
}
sub inner {
	print Dumper \@_;
}

Result:
$VAR1 = [
    1,
    2,
    3
];

However, if you explicitly indicate (using empty brackets) that the function is called without parameters, then everything is in order:

sub my_sub {
	&inner();
}

And the output will look like this:
$VAR1 = [];

However, function calls using ampersand are used very rarely and are almost never found in the code.

Anonymous Functions

Anonymous functions are declared at the place of use and do not receive a unique identifier to access them. When created, they are either called directly, or a reference to a function is assigned to a variable, with which you can then indirectly call this function.

An elementary declaration of an anonymous function in Perl:

my $subroutine = sub {
	my $msg = shift;
	printf "I am called with message: %s\n", $msg;
	return 42;
};
# $subroutine теперь ссылается на анонимную функцию
$subroutine->("Oh, my message!");

Anonymous functions can and should be used both for creating blocks of code and for closures, which will be discussed later.

Short circuits

Closing is a special kind of function in the body of which variables that are declared outside the body of this function are used.

In a record, it looks like, for example, a function located entirely in the body of another function.

# возвращает ссылку на анонимную функцию
sub adder($) {
	my $x = shift;    # в котором x — свободная переменная,
	return sub ($) {
    	my $y = shift;    # а y — связанная переменная
    	return $x + $y;
	};
}
$add1 = adder(1);   # делаем процедуру для прибавления 1
print $add1->(10);  # печатает 11
$sub1 = adder(-1);  # делаем процедуру для вычитания 1
print $sub1->(10);  # печатает 9

It is useful to use closures, for example, when you need to get a function with ready-made parameters that will be stored in it. Or to generate a parser function, callbacks.

Non-slip functions

In our opinion, this is the most appropriate translation of the term parenthesis-less.

For example, print is often written and called without parentheses. The question is, can we also create such functions?

Of course. For this, Perl even has a special pragma - subs. Suppose we need a function that checks the value of a variable for truth.

use strict;
use subs qw/checkflag/;
my $flag = 1;
print "OK" if checkflag;
sub checkflag {
	return $flag;
}

This program will print OK.

But this is not the only way. Perl is well thought out, so if we restructure our program and bring it to this form:

use strict;
my $flag = 1;
sub checkflag {
	return $flag;
}
print "OK" if checkflag;

... then the result will be the same.

The pattern is as follows - we can call a function without parentheses in several cases:

using the pragma subs;
writing a function BEFORE calling it;
using function prototypes.

We turn to the last option.

Function Prototypes

Often a different understanding of the purpose of this mechanism leads to holivars with adherents of other languages who claim that "the pearl has bad prototypes." So, prototypes in Perl are not for the strict restriction of the types of parameters passed to functions. This is a hint for the language: how to parse what is passed into the function.

The authors at PerlMonks explained this as “parameter context templates” —parameter context templates. Details on the examples below.

There is, for example, an abstract function called my_sub:

sub my_sub {
	print join ', ', @_;
}

We call it as follows:

my_sub(1, 2, 3, 4, 5);

The function prints the following:
1, 2, 3, 4, 5,

It turns out that any number of arguments can be passed to any Perl function. And let the function itself understand what we wanted from it.

It is supposed that there should be a mechanism for controlling the arguments passed to the function. This role is performed by prototypes.

The protl Perl function will look like this:

sub my_sub($$;$) {
	my ($v1, $v2, $v3) = @_;
	$v3 ||= 'empty';
    printf("v1: %s, v2: %s, v3: %s\n", $v1, $v2, $v3);
}

Function prototypes are written after the function name in parentheses. The prototype $$; $ means that, as the parameters, the presence of two scalars is necessary and the third, if desired, “;” separates the required parameters from the possible ones.

If we try to call it like this:

my_sub();

... then we get an error of the form:
Not enough arguments for main::my_sub at pragmaticperl.pl line 7, near "()"
Execution of pragmaticperl.pl aborted due to compilation errors.

And if so:

&my_sub();

... then prototype testing will not happen.

We summarize. Prototypes will work in the following cases:

If the function is called unsigned ampersand (&). Perlcritic (a tool for static analysis of Perl code), by the way, swears at recording a function call through an ampersand, that is, this option is not recommended.
If the function is written before the call. If we first call the function and then write it, with warnings turned on, we get the following warning:
main::my_sub() called too early to check prototype at pragmaticperl.pl line 4

Below is an example of the correct program with Perl prototypes:

use strict;
use warnings;
use subs qw/my_sub/;
sub my_sub($$;$) {
	my ($v1, $v2, $v3) = @_;
	$v3 ||= 'empty';
    printf("v1: %s, v2: %s, v3: %s\n", $v1, $v2, $v3);
}
my_sub();

Perl has the ability to find out which prototype a function has. For instance:

perl -e 'print prototype("CORE::read")'

Will produce:
*\$$;$

Override methods

Override is often a pretty useful thing. For example, we have a module written by a certain N. And everything is good in it, but one method, for example, call_me, should always return 1, otherwise it’s a problem, and the method from the base delivery of the module always returns 0. You can’t touch the module code.

Let the program look like this:

use strict;
use Data::Dumper;
my $obj = Top->new();
if ($obj->call_me()) {
	print "Purrrrfect\n";
}
else {
	print "OKAY :(\n";
}
package Top;
use strict;
sub new {
	my $class = shift;
	my $self = {};
	bless $self, $class;
	return $self;
}
sub call_me {
	print "call_me from TOP called!\n";
	return 0;
}
1;

She will deduce:
call_me from TOP called!
OKAY :(

And again, we have a solution.

Let's add the following thing before calling $ obj-> call_me ():

*Top::call_me = sub {
	print "Overrided subroutine called!\n";
	return 1;
};

Better yet, for a temporary override, use the local keyword:

local *Top::call_me = sub {
	print "Overrided subroutine called!\n";
	return 1;
};

This will replace the call_me function of the Top package in the lexical scope (in the current block).
Now our conclusion will look like this: the
Overrided subroutine called!
Purrrrfect

module code has not been changed, the function now does what we need.

Note: if you often have to use this technique in your work, there is an architectural jamb. A good use case is adding debugging output to functions.

Wantarray

Perl has such a useful thing that allows you to determine in what context the
function is called. For example, we want the function to behave as follows:
when necessary, return an array, and otherwise, a reference to the array. This can be implemented, and
also very simple, with wantarray. We will write a simple program for demonstration:

#!/usr/bin/env perl
use strict;
use Data::Dumper;
my @result = my_cool_sub();
print Dumper @result;
my $result = my_cool_sub();
print Dumper $result;
sub my_cool_sub {
    my @array = (1, 2, 3);
    if (wantarray) {
        print "ARRAY!\n";
        return @array;
    }
    else {
        print "REFERENCE!\n";
        return \@array;
    }
}

What will output:
ARRAY!
$VAR1 = 1;
$VAR2 = 2;
$VAR3 = 3;
REFERENCE!
$VAR1 = [
    1,
    2,
    3
];

I would also like to recall an interesting feature of Perl. % hash = @ array; In this case, Perl will build a hash of the form ($ array [0] => $ array [1], $ array [2] => $ array [3]);

Therefore, if you apply my% hash = my_cool_sub (), the wantarray logic branch will be used. And for this very reason, wanthash is not.

AUTOLOAD

Perl has one of the best module management systems. Not only can a programmer control all stages of module execution, there are also interesting features that make life easier. For example, AUTOLOAD.

The essence of AUTOLOAD is that when the called function does not exist in the module, Perl searches for the AUTOLOAD function in this module, and only then, if it does not find it, throws an exception about calling a nonexistent function. This means that we can describe a situation handler when a nonexistent function is called.

For instance:

#!/usr/bin/env perl
use strict;
Autoload::Demo::hello();
Autoload::Demo::asdfgh(1, 2, 3);
Autoload::Demo::qwerty();
package Autoload::Demo;
use strict;
use warnings;
our $AUTOLOAD;
sub AUTOLOAD {
    print $AUTOLOAD, " called with params: ", join (', ', @_), "\n";
}
sub hello {
    print "Hello!\n";
}
1;

Obviously, the qwerty and asdfgh functions do not exist in the Autoload :: Demo package. In the AUTOLOAD function, the special global variable $ AUTOLOAD is set equal to the function that was not found.

The output of this program:
Hello!
Autoload::Demo::asdfgh called with params: 1, 2, 3
Autoload::Demo::qwerty called with params:

Function Generation on the fly

Suppose we need to write many functions that perform roughly the same actions. For example, a set of accessors for an object. Writing such code is unlikely to please anyone:

sub getName {
    my $self = shift;
    return $self->{name};
}
sub getAge {
    my $self = shift;
    return $self->{age};
}
sub getOther {
    my $self = shift;
    return $self->{other};   
}

This is Perl. “Laziness, impatience, arrogance” (L. Wall).

Functions can be generated. Perl has such a thing as a typeglob data type. The most accurate translation of a name is the name table. Typeglob has its own sigil - "*".

First, let's see the code:

#!/usr/bin/env perl
use strict;
use warnings;
package MyCoolPackage;
sub getName {
    my $self = shift;
    return $self->{name};
}
sub getAge {
    my $self = shift;
    return $self->{age};
}
sub getOther {
    my $self = shift;
    return $self->{other};   
}
foreach (keys %{*MyCoolPackage::}) {
        print $_." => ".$MyCoolPackage::{$_}."\n";
}

Conclusion:
getOther => *MyCoolPackage::getOther
getName => *MyCoolPackage::getName
getAge => *MyCoolPackage::getAge

Basically, a globe is a hash with the name of the package in which it is defined. It contains as keys the module elements + global variables (our). It is logical to assume that if we add our key to the hash, then this key will become available as a regular entity. We use the generation of functions to generate these getters.

And here is what we got:

#!/usr/bin/env perl
use strict;
use warnings;
$\ = "\n";
my $person = Person->new(
    name    =>  'justnoxx',
    age     =>  '25',
    other   =>  'perl programmer',
);
print "Name: ", $person->get_name();
print "Age: ", $person->get_age();
print "Other: ", $person->get_other();
package Person;
use strict;
use warnings;
sub new {
    my ($class, %params) = @_;
    my $self = {};
    no strict 'refs';
    for my $key (keys %params) {
        # __PACKAGE__ равен текущему модулю, это встроенная
        # волшебная строка
        # следующая строка превращается в, например:
        # Person::get_name = sub {...};
        *{__PACKAGE__ . '::' . "get_$key"} = sub {
            my $self = shift;
            return $self->{$key};
        };
        $self->{$key} = $params{$key};
    }
    bless $self, $class;
    return $self;
}
1;

This program will print:
Name: justnoxx
Age: 25
Other: perl programmer

Function Attributes

Python has a concept like a decorator. This is such a thing that allows you to "add extra behavior to the object."

Yes, there are no decorators in Perl, but there are function attributes. If we open perldoc perlsub and look at the description of the function, we will see a curious entry:

sub NAME(PROTO) : ATTRS BLOCK

Thus, a function with attributes may look like this:

sub my_sub($$;$) : MyAttr {
	print "Hello, I am sub with attributes and prototypes!";
}

Working with attributes in Perl is not a trivial matter, because for quite some time now Attribute :: Handlers module has been included in the standard Perl package.

The fact is that the attributes out of the box have quite a few limitations and nuances of work, so if anyone is interested, you can discuss in the comments.

Suppose we have a function that can only be called if the user is authorized. For the fact that the user is authorized, the variable $ auth is responsible, which is equal to 1 if the user is authorized, and 0 if not. We can do as follows:

my $auth = 1;
sub my_sub {
	if ($auth) {
        print "Okay!\n";
        return 1;
	}
	print "YOU SHALL NOT PASS!!!1111";
	return 0;
}

And this is an acceptable solution.

But a situation may arise that functions will become more and more. And in each one it will be more complicated to do a check. The problem can be solved using attributes.

use strict;
use warnings;
use Attribute::Handlers;
use Data::Dumper;
my_sub();
sub new {
	return bless {}, shift;
}
sub isAuth : ATTR(CODE) {
	my ($package, $symbol, $referent, $attr, $data, $phase, $filename, $linenum) = @_;
	no warnings 'redefine';
	unless (is_auth()) {
        *{$symbol} = sub {
            require Carp;
            Carp::croak "YOU SHALL NOT PASS\n";
    	};
	}
}
sub my_sub : isAuth {
	print "I am called only for auth users!\n";
}
sub is_auth {
	return 0;
}

In this example, the output of the program will look like this:
YOU SHALL NOT PASS at myattr.pl line 18. main::__ANON__() called at myattr.pl line 6

And if we replace return 0 with return 1 in is_auth, then:
I am called only for auth users!

No wonder the attributes are presented at the end of the article. In order to write this example, we used:

anonymous functions
override features
a special form of goto statement

Despite the rather cumbersome syntax, attributes are successfully and actively used, for example, in the Catalyst web framework. However, do not forget that the attributes, after all, are an experimental feature of Perl, and therefore their syntax may change in future versions of the language.

The article was written in collaboration and on technical material from Dmitry Shamatrin (@justnoxx) and with the assistance of REG.RU programmers: Timur Nozadze (@TimurN), Victor Efimov (@vsespb), Polina Shubina (@imagostorm), Andrew Nugged (@nugged)

Tags: