VlK April 18, 2011 at 02:19

Pipes, the pythonic way

Some pythonists love code that is readable, while others prefer concise. Unfortunately, the balance between the first and second - the decisions are really elegant - rarely happens to be found in practice. More often lines like

my_function(sum(filter(lambda x: x % 3 == 1, [x for x in range(100)])))

Or quatrains a la

xs = [x for x in range(100)]
xs_filtered = filter(lambda x: x % 3 == 1, xs)
xs_sum = sum(xs_filtered)
result = my_function(xs_sum)

Idealists would like to write something like this

result = [x for x in range(100)] \
    | where(lambda x: x % 3 == 1)) \
    | sum \
    | my_function

Not in Python?

A simple implementation of such chains was recently proposed by a certain Julien Palard in his Pipe library .

Let's start right away with an example:

from pipe import *   
[1,2,3,4] | where(lambda x: x<=2)
# at 0x88231e4>

Oops, the intuitive impulse did not ride. Pipe returns the generator, the values from which have yet to be extracted.

[1,2,3,4] | where(lambda x: x<=2) | as_list
#[1, 2]

It would be possible to get the values from the generator with a built-in cast function of type list (), but the author of the tool was consistent in his research and offered us the pipe function as_list.

As you can see, the data source for the pipes in the example is a simple list. Generally speaking, any Python iterable can be used. Let's say “pairs” (tuples) or, more interestingly, the same generators:

def fib():
    u"""
    Генератор чисел Фибоначчи
    """
    a, b = 0, 1
    while 1:
        yield a
        a, b = b, a + b
fib() | take_while(lambda x: x<10) | as_list
#0
#1
#1
#2
#3
#5
#8

Several lessons can be learned from here:

in pipes, you can use lists, "pairs", generators - any iterables.
the result of combining generators in chains will be a generator.
without an explicit requirement (type conversion or special pipe), piping is "lazy" in the sense that the chain is a generator and can serve as an endless data source.

Of course, joy would be incomplete if it weren’t for us to easily create our own pipes. Example:

@Pipe
def custom_add(x):
    return sum(x)
[1,2,3,4] | custom_add
#10

Arguments? Easy:

@Pipe
def sum_head(x, number_to_sum=2):
    acc = 0
    return sum(x[:number_to_sum])
[1,2,3,4] | sum_head(3)
#6

The author has kindly provided a lot of prepared pipes. Some of them:

count - recount the number of elements of the incoming iterable
take (n) - extracts the first n elements from the input iterable.
tail (n) - Retrieves the last n elements.
skip (n) - skips n first elements.
all (pred) - Returns True if all iterable elements satisfy predicate pred.
any (pred) - Returns True if at least one iterable element satisfies the predicate pred.
as_list / as_dist - leads iterable to a list / dictionary, if such a conversion is possible.
permutations (r = None) - makes up all possible combinations of r elements of the input iterable. If r is not defined, then r is taken as len (iterable).
stdout - output to the standard stream iterable entirely after casting to a string.
tee - output the iterable element to the standard stream and pass it for further processing.
select (selector) - pass iterable elements for further processing, after applying the selector function to them.
where (predicate) - pass for further processing iterable elements that satisfy the predicate predicate.

But these are more interesting:

netcat (host, port) - for each iterable element, open the socket, transfer the element itself (of course, string), transmit the host response for further processing.
netwrite (host, port) - the same thing, just do not read from the socket after sending data.

These and other pipes for sorting, bypassing and processing the data stream are included by default in the module itself, since it is really easy to create them.

Under the hood of the Pipe decorator

Honestly, it was amazing to see how laconic the base code of the module! Judge for yourself:

class Pipe:
    def __init__(self, function):
        self.function = function
    def __ror__(self, other):
        return self.function(other)
    def __call__(self, *args, **kwargs):
        return Pipe(lambda x: self.function(x, *args, **kwargs))

That's all, actually. The usual decorator class.

In the constructor, the decorator saves the function being decorated, turning it into an object of the Pipe class.

If the pipe is called by the __call__ method, a new pipe of the function with the given arguments is returned.

The main subtlety is the __ror__ method. This is an inverted operator, analogous to the “or” operator (__or__), which is called on the right operand with the left operand as an argument.

It turns out that the chain calculation starts from left to right. The first element is passed as an argument to the second; the result of computing the second to the third and so on. Generators also pass along the chain painlessly.

In my opinion, very very elegant.

Afterword

The syntax for this kind of pipes is really simple and convenient, I would like to see something like this in popular frameworks; say to process data streams; or - in a declarative form - lining up callback chains.

The only drawback of this implementation is the rather vague error traces.

On the development of ideas and alternative implementation - in the following articles.

Tags:

Pipes, the pythonic way

Under the hood of the Pipe decorator

Afterword

Also popular now: