LINQ for PHP. Part 2. If the mountain does not go to Mohammed, Mohammed goes to the mountain

    As you can see from my previous article comparing LINQ libraries for PHP , there are a lot of libraries, and the quality is low: lazy calculations are not implemented in any library, there are tests in half the cases, the types of callbacks are limited, and sometimes it’s completely unclear what LINQ stands for . So I wrote my library. Meet:

    YaLinqo - Yet Another LINQ to Objects for PHP

    Opportunities:

    • The most comprehensive .NET LINQ port in PHP, with many additional methods. Some methods are missing, but work is ongoing. In total, more than 70 methods have been implemented.
    • Lazy calculations, text exceptions, and much more, as in the original LINQ.
    • Detailed PHPDoc documentation for each method. Text of articles adapted from MSDN.
    • 100% unit test coverage.
    • Callbacks can be defined by closures, "pointers to a function" in the form of strings and arrays, string "lambdas" with support for several syntaxes.
    • Keys are given as much attention as values: transformations can be applied to both; most callbacks accept both; if possible, keys are not lost during conversions.
    • Minimal invention of bicycles: Iterator, IteratorAggregate, etc. are used for iteration (and they can be used along with Enumerable); Whenever possible, exceptions are used by native hang-ups, etc.

    Code example:

    // Отфильтровать продукты с ненулевым количеством, поместить в соответствующие категории,
    // отсортированные по имени. Продукты отсортировать сначала по убыванию количества, потом по имени.
    from($categories)
        ->orderBy('$v["name"]')
        ->groupJoin(
            from($products)
                ->where('$v["quantity"] > 0')
                ->orderByDescending('$v["quantity"]')
                ->thenBy('$v["name"]'),
            '$v["id"]', '$v["catId"]', 'array("name" => $v["name"], "products" => $e)'
        );
    

    Implemented methods

    • Generation: cycle, emptyEnum (empty), from, generate, toInfinity, toNegativeInfinity, matches, returnEnum (return), range, rangeDown, rangeTo, repeat, split;
    • Projection, filtering: ofType, select, selectMany, where;
    • Grouping, connection: groupJoin, join, groupBy;
    • Aggregation: aggregate, aggregateOrDefault, average, count, max, maxBy, min, minBy, sum;
    • Sets: all, any, contains;
    • Paging: elementAt, elementAtOrDefault, first, firstOrDefault, firstOrFallback, last, lastOrDefault, lastOrFallback, single, singleOrDefault, singleOrFallback, indexOf, lastIndexOf, findIndex, findLastIndex, skip, skipWhile;
    • Conversion: toArray, toArrayDeep, toList, toListDeep, toDictionary, toJSON, toLookup, toKeys, toValues, toObject, toString;
    • Actions: call (do), each (forEach), write, writeLine.

    Example

    Now consider the above example in more detail. In fact, there are several options for writing this query: using closures and using string lambdas. Lambdas also have two syntaxes: you can use the default variable names (v and k for value and key, respectively), you can set meaningful ones.

    Initial data (either from the database, or from some JSON service, or “iron” constants, or some other source):

    $products = array(
        array('name' => 'Keyboard',    'catId' => 'hw', 'quantity' =>  10, 'id' => 1),
        array('name' => 'Mouse',       'catId' => 'hw', 'quantity' =>  20, 'id' => 2),
        array('name' => 'Monitor',     'catId' => 'hw', 'quantity' =>   0, 'id' => 3),
        array('name' => 'Joystick',    'catId' => 'hw', 'quantity' =>  15, 'id' => 4),
        array('name' => 'CPU',         'catId' => 'hw', 'quantity' =>  15, 'id' => 5),
        array('name' => 'Motherboard', 'catId' => 'hw', 'quantity' =>  11, 'id' => 6),
        array('name' => 'Windows',     'catId' => 'os', 'quantity' => 666, 'id' => 7),
        array('name' => 'Linux',       'catId' => 'os', 'quantity' => 666, 'id' => 8),
        array('name' => 'Mac',         'catId' => 'os', 'quantity' => 666, 'id' => 9),
    );
    $categories = array(
        array('name' => 'Hardware',          'id' => 'hw'),
        array('name' => 'Operating systems', 'id' => 'os'),
    );
    

    Actually, the task: to filter products with a non-zero quantity, put in the appropriate categories. Sort products first in descending order of quantity, then by name. Categories sorted by name. You should get the following (reformatted for brevity):

    Array (
        [hw] => Array (
            [name] => Hardware
            [products] => Array (
                [0] => Array ( [name] => Mouse       [catId] => hw [quantity] =>  20 [id] => 2 )
                [1] => Array ( [name] => CPU         [catId] => hw [quantity] =>  15 [id] => 5 )
                [2] => Array ( [name] => Joystick    [catId] => hw [quantity] =>  15 [id] => 4 )
                [3] => Array ( [name] => Motherboard [catId] => hw [quantity] =>  11 [id] => 6 )
                [4] => Array ( [name] => Keyboard    [catId] => hw [quantity] =>  10 [id] => 1 )
            )
        )
        [os] => Array (
            [name] => Operating systems
            [products] => Array (
                [0] => Array ( [name] => Linux       [catId] => os [quantity] => 666 [id] => 8 )
                [1] => Array ( [name] => Mac         [catId] => os [quantity] => 666 [id] => 9 )
                [2] => Array ( [name] => Windows     [catId] => os [quantity] => 666 [id] => 7 )
            )
        )
    )
    

    The following is an example using closures from PHP 5.3. The longest record, but the best support in a variety of IDEs.

    from($categories)
        ->orderBy(function ($cat) { return $cat['name']; })
        ->groupJoin(
            from($products)
                ->where(function ($prod) { return $prod["quantity"] > 0; })
                ->orderByDescending(function ($prod) { return $prod["quantity"]; })
                ->thenBy(function ($prod) { return $prod["name"]; }),
            function ($cat) { return $cat["id"]; },
            function ($prod) { return $prod["catId"]; },
            function ($cat, $prods) { return array("name" => $cat["name"], "products" => $prods); }
        );
    

    Recording using string lambdas. To the left of the operator "==>" are the argument names, to the right is the return value.

    from($categories)
        ->orderBy('$cat ==> $cat["name"]')
        ->groupJoin(
            from($products)
                ->where('$prod ==> $prod["quantity"] > 0')
                ->orderByDescending('$prod ==> $prod["quantity"]')
                ->thenBy('$prod ==> $prod["name"]'),
            '$cat ==> $cat["id"]',
            '$prod ==> $prod["catId"]',
            '($cat, $prods) ==> array("name" => $cat["name"], "products" => $prods)'
        );
    

    And finally, the shortest entry. If there is no operator "==>", then the default variable names are used: v for the value, k for the key, a and b for the compared values, etc.

    from($categories)
        ->orderBy('$v["name"]')
        ->groupJoin(
            from($products)
                ->where('$v["quantity"] > 0')
                ->orderByDescending('$v["quantity"]')
                ->thenBy('$v["name"]'),
            '$v["id"]', '$v["catId"]', 'array("name" => $v["name"], "products" => $e)'
        );
    

    (Doubtful) architectural solutions

    You just can’t copy the original LINQ one-on-one: different languages, different features, different features. Therefore, often had to make a choice. How good or bad is for you to judge. Discussion is welcome.

    The keys

    Let's start with the most dubious: keys are declared an important piece of data. Reason: they are clearly present in native pokepashnyh iterators, they are important in arrays, they are important when converting to JSON. In general, keys are used everywhere in PHP, so I would not want to neglect them, as in some other libraries.

    However, in the original LINQ sequences do not have any keys, so you have to increase the number of arguments both for callbacks (now they can all work with the key, if possible), and for LINQ methods themselves: resultSelector turns into resultSelectorValue + resultSelectorKey. However, in most cases, the developer does not need to think about it: callbacks can be passed with fewer arguments, and all LINQ methods have default values ​​for arguments of resultSelectorKey type.

    Another problem with keys stems from the need to store them everywhere. This means that by default when sorting the elements will have the same keys. PHP usually lists arrays in the order of adding elements, so converting to an array should not be a problem, but you never know.

    If you don’t need key information, then there are two simple ways to get rid of them:
    • The final operation, instead of toArray / toArrayDeep, call toList / toListDeep.
    • Call the toValues ​​method - the equivalent of array_values, but lazy like select.

    Argument Order

    In second place is doubtful decision on the order of arguments in functions and callbacks. They always go in order from (theoretically) the most used to the least used. In callbacks, therefore, in the first place usually comes the value, and then the key, because the value is almost always important, but the key is not. However, the order of the arguments may now be more difficult to remember. For example, in select, the value is selected first, and in toDictionary, the key is selected.

    However, we pokhapeshniki cannot get used to such disgrace - the whole language is mottled with a completely random order of arguments (the same needle and haystack).

    Item Indices

    An unobvious solution for those who used the original LINQ: methods like indexOf, elementAt work with keys, not the numeric position of the element in the enumeration. If you need a position, then first call toValues ​​- the keys will become sequential: 0, 1, 2, 3, etc. Also, for methods of type select there are no overloads with callbacks that accept the position of the element. Similarly, use toValues.

    Lambd arguments

    In the linq.js library, which I was inspired to write, all callbacks have arguments called $, $$, $$$, $$$$. This does not happen in PHP. You can do string conversion, but I would like to leave the code valid, even if it is inside the line. I do not want to call the arguments empty $ a, $ b, $ c either. Therefore, it was decided to use the names corresponding to the contents:

    • Usually v for value, k for key
    • If there are several values, then v1 and v2
    • If the value is a sequence, then e
    • For battery during aggregation - a
    • For comparison methods - a and b
    • (Maybe something else forgot)

    Disadvantage: you need to know the names. However, with detailed documentation this should not be a problem.

    “Doubtful” collections

    There is no List class, the toList method returns the same as toArray, only with sequential keys (0, 1, 2, etc.)

    There is a Dictionary class. Originally conceived solely as a base for Lookup, but now has become a separate full collection. Unlike ordinary arrays, objects can be keys (possibly in the original LINQ). But in fact, LINQ itself does not support key objects everywhere, because PHP does not allow you to use key objects in foreach. You can rewrite all the cycles, but how much the game is worth the candle is a question.

    The Lookup class is. By key returns a list of values ​​(or an empty array if there is no key).

    Both collections support the toArray method, which returns an internal array.

    Documentation from MSDN

    Help from MSDN was copied to all methods, then adapted to port realities. Somewhere descriptions are from other projects. Somewhere - written independently. If you find errors - report.

    In general, the certificate was very solid. Some methods are not sickly such articles.

    Method Names

    Some words in PHP are brazenly grabbed by the language itself, and in all registers. Even empty cannot be used as a method name. Therefore, where there are conflicts, the methods are renamed (in the list of methods at the beginning of the article, the original method names are given in parentheses). In particular, run / forEach became call / each.

    Exception Names

    There are no built-in exceptions in PHP that are in .NET. However, I tried to avoid creating unnecessary classes. So, instead of InvalidOperationException, an UnexpectedValueException is used. In the end, the operation becomes invalid with unexpected values.

    Stable sorting

    Sort is unstable. That is, when sorting an array [[0,1], [1,0], [0,2]]by the first element of nested arrays, no one guarantees that [0,1]they [0,2]will go one after another in this order. The result can be both [[0,1], [0,2], [1,0]]and [[0,2], [0,1], [1,0]].

    Why? Because in PHP there are no functions for stable sorting, and usort is used inside the library. Theoretically, you can make the sort stable as in the original LINQ, but is it necessary? Everyone will have to pay for stability with runtime and memory usage. I decided that since we were following the “PHP path”, the instability should be the same as in PHP itself.

    Other

    Unit test coverage is almost 100%.

    The license is a simplified BSD (two-point).

    Requirements - PHP 5.3.

    Using:

    require_once __DIR__ . '/lib/Linq.php'; // заменить на свой путь
    use \YaLinqo\Enumerable; // по вкусу укоротить имя
    use \YaLinqo\Enumerable as E; // или так
    // Можно вызывать или глобальную функцию from, или статический метод в Enumerable — разницы нет
    Enumerable::from(array(1, 2, 3));
    from(array(1, 2, 3));
    

    Comparison with other libraries

    For clarity, I also added libraries for JavaScript to the table. Their comparison will be in a separate article. Legend as in Wikipedia, but with an additional meaning:





    • red - no way
    • yellow - the third grade is not marriage
    • green - that’s it
    • blue - drop dead

    I apologize for the English in the table. In Russian it turned out too long.

    thanks to me

    He worked for free, no one will give money. If you feel an attack of generosity, you can simply vote for these features in PHP and PHPStorm. Perhaps they will notice and use the library will be more pleasant.

    Php

    1. Iterator :: key () is allowed to return only numbers and strings.
      1. 45684 A request for foreach to be key-type agnostic
    2. There was a feature with shortening the syntax of closures, and patches, analysis, etc. were attached - the developer who designed the feature did its best. But the feature was closed with the result "nafig need." :-(

    PHPStorm IDE

    1. PHP code inside lines
      • WI-3477 Inject PHP language inside assert ('literal'), eval and similar
      • WI-2377 No autocompletion for php variables inside string with injected language
    2. PHP code analysis
      • WI-11110 Undefined method: Undefined method wrongly reported when using closures
    3. PHPDoc Comments
      • WI-8270 Error in PhpDoc quick documentation if {link} used twice in a line

    Link

    Download Yet Another LINQ to Objects for PHP from GitHub

    PS Tell me, please, where you can post a similar article in English.

    Also popular now: