
YaLinqo (LINQ to Objects for PHP) - Version 2.0

LINQ is a thing that allows you to write queries, something similar to SQL, directly in the code. LINQ to Objects, in fact, allows you to write queries to objects, arrays, and everything that you operate on in your code.
What else is this for?
If you have a base, then you have your favorite ORM (or your favorite bare SQL - to whom you like). But sometimes objects come from web services, from files, and indeed darkness obscuring objects may require non-trivial processing: conversion, filtering, sorting, grouping, aggregation ... Apply the usual ORM or SQL - but there’s no database. LINQ to Objects comes to the rescue, in this case YaLinqo.
What can?
- The most comprehensive .NET LINQ port in PHP, with many additional methods. In total, more than 70 methods have been implemented.
- Lazy calculations, text exceptions, and more, as in the original LINQ.
- Detailed PHPDoc documentation for each method. Text of articles adapted from MSDN.
- 100% unit test coverage.
- Callbacks can be set with closures, “function pointers” in the form of strings and arrays, string “lambdas” with support for several syntaxes.
- The keys are given as much attention as the values: transformations can be applied to both; most callbacks accept both; if possible, keys are not lost during conversions.
- Minimal invention of bicycles: Iterator, IteratorAggregate, etc. are used for iteration (and they can be used along with Enumerable); Whenever possible, exceptions are used by native hang-ups, etc.
- Composer is supported, there is a package on Packagist.
- No external dependencies.
What happened?
A year has passed since PHP 5.5 came out with all sorts of goodies such as generators and fixed iterators. Since on my conscience the most full-fledged LINQ port in PHP, I decided that it was time to update it and use the new language features.
What's new?
The speed is new. Tons of code are thrown out (about 800 lines according to Git calculations): there are no crutches like
Enumerator
for generation Iterator
's; there were no useless collections, in fact the only advantage of which was the storage of objects in keys; didn’t call_user_func
... And most importantly - there wasn’t a lot of lame code for generating iterators, which is impossible to understand. Remained foreach
and yield
. All this in total gave a big increase in speed.You have no idea with what pleasure I replaced this monster:
return new Enumerable(function () use ($self, $inner, $outerKeySelector, $innerKeySelector, $resultSelectorValue, $resultSelectorKey)
{
/** @var $self Enumerable */
/** @var $inner Enumerable */
/** @var $arrIn array */
$itOut = $self->getIterator();
$itOut->rewind();
$lookup = $inner->toLookup($innerKeySelector);
$arrIn = null;
$posIn = 0;
$key = null;
return new Enumerator(function ($yield) use ($itOut, $lookup, &$arrIn, &$posIn, &$key, $outerKeySelector, $resultSelectorValue, $resultSelectorKey)
{
/** @var $itOut \Iterator */
/** @var $lookup \YaLinqo\collections\Lookup */
while ($arrIn === null || $posIn >= count($arrIn)) {
if ($arrIn !== null)
$itOut->next();
if (!$itOut->valid())
return false;
$key = call_user_func($outerKeySelector, $itOut->current(), $itOut->key());
$arrIn = $lookup[$key];
$posIn = 0;
}
$args = array($itOut->current(), $arrIn[$posIn], $key);
$yield(call_user_func_array($resultSelectorValue, $args), call_user_func_array($resultSelectorKey, $args));
$posIn++;
return true;
});
});
to concise:
return new Enumerable(function () use ($inner, $outerKeySelector, $innerKeySelector, $resultSelectorValue, $resultSelectorKey) {
$lookup = $inner->toLookup($innerKeySelector);
foreach ($this as $ok => $ov) {
$key = $outerKeySelector($ov, $ok);
if (isset($lookup[$key]))
foreach ($lookup[$key] as $iv)
yield $resultSelectorKey($ov, $iv, $key) => $resultSelectorValue($ov, $iv, $key);
}
});
In addition, I finally added human version tags to the repository and described branch aliases in
composer.json
, so working with Composer should now cause less pain. So what is it, after all?
Let's say you have arrays:
$products = array(
array('name' => 'Keyboard', 'catId' => 'hw', 'quantity' => 10, 'id' => 1),
array('name' => 'Mouse', 'catId' => 'hw', 'quantity' => 20, 'id' => 2),
array('name' => 'Monitor', 'catId' => 'hw', 'quantity' => 0, 'id' => 3),
array('name' => 'Joystick', 'catId' => 'hw', 'quantity' => 15, 'id' => 4),
array('name' => 'CPU', 'catId' => 'hw', 'quantity' => 15, 'id' => 5),
array('name' => 'Motherboard', 'catId' => 'hw', 'quantity' => 11, 'id' => 6),
array('name' => 'Windows', 'catId' => 'os', 'quantity' => 666, 'id' => 7),
array('name' => 'Linux', 'catId' => 'os', 'quantity' => 666, 'id' => 8),
array('name' => 'Mac', 'catId' => 'os', 'quantity' => 666, 'id' => 9),
);
$categories = array(
array('name' => 'Hardware', 'id' => 'hw'),
array('name' => 'Operating systems', 'id' => 'os'),
);
Suppose you need to place products with a non-zero quantity in the corresponding categories sorted by name, and inside the categories, sort the products first in descending order of quantity, then by name. Now you are starting to build in your mind three times nested loops, function calls for arrays, trying to remember what prefix a suitable sort function is ... Instead of all this, you can write:
$result = from($categories)
->orderBy('$cat ==> $cat["name"]')
->groupJoin(
from($products)
->where('$prod ==> $prod["quantity"] > 0')
->orderByDescending('$prod ==> $prod["quantity"]')
->thenBy('$prod ==> $prod["name"]'),
'$cat ==> $cat["id"]', '$prod ==> $prod["catId"]',
'($cat, $prods) ==> [
"name" => $cat["name"],
"products" => $prods
]'
);
If the creators of PHP were not stubborn donkeys and did not abandon the pull request with lambdas, one could even write like this:
$result = from($categories)
->orderBy($cat ==> $cat['name'])
->groupJoin(
from($products)
->where($prod ==> $prod['quantity'] > 0)
->orderByDescending($prod ==> $prod['quantity'])
->thenBy($prod ==> $prod['name']),
$cat ==> $cat['id'], $prod ==> $prod['catId'],
($cat, $prods) ==> [
'name' => $cat['name'],
'products' => $prods,
]
);
One way or another, at the output we get:
Array (
[hw] => Array (
[name] => Hardware
[products] => Array (
[0] => Array ( [name] => Mouse [catId] => hw [quantity] => 20 [id] => 2 )
[1] => Array ( [name] => CPU [catId] => hw [quantity] => 15 [id] => 5 )
[2] => Array ( [name] => Joystick [catId] => hw [quantity] => 15 [id] => 4 )
[3] => Array ( [name] => Motherboard [catId] => hw [quantity] => 11 [id] => 6 )
[4] => Array ( [name] => Keyboard [catId] => hw [quantity] => 10 [id] => 1 )
)
)
[os] => Array (
[name] => Operating systems
[products] => Array (
[0] => Array ( [name] => Linux [catId] => os [quantity] => 666 [id] => 8 )
[1] => Array ( [name] => Mac [catId] => os [quantity] => 666 [id] => 9 )
[2] => Array ( [name] => Windows [catId] => os [quantity] => 666 [id] => 7 )
)
)
)
For aesthetes and optimizers, it is possible to use anonymous functions, rather than "string lambdas." Instead
'$prod ==> $prod["quantity"] > 0'
you can write function ($prod) { return $prod['quantity'] > 0; }
. For wild obfuscators, it is possible to use the default argument names (v is the value, k is the key, etc.), that is, it can be written simply '$v["quantity"] > 0'
(not recommended for complex nested queries). Where is LINQ to Database?
Yes, actually in the .NET world LINQ queries are also used in ORM (someone will say that this is generally the main purpose), but specifically this feature is not in the library, because any attempt to implement it will result in a ton of crutches, brakes and others unsightly things due to lack of support at the language level (parsing expressions in particular). LINQ to Objects didn’t do without crutches in the form of "string lambdas", but here a full-fledged translator from PHP to SQL with a full analysis and tons of optimizations is needed - sunset manually.
Come on!
PS The old version supports PHP 5.3. Functionally not inferior, but somewhat slower (iterators-s).
PPS Sane competitor (Ginq) has finally appeared. It has tons of SPL, Symfony and other architecture, zero comments, a lot of brakes (overhead from x2 to x50 relative to my version). Benchmarks in the process, I will write next time.