Compound key WHERE clause in Entity Framework

More recently, when developing a project with versioning, I ran into the problem of selecting items from a list of composite keys.

Description of the problem:

When developing “in the conditions” of versioning, primary key tables consists of Id and Revision. It is necessary to get a selection from the database table from the transmitted list of composite keys (Id, Revision). Such an SQL query would look like this (for five items in a list):
/* Запрос 1 */
select *
from dbo.[Items] i
where  (i.Id = 1 and i.Revision = 2) 
	OR (i.Id = 1 and i.Revision = 4)
	OR (i.Id = 3 and i.Revision = 3)
	OR (i.Id = 3 and i.Revision = 4)
	OR (i.Id = 5 and i.Revision = 9)

But the Entity Framework does not allow you to write such a query for a list of composite keys. The maximum that can be done by standard means is:
context.Items.Where(i=> idList.Contains(i.Id) && revisionList.Contains(i.Revision))

what will turn into such a request (conceptually):
/* Запрос 2 */
select *
from dbo.[Items] i
where i.Id in (1, 3, 5) and i.Revision in (2, 3, 4, 9)

This query will return incorrect results if the Items table has elements with the following identifiers:
Id = 3, Revision = 2
Id = 3, Revision = 4
And in the list of composite keys there are such lines:
Id = 5, Revision = 4
Id = 3, Revision = 2

So what to do?

The following method is common on the Internet, for ORM:
You need to combine Id and Revision in the database and in the list and compare by the resulting value. By concatenation is meant the concatenation of strings, or, if Id and Revision are of type int, then offset and get of type long (bigint).
For the case of concatenation:
/* Запрос 3 */
select *
from dbo.[Items] i
where CONVERT(nvarchar(10), i.Id)+ ',' + CONVERT(nvarchar(10), i.Revision) in ('1,2', '1,4', '3,3', '3,4', '5,9')

If we compare “Request 1” and “Request 3”, then to execute the latter, we need to build an additional column (moreover, to construct it, we need to carry out 2 type conversion operations and 2 concatenation operations). And in “Request 1” only comparison operations are used. Based on this, I assume that "Request 1" is cheaper.
But MSSQL 2008 R2 gives exactly the same Execution Plan for both queries (for queries in the form in which they are presented here).

So, how do you get the Entity Framework to compose a query in the same form as "Request 1".

Entity Framework allows you to write such a request for a specific set of composite keys:
context.Items.Where(i=> (i.Id == 1 && i.Revision == 2) || (i.Id == 1 && i.Revision == 4) || (i.Id == 3 && i.Revision == 3) || (i.Id == 3 && i.Revision == 4) || (i.Id == 5 && i.Revision == 9))

But what to do with the list? Generate Expression Dynamically! .NET allows you to do this. To do this, use the Expression class with a bunch of static methods for all occasions. We can write the expression as follows:
i.Id == list.Id && i.Revision == list.Revision

Then copy it the required number of times, substituting the necessary values ​​from the list in the form of constants instead of list.Id and list.Revision and then collect them into one through an operation, for example, ||

How to do it:
Suppose our list of composite keys is List. Where Identifier is a wrapper for a pair of Id, Revision.
In the Entity Framework, Where expressions of the BinaryExpression type are used, each BinaryExpression expression consists of 3 main fields:
  • Left (type Expression) - left side
  • Right (type Expression) - the right part
  • NodeType (type ExpressionType) - operation (OR, AND, etc.)

list.Id and list.Revision are also expressions, only of type MemberExpression. Using its ReflectedType property, you can find out what type of list we have, and if it is Identifier, then we can replace MemberExpression (list.Id) with ConstantExpression (Id value for a specific instance can be obtained through Reflection or using a delegate)

After that, we get a list of expressions that need to be collected into one.
The easiest option is to collect them one at a time:
BinaryExpression BuildExpression(ExpressionType type, List expressions)
{
    if(expressions.Count == 0)
        return null;
    if(expressions.Count == 1)
        return expressions[0];
    var resExpression = expressions[0];
    for (int i = 1; i < expressions.Count; i++)
        resExpression = Expression.MakeBinary(type, resExpression, expressions[i]);
    return resExpression;
}

But there is one serious problem. Each expression substituted on the right is compared with the entire expression on the left, i.e. it turns out this expression:
((((((i.Id == 1 && i.Revision == 2) || (i.Id == 1 && i.Revision == 4)) || (i.Id == 3 && i.Revision == 3)) || (i.Id == 3 && i.Revision == 4)) || (i.Id == 5 && i.Revision == 9))

In this expression, the number of extra brackets at the beginning will be equal to the number of elements in the list. And all would be fine, when parsing this expression to build an SQL query, the Entity Framework uses recursion to go deeper into the expression, and, with ~ 1000 elements in the list (practical observations), a StackOverflowException flies. By the way, the quite interesting LINQ Dynamic Query Library project that I tried to use at home, but refused because of the above problem, suffers from the same problem .

But this problem can be defeated! To do this, you need to build the expression without substituting the elements on the right, and building it like a binary tree:
private static BinaryExpression BuildExpression(ExpressionType type, List expressions)
{
    if (expressions.Count == 0)
        return Expression.Equal(Expression.Constant(0), Expression.Constant(1)); //Если выражений нет, то и запрос должен вернуть пустой список
    else if (expressions.Count == 1)
        return expressions[0];
    var center = expressions.Count / 2;
    return Expression.MakeBinary(type,
        BuildExpression(type, expressions.Take(center).ToList()),
        BuildExpression(type, expressions.Skip(center).Take(expressions.Count - center).ToList()));
}

The expression obtained in this way does not raise a StackOverflowException even with 1,000,000 items in the list (I didn’t check it anymore, since with so many parameters SQL Server refuses to execute the query in an adequate amount of time).

Based on all this, I made extension methods that overload the Where method and use it in my projects.

To share these developments, I created a project on codeplex, where I uploaded the source code for EFCompoundkeyWhere.

I hope this comes in handy for someone.

Also popular now: