How IQueryable and LINQ Data Providers Work

LINQ tools allow .Net developers to work in a consistent fashion with both collections of objects in memory and objects stored in a database or other remote source. For example, to request ten red apples from a list in memory and from a database using Entity Framework, we can use an absolutely identical code:

List appleList; 
DbSet appleDbSet;
var applesFromList = appleList.Where(apple => apple.Color == “red”).Take(10);
var applesFromDb = appleDbSet.Where(apple => apple.Color == “red”).Take(10);

However, these requests are performed in different ways. In the first case, when enumerating the result using foreach, the apples will be filtered using the specified predicate, after which the first 10 of them will be taken. In the second case, the syntax tree with the query expression will be passed to a special LINQ provider, which will translate it into an SQL query to the database and execute it, after which it will form C # objects for 10 found records and return them. The IQueryable interface allows this behavior.designed to create LINQ providers to external data sources. Below we will try to understand the principles of organization and use of this interface.

Interfaces IEnumerable and iqueryable


At first glance, it might seem that LINQ is based on a set of extension methods like Where (), Select (), First (), Count (), etc. to the IEnumerable interface, which ultimately gives the developer the ability to uniformly write queries both to objects in memory (LINQ to Objects), and to databases (e.g. LINQ to SQL, LINQ to Entities) and remote services (e.g. LINQ to OData Services). But this is not so. The fact is that inside extension methods to IEnumerablecorresponding operations with sequences have already been implemented. So, for example, the First method(Funcpredicate) is implemented in the .Net Framework 4.5.2, the sources of which are available to us here , as follows:

public static TSource First(this IEnumerable source, Func predicate) {
    if (source == null) throw Error.ArgumentNull("source");
    if (predicate == null) throw Error.ArgumentNull("predicate");
    foreach (TSource element in source) {
          if (predicate(element)) return element;
    }
    throw Error.NoMatch();
}

It is clear that in the general case, such a method cannot be performed on data located in a database or service. To execute it, we can only preload the entire data set directly into the application, which for obvious reasons is unacceptable.

To implement LINQ providers to data external to the application, the IQueryable interface is used (successor from IEnumerable) along with a set of extension methods that are almost identical to those written for IEnumerable. Precisely because List implements IEnumerable, and DbSet from Entity Framework - IQueryableThe queries at the beginning of the article with apples are performed differently.

Feature of IQueryable Extension Methodsconsists in the fact that they do not contain data processing logic. Instead, they simply form a syntactic structure with a description of the request, "escalating" it with every new method call in the chain. When calling aggregate methods (Count (), etc.) or when enumerating using foreach, the request description is passed to the provider encapsulated inside a specific implementation of IQueryable, and he already converts the request into the language of the data source with which he works, and performs it. In the case of the Entity Framework, this language is SQL, in the case of the .Net driver for MongoDb, it is a search json object, etc.

By the way, some “interesting” characteristics of LINQ providers follow from this feature:
  • A request that is successfully executed by one provider may not be supported by another; moreover, we learn about this not even at the stage of constructing a request, but only at the stage of its execution by the provider;
  • before executing the request, the provider can modify it first; for example, a restriction on the number of returned objects, additional filters, etc. may be added to all requests.

Do-it-yourself LINQ: ISimpleQueryable


Before describing the IQueryable interface device, try to write its simple analogue on your own - the ISimpleQueryable interface, as well as a couple of extension methods to it in the LINQ style. This will allow you to clearly demonstrate the basic principles of working with IQueryable.without going into the nuances of its implementation.
public interface ISimpleQueryable : IEnumerable {
    string QueryDescription { get; }
    ISimpleQueryable CreateNewQueryable(string queryDescription);
    TResult Execute();
}

In the interface, we see the QueryDescription property, which contains the description of the query, as well as the Execute method(), which should fulfill this request if necessary. This is a generic method, because the result of execution can be either an enumeration or the value of an aggregate function, such as Count (). In addition, the interface has a CreateNewQueryable () method, which allows you to create a new instance of ISimpleQueryable when adding a new LINQ method, but with a new request description. Note that the request description is presented as a string here, and LINQ uses Expression Trees, which can be read about here or here .

Now let's move on to extension methods:

public static class SimpleQueryableExtentions 
{
    public static ISimpleQueryable Where(this ISimpleQueryable queryable,
                                                            Expression> predicate) {
        string newQueryDescription = queryable.QueryDescription + ".Where(" + predicate.ToString() + ")";
        return queryable.CreateNewQueryable(newQueryDescription);
    }
    public static int Count(this ISimpleQueryable queryable) {
        string newQueryDescription = queryable.QueryDescription + ".Count()";
        ISimpleQueryable newQueryable = queryable.CreateNewQueryable(newQueryDescription);
        return newQueryable.Execute();
    }
}

As we can see, these methods simply add information about themselves in the description of the query and create a new instance of ISimpleQueryable. In addition, the Where () method, unlike its counterpart for IEnumerableaccepts not Func predicate itself, and the previously mentioned expression tree with its description Expression>. In this example, it just gives us the opportunity to get a string with the predicate code, and in the case of a real LINQ - the ability to save all the details of the query in the form of an expression tree.

Finally, create a simple implementation of our ISimpleQueryable, which will contain everything you need to write LINQ queries, except for the method of their execution. To make it more realistic, add a link to the data source (_dataSource), which should be used when executing a query using the Execute () method.

public class FakeSimpleQueryable : ISimpleQueryable
{
    private readonly object _dataSource;
    public string QueryDescription { get; private set; }
    public FakeSimpleQueryable(string queryDescription, object dataSource) {
        _dataSource = dataSource;
        QueryDescription = queryDescription;
    }
    public ISimpleQueryable CreateNewQueryable(string queryDescription) {
        return new FakeSimpleQueryable(queryDescription, _dataSource);
    }
    public TResult Execute() {
        //Здесь должна быть обработка QueryDescription и применение запроса к dataSource
        throw new NotImplementedException();
    }
    public IEnumerator GetEnumerator() {
        return Execute>();
    }
    IEnumerator IEnumerable.GetEnumerator() {
        return GetEnumerator();
    }
}

Now consider a simple query for FakeSimpleQueryable:

var provider = new FakeSimpleQueryable("", null);
int result = provider.Where(s => s.Contains("substring")).Where(s => s != "some string").Count();

Let's try to figure out what will happen when the above code is executed (see also the figure below):
  • first, the first call to the Where () method will take an empty request description from the FakeSimpleQueryable instance created using the constructor, add ".Where (s => s.Contains (" substring ")) to it and form a second FakeSimpleQueryable instance with a new description;
  • then the second Where () call will take the request description from the previously created FakeSimpleQueryable, add ".Where (s => s! =" some string ") to it, and then again form a new, third, instance of FakeSimpleQueryable with the request description" .Where (s => s.Contains ("substring")). Where (s => s! = "Some string") ";
  • finally, the call to Count () will take the request description from the FakeSimpleQueryable instance created in the previous step, add ".Count ()" to it and form the fourth FakeSimpleQueryable instance, after which it will call the Execute method, since further construction of the request is impossible;
  • as a result, inside the Execute () method, we will have a QueryDescription value equal to ".Where (s => s.Contains (" substring ")). Where (s => s! =" some string "). Count ()", which needs to be processed further.



Real IQueryable... and IQueryProvider


Now let's see what the IQueryable interface is.implemented in .Net:
public interface IQueryable : IEnumerable {
    Expression Expression { get; }
    Type ElementType { get; }
    IQueryProvider Provider { get; }
}
public interface IQueryable : IEnumerable, IQueryable {}
public interface IQueryProvider {
    IQueryable CreateQuery(Expression expression);
    IQueryable CreateQuery(Expression expression);
    object Execute(Expression expression);
    TResult Execute(Expression expression);
}

Note that:
  • .Net has a generic and regular version of IQueryable;
  • Expression property is used to store the tree with the LINQ query description (in our implementation, we used the lowercase QueryDescription);
  • the ElementType property contains information about the type of elements returned by the request and is used in implementations of LINQ providers to check type compliance;
  • a couple of methods for creating new instances of IQueryable (CreateQuery () and CreateQuery()), as well as a couple of methods for executing the request (Execute () and Execute()) moved to a separate interface IQueryProvider; we can assume that such a separation was needed in order to separate the request itself, which is recreated with each new call of the extension method, from the object that really has access to the data source, does all the main work and can be quite “heavyweight” for constant recreation;
  • The IQueryable.Provider property points to the associated instance of IQueryProvider.

Now let's look at the work of extension methods to IQueryable using the Where () method as an example:
public static IQueryable Where(this IQueryable source, Expression> predicate) {
    if (source == null) throw Error.ArgumentNull("source");
    if (predicate == null) throw Error.ArgumentNull("predicate");
    return source.Provider.CreateQuery( 
        Expression.Call(
           null, 
           ((MethodInfo)MethodBase.GetCurrentMethod()).MakeGenericMethod(typeof(TSource)), 
           new Expression[] { source.Expression, Expression.Quote(predicate) }
           ));
}

We see that the method constructs a new instance of IQueryablepassing in CreateQuery() is an expression in which a call to the Where () method itself is added to the original expression from source.Expression with the predicate passed as an argument.

Thus, despite some differences in the IQueryable interfaces and IQueryProvider from the previously created ISimpleQueryable, the principles of their use in LINQ are the same: each extension method added to the query supplements the expression tree with information about itself, after which it creates a new instance of IQueryable using the CreateQuery method(), and aggregate methods, in addition, initiate the execution of the request by calling the Execute method().

A few words about developing LINQ providers


Since the LINQ query construction mechanism has already been implemented in .Net for us, the development of the LINQ provider for the most part comes down to the implementation of the Execute () and Execute methods(). It is here that you need to parse the expression tree that came to be executed, convert it to the language of the data source, execute the query, wrap the results in C # objects and return them. Unfortunately, this procedure involves processing a considerable number of different nuances. Moreover, the available information on the development of LINQ providers is rather small. Below are the most informative, according to the author, articles on this topic:

I hope that the material in this article will be useful to anyone who wants to figure out how to organize LINQ providers to remote data sources or start creating such a provider, but have not yet decided.

Also popular now: