3 worst things you can do with Linq to Database

    Disclaimer: By Linq to Database hereinafter we mean the most common .NET libraries for accessing relational DBMSs using Linq queries. Such as Linq2SQL, Entity Framework, NHibername, Linq2db and the like
    Since the advent of IQueryable providers for accessing databases, writing queries has become much easier. On the one hand, this is good, because applications have become much easier to make. On the other hand, the quality of requests has fallen dramatically. And the point is not that the providers generate bad SQL, but that people write queries without understanding how it works under the hood. For many C # programmers, it IQueryable ’s magic, and there is a natural desire to reduce the effect of magic on applications, isolate magic from the rest of the code. This approach not only does not correct the situation, but exacerbates it ...

    Code example

    In a recent holivar, one of the participants posted an amazing code example:
    public IEnumerable GetDocumentsToday()
    {
        foreach (var item in Db.Documents.Where(x => EntityFunctions.TruncateTime(x.CreatedOn) == DateTime.Today))
        {
            yield return Mapper.Map(item);
        }
    }
    

    It was assumed that this code could isolate the "complexity" of IQueryable providers from the consumer. But it is this seemingly simple code that contains 3 bad things when working with Linq to Database.

    So the hit parade:

    Third place

    Long processing of query results. Most providers are kept open DataReader while the listing is in progress IQueryable. This leads to the fact that all resources allocated by the server for this request are held until the end of the crawl and closure DataReader. In this example yield return, it is still used , that is, the processing of each element can take a very long time.

    In the case of using SQL Server (or another locking database) at the isolation level of Repeatable Read, the locks will also be held until the end of processing the query results, and if there is an escalation, then the whole table.

    How to fix
    Always clearly fulfill the materialization of the results, for example by means of .Load()or .ToList()or .ToArray(). If the query results are very large, but you need to draw everything in, then it is best to organize paging with blocks that will not cause escalation of locks.

    Second place

    Ineffective predicate. In this example, the predicate is converted in SQL to something like this depending on the provider:
    ...
    WHERE cast(CreatedOn as Date) = cast(GETDATE() as Date)
    
    Naturally, no index can optimize such a selection and there will be a full crawl of the table.

    Even if such a predicate would not be a problem in this particular query, the code tends to be copied to other places along with all the problems.

    How to fix
    Do not use functions and type conversions for column values ​​in predicates and joins. This example can be rewritten like this:
    Db.Documents.Where(x => x.CreatedOn >= DateTime.Today && x.CreatedOn < DateTime.Today.AddDays(1))
    
    Now it will be possible to make an index on the field CreatedOn.

    First place

    Lack of projection. It would seem a banal thing - the request pulls all the fields of the table, and then maps to the DTO, which can have much fewer fields ... If the table has LOB fields, this can have a very negative impact on the speed of the query. In some ORMs, you can configure deferred loading of LOB fields, but even in this case, having a projection is more efficient. If there is a projection in the query, you can build a covering index, that is, an index that includes all the columns in the projection. That is, projections can not only make the request faster, but also give the opportunity for further optimization at the database level.

    How to fix
    Always make projections only necessary for the fields, except when you need to get the object, and then change it and write it to the database.

    Corrected Example

    public IEnumerable GetDocumentsToday()
    {
        return Db.Documents
                 .Where(x => x.CreatedOn >= DateTime.Today && x.CreatedOn < DateTime.Today.AddDays(1))
                 .Select (x => new DocumentDto {...})
                 .ToList();
    }
    
    You can also adjust the class Mapperso that it generates projections.

    Conclusion

    This chart contains the three most common problems that arise when using Linq. I optimized many applications and almost always found these problems.

    As for a specific example, most likely in many cases all fields of DocumentDto will not be used, but only a part of them. Also, not all records will be displayed, but with pagination. Therefore, it is advantageous to return IQueryable from the GetDocumentsToday method, and apply filters and restrictions closer to where the query results are used. But more about that next time.

    Also popular now: