Alvaro July 28, 2012 at 08:32

MongoDB and C #. New features and unobvious challenges

Introduction

In early July, the next version (1.5) of the official MongoDB driver for C # was released. Among the innovations, it is worth noting support for typed queries. Now you can use lambda functions in conjunction with Expression.
In this article I will show examples of a new syntax that I really like (and I really like Expression in C #), as well as demonstrate examples of queries where, alas, Expression will not help us and will have to return to the usual lines. I will also discuss why this is so, and whether everything will ever be fine in C # when working with MongoDB.

Oh good

Yes, now instead of:

ObjectId articleId = new ObjectId("dgdfg343ddfg");
IMongoQuery query = Query.EQ("_id", articleId);

you can write like this:

ObjectId articleId = new ObjectId("dgdfg343ddfg");
IMongoQuery query = Query.EQ(item => item.Id, articleId);

in the presence, of course, of the Article class representing the outline of the document. It is worth noting that all methods associated with QueryBuilder samplingcompletely similar to those in QueryBuilder. True, to combine queries, you still need to use Query.And or Query.Or. Well, this is understandable, since all methods return the same QueryBuilder. In fact, they can be combined as you like.
UpdateBuilder also introduced UpdateBuilderwith appropriate methods.
As it was before:

Article article = new Article();
IMongoUpdate update = Update.PushWrapped("Articles", article);

How can I now:

Article article = new Article();
IMongoUpdate update = Update.Push(item => item.Articles, article);

In my opinion, much better (if we talk about beauty and control). In general, expression trees are a very powerful and beautiful thing. Here, only the very tip of the iceberg, but I have used them a lot where. In many ways, this is Reflection with a human face.
Anyway. Everything seems to be simple here. Let's move on to less beautiful things.

Oh sad

Simple queries like search by value go off with a bang. A bit more complicated, by the way, too. For example, searching for an element in an array:

IMongoQuery query = Query.ElemMatch(item => item.Comments, builder => builder.EQ(item => item.Id, comment.Id));

But the tension is already being felt.
Now let's flip the coal. There is a document of this content:

_id : "s3d4f5d6sf",
array1 : [{
            _id : "cv434lfgd45",
            array2 : [{
                        _id : "df4gd45g43f4",
                        name : "Logic" 
                      },
                      {
                         ...
                      }] 
          },
          {
             ...
          }]

And now we have the task of adding add another element to the array2 array. In the program:

1) the main document will be described by the Doc model class
2) we will turn array1 into List, where Item1 is the model class for each element of array1
3) we will turn array2 into List, where Item2 is the model class for each element of array2

(I deliberately depersonalized the document so that there are no complaints about the data structure)

There are no problems with the search:

IMongoQuery query = Query.And(
                       Query.EQ(item => item.Id, new ObjectId("s3d4f5d6sf")),
                       Query.ElemMatch(item => item.Array2, builder => builder.EQ(item => item.Id, new ObjectId("df4gd45g43f4")));

But with the request for updating difficulties arose (I do not consciously consider the option of selecting a document, adding data to it, and then writing to the database. For the option is too suboptimal. Plus, I would like to update the data in an atomic operation, rather than in a cascade of samples / records). You need to get to the nested array array2. If you use the standard tools of MongoDB, then you can do this:

IMongoUpdate update = Update.PushWrapped("array1.$.array2", new Item2());

I could not figure out how to get around the magic lines, and the longer I looked at the expression “array1. $. Array2”, the more suspicious thoughts I had.

Oh high

Let's start from afar.

There are "static" languages. For example, C #. In them, the structure of elements is known before compilation (most often). And it is with these structures that we operate. More specifically, we operate on classes, such schemes. Well, on the data side, we have objects — instances of classes.

There are “dynamic” languages. For example, Javascript. In them, the usual practice is the formation of a data structure during execution. In the general case, we operate on an “empty” object and add / remove methods and fields. Structures as such do not exist at all. There is only the starting point (prototype) from which we build.

In the organization of data, you can also draw a certain analogy.
Relational databases have hard data schemas and dependencies between them.
In the document-oriented there are only documents, nesting and lists.

But now I will fantasize and give more crude analogies.
The problem with querying with Expression for the case “array1. $. Array2” is that in C # we operate with structures (classes) for working with documents (objects). The “power” of the request adds fuel to the fire. If we leave only the first condition in the selection request, then if we wish (setting a flag in multiupdate), we can add an element to each of array2 arrays. Indeed, in fact, our beautiful query (even if we can write it) will still be converted to "array1. $. Array2". For a more extensive document with a deep level of nesting, the situation will only worsen.

I am not saying that it is impossible to come up with an Expression syntax for such a task, it just seems to me that this syntax will lose its comprehensibility. That is, the connection between the syntactic and semantic parts (intuitive, clear business) will melt. I'm not sure what exactly I want in my projects.
For me, the conclusions are quite obvious: the beautiful syntax in relatively simple cases is cool and cool, but in more complex situations, you can come to the conclusion that Expression does not simplify the work, but only multiplies ambiguities.

Conclusion

I do not have much experience working with MongoDB, and I will gladly include in the article a solution to the described query problem, as well as reasoned arguments / debate on the topic of building architecture and code in C # for MongoDB.

Tags: