
Microsoft DocumentDB: Article Two, Resources, and Concepts
As already mentioned in the first article, DocumentDB exposes access to its functionality in the form of a RESTful programming model, and entities stored inside the database are called resources and are addressed by URI. To access these resources, you can use standard HTTP verbs, headers and status codes.
While we are preparing a good example about DocumentDB (a quick and thoughtful matter) and answers to your questions to the first article, we suggest reading a little more about the resources and concepts that DocumentDB runs on.


The DocumentDB resource model consists of a set of resources stored in a specific structure within an account, and each of them is accessible by a constant URI. So it all starts with a DocumentDB account. An account is a logical container in which databases are stored, each of which contains collections, which, in turn, contain stored procedures, triggers, UDF, etc. Each database has users who have a set of rights to manipulate documents. Permissions look like tokens, collections are containers of JSON documents and logic in JS.
System resources - accounts, databases, collections, users, stored procedures, triggers and UDF - have a fixed scheme, documents and attachments do not have restrictions on the scheme and, accordingly, are called user resources. Resources of both types are described by JSON.

Each account, which can be many inside one Azure subscription, is a prefabricated container consisting of units combining SSD storage and a fixed bandwidth indicator. Units can be added or removed at any time. You can create and change account settings on the Microsoft Azure management portal - portal.azure.com - or using the REST API (a good part of the platform’s functionality is exposed for REST API management).
If the account is a top-level logical container, then the database is a collection container and user. An account can have as many databases as you like.

You can store as much data in the database as you need - from a few gigabytes to petabytes - and all this storage will work on an SSD with a fixed bandwidth. However, the database is not fixed within any one machine - it may be a large database, which stores thousands of collections and terabytes of documents.
Collection- a container of the next level of nesting, already for JSON documents. A collection as a container serves not only for consolidation, but also as a unit of scaling - transactions and queries. The easiest way to scale is to add more collections and distribute SSD storage across them. Automatic scaling already works - the collection automatically changes its size as documents are added or deleted. While DocumentDB is in preview and has only one mode of operation (Standard Preview), the maximum size to which collections can be scaled is 10 GB.
DocumentDB does not require you to plan the schema for the system at all. Documents do not imply its existence and, as soon as you add them to the collection, DocumentDB automatically indexes them (=> queries can be executed). Automatic indexing of documents without having to think about the schema and secondary indexes is one of the main features of DocumentDB. At the same time, a stable-stable number of very fast write operations is ensured during successive requests.
Automatic indexing can be slightly corrected by choosing an indexing policy and thus gaining on performance and storage. You can either turn off automatic indexing altogether, or select only some documents that will be indexed (and choose which ones will NOT be indexed) and choose between synchronous (consistent) and asynchronous (lazy) modes (by default, the index is updated synchronously on each Insert operation, Replace or Delete, this behavior can be corrected for “lazy” mode and, possibly, get some performance benefit with, for example, collections with a large number of read operations).
RDBMS usually write business logic using stored procedures and triggers, launching as a transaction, which imposes on the developer the need to know two different development languages - the application project development language (JS, Python, etc.) and T-SQL. In DocumentDB, however, a JS program execution model is available for collections in the form of stored procedures and triggers, which allows you to perform effective concurrency control, indexing and not be distracted by an abundance of application tools.
DocumentDB independently wraps this logic in an Ambient ACID transaction with snapshot isolation and, if it throws an exception in the JS process, the whole transaction is rolled back. JS execution takes place inside the engine in the same address space as the Buffer Pool, which has a good effect on performance.
All this is successfully wrapped in transactional execution through HTTP POST.
Our hero understands JSON and JS out of the box, so no problems with types occur. Learn more - Azure DocumentDB REST APIs .
As already mentioned, business logic can be written entirely in JS as a stored procedure, trigger, or UDF. An application on JS can be registered for execution for triggers, stored procedures and UDFs, triggers and stored procedures can CRUD, while UDFs do not have write access, and permission is only for performing simple operations, for example, transfers and creating a new set of results based on previous operation. Each procedure, trigger and UDF at the same time use a fixed amount of resources, while not being able to access external JS libraries. When the allocated resources are exceeded, operations are blocked.
You can register a procedure, trigger, and UDF for execution using the REST API, and after registration, a stored procedure, trigger, or UDF is precompiled and stored as a byte code that is executed.
Registering a stored procedure = creating a resource for a new procedure and assigning its collection with HTTP POST.
The execution of the stored procedure is done again with HTTP POST with the transfer of the necessary parameters in the request body.
Registering a trigger = creating a new resource for a collection with HTTP POST, and in the process you can specify whether the trigger will be called before or after and the type of operation to be performed (CRUD).
You can unregister a trigger by performing an HTTP DELETE on the trigger resource.
In DocumentDB, you can store binary files (blobs) that look like special entities - attachments. Attachment is a special document (JSON) that refers to a real file. For example: The
contents of a book are in storage in DocumentDB or any other.
An application can store the metadata of each user as a separate document - for example, Alex for book1 will be available at / colls / alex / docs / book1.
Attachments points to the pages of the book, i.e. / colls / alex / docs / book1 / chapter1, chapter2, etc.
In this introductory article, we looked at the very basic principles and concepts of DocumentDB. The service is new, so we ourselves are actively studying it and hope that soon we will be able to present some beautiful example of use. Stay in touch :)
While we are preparing a good example about DocumentDB (a quick and thoughtful matter) and answers to your questions to the first article, we suggest reading a little more about the resources and concepts that DocumentDB runs on.

The DocumentDB resource model consists of a set of resources stored in a specific structure within an account, and each of them is accessible by a constant URI. So it all starts with a DocumentDB account. An account is a logical container in which databases are stored, each of which contains collections, which, in turn, contain stored procedures, triggers, UDF, etc. Each database has users who have a set of rights to manipulate documents. Permissions look like tokens, collections are containers of JSON documents and logic in JS.
System resources - accounts, databases, collections, users, stored procedures, triggers and UDF - have a fixed scheme, documents and attachments do not have restrictions on the scheme and, accordingly, are called user resources. Resources of both types are described by JSON.

Each account, which can be many inside one Azure subscription, is a prefabricated container consisting of units combining SSD storage and a fixed bandwidth indicator. Units can be added or removed at any time. You can create and change account settings on the Microsoft Azure management portal - portal.azure.com - or using the REST API (a good part of the platform’s functionality is exposed for REST API management).
If the account is a top-level logical container, then the database is a collection container and user. An account can have as many databases as you like.
You can store as much data in the database as you need - from a few gigabytes to petabytes - and all this storage will work on an SSD with a fixed bandwidth. However, the database is not fixed within any one machine - it may be a large database, which stores thousands of collections and terabytes of documents.
Collection- a container of the next level of nesting, already for JSON documents. A collection as a container serves not only for consolidation, but also as a unit of scaling - transactions and queries. The easiest way to scale is to add more collections and distribute SSD storage across them. Automatic scaling already works - the collection automatically changes its size as documents are added or deleted. While DocumentDB is in preview and has only one mode of operation (Standard Preview), the maximum size to which collections can be scaled is 10 GB.
Auto Indexing
DocumentDB does not require you to plan the schema for the system at all. Documents do not imply its existence and, as soon as you add them to the collection, DocumentDB automatically indexes them (=> queries can be executed). Automatic indexing of documents without having to think about the schema and secondary indexes is one of the main features of DocumentDB. At the same time, a stable-stable number of very fast write operations is ensured during successive requests.
Automatic indexing can be slightly corrected by choosing an indexing policy and thus gaining on performance and storage. You can either turn off automatic indexing altogether, or select only some documents that will be indexed (and choose which ones will NOT be indexed) and choose between synchronous (consistent) and asynchronous (lazy) modes (by default, the index is updated synchronously on each Insert operation, Replace or Delete, this behavior can be corrected for “lazy” mode and, possibly, get some performance benefit with, for example, collections with a large number of read operations).
Multi-document transactions
RDBMS usually write business logic using stored procedures and triggers, launching as a transaction, which imposes on the developer the need to know two different development languages - the application project development language (JS, Python, etc.) and T-SQL. In DocumentDB, however, a JS program execution model is available for collections in the form of stored procedures and triggers, which allows you to perform effective concurrency control, indexing and not be distracted by an abundance of application tools.
DocumentDB independently wraps this logic in an Ambient ACID transaction with snapshot isolation and, if it throws an exception in the JS process, the whole transaction is rolled back. JS execution takes place inside the engine in the same address space as the Buffer Pool, which has a good effect on performance.
function businessLogic(name, author) {
var context = getContext();
var collectionManager = context.getCollection();
var collectionLink = collectionManager.getSelfLink()
// создаем документ.
collectionManager.createDocument(collectionLink,
{id: name, author: author},
function(err, documentCreated) {
if(err) throw new Error(err.message);
// фильтруем документы по автору
var filterQuery = "SELECT * from root r WHERE r.author = 'George R.'";
collectionManager.queryDocuments(collectionLink,
filterQuery,
function(err, matchingDocuments) {
if(err) throw new Error(err.message);
context.getResponse().setBody(matchingDocuments.length);
// заменяем автора
for (var i = 0; i < matchingDocuments.length; i++) {
matchingDocuments[i].author = "George R. R. Martin";
// we don’t need to execute a callback because they are in parallel
collectionManager.replaceDocument(matchingDocuments[i]._self,
matchingDocuments[i]);
}
})
})
};
All this is successfully wrapped in transactional execution through HTTP POST.
client.createStoredProcedureAsync(collection._self, {id: "CRUDProc", body: businessLogic})
.then(function(createdStoredProcedure) {
return client.executeStoredProcedureAsync(createdStoredProcedure.resource._self,
"NoSQL Distilled",
"Martin Fowler");
})
.then(function(result) {
console.log(result);
},
function(error) {
console.log(error);
});
Our hero understands JSON and JS out of the box, so no problems with types occur. Learn more - Azure DocumentDB REST APIs .
Stored Procedures, Triggers, and UDFs
As already mentioned, business logic can be written entirely in JS as a stored procedure, trigger, or UDF. An application on JS can be registered for execution for triggers, stored procedures and UDFs, triggers and stored procedures can CRUD, while UDFs do not have write access, and permission is only for performing simple operations, for example, transfers and creating a new set of results based on previous operation. Each procedure, trigger and UDF at the same time use a fixed amount of resources, while not being able to access external JS libraries. When the allocated resources are exceeded, operations are blocked.
You can register a procedure, trigger, and UDF for execution using the REST API, and after registration, a stored procedure, trigger, or UDF is precompiled and stored as a byte code that is executed.
Register stored procedures
Registering a stored procedure = creating a resource for a new procedure and assigning its collection with HTTP POST.
var storedProc = {
id: "validateAndCreate",
body: function (documentToCreate) {
documentToCreate.id = documentToCreate.id.toUpperCase();
var collectionManager = getContext().getCollection();
collectionManager.createDocument(collectionManager.getSelfLink(),
documentToCreate,
function(err, documentCreated) {
if(err) throw new Error('Error while creating document: ' + err.message;
getContext().getResponse().setBody('success - created ' +
documentCreated.name);
});
}
};
client.createStoredProcedureAsync(collection._self, storedProc)
.then(function (createdStoredProcedure) {
console.log("Successfully created stored procedure");
}, function(error) {
console.log("Error");
});
Executing a stored procedure
The execution of the stored procedure is done again with HTTP POST with the transfer of the necessary parameters in the request body.
var inputDocument = {id : "document1", author: "G. G. Marquez"};
client.executeStoredProcedureAsync(createdStoredProcedure.resource._self, inputDocument)
.then(function(executionResult) {
assert.equal(executionResult, "success - created DOCUMENT1");
}, function(error) {
console.log("Error");
});
Trigger Registration
Registering a trigger = creating a new resource for a collection with HTTP POST, and in the process you can specify whether the trigger will be called before or after and the type of operation to be performed (CRUD).
var preTrigger = {
id: "upperCaseId",
body: function() {
var item = getContext().getRequest().getBody();
item.id = item.id.toUpperCase();
getContext().getRequest().setBody(item);
},
triggerType: TriggerType.Pre,
triggerOperation: TriggerOperation.All
}
client.createTriggerAsync(collection._self, preTrigger)
.then(function (createdPreTrigger) {
console.log("Successfully created trigger");
}, function(error) {
console.log("Error");
});
You can unregister a trigger by performing an HTTP DELETE on the trigger resource.
client.deleteTriggerAsync(createdPreTrigger._self);
.then(function(response) {
return;
}, function(error) {
console.log("Error");
});
Attachments
In DocumentDB, you can store binary files (blobs) that look like special entities - attachments. Attachment is a special document (JSON) that refers to a real file. For example: The
contents of a book are in storage in DocumentDB or any other.
An application can store the metadata of each user as a separate document - for example, Alex for book1 will be available at / colls / alex / docs / book1.
Attachments points to the pages of the book, i.e. / colls / alex / docs / book1 / chapter1, chapter2, etc.
Summary
In this introductory article, we looked at the very basic principles and concepts of DocumentDB. The service is new, so we ourselves are actively studying it and hope that soon we will be able to present some beautiful example of use. Stay in touch :)
useful links
- Try Azure for 30 days for free!
- Learn Microsoft Cloud and Other Virtual Academy courses
- Download Free or Trial Visual Studio
- Microsoft Azure Development Center (azurehub.ru) - scripts, guides, examples, development recommendations
- Twitter.com/windowsazure_ru - Latest Microsoft Azure News
- Microsoft Azure Community on Facebook - Experts, Questions
- Become a Universal Windows Developer