Configuring MongoDB ShardedCluster with X.509 Authentication
Good day to all! Recently, life has given the author an exciting job of deploying a MongoDB cluster with replication and sharding settings, as well as authentication using x.509 certificates. In this article, I would first of all like to express my thoughts and share my experience. Since some things were not trivial and could not be done the first time, I think my step-by-step instructions may come in handy for those who are only familiar with data sharding and working with MongoDB in general.
I will also be very happy to see recommendations on adding / changing the cluster configuration and just questions or criticism on the article itself or on the subject matter.
The project within the framework of which the cluster was introduced is a service for collecting statistics on customer devices and aggregating its provision on the site or through the Rest-API. The project has been working stably under low load for a long time and, as a result, the MongoDB server installed as is “out of the box” (without sharding and data replication) did its job perfectly, and a “quiet sleep” was provided by daily backups of the database on the crown. The thunder struck as usual one fine moment after the arrival of several large clients with a large number of devices, data and requests. The consequence was an unacceptably long execution of queries to an older database, and the culmination was a disruption of the server when we almost lost data.
Thus, overnight, the need arose to carry out work to increase fault tolerance, data security, and productivity with the possibility of future scalability. It was decided to use the existing potential of MongoDB to eliminate the problems that have arisen, namely, to organize a sharded cluster with replication and migrate existing data to it.
To begin with, we’ll take a look at ShardedCluster MongoDB and its main components. Sharding as such is a method of horizontal scaling of computing systems that store and provide access to data. Unlike vertical scaling , when system performance can be increased by increasing the performance of a single server, for example, by switching to a more powerful CPU, adding the amount of available RAM or disk space, sharding works by distributing the data set and load between several servers and adding new servers as needed (this is just our case).
The advantage of such scaling is its almost endless potential for expansion, while a vertically scalable system is deliberately limited, for example, by the available hardware from the hosting provider.
What is expected from switching to a MongoDB shard cluster? First of all, it is necessary to obtain the load distribution of read / write operations between the shards of the cluster, and secondly, to achieve high fault tolerance (constant data availability) and data safety due to excessive copying (replication).
In the case of MongoDB, data sharding occurs at the collection level, which means that we can explicitly specify which collection data should be distributed among existing cluster shards. This also means that the entire set of documents of the shardable collection will be divided into equal-sized parts - chunks (Chunk), which subsequently will be almost equally migrated between the charades of the cluster by the monga balancer.
Sharding for all databases and collections is disabled by default, and we cannot shard cluster system databases, such as admin and config . When we try to do this, we will get an unequivocal refusal from Monga:
The shuffled MongoDB cluster presents us with three prerequisites: in fact, the presence of shards in it , all communication between the cluster and its clients should be carried out exclusively through mongos routers, and the server must have a config server (based on an additional mongod instance, or as recommended based on Replica Seta).
The official mongodb documentation says “In production, all shards should be replica sets.” . Being a replica set, each shard, due to multiple copying of data, increases its fault tolerance (in terms of data availability at any instance of the replica) and, of course, ensures its best preservation.
Replica(Replica Set) is a combination of several running mongod instances that store copies of the same data set. In the case of a shard replica, this will be a set of chunks passed to this shard by the monga balancer.
One of the replica instances is assigned as the main one ( PRIMARY ) and accepts all data write operations (while maintaining and reading), the remaining mongods are then declared SECONDARY and, in asynchronous communication with PRIMARY, update their copy of the data set. They are also available for reading data. As soon as PRIMARY for some reason becomes inaccessible, ceasing to interact with other members of the replica, a vote is announced among all available members of the replicafor the role of the new PRIMARY. In fact, in addition to PRIMARY and SECONDARY, there can be another third kind of participant in the Replica Set - this is the arbiter ( ARBITER ).
The arbiter in the replica does not fulfill the role of copying the data set; instead, he solves the important task of voting and is designed to protect the replica from the dead end of voting. Imagine a situation in which there is an even number of participants in the replica and they vote in half for two applicants with the same total number of votes, and so endlessly ... Adding an arbitrator to such an “even” remark, he will decide the voting outcome by casting his vote to one or another candidate for “ position ”PRIMARY, without requiring resources to service another copy of the data set.
I note that the Replica Set is the union of exactly mongod instances, that is, nothing prevents you from assembling a replica on one server, specifying folders located on different physical media as data storages and achieving some data security, but still an ideal option - This is a replica organization with the launch of mongod on different servers. In general, in this regard, the MongoDB system is very flexible, and allows us to assemble the configuration we need based on our needs and capabilities, without imposing strict restrictions. Replica Set as such, outside the context of Sharded Cluster, is one of the typical MongoDB server organization schemes, which provides a high degree of fault tolerance and data protection. In this case, each participant in the replica stores a full copy of the entire database data set, and not part of it,
The cluster configuration below is built on three OpenVZ virtual containers (VBOs). Each of the virtual machines is located on a separate dedicated server.
Two virtual machines (hereinafter server1.cluster.com and server2.cluster.com ) have more resources - they will be responsible for the replication, sharding and provision of data to clients. The third machine ( server3.cluster.com ) has a weaker configuration - its purpose is to ensure the operation of instances of mongod arbiters.
In the constructed cluster, we now have three charades. In our scheme, we followed the recommendation of building shards based on replica sets, but with some assumption. Each shard-replica set of our cluster has its own PRIMARY, SECONDARY and ARBITER, running on three different servers. There is also a config server built also using data replication.
However, we have only three servers, one of which does not perform data replication functions (only in the case of a config replica) and therefore all three shards are actually located on two servers.
In the diagrams from the Mongi documentation, Mongos are depicted on application servers. I decided to break this rule and place the Mongos (we will have two) on the data servers: server1.cluster.com and server2.cluster.com, getting rid of the additional configuration of mongodb on application servers and due to certain restrictions associated with application servers. Application servers have the ability to connect to either of the two Mongos, so in case of problems with one of them, they will reconnect to the other after a short timeout. Application servers, in turn, sit behind the DNS-th on which Round Robin is configured. It alternately issues one of two addresses, providing a primitive balance of connections (client requests). There are plans to replace it with some “smart” DNS (maybe someone will tell you a good solution in the comments, I will be grateful!
For clarity, I give a general diagram of the formed cluster with server names and applications running on them. Colons indicate the designated application ports.

We go to server1.cluster.com and install the latest version of the MongoDB Community Edition package from the official repository. At the time of assembly, the cluster is version 3.2.8. In my case, the Debian 8 operating system is installed on all cluster machines, you can find detailed installation instructions on your OS in the official documentation .
We import the public key into the system, update the list of packages and install the mongodb server with a set of utilities:
Done! As a result of the actions performed, we obtain on our machine MongoDB a server that is already up and running. For now, disable the mongod service (we will return to it):
Next, create a directory in which we will store all the data of our cluster, I have it located on the path “/ root / mongodb”. Inside we form the following directory structure:
In the data subdirectory, we will directly store the data of our replicas (including the config replica). In cfg, we will create configuration files to run the necessary mongo {d / s} instances. In keys, we copy the keys and certificates for x.509 authentication of cluster members. The purpose of the logs folder, I think everyone understands.
Similarly, the procedure with installation and directories must be repeated on the remaining two servers.
Before proceeding to the configuration and linking of the components of our cluster, make sure that everything works as we need. Run the mongod instance on port 27000, specifying the directory for the data in “/ root / mongodb / data / rs0”:
On the same server, open another terminal and connect to the running mongod:
If everything went well, we will end up in shell mongodb and can execute a couple of commands. By default, mong will switch us to the test database, we can verify this by entering the command:
Delete the database that we do not need with the command:
And we will initialize a new database with which we will experiment by switching to it:
Now try to enter the data. In order not to be unfounded, I propose to consider all further operations in the article using the example of a system of collecting certain statistics, where data from remote devices periodically arrives that are processed on application servers and then stored in a database.
We add a couple of devices:
Here,
s is the serial number of the sensor;
n is its string identifier;
o - current status (online / offline);
ip - sensor ip address;
a is the time of the last activity;
e is a sign of an error;
And now a few statistics records of the form:
s is the sensor number;
ts - TimeStamp;
param1..param6 - some statistics.
Clients of the statistical analytics service often perform aggregated queries to obtain some representative data on statistics collected from their devices. In almost all requests, the “sensor serial number” (field s) is involved. Sorts and groupings are often applied to it, so for optimization (as well as sharding) we add an index to the statistics collection:
The selection and creation of the necessary indexes is a topic for a separate discussion, but for now I will limit myself to this.
In order to understand the problem, let's run a little ahead and imagine the mongod instances running on different servers, which need to be combined into a replica, connect mongos to them, and provide for the possibility of securely connecting clients to the formed cluster. Of course, participants in the data exchange must be authenticated when connecting (be trusted) and it is desirable that the data channel is also protected. In this case, MongoDB has TSL / SSL support, as well as several authentication mechanisms. One of the options for establishing trust between the participants in the data exchange in the cluster is the use of keys and certificates. Regarding the choice of a mechanism that uses this option, there is a recommendation in the Mongi documentation:
“Keyfiles are bare-minimum forms of security and are best suited for testing or development environments. For production environments we recommend using x.509 certificates . ”
X.509 is the ITU-T standard for public key infrastructure and privilege management. This standard defines the format and how public keys are distributed using signed digital certificates. The certificate associates the public key with some subject - the user of the certificate. The reliability of this relationship is achieved through a digital signature, which is performed by a trusted certification authority.
(In addition to x.509, MongoDB also has highly reliable Enterprise-level methods - these are Kerberos Authentication and LDAP Proxy Authority Authentication), but this is not our case and here we will consider the configuration of x.509 authentication.
The authentication mechanism using x.509 certificates requires a secure TSL / SSL connection to the cluster, which is enabled by the corresponding argument to start mongod --sslMode , or by the net.ssl.mode parameter in the configuration file. The authentication of the client connecting to the server in this case comes down to the authentication of the certificate, and not the login and password.
In the context of this mechanism, the generated certificates will be divided into two types: cluster member certificates - are tied to a specific server, intended for internal authentication of mongod instances on different machines, and client certificates- are tied to a separate user, designed to authenticate external clients of the cluster.
To fulfill the conditions of x.509, we need a single key - the so-called “ Certificate Authority ” Certificate Authority (CA) . Based on it, both client and cluster member certificates will be issued, therefore, first of all, we will create a secret key for our CA. It will correctly perform all of the following actions and store secret keys on a separate machine, but in this article I will perform all the actions on the first server (server1.cluster.com):
On the proposal to introduce a secret phrase, we introduce and confirm some reliable combination, for example, “temporis $ filia $ veritas” (of course you will have something different and more complicated). The phrase must be remembered, we will need it to sign each new certificate.
Next, we create a CA certificate (right after launching the command we will be asked to enter the secret phrase from the key that we specified (in the “key” parameter):
I draw your attention to the days parameter - it is responsible for the duration of the certificate. I’m not sure who and how much will be involved in the project that I am currently working on, so in order to exclude unpleasant surprises, we indicate the certificate 36,500 days of life, which corresponds to 100 years (very optimistic, isn't it?).
After checking the phrase, we will be asked to enter information about the organization that owns the certificate. Imagine that our large organization is called “SomeSysyems” and is located in Moscow (the information entered follows after the colons):
Excellent! CA is ready and we can use it to sign client certificates and certificates of cluster members. I add that the reliability of the entered data does not affect the functionality of the CA certificate itself, however, the signed certificates will now depend on the entered values, which will be discussed later.
The procedure for creating certificates for cluster members (certificates for external clients will be considered separately) is as follows:
We create a private key and a certificate request for our first server (server1.cluster.com). I’ll pay attention to an important detail, when filling out all the fields remain the same as for the root certificate, with the exception of CN (Common Name). It must be made unique for each certificate. In our case, the FQDN (Fully Qualified Domain Name) of a specific server will be indicated as the value :
Extra fields I left empty. If you decide to specify an additional password (A challenge password [] :), then in the mongod configuration you will need to specify a password for this certificate for which the parameters net.ssl.PEMKeyPassword and net.ssl.clusterPassword are responsible . (Details on these parameters are in the documentation here ).
Next, we will sign the CSR file with our CA certificate and get a public certificate (* .crt file):
Now we need to make a .pem file:
We will use the PEM file directly when launching mongod instances, and we will indicate it in the configuration.
Now you need to repeat the operation to create a certificate for the remaining servers. For a complete understanding, I quote all the commands:
(extra fields weren’t filled out) We
sign the CSR file with our CA certificate to receive the public certificate (* .crt file) of the second server:
Now we need to make a .pem file:
And similarly for the third server certificate:
(extra fields weren’t filled out) We
sign the CSR file with our CA certificate to receive the public certificate (* .crt file) of the third server:
Create a PEM file:
I repeat that all the keys and certificates were created by me on the first server and then, if necessary, moved to the corresponding server. Thus, each of the three servers should have a public CA certificate (mongodb-CA-cert.crt) and a PEM file for this server (server <$ N> .pem).
For the correct launch, we need to pass a number of parameters to the mongod instances. To do this, you can use the configuration file, or pass all the necessary values as arguments to the terminal command. Almost all configuration options are reflected in the corresponding command line arguments. In my opinion, the option with a config file is more justified, since a separate structured file is easier to read and supplement, and in this case, launching an instance of a program is reduced to passing it a single argument - the location of the configuration file:
So, create a configuration file for the mongod instance of the first shard replica (rs0) on the first server:
We create a similar file for the second shard replica (rs1), but change the port, replica name, location of the data directory and the log file:
And by analogy for the third replica (rs2):
In addition to the instances organizing three shard replicas, in our cluster there will be mongods providing the operation of the configuration server, which will be built on the basis of the replica (rscfg).
It is worth explaining that the role of the config server can be performed by one mongod (as well as with the shard), but to ensure reliability and fault tolerance, it is recommended to make the config server also based on the Replica Set.
The service replica config file differs from data replicas by the presence of the “sharding.clusterRole” parameter which tells the mongod instance its special purpose:
Now we need to copy all the created configuration files to the other servers. After copying, do not forget to change the values in the parameters net.ssl.PEMKeyFile and net.ssl.clusterFile in which the certificates of the corresponding server should be specified (server2.pem, server3.pem).
On the first server, run mongod on port 27000, without specifying a “combat” configuration file — only the port and the data directory. This is done so that the launched mongod instance does not yet consider itself a member of the replica and also does not impose strict requirements for the connection and authentication, which we specified in the configuration files:
Next, we need to connect to the running mongod and add the superuser of the future replica, so that in the future, after enabling the authorization specified in our config file, we have the right to change the replica, including the initialization. As practice has shown, the inclusion of x.509 authorization does not prohibit us from adding traditional users to the database (authenticated by login and password). Nevertheless, I decided not to resort to this opportunity, but to use the x.509 mechanism everywhere both at the cluster level and in the formation of replicas. To make it clear, I’ll say that the user we are creating now is a user at the level of this replica. From other replicas and at the cluster level, it will not be available.
For the new user, we will need to create another certificate, just as we already did in the “x.509 authentication” section. The difference between this certificate is that it is not tied to a cluster member (mongod instance or server), but to the account. In other words, we will create a client certificate. This certificate will be tied to the superuser (root role) replica of the set of the first shard (rs0). MongoDB built-in roles can be found in this section of the official documentation.
We need to go to our CA server. And generate another key and certificate signing request:
We sign the certificate (again we will need the secret phrase from the CA key):
Create a PEM file:
I would like to draw your attention to the Organization Unit Name (OU) parameter, namely, when generating client certificates, it must be different from the one we specified when generating the certificates of the Cluster Members. Otherwise, when adding a user to the cluster that contains subject (explained below) with OU equal to what the cluster members have in their certificates, the monga may refuse you with an error:
The user for authorization using the x.509 mechanism is added in a somewhat unusual way, we need to specify not its name and password, but the identifier (subject) of the certificate that corresponds to it. You can get subject from the PEM file by running the command:
In the output, we are interested in the contents of the line starting with “subject =” (without the “subject =” itself and the space). Connect to the mongod and add the user:
$ external is the name of the virtual database used to create users whose credentials are stored outside MongoDB, for example, as in our case (a certificate file is used for authentication).
Now we will exit the monga shell and restart mongod, now with the appropriate configuration file. The same thing needs to be done on the second and third servers. Thus, all the Mongodes of the first replica (rs0) should be started.
We connect to the mongod using the certificate of the created superuser of the replica (rsroot) and pass authentication by specifying the subject name as the user name:
Initialize our replica:
Pay attention to the arbiterOnly parameter for the third server, which we at the very beginning agreed to make an “arbitrators server”.
Having reconnected to the mongod, by the prefix “rs0” in the shell we will see that now it belongs to the replica of the same name:
rs0: PRIMARY (Your current server can be selected SECONDARY).
In a similar fashion, two more data replicas need to be linked.
1. Run the Mongodes without the config on the first server (the port and the data directory have changed):
2. Connect to the running mongod and add the replica superuser (rs1). I will use the same certificate for all replicas, so subject is used the same as the first replica:
3. Restart mongod on the first server, specifying the configuration file. On the second and third servers, we also raise the Mongod with the corresponding config:
4. Connect to the Mongod with the certificate, pass authentication and initialize the rs1 replica:
Repeat the procedure for the third replica (rs2).
1. Run the Mongodes without the config on the first server (do not forget to change the port and data directory):
2. Connect to Mongod and add the replica superuser (rs2):
3. Restart the Mongod on the first server with the configuration file. On the second and third servers, we also raise the Mongod with the corresponding configs:
4. Connect to the Mongod with the certificate, pass authentication and initialize the rs2 replica:
I decided to highlight the configuration replica set of the configuration server since it has a couple of features that will require some additional steps. Firstly, all users of which we add a replica to the config will be available at the cluster level through the Mongos, so for it I will create separate users tied to individual certificates. Secondly, monga does not allow creating arbitrators as part of a config replica. If you try to do this, you will receive an error message:
For this reason, we will have two SECONDARY instances / mongods in the config replica. Let's create another certificate for the superuser of the rscfg replica, which, as I said, will also be a root at the cluster level.
1. Launch mongod without config on the first server:
2. Connect to mongod and add replica superuser (rscfg) .:
3. Restart mongod on the first server with the config file. On the second and third servers, we also raise the Mongod with the corresponding configuration file:
4. We connect to the Mongod with the certificate, pass authentication and initialize the config replica (rscfg):
Our config server based replica set is ready. Now you are ready to start mongos and connect to the cluster.
The goal of Mongos is to provide an access point to cluster data (moreover, clients can access cluster data only through mongos). In the diagrams from the MongoDB documentation, Mongos are depicted running on application servers. In the cluster structure I represent, there are two mongos instances running directly on server1.cluster.com and server2.cluster.com.
First of all, as well as for mongod, we will create a configuration file that we will transfer to our Mongos at startup.
The main difference between the mongos settings and the mongod is that the Mongos do not have a data directory, since they do not store, but only proxy data. The Mongos receive all the necessary information about the configuration and state of the cluster from the config collection of the config server. Mongos learns about how to connect to the config server through the sharding.configDB parameter. Since our config server is based on a replica of the set, we specify it in the replica format: the name of the replica itself, a slash, and then a list of hosts with their ports, separated by commas. We will launch the Mongos at the default port of Monga - 27017.
Copy the configuration file to both servers (specifying the corresponding PEM certificates) and run in the command:
Check the correctness of our actions - connect to mongos and authenticate with the root user certificate, which we added to the config replica (remember that the user of the replica replica is a cluster user).
by the inscription “mongos>” we see to whom we are connected, then everything is OK.
(we expect to see the affirmative “1” in the output)
In general, the monga “does not like” when they connect to it from root and in this case will notify you that this is not worth doing for security reasons. Therefore, when working with a real cluster, I also recommend adding a user (naturally with a separate certificate) endowed with the built-in userAdminAnyDatabase role . This role has almost all the rights necessary to perform administrative tasks.
I think here it is worth giving an example of creating another user’s certificate. This user will have access only to the analytics database , and all applications of our service will be connected to the cluster on his behalf.
So, we’ll go to the directory where our Certificate Authority is located and create a key and a certificate signing request for a new user, which we will call analyticsuser :
We sign the certificate:
Create a PEM file:
Let's see which subject has our certificate:
Connect to the cluster (Mongos) as a user with administrative rights and add a new user:
Please note that we gave the user analyticsuser read and write permissions for one analytics database only. This will protect the cluster from possible (reckless or malicious) actions from external applications to the settings of the analytics database itself and the cluster as a whole.
Sharding in our case will share the high-loaded statistics collection at a given index - the Shard key between several shards, which we will add soon. When sharding is activated for a collection, the entire set of its documents will be divided into n parts, called Chunks . The number of chunks into which the collection will be divided when sharding is enabled and how often new chunks will be formed depends on the amount of data in your collection, as well as on the chunksize parameter which affects the size of the chunk and defaults to 64 Mb. If you want to specify a different chunk size in your cluster, then this must be done before activating sharding on these collections, because The new chunk size will only be applied to newly formed chunks.
In order to change the size of the chunk, we will connect to the Mongos with a superuser certificate and pass authentication. In general, authentication can be combined with the input by specifying its mechanism ( authenticationMechanism argument ), a database that is responsible for certificate authentication ( authenticationDatabase ) and directly to the user who owns the certificate ( u ). For our superuser (root), the “connect + authentication” command will take the following form:
After a successful login, select the config collection and change the desired parameter:
We just set the chunk size to a jerky 32 Mb. You can check the current value of this setting with the command:
In order to manage shards (you must first add them), you need to connect as a user with the built-in clusterAdmin role . Create a certificate for the cluster administrator:
Nothing unusual for us, just do not forget to specify an OU different from the OU indicated for the cluster members.
Now connect to Mongos again and authenticate as root, and add a new user - the cluster administrator:
We reconnect to mongos under the cluster administrator (authentication is included in the connection command):
Add shards that are specified in the replica of sets, excluding arbiter instances:
If everything went well with the addition of shards, we can see the current status with the sharding command:
we see our shards, we see the state of the balancer - it is turned on, but it is idle now because it does not yet have data for the migration of chunks that it would distribute between the available shards. This is what the empty “databases" list tells us. Thus, we built a shard cluster, but by default all shards of all databases have sharding disabled. It is included in two stages:
Check the result:
The analytics database should appear in the list of databases, we also see that the “rs2” shard was assigned to this database as a Primary shard (not to be confused with the PRIMARY Set replica). This means that all collection documents with sharding disabled will be entirely stored on this Primary shard (rs2).
As mentioned earlier, to split the entire collection of documents of a shardable collection into chunks, a monga needs a key index - a sharding key. His choice is a very responsible task, which must be approached wisely, guided by the requirements of your implementation and common sense. The index by which the collection will be divided into chunks is selected from existing indices, or is added to the collection intentionally. One way or another, at the time sharding is enabled, an index corresponding to the key must exist in the collection. The sharding key does not impose special restrictions on the corresponding index. If necessary, it can be made compound, for example {"s": 1, "ts": -1} .
Having decided on the index that we need, we create it and specify it as the sharding key for the statistics collection in the analytics database. As I said before, the most representative field of our statistics collection is the sensor identifier, the field s . If you have not yet created the corresponding index in the collection, then it's time to create it:
Turn on collection sharding with the key sharding index:
From now on, we can really talk about sharding data in our cluster. After enabling sharding for the collection, it will be divided into chunks (the amount depends on the size of the data and the size of the chunk itself), which will initially be in the PRIMARY shard, and then will be divided between other shards during the balancing (migration) process. The balancing process in my opinion is very leisurely. In our case, a collection of 3M records was distributed between the three shards for more than a week.
After some time, let's run the sh.status () command again and see what has changed:
In the analytics database for which we previously enabled sharding, a statistics collection appeared in which we see the current shard key. Also in the output you can find the distribution of chunks by shards, and if you have a small number of chunks in the collection, then you will also see a brief summary of the chunks. Also in the balancer section we can see information about successful migration of chunks, or about errors for the last day.
After installing the standard MongoDB Community package, the mongodb service appears on our system, representing the “boxed" version of the server. This service starts by default after installing MongoDB.
Starting the service provides a demonization script located along the path: /etc/init.d/mongod. As you may have noticed, we need to run several instances of mongod and one mongos for the data servers server1.cluster.com and server2.cluster.com on the same machine.
At first glance, there is a ready-made solution using the /etc/init.d/mongod script as an example, but the option using the supervisor utility seemed more convenient and transparent to me.
Supervisor also gives us a small plus in the form of the ability to simultaneously start and stop all our mongo {d / s} 's commands:
(provided that the machine no longer has other applications launched by the supervisor - as in our case).
The supervisor package is installed on most operating systems of the linux family from the standard repository, in my case (Debian 8) the command will be relevant:
In order to configure the supervisor we need to create a configuration for each launched application in a separate configuration file, or combine all the configurations into one.
Here is an example mongod configuration for rs0 replica:
In square brackets we define the identifier of the application that we will use to start or stop. The command parameter actually sets the command that the supervisor needs to execute - mongod receiving the configuration file. Next, indicate the user on whose behalf the process will be launched. Parameter stdout_logfile - sets the path to the output file to which supervisor will write. This is useful when something goes wrong, and you need to understand why the supervisor does not start the application.
redirect_stderr tells the supervisor to redirect the error stream to the same log file as we specified above. Next, be sure to include the options autostart and autorestartin case of unauthorized server restart and the process itself crash.
It will also be useful to change the stopwaitsecs parameter , which will cause the supervisor to wait for the specified number of seconds when the application stops. By default, when the application stops, the supervisor sends a TERM signal, then waits 10 seconds. If after their expiration the application has not completed, it already sends a KILL signal, which cannot be ignored by the application and theoretically can lead to data loss. Therefore, it is recommended to increase the default interval for waiting for the application to complete.
The generated configuration file must be put in the appropriate supervisor directory, as a rule in linux OS it is /etc/supervisor/conf.d/ .
When everything is ready, you need to update the supervisor configuration with the command:
Stopping, starting and checking the state of a configured application is performed accordingly by the commands:
After switching to using supervisor, it is important to prevent the standard mongodb service from starting, which may take port 27017 (for example, after restarting the server) on which we run mongos. To do this, you can simply remove the /etc/init.d/mongod script.
The most loaded collection of our database at the time of migration totaled a little more than 3M records, and during tests, the inclusion of sharding for such a collection (sh.shardCollection () command) performed fine. However, tests were also conducted on an artificially generated database with 100M of similar records. At such a volume, the sh.shardCollection () command ends after a while with the “timeout” error. The solution to this situation is the following procedure:
Step 1. Import the entire database onto the cluster;
Step 2. On the production server or already on the cluster, create a dump of a separate “large” collection, for example:
Step 3. Delete the “large” collection on the cluster:
Step 4. Create an empty “large” collection and add an index to it, by which we will shard:
Step 5. Turn on the sharding of the collection with the sharding key:
Step 6. And now we import the collection data:
This technique worked for me, but keep in mind that exporting / importing a large collection in json format is not a fast process.
Creating a backup copy of all components of a sharded cluster is a very complicated procedure, which requires that the balancer be in the off state (it cannot be stopped forcibly during migration), blocking SECONDARY nodes on each shard for their subsequent backup. You can read more about performing a full backup in the official documentation .
For ourselves, we solved the backup problem by periodically creating the usual data dump of the necessary databases. I will describe the implementation of this procedure here. We will
back up the analytics database using the mongodump utility, which is part of the MongoDB Community package.
MongoDB has a special built-in backup role that has a minimal set of rights to perform data backup. To perform this procedure, we will create an individual user and, by tradition, first generate an x.509 certificate for him. I will not give the entire procedure for generating a certificate, it has been repeatedly shown in the article, I will only say that you should get the following subject:
Now connect to the cluster and create the backuper user with the built-in backup role:
After creating the user, you can try to back up our analytics database. The command arguments for the mongodump utility are similar to connecting with authentication, only the database name ( --db ) is specified additionally , the directory where the dump ( -o ) will be saved , as well as the --gzip argument indicating that all dump files should be compressed:
At the end of the article I want to share examples of program code where I will demonstrate connection to the created cluster. Since our service working with the cluster consists of many parts written in C ++ and Python, the examples will be in these wonderful programming languages.
So, let's start with a C ++ example. The connection example below is relevant for the official MongoDB driver mongodb-cxx-driver-legacy-1.1.1 .
Before connecting to the database host, we need to initialize the client using the mongo :: client :: Options structure , specifying the SSL request level (kSSLRequired) , the public CA certificate (mongodb-CA-cert.crt) , and the attached PEM file to the cluster user (in this case, this is the analyticsuser we created earlier).
Next, we connect to the database and if everything goes through, we successfully authenticate. Pay attention to the name of the database through which authentication passes - “$ external”, as the name we pass the subject from the user certificate, do not forget to specify the authentication mechanism. We also see that we do not transmit the password since our authentication is external - through certificate authentication.
In the web part of the project written in Python, the pure pymongo driver is involved, and the object model is formed using the mongoengine framework.
To get started, an example for pymongo:
Nothing special - we also transfer a public CA certificate and a client PEM file. The db_hosts variable deserves attention here - this is actually a connection string in which addresses and ports on which Mongos are available are separated by commas. The port parameter (db_port), you can not specify in our case, I gave it for clarity. The pymongo driver, connected in this way if the first address is unavailable, will automatically attempt to reconnect to the second address and vice versa. Practice shows that if both servers are available at the first connection, addresses are selected in order, i.e. the first will be a connection to server1.cluster.com:27017 .
However, when testing this pymogo behavior, it was noticed that automatic reconnection is preceded by the generation of the pytmogo.errors.AutoReconnect exception. To handle this situation, a small decorator was written that allows you to wrap, for example, the functions of displaying a statistics page or API request to read data:
from functools import wraps
from pymongo.errors import AutoReconnect
import time
The decorator gives a number of attempts to execute the function (in this case, 5) and, having spent all attempts, it ends with an exception.
You should also say a few words about the read_preference parameter from the connection example. read_preference tells the driver which data reading rule to use in this connection (writing is always done in PRIMARY, which is logical). The following possible values are available:
PRIMARY - always read data from the primary member of the shard replica; PRIMARY_PREFERRED - read from the primary member of the shard replica, but if it is impossible to read from secondary;
SECONDARY - read only from a secondary member of the shard;
SECONDARY_PREFERRED - read as much as possible from the secondary shard, but if it is impossible from primary;
NEAREST - read from whatever is available (as stated in the pymongo documentation), and the documentation of the monga itself is described in detail that it is not just the first participant in the replica that is used, but the one with the least network delay - it's just a ping, no matter who provides the primary or secondary data.
Thus, this parameter on the one hand gives us the opportunity to offload the PRIMARY instances from the load of requests for reading, but on the other hand it can lead to irrelevant / inconsistent data, because SECONDARY instances somehow have a delay in synchronizing with PRIMARY (it depends on the configuration of your replica and lag). Therefore, you should choose this option with caution and based on the assumptions and limitations of your system.
It should also be noted that if it is not possible to fulfill the PRIMARY or SECONDARY preferences, pymongo will throw an OperationFailure exception, so this behavior must be taken into account when using these options.
With the mongoengine package, everything turned out to be more sad. The first thing I saw in the project was the connection point to the database through the mongoengine package:
OK, I thought: “Now I will transfer the remaining connection parameters to mongoengine.connect as it was with pymongo and this is decided.” But my aspirations were in vain since I did not find the parameters I needed in mongoengine.connect - it is just a general wrapper for a function with a wider list of arguments: mongoengine.register_connection. Among the parameters of this function, there was also no necessary through which MONGODB-X509 authorization mechanism could be transferred to the connection. I made a couple of vain attempts in the hope that the framework will “understand” what is required of it, but delving into the source code, I was convinced of the lack of even support, and the inability to “forward” the necessary mechanism to mogoengine where pymongo understands it (on which it is actually based mongoengine).
It turned out that on github for this drawback a similar ticket was already brought up , which was not brought to the end, so I decided to make my own fork and add everything I needed.
Thus, the connection with x.509 authentication took the following form:
Unfortunately, so far I have not been able to merge with the main MongoEngine repository, as tests fail on all python / pymongo combinations. In the latest pool requests of many developers, I noticed similar problems with the same tests, so the thought of a possible malfunction in the “stable” branch of the framework creeps in.
I hope that in the near future the situation will improve, it will be possible to understand the problem, and authentication support for x.509 will appear in the main MongoEngine repository.
Update
Support for an authentication mechanism option has been added to the official version of mongoengine.
I will also be very happy to see recommendations on adding / changing the cluster configuration and just questions or criticism on the article itself or on the subject matter.
Introduction
The project within the framework of which the cluster was introduced is a service for collecting statistics on customer devices and aggregating its provision on the site or through the Rest-API. The project has been working stably under low load for a long time and, as a result, the MongoDB server installed as is “out of the box” (without sharding and data replication) did its job perfectly, and a “quiet sleep” was provided by daily backups of the database on the crown. The thunder struck as usual one fine moment after the arrival of several large clients with a large number of devices, data and requests. The consequence was an unacceptably long execution of queries to an older database, and the culmination was a disruption of the server when we almost lost data.
Thus, overnight, the need arose to carry out work to increase fault tolerance, data security, and productivity with the possibility of future scalability. It was decided to use the existing potential of MongoDB to eliminate the problems that have arisen, namely, to organize a sharded cluster with replication and migrate existing data to it.
Bit of theory
To begin with, we’ll take a look at ShardedCluster MongoDB and its main components. Sharding as such is a method of horizontal scaling of computing systems that store and provide access to data. Unlike vertical scaling , when system performance can be increased by increasing the performance of a single server, for example, by switching to a more powerful CPU, adding the amount of available RAM or disk space, sharding works by distributing the data set and load between several servers and adding new servers as needed (this is just our case).
The advantage of such scaling is its almost endless potential for expansion, while a vertically scalable system is deliberately limited, for example, by the available hardware from the hosting provider.
What is expected from switching to a MongoDB shard cluster? First of all, it is necessary to obtain the load distribution of read / write operations between the shards of the cluster, and secondly, to achieve high fault tolerance (constant data availability) and data safety due to excessive copying (replication).
In the case of MongoDB, data sharding occurs at the collection level, which means that we can explicitly specify which collection data should be distributed among existing cluster shards. This also means that the entire set of documents of the shardable collection will be divided into equal-sized parts - chunks (Chunk), which subsequently will be almost equally migrated between the charades of the cluster by the monga balancer.
Sharding for all databases and collections is disabled by default, and we cannot shard cluster system databases, such as admin and config . When we try to do this, we will get an unequivocal refusal from Monga:
mongos> sh.enableSharding("admin")
{ "ok" : 0, "errmsg" : "can't shard admin database" }
The shuffled MongoDB cluster presents us with three prerequisites: in fact, the presence of shards in it , all communication between the cluster and its clients should be carried out exclusively through mongos routers, and the server must have a config server (based on an additional mongod instance, or as recommended based on Replica Seta).
The official mongodb documentation says “In production, all shards should be replica sets.” . Being a replica set, each shard, due to multiple copying of data, increases its fault tolerance (in terms of data availability at any instance of the replica) and, of course, ensures its best preservation.
Replica(Replica Set) is a combination of several running mongod instances that store copies of the same data set. In the case of a shard replica, this will be a set of chunks passed to this shard by the monga balancer.
One of the replica instances is assigned as the main one ( PRIMARY ) and accepts all data write operations (while maintaining and reading), the remaining mongods are then declared SECONDARY and, in asynchronous communication with PRIMARY, update their copy of the data set. They are also available for reading data. As soon as PRIMARY for some reason becomes inaccessible, ceasing to interact with other members of the replica, a vote is announced among all available members of the replicafor the role of the new PRIMARY. In fact, in addition to PRIMARY and SECONDARY, there can be another third kind of participant in the Replica Set - this is the arbiter ( ARBITER ).
The arbiter in the replica does not fulfill the role of copying the data set; instead, he solves the important task of voting and is designed to protect the replica from the dead end of voting. Imagine a situation in which there is an even number of participants in the replica and they vote in half for two applicants with the same total number of votes, and so endlessly ... Adding an arbitrator to such an “even” remark, he will decide the voting outcome by casting his vote to one or another candidate for “ position ”PRIMARY, without requiring resources to service another copy of the data set.
I note that the Replica Set is the union of exactly mongod instances, that is, nothing prevents you from assembling a replica on one server, specifying folders located on different physical media as data storages and achieving some data security, but still an ideal option - This is a replica organization with the launch of mongod on different servers. In general, in this regard, the MongoDB system is very flexible, and allows us to assemble the configuration we need based on our needs and capabilities, without imposing strict restrictions. Replica Set as such, outside the context of Sharded Cluster, is one of the typical MongoDB server organization schemes, which provides a high degree of fault tolerance and data protection. In this case, each participant in the replica stores a full copy of the entire database data set, and not part of it,
Infrastructure
The cluster configuration below is built on three OpenVZ virtual containers (VBOs). Each of the virtual machines is located on a separate dedicated server.
Two virtual machines (hereinafter server1.cluster.com and server2.cluster.com ) have more resources - they will be responsible for the replication, sharding and provision of data to clients. The third machine ( server3.cluster.com ) has a weaker configuration - its purpose is to ensure the operation of instances of mongod arbiters.
In the constructed cluster, we now have three charades. In our scheme, we followed the recommendation of building shards based on replica sets, but with some assumption. Each shard-replica set of our cluster has its own PRIMARY, SECONDARY and ARBITER, running on three different servers. There is also a config server built also using data replication.
However, we have only three servers, one of which does not perform data replication functions (only in the case of a config replica) and therefore all three shards are actually located on two servers.
In the diagrams from the Mongi documentation, Mongos are depicted on application servers. I decided to break this rule and place the Mongos (we will have two) on the data servers: server1.cluster.com and server2.cluster.com, getting rid of the additional configuration of mongodb on application servers and due to certain restrictions associated with application servers. Application servers have the ability to connect to either of the two Mongos, so in case of problems with one of them, they will reconnect to the other after a short timeout. Application servers, in turn, sit behind the DNS-th on which Round Robin is configured. It alternately issues one of two addresses, providing a primitive balance of connections (client requests). There are plans to replace it with some “smart” DNS (maybe someone will tell you a good solution in the comments, I will be grateful!
For clarity, I give a general diagram of the formed cluster with server names and applications running on them. Colons indicate the designated application ports.

Initial setup
We go to server1.cluster.com and install the latest version of the MongoDB Community Edition package from the official repository. At the time of assembly, the cluster is version 3.2.8. In my case, the Debian 8 operating system is installed on all cluster machines, you can find detailed installation instructions on your OS in the official documentation .
We import the public key into the system, update the list of packages and install the mongodb server with a set of utilities:
server1.cluster.com:~# apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv EA312927
server1.cluster.com:~# echo "deb http://repo.mongodb.org/apt/debian wheezy/mongodb-org/3.2 main" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.2.list
server1.cluster.com:~# apt-get update
server1.cluster.com:~# apt-get install -y mongodb-org
Done! As a result of the actions performed, we obtain on our machine MongoDB a server that is already up and running. For now, disable the mongod service (we will return to it):
server1.cluster.com:~# service mongod stop
Next, create a directory in which we will store all the data of our cluster, I have it located on the path “/ root / mongodb”. Inside we form the following directory structure:
.
├── cfg
├── data
│ ├── config
│ ├── rs0
│ ├── rs1
│ └── rs2
├── keys
└── logs
In the data subdirectory, we will directly store the data of our replicas (including the config replica). In cfg, we will create configuration files to run the necessary mongo {d / s} instances. In keys, we copy the keys and certificates for x.509 authentication of cluster members. The purpose of the logs folder, I think everyone understands.
Similarly, the procedure with installation and directories must be repeated on the remaining two servers.
Before proceeding to the configuration and linking of the components of our cluster, make sure that everything works as we need. Run the mongod instance on port 27000, specifying the directory for the data in “/ root / mongodb / data / rs0”:
mongod --port 27000 --dbpath /root/mongodb/data/rs0
On the same server, open another terminal and connect to the running mongod:
mongo --port 27000
If everything went well, we will end up in shell mongodb and can execute a couple of commands. By default, mong will switch us to the test database, we can verify this by entering the command:
> db.getName()
test
Delete the database that we do not need with the command:
> db.dropDatabase()
{ "ok" : 1 }
And we will initialize a new database with which we will experiment by switching to it:
> use analytics
switched to db analytics
Now try to enter the data. In order not to be unfounded, I propose to consider all further operations in the article using the example of a system of collecting certain statistics, where data from remote devices periodically arrives that are processed on application servers and then stored in a database.
We add a couple of devices:
> db.sensors.insert({'s':1001, 'n': 'Sensor1001', 'o': true, 'ip': '192.168.88.20', 'a': ISODate('2016-07-20T20:34:16.001Z'), 'e': 0})
WriteResult({ "nInserted" : 1 })
> db.sensors.insert({'s':1002, 'n': 'Sensor1002', 'o': false, 'ip': '192.168.88.30', 'a': ISODate('2016-07-19T13:40:22.483Z'), 'e': 0})
WriteResult({ "nInserted" : 1 })
Here,
s is the serial number of the sensor;
n is its string identifier;
o - current status (online / offline);
ip - sensor ip address;
a is the time of the last activity;
e is a sign of an error;
And now a few statistics records of the form:
> db.statistics.insert({'s':1001, ‘ts’: ISODate('2016-08-04T20:34:16.001Z'), ‘param1’: 123, ‘param2’: 23.45, ‘param3’: “OK”, ‘param4’: True, ‘param5’: ‘-1000’, ‘param6’: [1,2,3,4,5])
WriteResult({ "nInserted" : 1 })
s is the sensor number;
ts - TimeStamp;
param1..param6 - some statistics.
Clients of the statistical analytics service often perform aggregated queries to obtain some representative data on statistics collected from their devices. In almost all requests, the “sensor serial number” (field s) is involved. Sorts and groupings are often applied to it, so for optimization (as well as sharding) we add an index to the statistics collection:
mongos> db.statistics.ensureIndex({"s":1})
The selection and creation of the necessary indexes is a topic for a separate discussion, but for now I will limit myself to this.
Authentication using x.509 certificates
In order to understand the problem, let's run a little ahead and imagine the mongod instances running on different servers, which need to be combined into a replica, connect mongos to them, and provide for the possibility of securely connecting clients to the formed cluster. Of course, participants in the data exchange must be authenticated when connecting (be trusted) and it is desirable that the data channel is also protected. In this case, MongoDB has TSL / SSL support, as well as several authentication mechanisms. One of the options for establishing trust between the participants in the data exchange in the cluster is the use of keys and certificates. Regarding the choice of a mechanism that uses this option, there is a recommendation in the Mongi documentation:
“Keyfiles are bare-minimum forms of security and are best suited for testing or development environments. For production environments we recommend using x.509 certificates . ”
X.509 is the ITU-T standard for public key infrastructure and privilege management. This standard defines the format and how public keys are distributed using signed digital certificates. The certificate associates the public key with some subject - the user of the certificate. The reliability of this relationship is achieved through a digital signature, which is performed by a trusted certification authority.
(In addition to x.509, MongoDB also has highly reliable Enterprise-level methods - these are Kerberos Authentication and LDAP Proxy Authority Authentication), but this is not our case and here we will consider the configuration of x.509 authentication.
The authentication mechanism using x.509 certificates requires a secure TSL / SSL connection to the cluster, which is enabled by the corresponding argument to start mongod --sslMode , or by the net.ssl.mode parameter in the configuration file. The authentication of the client connecting to the server in this case comes down to the authentication of the certificate, and not the login and password.
In the context of this mechanism, the generated certificates will be divided into two types: cluster member certificates - are tied to a specific server, intended for internal authentication of mongod instances on different machines, and client certificates- are tied to a separate user, designed to authenticate external clients of the cluster.
To fulfill the conditions of x.509, we need a single key - the so-called “ Certificate Authority ” Certificate Authority (CA) . Based on it, both client and cluster member certificates will be issued, therefore, first of all, we will create a secret key for our CA. It will correctly perform all of the following actions and store secret keys on a separate machine, but in this article I will perform all the actions on the first server (server1.cluster.com):
server1.cluster.com:~/mongodb/keys# openssl genrsa -out mongodb-private.key -aes256
Generating RSA private key, 2048 bit long modulus
.....................+++
........................................................+++
e is 65537 (0x10001)
Enter pass phrase for mongodb-private.key:
Verifying - Enter pass phrase for mongodb-private.key:
On the proposal to introduce a secret phrase, we introduce and confirm some reliable combination, for example, “temporis $ filia $ veritas” (of course you will have something different and more complicated). The phrase must be remembered, we will need it to sign each new certificate.
Next, we create a CA certificate (right after launching the command we will be asked to enter the secret phrase from the key that we specified (in the “key” parameter):
server1.cluster.com:~/mongodb/keys# openssl req -x509 -new -extensions v3_ca -key mongodb-private.key -days 36500 -out mongodb-CA-cert.crt
I draw your attention to the days parameter - it is responsible for the duration of the certificate. I’m not sure who and how much will be involved in the project that I am currently working on, so in order to exclude unpleasant surprises, we indicate the certificate 36,500 days of life, which corresponds to 100 years (very optimistic, isn't it?).
After checking the phrase, we will be asked to enter information about the organization that owns the certificate. Imagine that our large organization is called “SomeSysyems” and is located in Moscow (the information entered follows after the colons):
Country Name (2 letter code) [AU]: RU
State or Province Name (full name) [Some-State]: MoscowRegion
Locality Name (eg, city) []: Moscow
Organization Name (eg, company) [Internet Widgits Pty Ltd]: SomeSystems
Organizational Unit Name (eg, section) []: Statistics
Common Name (e.g. server FQDN or YOUR name) []: CaServer
Email Address []: info@SomeSystems.com
Excellent! CA is ready and we can use it to sign client certificates and certificates of cluster members. I add that the reliability of the entered data does not affect the functionality of the CA certificate itself, however, the signed certificates will now depend on the entered values, which will be discussed later.
The procedure for creating certificates for cluster members (certificates for external clients will be considered separately) is as follows:
- We generate a private key (* .key - file) and “certificate request” (csr file). CSR (Certificate Signing Request) is a text file that contains encoded information about the organization that issued the certificate and the public key.
- Using the private key and public certificate of our Certificate Authority, we sign the certificate for the current server.
- From the new key and the certificate of the cluster member, we form the PEM file, which we use to connect to the cluster.
We create a private key and a certificate request for our first server (server1.cluster.com). I’ll pay attention to an important detail, when filling out all the fields remain the same as for the root certificate, with the exception of CN (Common Name). It must be made unique for each certificate. In our case, the FQDN (Fully Qualified Domain Name) of a specific server will be indicated as the value :
server1.cluster.com:~/mongodb/keys# openssl req -new -nodes -newkey rsa:2048 -keyout server1.key -out server1.csr
Country Name (2 letter code) [AU]: RU
State or Province Name (full name) [Some-State]: MoscowRegion
Locality Name (eg, city) []: Moscow
Organization Name (eg, company) [Internet Widgits Pty Ltd]: SomeSystems
Organizational Unit Name (eg, section) []: Statistics
Common Name (e.g. server FQDN or YOUR name) []: server1.cluster.com
Email Address []: info@SomeSystems.com
Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:
Extra fields I left empty. If you decide to specify an additional password (A challenge password [] :), then in the mongod configuration you will need to specify a password for this certificate for which the parameters net.ssl.PEMKeyPassword and net.ssl.clusterPassword are responsible . (Details on these parameters are in the documentation here ).
Next, we will sign the CSR file with our CA certificate and get a public certificate (* .crt file):
server1.cluster.com:~/mongodb/keys# openssl x509 -CA mongodb-CA-cert.crt -CAkey mongodb-private.key -CAcreateserial -req -days 36500 -in server1.csr -out server1.crt
Signature ok
subject=/C=RU/ST=MoscowRegion/L=Moscow/O=SomeSystems/OU=Statistics/CN=server1.cluster.com/emailAddress=info@SomeSystems.com
Getting CA Private Key
Enter pass phrase for mongodb-private.key:
Now we need to make a .pem file:
server1.cluster.com:~/mongodb/keys# cat server1.key server1.crt > server1.pem
We will use the PEM file directly when launching mongod instances, and we will indicate it in the configuration.
Now you need to repeat the operation to create a certificate for the remaining servers. For a complete understanding, I quote all the commands:
server1.cluster.com:~/mongodb/keys# openssl req -new -nodes -newkey rsa:2048 -keyout server2.key -out server2.csr
Country Name (2 letter code) [AU]: RU
State or Province Name (full name) [Some-State]: MoscowRegion
Locality Name (eg, city) []: Moscow
Organization Name (eg, company) [Internet Widgits Pty Ltd]: SomeSystems
Organizational Unit Name (eg, section) []: Statistics
Common Name (e.g. server FQDN or YOUR name) []: server2.cluster.com
Email Address []: info@SomeSystems.com
Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:
(extra fields weren’t filled out) We
sign the CSR file with our CA certificate to receive the public certificate (* .crt file) of the second server:
server1.cluster.com:~/mongodb/keys# openssl x509 -CA mongodb-CA-cert.crt -CAkey mongodb-private.key -CAcreateserial -req -days 36500 -in server2.csr -out server2.crt
Signature ok
subject=/C=RU/ST=MoscowRegion/L=Moscow/O=SomeSystems/OU=Statistics/CN=server2.cluster.com/emailAddress=info@SomeSystems.com
Getting CA Private Key
Enter pass phrase for mongodb-private.key:
Now we need to make a .pem file:
server1.cluster.com:~/mongodb/keys# cat server2.key server2.crt > server2.pem
And similarly for the third server certificate:
server1.cluster.com:~/mongodb/keys# openssl req -new -nodes -newkey rsa:2048 -keyout server3.key -out server3.csr
Country Name (2 letter code) [AU]: RU
State or Province Name (full name) [Some-State]: MoscowRegion
Locality Name (eg, city) []: Moscow
Organization Name (eg, company) [Internet Widgits Pty Ltd]: SomeSystems
Organizational Unit Name (eg, section) []: Statistics
Common Name (e.g. server FQDN or YOUR name) []: server3.cluster.com
Email Address []: info@SomeSystems.com
Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:
(extra fields weren’t filled out) We
sign the CSR file with our CA certificate to receive the public certificate (* .crt file) of the third server:
server1.cluster.com:~/mongodb/keys# openssl x509 -CA mongodb-CA-cert.crt -CAkey mongodb-private.key -CAcreateserial -req -days 36500 -in server3.csr -out server3.crt
Signature ok
subject=/C=RU/ST=MoscowRegion/L=Moscow/O=SomeSystems/OU=Statistics/CN=server3.cluster.com/emailAddress=info@SomeSystems.com
Getting CA Private Key
Enter pass phrase for mongodb-private.key:
Create a PEM file:
server1.cluster.com:~/mongodb/keys# cat server3.key server3.crt > server3.pem
I repeat that all the keys and certificates were created by me on the first server and then, if necessary, moved to the corresponding server. Thus, each of the three servers should have a public CA certificate (mongodb-CA-cert.crt) and a PEM file for this server (server <$ N> .pem).
Mongod instances configuration
For the correct launch, we need to pass a number of parameters to the mongod instances. To do this, you can use the configuration file, or pass all the necessary values as arguments to the terminal command. Almost all configuration options are reflected in the corresponding command line arguments. In my opinion, the option with a config file is more justified, since a separate structured file is easier to read and supplement, and in this case, launching an instance of a program is reduced to passing it a single argument - the location of the configuration file:
mongod --config
So, create a configuration file for the mongod instance of the first shard replica (rs0) on the first server:
#
# /root/mongodb/cfg/mongod-rs0.conf
#
replication:
replSetName: "rs0" # название реплики
net:
port: 27000
ssl:
mode: requireSSL # требуем защищенного соединения
PEMKeyFile: /root/mongodb/keys/server1.pem
clusterFile: /root/mongodb/keys/server1.pem
CAFile: /root/mongodb/keys/mongodb-CA-cert.crt
weakCertificateValidation: false # запрещаем подключаться без сертификата
allowInvalidCertificates: false # запрещаем подключение с невалидными сертификатами
security:
authorization: enabled # требуем обязательную авторизацию
clusterAuthMode: x509 # метод авторизации - MONGODB-X509
storage:
dbPath : /root/mongodb/data/rs0 # указываем каталог для данных
systemLog:
destination: file # будем выводить лог в файл
path: /root/mongodb/logs/mongod-rs0.log # путь для лог-файла
logAppend: true # дописывать лог-файл при следующем запуске
We create a similar file for the second shard replica (rs1), but change the port, replica name, location of the data directory and the log file:
#
# /root/mongodb/cfg/mongod-rs1.conf
#
replication:
replSetName: "rs1"
net:
port: 27001
ssl:
mode: requireSSL
PEMKeyFile: /root/mongodb/keys/server1.pem
clusterFile: /root/mongodb/keys/server1.pem
CAFile: /root/mongodb/keys/mongodb-CA-cert.crt
weakCertificateValidation: false
allowInvalidCertificates: false
security:
authorization: enabled
clusterAuthMode: x509
storage:
dbPath : /root/mongodb/data/rs1
systemLog:
destination: file
path: /root/mongodb/logs/mongod-rs1.log
logAppend: true
And by analogy for the third replica (rs2):
#
# /root/mongodb/cfg/mongod-rs2.conf
#
replication:
replSetName: "rs2"
net:
port: 27002
ssl:
mode: requireSSL
PEMKeyFile: /root/mongodb/keys/server1.pem
clusterFile: /root/mongodb/keys/server1.pem
CAFile: /root/mongodb/keys/mongodb-CA-cert.crt
weakCertificateValidation: false
allowInvalidCertificates: false
security:
authorization: enabled
clusterAuthMode: x509
storage:
dbPath : /root/mongodb/data/rs2
systemLog:
destination: file
path: /root/mongodb/logs/mongod-rs2.log
logAppend: true
In addition to the instances organizing three shard replicas, in our cluster there will be mongods providing the operation of the configuration server, which will be built on the basis of the replica (rscfg).
It is worth explaining that the role of the config server can be performed by one mongod (as well as with the shard), but to ensure reliability and fault tolerance, it is recommended to make the config server also based on the Replica Set.
The service replica config file differs from data replicas by the presence of the “sharding.clusterRole” parameter which tells the mongod instance its special purpose:
#
# /root/mongodb/cfg/mongod-rscfg.conf
#
sharding:
clusterRole: configsvr # указываем роль в кластере - сервер конфигурации
replication:
replSetName: "rscfg" # название реплики
net:
port: 27888
ssl:
mode: requireSSL
PEMKeyFile: /root/mongodb/keys/server1.pem
clusterFile: /root/mongodb/keys/server1.pem
CAFile: /root/mongodb/keys/mongodb-CA-cert.crt
weakCertificateValidation: false
allowInvalidCertificates: false
security:
authorization: enabled
clusterAuthMode: x509
storage:
dbPath : /root/mongodb/data/config
systemLog:
destination: file
path: /root/mongodb/logs/mongod-rscfg.log
logAppend: true
Now we need to copy all the created configuration files to the other servers. After copying, do not forget to change the values in the parameters net.ssl.PEMKeyFile and net.ssl.clusterFile in which the certificates of the corresponding server should be specified (server2.pem, server3.pem).
Setting up the Replica Set
On the first server, run mongod on port 27000, without specifying a “combat” configuration file — only the port and the data directory. This is done so that the launched mongod instance does not yet consider itself a member of the replica and also does not impose strict requirements for the connection and authentication, which we specified in the configuration files:
mongod --port 27000 --dbpath /root/mongodb/data/rs0
Next, we need to connect to the running mongod and add the superuser of the future replica, so that in the future, after enabling the authorization specified in our config file, we have the right to change the replica, including the initialization. As practice has shown, the inclusion of x.509 authorization does not prohibit us from adding traditional users to the database (authenticated by login and password). Nevertheless, I decided not to resort to this opportunity, but to use the x.509 mechanism everywhere both at the cluster level and in the formation of replicas. To make it clear, I’ll say that the user we are creating now is a user at the level of this replica. From other replicas and at the cluster level, it will not be available.
For the new user, we will need to create another certificate, just as we already did in the “x.509 authentication” section. The difference between this certificate is that it is not tied to a cluster member (mongod instance or server), but to the account. In other words, we will create a client certificate. This certificate will be tied to the superuser (root role) replica of the set of the first shard (rs0). MongoDB built-in roles can be found in this section of the official documentation.
We need to go to our CA server. And generate another key and certificate signing request:
server1.cluster.com:~/mongodb/keys# openssl req -new -nodes -newkey rsa:2048 -keyout rsroot.key -out rsroot.csr
Generating a 2048 bit RSA private key
........................................................................+++
.........................+++
writing new private key to 'rsroot.key'
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [AU]: RU
State or Province Name (full name) [Some-State]: MoscowRegion
Locality Name (eg, city) []: Moscow
Organization Name (eg, company) [Internet Widgits Pty Ltd]: SomeSystems
Organizational Unit Name (eg, section) []: StatisticsClient
Common Name (e.g. server FQDN or YOUR name) []: rsroot
Email Address []:
Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:
We sign the certificate (again we will need the secret phrase from the CA key):
server1.cluster.com:~/mongodb/keys# openssl x509 -CA mongodb-CA-cert.crt -CAkey mongodb-private.key -CAcreateserial -req -days 36500 -in rsroot.csr -out rsroot.crt
Signature ok
subject=/C=RU/ST=MoscowRegion/L=Moscow/O=SomeSystems/OU=StatisticsClient/CN=rsroot
Getting CA Private Key
Enter pass phrase for mongodb-private.key:
Create a PEM file:
server1.cluster.com:~/mongodb/keys# cat rsroot.key rsroot.crt > rsroot.pem
I would like to draw your attention to the Organization Unit Name (OU) parameter, namely, when generating client certificates, it must be different from the one we specified when generating the certificates of the Cluster Members. Otherwise, when adding a user to the cluster that contains subject (explained below) with OU equal to what the cluster members have in their certificates, the monga may refuse you with an error:
{
"ok" : 0,
"errmsg" : "Cannot create an x.509 user with a subjectname that would be recognized as an internal cluster member.",
"code" : 2
}
The user for authorization using the x.509 mechanism is added in a somewhat unusual way, we need to specify not its name and password, but the identifier (subject) of the certificate that corresponds to it. You can get subject from the PEM file by running the command:
server1.cluster.com:~/mongodb/keys# openssl x509 -in rsroot.pem -inform PEM -subject -nameopt RFC2253
subject= CN=rsroot,OU=StatisticsClient,O=SomeSystems,L=Moscow,ST=MoscowRegion,C=RU
-----BEGIN CERTIFICATE-----
In the output, we are interested in the contents of the line starting with “subject =” (without the “subject =” itself and the space). Connect to the mongod and add the user:
mongo --port 27000
> db.getSiblingDB("$external").runCommand({createUser: "CN=rsroot,OU=StatisticsClient,O=SomeSystems,L=Moscow,ST=MoscowRegion,C=RU",
roles: [{role: "root", db: "admin"}]
})
$ external is the name of the virtual database used to create users whose credentials are stored outside MongoDB, for example, as in our case (a certificate file is used for authentication).
Now we will exit the monga shell and restart mongod, now with the appropriate configuration file. The same thing needs to be done on the second and third servers. Thus, all the Mongodes of the first replica (rs0) should be started.
We connect to the mongod using the certificate of the created superuser of the replica (rsroot) and pass authentication by specifying the subject name as the user name:
server1.cluster.com:~/mongodb/keys# mongo admin --ssl --sslCAFile /root/mongodb/keys/mongodb-CA-cert.crt --sslPEMKeyFile /root/mongodb/keys/rsroot.pem --host server1.cluster.com --port 27000
> db.getSiblingDB("$external").auth({
mechanism:"MONGODB-X509",
user: "CN=rsroot,OU=StatisticsClient,O=SomeSystems,L=Moscow,ST=MoscowRegion,C=RU"
})
Initialize our replica:
rs.initiate(
{
_id: "rs0",
members: [
{ _id: 0, host : "server1.cluster.com:27000" },
{ _id: 1, host : "server2.cluster.com:27000" },
{ _id: 2, host : "server3.cluster.com:27000", arbiterOnly: true },
]
}
)
Pay attention to the arbiterOnly parameter for the third server, which we at the very beginning agreed to make an “arbitrators server”.
Having reconnected to the mongod, by the prefix “rs0” in the shell we will see that now it belongs to the replica of the same name:
rs0: PRIMARY (Your current server can be selected SECONDARY).
In a similar fashion, two more data replicas need to be linked.
1. Run the Mongodes without the config on the first server (the port and the data directory have changed):
mongod --port 27001 --dbpath /root/mongodb/data/rs1
2. Connect to the running mongod and add the replica superuser (rs1). I will use the same certificate for all replicas, so subject is used the same as the first replica:
mongo --port 27001
> db.getSiblingDB("$external").runCommand({createUser: "CN=rsroot,OU=StatisticsClient,O=SomeSystems,L=Moscow,ST=MoscowRegion,C=RU",
roles: [{role: "root", db: "admin"}]
})
3. Restart mongod on the first server, specifying the configuration file. On the second and third servers, we also raise the Mongod with the corresponding config:
root@server1.cluster.com# mongod --config /root/mongodb/cfg/mongod-rs1.conf
root@server2.cluster.com# mongod --config /root/mongodb/cfg/mongod-rs1.conf
root@server3.cluster.com# mongod --config /root/mongodb/cfg/mongod-rs1.conf
4. Connect to the Mongod with the certificate, pass authentication and initialize the rs1 replica:
root@server1.cluster.com# mongo admin --ssl --sslCAFile /root/mongodb/keys/mongodb-CA-cert.crt --sslPEMKeyFile /root/mongodb/keys/rsroot.pem --host server1.cluster.com --port 27001
> db.getSiblingDB("$external").auth({
mechanism:"MONGODB-X509",
user: "CN=rsroot,OU=StatisticsClient,O=SomeSystems,L=Moscow,ST=MoscowRegion,C=RU"
})
> rs.initiate(
{
_id: "rs1",
members: [
{ _id: 0, host : "server1.cluster.com:27001" },
{ _id: 1, host : "server2.cluster.com:27001" },
{ _id: 2, host : "server3.cluster.com:27001", arbiterOnly: true },
]
}
)
Repeat the procedure for the third replica (rs2).
1. Run the Mongodes without the config on the first server (do not forget to change the port and data directory):
mongod --port 27002 --dbpath /root/mongodb/data/rs2
2. Connect to Mongod and add the replica superuser (rs2):
mongo --port 27002
> db.getSiblingDB("$external").runCommand({createUser: "CN=rsroot,OU=StatisticsClient,O=SomeSystems,L=Moscow,ST=MoscowRegion,C=RU",
roles: [{role: "root", db: "admin"}]
})
3. Restart the Mongod on the first server with the configuration file. On the second and third servers, we also raise the Mongod with the corresponding configs:
root@server1.cluster.com# mongod --config /root/mongodb/cfg/mongod-rs2.conf
root@server2.cluster.com# mongod --config /root/mongodb/cfg/mongod-rs2.conf
root@server3.cluster.com# mongod --config /root/mongodb/cfg/mongod-rs2.conf
4. Connect to the Mongod with the certificate, pass authentication and initialize the rs2 replica:
root@server1.cluster.com# mongo admin --ssl --sslCAFile /root/mongodb/keys/mongodb-CA-cert.crt --sslPEMKeyFile /root/mongodb/keys/rsroot.pem --host server1.cluster.com --port 27002
> db.getSiblingDB("$external").auth({
mechanism:"MONGODB-X509",
user: "CN=rsroot,OU=StatisticsClient,O=SomeSystems,L=Moscow,ST=MoscowRegion,C=RU"
})
> rs.initiate(
{
_id: "rs2",
members: [
{ _id: 0, host : "server1.cluster.com:27002" },
{ _id: 1, host : "server2.cluster.com:27002" },
{ _id: 2, host : "server3.cluster.com:27002", arbiterOnly: true },
]
}
)
Config server
I decided to highlight the configuration replica set of the configuration server since it has a couple of features that will require some additional steps. Firstly, all users of which we add a replica to the config will be available at the cluster level through the Mongos, so for it I will create separate users tied to individual certificates. Secondly, monga does not allow creating arbitrators as part of a config replica. If you try to do this, you will receive an error message:
{
"ok" : 0,
"errmsg" : "Arbiters are not allowed in replica set configurations being used for config servers",
"code" : 93
}
For this reason, we will have two SECONDARY instances / mongods in the config replica. Let's create another certificate for the superuser of the rscfg replica, which, as I said, will also be a root at the cluster level.
server1.cluster.com:~/mongodb/keys# openssl req -new -nodes -newkey rsa:2048 -keyout rootuser.key -out rootuser.csr
Generating a 2048 bit RSA private key
......................+++
.........................................+++
writing new private key to 'rootuser.key'
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [AU]: RU
State or Province Name (full name) [Some-State]: MoscowRegion
Locality Name (eg, city) []: Moscow
Organization Name (eg, company) [Internet Widgits Pty Ltd]: SomeSystems
Organizational Unit Name (eg, section) []: StatisticsClient
Common Name (e.g. server FQDN or YOUR name) []: root
Email Address []:
Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:
server1.cluster.com:~/mongodb/keys# openssl x509 -CA mongodb-CA-cert.crt -CAkey mongodb-private.key -CAcreateserial -req -days 36500 -in rootuser.csr -out rootuser.crt
Signature ok
subject=/C=RU/ST=MoscowRegion/L=Moscow/O=SomeSystems/OU=StatisticsClient/CN=root
Getting CA Private Key
Enter pass phrase for mongodb-private.key:
server1.cluster.com:~/mongodb/keys# cat rootuser.key rootuser.crt > rootuser.pem
server1.cluster.com:~/mongodb/keys# openssl x509 -in rootuser.pem -inform PEM -subject -nameopt RFC2253
subject= CN=root,OU=StatisticsClient,O=SomeSystems,L=Moscow,ST=MoscowRegion,C=RU
-----BEGIN CERTIFICATE-----
1. Launch mongod without config on the first server:
server1.cluster.com:~/mongodb/keys# mongod --port 27888 --dbpath /root/mongodb/data/config
2. Connect to mongod and add replica superuser (rscfg) .:
server1.cluster.com:~/mongodb/keys# mongo --port 27888
> db.getSiblingDB("$external").runCommand({createUser: "CN=root,OU=StatisticsClient,O=SomeSystems,L=Moscow,ST=MoscowRegion,C=RU",
roles: [{role: "root", db: "admin"}]
})
3. Restart mongod on the first server with the config file. On the second and third servers, we also raise the Mongod with the corresponding configuration file:
root@server1.cluster.com# mongod --config /root/mongodb/cfg/mongod-rscfg.conf
root@server2.cluster.com# mongod --config /root/mongodb/cfg/mongod-rscfg.conf
root@server3.cluster.com# mongod --config /root/mongodb/cfg/mongod-rscfg.conf
4. We connect to the Mongod with the certificate, pass authentication and initialize the config replica (rscfg):
root@server1.cluster.com# mongo admin --ssl --sslCAFile /root/mongodb/keys/mongodb-CA-cert.crt --sslPEMKeyFile /root/mongodb/keys/rootuser.pem --host server1.cluster.com --port 27888
> db.getSiblingDB("$external").auth({
mechanism:"MONGODB-X509",
user: "CN=root,OU=StatisticsClient,O=SomeSystems,L=Moscow,ST=MoscowRegion,C=RU"
})
> rs.initiate(
{
_id: "rscfg",
members: [
{ _id: 0, host : "server1.cluster.com:27888" },
{ _id: 1, host : "server2.cluster.com:27888" },
{ _id: 2, host : "server3.cluster.com:27888" }
]
}
)
Our config server based replica set is ready. Now you are ready to start mongos and connect to the cluster.
Configuring and starting mongos
The goal of Mongos is to provide an access point to cluster data (moreover, clients can access cluster data only through mongos). In the diagrams from the MongoDB documentation, Mongos are depicted running on application servers. In the cluster structure I represent, there are two mongos instances running directly on server1.cluster.com and server2.cluster.com.
First of all, as well as for mongod, we will create a configuration file that we will transfer to our Mongos at startup.
The main difference between the mongos settings and the mongod is that the Mongos do not have a data directory, since they do not store, but only proxy data. The Mongos receive all the necessary information about the configuration and state of the cluster from the config collection of the config server. Mongos learns about how to connect to the config server through the sharding.configDB parameter. Since our config server is based on a replica of the set, we specify it in the replica format: the name of the replica itself, a slash, and then a list of hosts with their ports, separated by commas. We will launch the Mongos at the default port of Monga - 27017.
#
# /root/mongodb/cfg/mongos.conf
#
sharding:
configDB: "rscfg/server1.cluster.com:27888,server2.cluster.com:27888,server3.cluster.com:27888"
net:
port: 27017
ssl:
mode: requireSSL
PEMKeyFile: /root/mongodb/keys/server1.pem
clusterFile: /root/mongodb/keys/server1.pem
CAFile: /root/mongodb/keys/mongodb-CA-cert.crt
weakCertificateValidation: false
allowInvalidCertificates: false
security:
clusterAuthMode: x509
systemLog:
destination: file
path: /root/mongodb/logs/mongos.log
logAppend: true
Copy the configuration file to both servers (specifying the corresponding PEM certificates) and run in the command:
mongos --config /root/mongodb/cfg/mongos.conf
Check the correctness of our actions - connect to mongos and authenticate with the root user certificate, which we added to the config replica (remember that the user of the replica replica is a cluster user).
mongo admin --ssl --sslCAFile /root/mongodb/keys/mongodb-CA-cert.crt --sslPEMKeyFile /root/mongodb/keys/rootuser.pem --host server1.cluster.com --port 27017
by the inscription “mongos>” we see to whom we are connected, then everything is OK.
mongos> db.getSiblingDB("$external").auth({
mechanism:"MONGODB-X509",
user: "CN=root,OU=StatisticsClient,O=SomeSystems,L=Moscow,ST=MoscowRegion,C=RU"
})
(we expect to see the affirmative “1” in the output)
In general, the monga “does not like” when they connect to it from root and in this case will notify you that this is not worth doing for security reasons. Therefore, when working with a real cluster, I also recommend adding a user (naturally with a separate certificate) endowed with the built-in userAdminAnyDatabase role . This role has almost all the rights necessary to perform administrative tasks.
I think here it is worth giving an example of creating another user’s certificate. This user will have access only to the analytics database , and all applications of our service will be connected to the cluster on his behalf.
So, we’ll go to the directory where our Certificate Authority is located and create a key and a certificate signing request for a new user, which we will call analyticsuser :
server1.cluster.com:~/mongodb/keys# openssl req -new -nodes -newkey rsa:2048 -keyout analyticsuser.key -out analyticsuser.csr
Generating a 2048 bit RSA private key
......................+++
.........................................+++
writing new private key to 'analyticsuser.key'
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [AU]: RU
State or Province Name (full name) [Some-State]: MoscowRegion
Locality Name (eg, city) []: Moscow
Organization Name (eg, company) [Internet Widgits Pty Ltd]: SomeSystems
Organizational Unit Name (eg, section) []: StatisticsClient
Common Name (e.g. server FQDN or YOUR name) []: analyticsuser
Email Address []:
Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:
We sign the certificate:
server1.cluster.com:~/mongodb/keys# openssl x509 -CA mongodb-CA-cert.crt -CAkey mongodb-private.key -CAcreateserial -req -days 36500 -in analyticsuser.csr -out analyticsuser.crt
Signature ok
subject=/C=RU/ST=MoscowRegion/L=Moscow/O=SomeSystems/OU=StatisticsClient/CN=analyticsuser
Getting CA Private Key
Enter pass phrase for mongodb-private.key:
Create a PEM file:
server1.cluster.com:~/mongodb/keys# cat analyticsuser.key analyticsuser.crt > analyticsuser.pem
Let's see which subject has our certificate:
server1.cluster.com:~/mongodb/keys# openssl x509 -in rootuser.pem -inform PEM -subject -nameopt RFC2253
subject= CN=analyticsuser,OU=StatisticsClient,O=SomeSystems,L=Moscow,ST=MoscowRegion,C=RU
-----BEGIN CERTIFICATE-----
Connect to the cluster (Mongos) as a user with administrative rights and add a new user:
mongos> db.getSiblingDB("$external").runCommand({createUser: "CN=analyticsuser,OU=StatisticsClient,O=SomeSystems,L=Moscow,ST=MoscowRegion,C=RU",
roles: [{role: "readWrite", db: "analytics"}]
})
Please note that we gave the user analyticsuser read and write permissions for one analytics database only. This will protect the cluster from possible (reckless or malicious) actions from external applications to the settings of the analytics database itself and the cluster as a whole.
Sharding
Sharding in our case will share the high-loaded statistics collection at a given index - the Shard key between several shards, which we will add soon. When sharding is activated for a collection, the entire set of its documents will be divided into n parts, called Chunks . The number of chunks into which the collection will be divided when sharding is enabled and how often new chunks will be formed depends on the amount of data in your collection, as well as on the chunksize parameter which affects the size of the chunk and defaults to 64 Mb. If you want to specify a different chunk size in your cluster, then this must be done before activating sharding on these collections, because The new chunk size will only be applied to newly formed chunks.
In order to change the size of the chunk, we will connect to the Mongos with a superuser certificate and pass authentication. In general, authentication can be combined with the input by specifying its mechanism ( authenticationMechanism argument ), a database that is responsible for certificate authentication ( authenticationDatabase ) and directly to the user who owns the certificate ( u ). For our superuser (root), the “connect + authentication” command will take the following form:
mongo --ssl --sslCAFile /root/mongodb1/keys/mongodb-CA-cert.crt --sslPEMKeyFile /root/mongodb1/keys/rootuser.pem --host server1.cluster.com --port 27017 --authenticationMechanism "MONGODB-X509" --authenticationDatabase "$external" -u “CN=root,OU=StatisticsClient,O=SomeSystems,L=Moscow,ST=MoscowRegion,C=RU”
After a successful login, select the config collection and change the desired parameter:
mongos> use config
mongos> db.settings.save({_id: "chunksize", value: NumberLong(32)})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
We just set the chunk size to a jerky 32 Mb. You can check the current value of this setting with the command:
mongos> db.settings.find({'_id':"chunksize" })
{ "_id" : "chunksize", "value" : NumberLong(32) }
In order to manage shards (you must first add them), you need to connect as a user with the built-in clusterAdmin role . Create a certificate for the cluster administrator:
server1.cluster.com:~/mongodb/keys# openssl req -new -nodes -newkey rsa:2048 -keyout clusterAdmin.key -out aclusterAdmin.csr
Generating a 2048 bit RSA private key
................+++
.......................................+++
writing new private key to 'clusterAdmin.key'
-----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
-----
Country Name (2 letter code) [AU]: RU
State or Province Name (full name) [Some-State]: MoscowRegion
Locality Name (eg, city) []: Moscow
Organization Name (eg, company) [Internet Widgits Pty Ltd]: SomeSystems
Organizational Unit Name (eg, section) []: Statistics
Common Name (e.g. server FQDN or YOUR name) []: clusteradmin
Email Address []:
Please enter the following 'extra' attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:
server1.cluster.com:~/mongodb/keys# openssl x509 -CA mongodb-CA-cert.crt -CAkey mongodb-private.key -CAcreateserial -req -days 36500 -in clusterAdmin.csr -out clusterAdmin.crt
Signature ok
subject=/C=RU/ST=MoscowRegion/L=Moscow/O=SomeSystems/OU=Statistics/CN=clusteradmin
Getting CA Private Key
Enter pass phrase for mongodb-private.key:
server1.cluster.com:~/mongodb/keys# cat clusterAdmin.key clusterAdmin.crt > clusterAdmin.pem
server1.cluster.com:~/mongodb/keys# openssl x509 -in clusterAdmin.pem -inform PEM -subject -nameopt RFC2253
subject= CN=clusteradmin,OU=StatisticsClient,O=SomeSystems,L=Moscow,ST=MoscowRegion,C=RU
-----BEGIN CERTIFICATE-----
Nothing unusual for us, just do not forget to specify an OU different from the OU indicated for the cluster members.
Now connect to Mongos again and authenticate as root, and add a new user - the cluster administrator:
mongos> db.getSiblingDB("$external").runCommand({
createUser: "CN=clusteradmin,OU=StatisticsClient,O=SomeSystems,L=Moscow,ST=MoscowRegion,C=RU",
roles: [{role: "clusterAdmin", db: "admin"}]
})
We reconnect to mongos under the cluster administrator (authentication is included in the connection command):
mongo --ssl --sslCAFile /root/mongodb1/keys/mongodb-CA-cert.crt --sslPEMKeyFile /root/mongodb1/keys/clusterAdmin.pem --host server1.cluster.com --port 27017 --authenticationMechanism "MONGODB-X509" --authenticationDatabase "$external" -u “CN=clusteradmin,OU=StatisticsClient,O=SomeSystems,L=Moscow,ST=MoscowRegion,C=RU”
Add shards that are specified in the replica of sets, excluding arbiter instances:
mongos> sh.addShard("rs0/server1.cluster.com:27000,server2.cluster.com:27000")
mongos> sh.addShard("rs1/server1.cluster.com:27001,server2.cluster.com:27001")
mongos> sh.addShard("rs2/server1.cluster.com:27002,server2.cluster.com:27002")
If everything went well with the addition of shards, we can see the current status with the sharding command:
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5795284cd589624d4e36b7d4")
}
shards:
{ "_id" : "rs0", "host" : "rs0/server1.cluster.com:27100,server2.cluster.com:27200" }
{ "_id" : "rs1", "host" : "rs1/server1.cluster.com:27101,server2.cluster.com:27201" }
{ "_id" : "rs2", "host" : "rs2/server1.cluster.com:27102,server2.cluster.com:27202" }
active mongoses:
"3.2.8" : 1
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
we see our shards, we see the state of the balancer - it is turned on, but it is idle now because it does not yet have data for the migration of chunks that it would distribute between the available shards. This is what the empty “databases" list tells us. Thus, we built a shard cluster, but by default all shards of all databases have sharding disabled. It is included in two stages:
Step 1. Turn on sharding for the desired base. In our case, these are analyitcs:
mongos> sh.enableSharding("statistics")
Check the result:
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5795284cd589624d4e36b7d4")
}
shards:
{ "_id" : "rs0", "host" : "rs0/server1.cluster.com:27000,server2.cluster.com:27000" }
{ "_id" : "rs1", "host" : "rs1/server1.cluster.com:27001,server2.cluster.com:27001" }
{ "_id" : "rs2", "host" : "rs2/server1.cluster.com:27002,server2.cluster.com:27002" }
active mongoses:
"3.2.8" : 1
balancer:
Currently enabled: yes
Currently running: no
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "analytics", "primary" : "rs2", "partitioned" : true }
The analytics database should appear in the list of databases, we also see that the “rs2” shard was assigned to this database as a Primary shard (not to be confused with the PRIMARY Set replica). This means that all collection documents with sharding disabled will be entirely stored on this Primary shard (rs2).
Step 2. Turn on sharding for the collection.
As mentioned earlier, to split the entire collection of documents of a shardable collection into chunks, a monga needs a key index - a sharding key. His choice is a very responsible task, which must be approached wisely, guided by the requirements of your implementation and common sense. The index by which the collection will be divided into chunks is selected from existing indices, or is added to the collection intentionally. One way or another, at the time sharding is enabled, an index corresponding to the key must exist in the collection. The sharding key does not impose special restrictions on the corresponding index. If necessary, it can be made compound, for example {"s": 1, "ts": -1} .
Having decided on the index that we need, we create it and specify it as the sharding key for the statistics collection in the analytics database. As I said before, the most representative field of our statistics collection is the sensor identifier, the field s . If you have not yet created the corresponding index in the collection, then it's time to create it:
mongos> use analytics
mongos> db.statistics.ensureIndex({"s":1})
Turn on collection sharding with the key sharding index:
mongos> sh.shardCollection("analytics.statistics", {"s":1})
From now on, we can really talk about sharding data in our cluster. After enabling sharding for the collection, it will be divided into chunks (the amount depends on the size of the data and the size of the chunk itself), which will initially be in the PRIMARY shard, and then will be divided between other shards during the balancing (migration) process. The balancing process in my opinion is very leisurely. In our case, a collection of 3M records was distributed between the three shards for more than a week.
After some time, let's run the sh.status () command again and see what has changed:
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("5773899ee3456024f8ef4895")
}
shards:
{ "_id" : "rs0", "host" : "rs0/server1.cluster.com:27000,server2.cluster.com:27000" }
{ "_id" : "rs1", "host" : "rs1/server1.cluster.com:27001,server2.cluster.com:27001" }
{ "_id" : "rs2", "host" : "rs2/server1.cluster.com:27002,server2.cluster.com:27002" }
active mongoses:
"3.2.8" : 1
balancer:
Currently enabled: yes
Currently running: yes
Balancer lock taken at Sun Jul 29 2016 10:18:32 GMT+0000 (UTC) by MongoDB:27017:1468508127:-1574651753:Balancer
Collections with active migrations:
statistic.statistic started at Sun Jul 29 2016 10:18:32 GMT+0000 (UTC)
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
3 : Success
2 : Failed with error 'aborted', from rs2 to rs0
databases:
{ "_id" : "analytics", "primary" : "rs2", "partitioned" : true }
analytics.statistics
shard key: { "s" : 1 }
unique: false
balancing: true
chunks:
rs0 1
rs1 2
rs2 21
too many chunks to print, use verbose if you want to force print
In the analytics database for which we previously enabled sharding, a statistics collection appeared in which we see the current shard key. Also in the output you can find the distribution of chunks by shards, and if you have a small number of chunks in the collection, then you will also see a brief summary of the chunks. Also in the balancer section we can see information about successful migration of chunks, or about errors for the last day.
Supervisor
After installing the standard MongoDB Community package, the mongodb service appears on our system, representing the “boxed" version of the server. This service starts by default after installing MongoDB.
Starting the service provides a demonization script located along the path: /etc/init.d/mongod. As you may have noticed, we need to run several instances of mongod and one mongos for the data servers server1.cluster.com and server2.cluster.com on the same machine.
At first glance, there is a ready-made solution using the /etc/init.d/mongod script as an example, but the option using the supervisor utility seemed more convenient and transparent to me.
Supervisor also gives us a small plus in the form of the ability to simultaneously start and stop all our mongo {d / s} 's commands:
supervisorctl start all
supervisorctl stop all
(provided that the machine no longer has other applications launched by the supervisor - as in our case).
The supervisor package is installed on most operating systems of the linux family from the standard repository, in my case (Debian 8) the command will be relevant:
# apt-get install supervisor
In order to configure the supervisor we need to create a configuration for each launched application in a separate configuration file, or combine all the configurations into one.
Here is an example mongod configuration for rs0 replica:
#
# /etc/supervisor/conf.d/mongod-rs0.conf
#
[program:mongod-rs0]
command=mongod --config /root/mongodb/cfg/rs0.conf
user=root
stdout_logfile=/root/mongodb/logs/supervisor/mongod-rs0-stdout.log
redirect_stderr=true
autostart=true
autorestart=true
stopwaitsecs=60
In square brackets we define the identifier of the application that we will use to start or stop. The command parameter actually sets the command that the supervisor needs to execute - mongod receiving the configuration file. Next, indicate the user on whose behalf the process will be launched. Parameter stdout_logfile - sets the path to the output file to which supervisor will write. This is useful when something goes wrong, and you need to understand why the supervisor does not start the application.
redirect_stderr tells the supervisor to redirect the error stream to the same log file as we specified above. Next, be sure to include the options autostart and autorestartin case of unauthorized server restart and the process itself crash.
It will also be useful to change the stopwaitsecs parameter , which will cause the supervisor to wait for the specified number of seconds when the application stops. By default, when the application stops, the supervisor sends a TERM signal, then waits 10 seconds. If after their expiration the application has not completed, it already sends a KILL signal, which cannot be ignored by the application and theoretically can lead to data loss. Therefore, it is recommended to increase the default interval for waiting for the application to complete.
The generated configuration file must be put in the appropriate supervisor directory, as a rule in linux OS it is /etc/supervisor/conf.d/ .
When everything is ready, you need to update the supervisor configuration with the command:
# supervisorctl reload
Stopping, starting and checking the state of a configured application is performed accordingly by the commands:
# supervisorctl stop mongod-rs0
# supervisorctl start mongod-rs0
# supervisorctl status mongod-rs0
After switching to using supervisor, it is important to prevent the standard mongodb service from starting, which may take port 27017 (for example, after restarting the server) on which we run mongos. To do this, you can simply remove the /etc/init.d/mongod script.
Helpful information
Enabling sharding for large collections
The most loaded collection of our database at the time of migration totaled a little more than 3M records, and during tests, the inclusion of sharding for such a collection (sh.shardCollection () command) performed fine. However, tests were also conducted on an artificially generated database with 100M of similar records. At such a volume, the sh.shardCollection () command ends after a while with the “timeout” error. The solution to this situation is the following procedure:
Step 1. Import the entire database onto the cluster;
Step 2. On the production server or already on the cluster, create a dump of a separate “large” collection, for example:
mongoexport --db analytics --collection statistics --out statistics.json
Step 3. Delete the “large” collection on the cluster:
> use analytics
> db.statistics.drop()
Step 4. Create an empty “large” collection and add an index to it, by which we will shard:
> db.analytics.ensureIndex({"s":1})
Step 5. Turn on the sharding of the collection with the sharding key:
> sh.shardCollection("analytics.statistics", {"s":1})
Step 6. And now we import the collection data:
mongoimport --db analytics --collection statistics --file statistics.json
This technique worked for me, but keep in mind that exporting / importing a large collection in json format is not a fast process.
Database backup
Creating a backup copy of all components of a sharded cluster is a very complicated procedure, which requires that the balancer be in the off state (it cannot be stopped forcibly during migration), blocking SECONDARY nodes on each shard for their subsequent backup. You can read more about performing a full backup in the official documentation .
For ourselves, we solved the backup problem by periodically creating the usual data dump of the necessary databases. I will describe the implementation of this procedure here. We will
back up the analytics database using the mongodump utility, which is part of the MongoDB Community package.
MongoDB has a special built-in backup role that has a minimal set of rights to perform data backup. To perform this procedure, we will create an individual user and, by tradition, first generate an x.509 certificate for him. I will not give the entire procedure for generating a certificate, it has been repeatedly shown in the article, I will only say that you should get the following subject:
CN=backuper,OU=StatisticsClient,O=SomeSystems,L=Moscow,ST=MoscowRegion,C=RU
Now connect to the cluster and create the backuper user with the built-in backup role:
mongos> db.getSiblingDB("$external").runCommand({
createUser: "CN=backuper,OU=StatisticsClient,O=SomeSystems,L=Moscow,ST=MoscowRegion,C=RU",
roles: [{role: "backup", db: "admin"}]
})
After creating the user, you can try to back up our analytics database. The command arguments for the mongodump utility are similar to connecting with authentication, only the database name ( --db ) is specified additionally , the directory where the dump ( -o ) will be saved , as well as the --gzip argument indicating that all dump files should be compressed:
mongodump --ssl --sslCAFile “/root/mongodb/keys/mongodb-CA-cert.crt” --sslPEMKeyFile “/root/mongodb/keys/backuper.pem” -u "CN=backuper,OU=StatisticsClient,O=SomeSystems,L=Moscow,ST=MoscowRegion,C=RU" --host server1.cluster.com --port 27017 --authenticationMechanism "MONGODB-X509" --authenticationDatabase "$external" --db analytics --gzip -o "/path/to/backup/"
A bit of code ...
At the end of the article I want to share examples of program code where I will demonstrate connection to the created cluster. Since our service working with the cluster consists of many parts written in C ++ and Python, the examples will be in these wonderful programming languages.
So, let's start with a C ++ example. The connection example below is relevant for the official MongoDB driver mongodb-cxx-driver-legacy-1.1.1 .
#include
#include
...
mongo::DBClientConnection client(true); // включаем автореконнект
try {
// заполняем структуру опций SSL подключения
mongo::client::Options options;
options.setSSLMode(mongo::client::Options::SSLModes::kSSLRequired);
options.setSSLCAFile("/path_to_certs/mongodb-CA-cert.crt");
options.setSSLPEMKeyFile("/path_to_certs/analyticsuser.PEM");
mongo::Status status = mongo::client::initialize(options);
mongo::massertStatusOK(status); // проверим, все ли в порядке
client.connect("www.server1.cluster.com:27017"); // адрес и порт хоста на котором запущен mongos
// настройки аутентификации: бд, пользователь, механизм
mongo::BSONObjBuilder auth_params;
auth_params.append("db", "$external");
auth_params.append("user", "CN=username,OU=StatisticsClient,O=SomeSystems,L=Moscow,ST=MoscowRegion,C=RU");
auth_params.append("mechanism", "MONGODB-X509");
client.auth(auth_params.obj()); // выполняем аутентификацию
} catch (const mongo::DBException &e) {
std::cout << "DBException : " << e.toString() << std::endl;
}
...
Before connecting to the database host, we need to initialize the client using the mongo :: client :: Options structure , specifying the SSL request level (kSSLRequired) , the public CA certificate (mongodb-CA-cert.crt) , and the attached PEM file to the cluster user (in this case, this is the analyticsuser we created earlier).
Next, we connect to the database and if everything goes through, we successfully authenticate. Pay attention to the name of the database through which authentication passes - “$ external”, as the name we pass the subject from the user certificate, do not forget to specify the authentication mechanism. We also see that we do not transmit the password since our authentication is external - through certificate authentication.
In the web part of the project written in Python, the pure pymongo driver is involved, and the object model is formed using the mongoengine framework.
To get started, an example for pymongo:
import ssl
db_hosts="server1.cluster.com:27017,server2.cluster.com:27017"
db_port=None
client = MongoClient(db_hosts,
db_port,
read_preference=ReadPreference.NEAREST,
ssl=True,
ssl_certfile="/path_to_certs/analyticsuser.PEM",
ssl_cert_reqs=ssl.CERT_REQUIRED,
ssl_ca_certs="/path_to_certs/mongodb-CA-cert.crt")
db = client[db_name]
db.authenticate(name=db_user, source="$external", mechanism="MONGODB-X509")
Nothing special - we also transfer a public CA certificate and a client PEM file. The db_hosts variable deserves attention here - this is actually a connection string in which addresses and ports on which Mongos are available are separated by commas. The port parameter (db_port), you can not specify in our case, I gave it for clarity. The pymongo driver, connected in this way if the first address is unavailable, will automatically attempt to reconnect to the second address and vice versa. Practice shows that if both servers are available at the first connection, addresses are selected in order, i.e. the first will be a connection to server1.cluster.com:27017 .
However, when testing this pymogo behavior, it was noticed that automatic reconnection is preceded by the generation of the pytmogo.errors.AutoReconnect exception. To handle this situation, a small decorator was written that allows you to wrap, for example, the functions of displaying a statistics page or API request to read data:
from functools import wraps
from pymongo.errors import AutoReconnect
import time
def pymongo_reconnect(attempts=5):
def decorator(f):
@wraps(f)
def decorated_function(*args, **kwargs):
tries_reconnect = attempts
if tries_reconnect <= 0:
tries_reconnect = 1
while tries_reconnect:
try:
return f(*args, **kwargs)
except AutoReconnect as ar:
tries_reconnect -= 1
print("Caught AutoReconnect exception.")
if tries_reconnect <= 0:
raise ar
time.sleep(0.1)
print("Attempt to reconnect (%d more)...\n" % tries_reconnect)
continue
return decorated_function
return decorator
The decorator gives a number of attempts to execute the function (in this case, 5) and, having spent all attempts, it ends with an exception.
You should also say a few words about the read_preference parameter from the connection example. read_preference tells the driver which data reading rule to use in this connection (writing is always done in PRIMARY, which is logical). The following possible values are available:
PRIMARY - always read data from the primary member of the shard replica; PRIMARY_PREFERRED - read from the primary member of the shard replica, but if it is impossible to read from secondary;
SECONDARY - read only from a secondary member of the shard;
SECONDARY_PREFERRED - read as much as possible from the secondary shard, but if it is impossible from primary;
NEAREST - read from whatever is available (as stated in the pymongo documentation), and the documentation of the monga itself is described in detail that it is not just the first participant in the replica that is used, but the one with the least network delay - it's just a ping, no matter who provides the primary or secondary data.
Thus, this parameter on the one hand gives us the opportunity to offload the PRIMARY instances from the load of requests for reading, but on the other hand it can lead to irrelevant / inconsistent data, because SECONDARY instances somehow have a delay in synchronizing with PRIMARY (it depends on the configuration of your replica and lag). Therefore, you should choose this option with caution and based on the assumptions and limitations of your system.
It should also be noted that if it is not possible to fulfill the PRIMARY or SECONDARY preferences, pymongo will throw an OperationFailure exception, so this behavior must be taken into account when using these options.
With the mongoengine package, everything turned out to be more sad. The first thing I saw in the project was the connection point to the database through the mongoengine package:
connect('default', host, port)
OK, I thought: “Now I will transfer the remaining connection parameters to mongoengine.connect as it was with pymongo and this is decided.” But my aspirations were in vain since I did not find the parameters I needed in mongoengine.connect - it is just a general wrapper for a function with a wider list of arguments: mongoengine.register_connection. Among the parameters of this function, there was also no necessary through which MONGODB-X509 authorization mechanism could be transferred to the connection. I made a couple of vain attempts in the hope that the framework will “understand” what is required of it, but delving into the source code, I was convinced of the lack of even support, and the inability to “forward” the necessary mechanism to mogoengine where pymongo understands it (on which it is actually based mongoengine).
It turned out that on github for this drawback a similar ticket was already brought up , which was not brought to the end, so I decided to make my own fork and add everything I needed.
Thus, the connection with x.509 authentication took the following form:
import ssl
from mongoengine import DEFAULT_CONNECTION_NAME, register_connection
db_hosts="server1.cluster.com:27017,server2.cluster.com:27017"
db_port=None
ssl_config = {
'ssl': True,
'ssl_certfile': "/path_to_certs/analyticsuser.PEM",
'ssl_cert_reqs': ssl.CERT_REQUIRED,
'ssl_ca_certs': "/path_to_certs/mongodb-CA-cert.crt",
}
register_connection(alias=DEFAULT_CONNECTION_NAME,
name="statistic",
host=db_hosts,
port=db_port,
username="CN=username,OU=StatisticsClient,O=SomeSystems,L=Moscow,ST=MoscowRegion,C=RU",
password=None,
read_preference=ReadPreference.NEAREST,
authentication_source="$external",
authentication_mechanism="MONGODB-X509",
**ssl_config)
Unfortunately, so far I have not been able to merge with the main MongoEngine repository, as tests fail on all python / pymongo combinations. In the latest pool requests of many developers, I noticed similar problems with the same tests, so the thought of a possible malfunction in the “stable” branch of the framework creeps in.
I hope that in the near future the situation will improve, it will be possible to understand the problem, and authentication support for x.509 will appear in the main MongoEngine repository.
Update
Support for an authentication mechanism option has been added to the official version of mongoengine.