Alex1C July 22, 2015 at 12:32

Fault tolerance in MS SQL 2012 for 1C: Enterprise 8.2

From the sandbox

Many currently use fault tolerance technology in building information systems, and this topic is not new. Moreover, ensuring only fault tolerance is no longer considered the only and sufficient requirement. The ideal system, in my opinion, should be

Fault-tolerant (ensuring continuous operation of the system in case of failure of its constituent parts)
With load balancing and high utilization of resources (with the current operation, even load distribution and the use of all resources, including those reserved for failure)
Easily extensible (scalable)

For those who do not want to master a lot of letters and pictures, I suggest moving to the end of the article to conclusions, along the way you can still look into the experiment.

Fault tolerance in MS SQL

To get started, we’ll briefly look at existing Microsoft high availability database technologies. One fault tolerance solution is to use the Windows Server Failover Cluster (WSFC). Consider it with an example:

WSFC uses shared network storage (SAN). When you install SQL in a failover cluster, system and user databases are hosted on a shared repository. In the above diagram, two nodes (SQLCLU01NODE01 and SQLCLU01NODE02) are combined into a failover cluster SQLCLUSTER01A. In this case, SQL is installed by a named instance (instance), i.e. The clustered SQL server connection address looks like SQLCLUSTER01A \ SQL. During normal operation, one of the cluster nodes (server) is active (in the example - SQLCLU01NODE01), and the second is in hot standby. When the active server SQLCLU01NODE01 fails, the cluster service switches (transparent to users) to the standby node SQLCLU01NODE01.

The obvious advantages of WSFC follow from the foregoing: high reliability through full server redundancy, no downtime during server failure, as well as disadvantages: no disk array failure protection, high solution cost (an idle backup server similar to a working one).

The solution has long existed (in my opinion, starting with Windows Server 2000 and SQL Server 2000) and should not cause difficulties during deployment.

With the release of SQL2005SP1, the possibility of mirroring databases has appeared. The basic principle of mirroring is to create a hot backup of the database on another server (storage).

On the primary server (Primary) is the database (principal database), which users normally work with.
On the backup server (Mirror) is a mirror copy of the database (mirror database).

In the protected synchronous mode of mirroring, a witness server (Witness) can be added to the circuit, the task of which is to monitor the mirror image and, in the event of a failure of the main server or database, automatically transfer the mirror database to the operational state.

Advantages of mirroring: ensuring fault tolerance of disk arrays, the ability to use the resources of a backup server, the ability to create a geographically distributed cluster. Disadvantages: high cost of the solution (a backup server with a disk subsystem is required, as well as a witness server).

It should be noted that connecting to the mirror database by the client application will not work due to the fact that this database is always in recovery mode. In this case, you can use a snapshot of the mirror base. You can read about this and other Microsoft technologies, for example, here.

New features of MS SQL 2012

In addition to all of the above, a new functionality has appeared in SQL 2012 - AlwaysOn Availability Groups. What Microsoft writes to us:

Availability Groups is an enterprise-level high-availability and disaster recovery solution introduced in SQL Server 2012 to enable you to maximize availability for one or more user databases.

Interesting ... i.e. we are provided with a new technology of high availability at the enterprise level and it most likely should be cooler than what is already there.

Let's try to understand the features. At first glance, AlwaysOn Availability Groups is a development of mirroring. There are many instructions on the network for deploying and configuring AlwaysOn, so I’ll only talk about the essence. To deploy AlwaysOn, you need a WSFC failover cluster. To enable AlwaysOn support, you must install MS SQL on the failover cluster instance and enable the AlwaysOn high availability level in the SQL service properties (SQL Server Configuration Manager snap-in). Naturally, this must be done on all servers that we plan to use for AlwaysOn.

The following figure shows a WSFC cluster with Node01-Node05 nodes. Each node has an instance of SQL Server with AlwaysOn support installed. Within the framework of SQL servers, an MyAg availability group was created with the maximum possible number of replicas (the working base is Primary Replica and the 4th copy is Secondary Replica).

An availability group is a collection of availability replicas, their modes of operation, a collection of databases, and a group listener. The figure below shows a view of MS SQL Management Studio utility with an availability group.

Here we see a connection to an instance of SQL server named ARSHAD-PC. An AlwaysOnDemo-AG Availability Group has been created on this server. Two replicas (servers on which the replicas are located) ARSHAD-PC and ARSHAD-LP have been added. Moreover, at the moment, the main replica of the availability group is located on the ARSHAD-PC server and this group is managed from the same server. The availability group also includes two AdventureWorks and AdventureWorksDW databases. This availability group has a listener named AlwaysOnDemo-L. In fact, this is a fail-safe address for connecting to the availability group virtual server.

An unlimited number of availability groups can be created, and there can be many databases in one group. Creating a group is accompanied by a simple and clear wizard and should not cause any special problems. It is worth paying attention to two features.
When adding the database of the current server to the availability group, you must first make an archive of this database (the recovery model must be Full). To create a listener when creating an availability group, domain administrator privileges are required.
And when adding databases to an existing availability group, the administrator privileges of the SQL server will be sufficient.

It is also necessary to mention that the rest of the “binding” (users, service plans, agent tasks, etc.), which is not part of the availability group, remains local to the server. You should take care of synchronization between servers and coordination of this “binding” among yourself.

The advantages and disadvantages of AlwaysOn are basically the same as those of mirroring, but there are also features. As I understand it, the main differences between AlwaysOn and mirroring are the ability to create read-only replicas, the ability to create more than one copy, ease of configuration, great flexibility, the absence of a Whitness server (instead, the listener is in the availability group).

Due to the presence of some experience with 1C, there was a desire to check the work of AlwaysOn in conjunction with 1C: Enterprise. Moreover, when building WSFC for 1C, there should not be any problems or features. In fact, AlwaysOn should provide fault tolerance higher than WSFC (if you use different repositories), while it will be possible to offload the main database from report users (ReadOnly replica) and we can also distribute the load across the servers by creating several workgroups.

Work 1C: Enterprise and MS SQL 2012

Starting from version 8.2.17, the 1C: Enterprise platform officially supports working with MSSQL 2012 (before that, it was necessary to transfer the stored procedure sp_dboption from SQL2008).

Due to the appearance of AlwaysOn, the syntax of the ODBC connection string was supplemented in SQL, the following parameters appeared: ApplicationIntent - allows to determine the type of workload (ReadOnly / ReadWrite), MultiSubnetFailover - accelerated detection of the active server, etc. 1C does not support these new features. There is a way out - when connecting to the working database, specify the listener address, and connect to the database in read mode directly to the ReadOnly replica server (for more details, see the experiment).

Work with the base for reading. As you know, when working with 1C, even if the user is not going to change anything in the database, there are algorithms that try to call insert-s and update-s in the database. The easiest way is to rewrite the 1C configuration in which to remove (comment out) the parts of the code that lead to the changes. If this is not done, then the user will not be able to work with the ReadOnly database - 1C will fall off with an error about the impossibility of changing the database. The following are examples for typical configurations.

Scheduled tasks, obviously for the ReadOnly database they need to be disabled
When processing, the “Function Panel”, when opening, always tries to set the value of the parameter “OpenWhenStartPanelFunction” of the “User Settings” information register is set to “True”. Workarounds - disable the "OpenWhenPlayFunction Panel" setting for users, comment out the code or add a special ReadOnly role, and execute the code if the role is different
Processing the “Function Panel”, upon closing, writes the value of the parameter “Current Page of the Function Panel” of the “User Settings” information register. Workarounds - disable the "OpenWhenPlayFunction Panel" setting for the user, add a special ReadOnly role, and execute the code if the role is different
Internet user support (general form “InternetSupport for UsersError of Access to the Internet”). At the start of the configuration, 1C offers to connect to the 1C server and, if there is a connection, sets the setting in the general settings repository. Workarounds - reset the auto-connect settings to Internet support by creating external processing with the following button code:
When closing the report (for example, the balance sheet), the report settings are recorded in the "Saved Settings" directory. Workarounds: add a special ReadOnly role and in the Save procedure Configure the common module Standard Reports insert a check to save the settings, if the role is different

Experiment

So, the experiment itself. Earlier, we agreed that we will explore AlwaysOn SQL 2012 with reference to 1C. To begin, I will describe the configuration of the servers.

We have 4 servers at our disposal: one 1C application server and three SQL servers in the cluster.

The experiment will be carried out in two stages. At the first stage, we will test the possibility of working the 1C infobase using AlwaysOn SQL 2012 and check the failover. In the second stage, we examine the operation of the ReadOnly replica. For simplicity, we restrict ourselves to two databases.

As an information base 1C, we take the typical configuration "1C: Enterprise Accounting". We will carry out the preparatory work and configure the servers, availability groups and databases. So, we create the Dbtest2 database on the ServerSQL2 server, then on the same server we create the AG_test2 availability group, where we put the same database: We do the

same with the Dbtest1 database on ServerSQL1. Now, on the Server1C application server, we register two infobases:

Well, let's start the experiment itself by connecting from the client computer using 1C: Transfer to both infobases. In both sessions 1C: Enterprise, we will run the formation of some long-playing report, and without waiting for the completion of its formation, we model the server failure in the AGtest2 availability group:

And then we get an error in the infobase, which is located in the availability group, where the failure was modeled:

Connecting to another infobase (for obvious reasons) remains working. We restart the application 1C of the "emergency" information base on the client, the program works correctly. At the same time, in SQL Management Studio, we can observe that ServerSQL1 is now becoming the main replica in this availability group.

Probably, if we don’t read, but change something, for example, post a document, the result of the change will most likely be rolled back into the transaction by mistake . Those. fault tolerance does not apply to running requests.

Now let's try working with the ReadOnly database. In order to connect from 1C to such a database, you need to do two things: modify the 1C configuration so that it does not write to the database during operation, and configure the connection of the information base.

Examples of refinement of the 1C configuration are given above. As for connecting to the SQL database, then everything is much simpler. Because 1C: Enterprise platform does not support the new syntax for the connection string to MSSQL 2012, we will register the connection to the database in ReadOnly mode directly:

It is also necessary to specify the ability to connect to ReadOnly replica for all clients. To do this, in SQL Management Studio, open the properties of the availability group and for the replica change the property “Secondary replica for reading” from the value “Read only” to “Yes”:

After that, we connect from the client computer with the application 1C: Enterprise to the ReadOnly information base. Connection occurs without problems, reports are generated. But when you try to change something, SQL returns an error about the impossibility of changes and the 1C session closes.

In addition, I note that in large databases, it is absolutely possible to distinguish a significant category of users for working with reports (just this case). Having selected separate servers for them (1C and SQL), you can distribute the load among the hardware resources. It works.

Information in the ReadOnly database is updated “almost instantly”. Only a significant change in the data in the main replica can delay the update in the database for reading.

A curious effect was also noticed. In the main database, we change the structure (for example, add a new directory), having previously driven out users. At the same time, we do not expel users from the database for reading; report users continue to work quietly. At the same time, the main base changes the structure, full-fledged users begin to work in it, fill in a new directory. Changing the structure of the database and filling the database with new information is reloaded into the replica for reading. At the same time, users of reports (replicas for reading) will not see structure changes in the current session, i.e. a new directory (because the 1C client read the old metadata). But they will always have updated information from "known" objects. As soon as users of reports restart 1C (re-read metadata),

It is clear that this behavior also has a negative point - it is the integrity control of obtaining information (for example, a user can generate a report using an outdated algorithm).

conclusions

Unfortunately, there is fault tolerance in the Microsoft DBMS, but there is still no balancing. The failover in AlwaysOn is also upsetting: active connections to the database fall off. Expectations, based on a general description of the technology on Microsoft resources and on presentations at various conferences, were somewhat different. But rummaging through Microsoft resources, I found this .

Everything is honestly said, but in the general descriptions there is no such important feature, and it becomes obvious only after the experiment.
But at the same time, the flexibility and ease of configuration and management of AlwaysOn pleased us.

Also, the ReadOnly replica work “transparent” for 1C was pleased, although it is necessary to “finish” the 1C configuration a bit.
Well, it’s worth saying that the current AlwaysOn fault tolerance implementation may well be used if the loss of incomplete transactions and disconnection of active sessions at the time of switching is not critical for the business.

useful links

PS: An article from the report, which is 2.5 years old, but it seemed to me that it is relevant now (despite the release of SQL2014 and 1C 8.3).
PPS: And yet - do not forget to do BackUp, AlwaysOn does not replace them.

Tags: