JBoss 7 Cluster - Load Balancing Using Apache

  • Tutorial
In the last articlewe configured user session replication to all nodes of the JBoss cluster. By itself, this action does not improve fault tolerance, and in order to use the received functionality, a load balancer is required that will distribute external calls between the cluster nodes. The load will be distributed between nodes by the Apache web server, in accordance with the recommendations in the JBoss documentation. All the information in the article is available in various sources, but it is scattered, and I didn’t come across any resources where everything would be collected in one place, with a description of the solution to problems that may arise in practice (I personally had it, so I share recipes) . The article does not pretend to be complete information, rather the opposite - the minimum configuration is described. Designed for both professionals

General principles of work

Professionals familiar with the principles of clustering may skip this section . For clarity, I will consider the specific situation that many people face: the existing system, consisting of one application server, switches to a cluster solution. The best option in such cases is not to try to immediately grasp the immensity, but to build a minimal working configuration: two application servers in a cluster, and in front of them a web server as an interface (frontend) to clients.

The scheme of work is as follows: all requests previously sent directly to the JBoss http / https connector must now access the web server (Apache). Apache is configured in such a way as to “know” about the existence “behind its back” of two application servers. When the client first accesses the System, the web server selects one of the application servers ( Node-1 ) and redirects the request to it. A session is created, a Cookie is added to it, which is further used by the web server in order to “stick” all subsequent requests of the same client to the selected application server. When creating / modifying a session, Node-1 , which processes client requests, replicates its state to the other nodes of the cluster, where they hang dead load, while not bringing any benefit.
Session Details
Simplified. Session - an object created on the side of JBoss, having various attributes. A session has an identifier that is transmitted to the browser when the session is created, and each time the client accesses the server, it is transmitted by the browser back to the server (usually in the form of a cookie). The server identifies the session by identifier and processes the request in the context of the data of the found session. The task of replication in a cluster is to transfer session data to other nodes in such a way that when a client accesses them, any other node can find the session (and the data stored in it) by identifier and process the client’s request just like Node-1 in her context.
In the event of a “crash” of Host-1 , at the next client call, Apache detects that the server is unavailable and redirects the request to another host ( Host-2 ). This is where the very “dead weight" begins to be used - a session whose state is already on all nodes of the cluster. The request is processed by Node-2 , and Apache “sticks” the session to it, that is, all subsequent user requests are now immediately sent to Node-2 .

Basic configuration

If session replication is already configured on the JBoss side, then no additional settings are required, all that is needed ( AJP connector) is already configured in the default configuration. You need to download and install Apache HTTP Server (the installation process is beyond the scope of this article). All configuration is done by editing the conf / httpd.conf file.
  1. Connecting modules . Uncomment or add the following lines:
    LoadModule proxy_module modules/mod_proxy.so
    LoadModule proxy_ajp_module modules/mod_proxy_ajp.so
    LoadModule proxy_balancer_module modules/mod_proxy_balancer.so
    LoadModule headers_module modules/mod_headers.so

    The modules mod_proxy.so, mod_proxy_ajp.so and mod_proxy_balancer.so are required for the load balancer. If the System does not have the concept of “session”, then there are enough of them. For example, I worked with a system that processed requests through web services, these are single calls that are not related to each other, so sessions were not created. In our case, there are users and their sessions, so you need to add the mod_headers.so module, which will allow you to track session data transmitted by the browser.
  2. Adjustment of balancing . Add the following lines:
    NameVirtualHost 192.168.1.0:80
    
    	Header add Set-Cookie "ROUTEID=.%{BALANCER_WORKER_ROUTE}e; path=/" env=BALANCER_ROUTE_CHANGED
    	
    		BalancerMember ajp://192.168.1.1:8009 route=node1
    		BalancerMember ajp://192.168.1.2:8009 route=node2
    		ProxySet stickysession=ROUTEID
    	
    		ProxyPass balancer://ajp-cluster/
    	

    Instead of 192.168.1.0 specify the Apache address.
    In the Proxy section, specify a list of addresses of AJP connectors of cluster nodes, one occurrence of BalancerMember for each node. In the JBoss 7 configuration, by default, the AJP connector listens on port 8009. The route parameter assigns a unique host name, used only by Apache itself, i.e. it is not necessary to match the names specified at the start of JBoss ( parameter jboss.node.name ). Nothing more needs to be changed; the cluster is ready for launch and testing.
    Configuration Details
    The Header add Set-Cookie directive acts in conjunction with the ProxySet stickysession directive as follows. When the client first calls to Apache, after the balancer has selected which node it will be redirected to, a cookie with the name " ROUTEID " is added to the user session", the value of which is the identifier of the selected node (node1 / node2 in the configuration above). On subsequent client calls, Apache no longer needs to" think "about which node was previously selected, Cookie explicitly tells the web server to which node the request should be redirected The Header directive is responsible for the existence of a cookie in a session, and the ProxySet stickysession directive is used to route it to "stick" a session to a specific node. If your server does not work with sessions (for example, as in my example above, only the web services), then the Header and P directives roxySet is not needed.
    Jssessionid
    In many configuration examples, it is suggested to use the session identifier JSESSIONID as a node marker. This seems logical, because this identifier is used in most Java web applications. Example with apache.org :
    ProxyPass /test balancer://mycluster stickysession=JSESSIONID|jsessionid scolonpathdelim=On
    
    BalancerMember http://192.168.1.50:80 route=node1
    BalancerMember http://192.168.1.51:80 route=node2
    
    I must say right away: for JBoss this does not work, at least in the default configuration. This method assumes that the Cookie value with the name JSESSIONID, formed on the server side of the application, consists of two parts: the session identifier is supplemented by the server identifier, they are separated by a dot. JBoss 7 forms a simple cookie JSESSIONID, which contains only the session identifier, the value of which is randomly generated and does not carry any meaning.
    Thus, the main differences from the configuration I described are as follows:
    • In the configuration described in the article, the node identifier and the session identifier are in different Cookies, here the two values ​​are combined into one Cookie.
    • In the configuration described in the article, the node identifier is formed on the side of the balancer (it is logical: whoever uses it forms) and the session identifier on the application server side (the same principle: the session identifier is needed by JBoss, not Apache). Here, both values ​​are generated on the server side of the application (or the web server on the backend).



SSL Setup (HTTPS)

If, prior to clustering, you used encryption when exchanging data with the client, communicating with him using the HTTPS protocol, then certificate settings were performed on the JBoss side, in the configuration of the HTTPS connector. When migrating to a clustered solution, Apache must manage the certificates.

General principles
The process is as follows: HTTPS requests from the client are sent to the Apache web server, decrypted and transmitted to the application server in clear form using the AJP protocol. The response from the application server is also transmitted to the web server in the clear, is encrypted there and transmitted to the client via the HTTPS protocol. Thus, between Apache and JBoss there is a so-called "demilitarized zone", i.e. An environment that is closed from external unauthorized access and in which trust relationships are established between servers that do not require additional protection against unauthorized access at the application level. Only Apache should be accessible from the outside, via the HTTPS port (usually 443), everything else should be tightly closed.

Web server settings in conf / httpd.conf:
  1. Port 443 wiretapping . The Apache web server in the default configuration only listens to port 80. In order to listen to port 443, add the line:
    Listen 443
  2. Connecting modules . Uncomment or add the following line:
    LoadModule ssl_module modules/mod_ssl.so

  3. Adjustment of balancing . Add the following lines:
    NameVirtualHost 192.168.1.0:443
    
    	Header add Set-Cookie "ROUTEID=.%{BALANCER_WORKER_ROUTE}e; path=/" env=BALANCER_ROUTE_CHANGED
    	
    		BalancerMember ajp://192.168.1.1:8009 route=node1
    		BalancerMember ajp://192.168.1.2:8009 route=node2
    		ProxySet stickysession=ROUTEID
    	
    		ProxyPass balancer://ajp-cluster/
    	
    	ProxyRequests off
    	SSLEngine on
    	SSLCertificateFile c:/Apache2/conf/my.cer
    	SSLCertificateKeyFile c:/Apache2/conf/my.key
    


In addition to the previous configuration, here you need to specify the path to the certificate and private key that your server uses to encrypt traffic.

The configuration of Apache 2.2 is described above, starting from version 2.4, add the mod_lbmethod_byrequests.so and mod_slotmem_shm.so modules, and delete the NameVirtualHost declarations, they are no longer needed.
Problems with certificates and their solution
When setting up encryption on the JBoss side, one file was required containing the key and certificate, for example, in PKCS # 12 format. Apache requires splitting into two files, a certificate in one file and a private key in the other. In addition, Apache has two limitations:
  1. Only the PEM format is recognized. You can find out if your key matches this format by opening the key file with a text editor. The content should contain the lines "----- BEGIN PRIVATE KEY -----" and "----- END PRIVATE KEY -----", the length of the lines between them should not exceed 64 characters. The same applies to the certificate , with the strings, respectively, "----- BEGIN CERTIFICATE -----" and "----- END CERTIFICATE -----".
  2. Building Apache under Windows does not know how to open password-protected key stores.

All of the above problems are solved using the Openssl utility . From your key and certificate storage file used in JBoss, you should extract the key and certificate into separate files, while converting to the desired format and removing the password protection. Here is an example of converting a key from PKCS # 12 format (other formats are supported, Google will help you):
openssl pkcs12 -nocerts -nodes -in C:\key.p12 -out C:\Apache2\conf\my.key
openssl pkcs12 -clcerts -nokeys -nodes -in C:\key.p12 -out C:\Apache2\conf\my.cer

ATTENTION! Do not forget to delete the received my.key file from your computer after transferring it to the industrial server, as he is not protected by anything and may be compromised.

Conclusion

After applying the solution described in the article, in principle, you can generally remove the description of HTTP / HTTPS connectors from JBoss settings as unnecessary, because Now all external calls will come through the AJP connector. This will significantly increase server security, especially for scanning utilities that pay special attention to these protocols.

PS

The proposed clustering option partially solves the problems of load balancing and fault tolerance. For a full-fledged configuration, DBMS clustering and duplication of a single entry point, Apache, are not enough. The hardware infrastructure configuration is also very important: duplication of networks and routers, server components, uninterruptible power supply and more. Nevertheless, this article will help to develop the system according to the 80/20 principle, i.e. at low cost, a significant increase in productivity and fault tolerance. In my practice, I was not able to meet the solutions that went further, except that there was a clustering of the DBMS, but if I happen to encounter it, I will write about it. If your services are highly critical, then the solution I proposed is just the first step.

There is a similar articleon Habr, I recommend to familiarize also with it. Despite the thematic similarities, my article is noticeably different, and, in my opinion, has every right to exist. I tried to lucidly talk about how and due to what everything works, the contents of configuration and batch files are simply laid out there. In addition, in the article by reference, in Apache mod_jk is used for balancing, I have mod_proxy . Stackoverflow writes that the main disadvantage of mod_jk is " Need to build and maintain a separate module". This is exactly what our system administrators faced in practice, who were unable to install this module on Apache under AIX, so if you have a similar problem, you can try mod_proxy, which comes with Apache and, in my opinion, is easier in setup.

Also popular now: