Jira DataCenter - what is it? How does it work? How to deploy?

Tutorial

Introduction

With the spread of the Agile philosophy, Russian IT specialists are gaining more and more expertise and competence in customizing and managing products for development teams, the most popular of which is still Jira. However, working with the highest, most productive and highly available version of it - the Jira Data Center - still raises a lot of questions. In this post I will talk about some of the principles and mechanisms of the Jira DataCenter, which we apply in practice. I'll start with a story about the structure of the Jira cluster.

What is the Jira DataCenter?

Jira DataCenter is, in fact, a server version, but with the ability to use a common database and a joint index.

It is important to understand that Jira DataCenter itself, as a product and as an application, does NOT provide fault tolerance and load balancing. The modules and systems to which the Atlassian product has nothing to do are responsible for this.

In other words - Atlassian provides support for working in a cluster, but clustering itself is implemented by external means, the choice of which is quite rich.

A detailed description of the product can be found on the Atlassian website .

There are several options for building:

1. On our own infrastructure
2. In the Amazon cloud (AWS)
3. In the MS cloud (Azure)

This article will describe the solution for your own infrastructure.

What problems does the Jira DataCenter solve?

Jira Data Center helps to achieve the following goals:

Fault tolerance implementation.
Ensuring stable operation under high load. High load refers to large / enterprise scale instances, according to the Jira Sizing Guide .
Ensure continuous operation when maintenance is required. At this point, I will focus separately. The application quite often has to be updated and not all companies have the opportunity to do it quickly, and invisibly to users. This problem is solved by clustering and using the so-called Zero Downtime update scheme.

These problems are solved through clustering and scalable architecture.

What are the components of the Jira DataCenter?

As you can see in the figure below, the Jira DataCenter cluster is a set of several dedicated machines.

Figure 1. Architecture of the Jira Data Center

Application nodes (Application nodes or cluster nodes). They accept and process all workloads and requests. The role of the nodes is performed by regular servers, with identical content and installed application, as well as a shared file system mounted.
File system (Shared file system) with standard file import / export capabilities, plugins, caching and so on. A file server is also a separate server on which a shared folder or resource is created, which is mounted to the nodes and used for shared files.
Database (Shared database). The database server is also, in this case, a separate server and can be built on MS SQL, PostgreSQL, MySQL, Oracle solutions.
Load balancer. It distributes user requests and delivers them to the nodes, and if one of them fails, the balancer redirects its requests to other nodes almost instantly. Thanks to his work, the failure of one node is not even noticed by users. We will talk about the work of the balancer separately below.

Jira Data Center cluster topology

I will give the basic principles on which the cluster is built in the JDC:

Jira instances share a common database;
the Lucene index is replicated in real time and stored locally for each instance;
attachments are stored in a shared storage;
Jira instances monitor cache consistency;
several instances can be active at any time;
cluster locks are available;
the balancer is configured to redirect requests only to active nodes, while it should not redirect requests to inactive nodes, and also cannot address all sessions to one node.

All nodes are divided into active and passive. Active nodes are different in that they:

Process requests
Perform background processes and tasks.
Scheduled tasks can be configured on one or more of them.
In all practical scenarios, the situation will look like when using a standard Jira Server. Accordingly, passive nodes do not process requests and do not perform tasks, but serve to take short-term workload (for example, at system startup, loading plug-ins and / or indexing).

The figure below shows the operation of the Jira cluster.

Figure 2. Simplified diagram of the architecture

About load balancers

Any server with a reverse proxy or physical device can act as a balancer. I will give the most famous examples of balancers.

1. Hardware balancers:

• Cisco
• Juniper
• F5

2. Software balancers:

• mod_proxy (Apache) is a proxy server for the Apache HTTP Server that supports most of the popular protocols and several different load balancing algorithms.

• Varnish is a reverse HTTP proxy server and accelerator, it is designed for sites with high traffic. Unlike the others, it is only a proxy server and load balancing HTTP traffic. In particular, Varnish uses the Wikipedia, NY Times, The Guardian and many other major projects.

• Nginx - web server number 1 in popularity among load balancers and proxy solutions for sites with high traffic. It is actively developing, the manufacturer offers free and corporate versions. Used on many of the most visited sites in the world, for example, WordPress.com, Zynga, Airbnb, Hulu, MaxCDN.

• Nginx Plus - in fact, the above-mentioned paid corporate version of Nginx.

• HAProxy is a free, open source tool that provides load balancing and proxy server capabilities for TCP / HTTP protocols. It is fast and consumes few system resources, compatible with Linux, Solaris, FreeBSD and Windows.

A good comparison of proxy servers can be found here on this link .

Direct and reverse proxies

Load balancers can work as direct, as well as reverse proxies. The difference is well described by the author of this comment on stackoverflow:

1. “Forward proxy” (Forward proxy). The proxy event in this case is that the “direct proxy” retrieves data from another web site on behalf of the original requester. As an example, I will give a list of three computers connected to the Internet.

X = computer or client computer on the Internet
Y = proxy website, proxy.example.org
Z = website you want to visit, www.example.net You
can usually connect directly from X -> Z. However in some scenarios, it is better to have Y -> Z on behalf of X, which chain looks like this: X -> Y -> Z.

2. "Reverse proxy" (Reverse proxy). Imagine the same situation, only on site Y is a reverse proxy configured. Usually, you can connect directly from X -> Z. However, in some scenarios, administrator Z is better to restrict or prohibit direct access and force visitors to first pass through Y. Thus, as before, we receive data received from Y -> Z on behalf of X, which follows: X -> Y -> Z.
This case differs from “direct proxy” in that user X does not know that he is accessing Z, because user X sees that he is communicating with Y. Server Z is invisible to clients and only the external proxy server Y is visible . Reverse proxy does not require client-side configuration. Client X believes that he only interacts with Y (X -> Y), but the reality is that Y redirects the entire connection (X -> Y -> Z again).

Next, we look at working with a software balancer.

Which software balancer to choose?

In our experience, the best choice among software balancers is Nginx, because it supports Sticky sessions and is also one of the most commonly used web servers, which means good documentation and sufficient fame among IT specialists.

Sticky session is a load balancing method in which client requests are sent to the same server in a group. In Nginx, there is a sticky method that uses a cookie for balancing, though only in the commercial version. But there is a free way - using external modules.

The module creates a cookie, and thus makes each browser unique. Next, a cookie is used to redirect requests to the same server. If there is no cookie (for example, at the first request), the server is selected randomly.
More information about the sticky method can be found at this link , as well as here in this Habrapost .

Now, go to the practical part ...

Instructions for creating a cluster Jira DataCenter

For clustering, you can use either an existing instance with Jira installed, or a new one. In our example, the installation of new instances on different operating systems will be described (to demonstrate the versatility of the system).

1. Let's start with the database server. You can use both existing and create a new one. Again, for illustration purposes, Windows Server 2016 + PostgreSQL 9.4 was chosen. Install the OS, install the PG server, install the PG Admin, add the user and the database.

2. Create the first node on Ubuntu 16.04 LTS. Install the necessary packages, update the repositories.

3. Download and install the Jira DataCenter, launch, configure the database (just in case, Atlassian has a detailed guide ).

4. Turn off the Jira, turn off the node.
service jira stop

5. For further manipulations, it is better to temporarily disable Jira's autorun:
update-rc.d -f jira remove

6. Clone the disabled node.

7. Start the first node, turn off the Jira (by default, after installation, Jira is set to autorun).

8. Start the second node, turn off the Jira.

9. Create a separate instance for the balancer. I chose Ubuntu 16.04, because it is quite fast, simple and does not require additional costs in the form of licenses.

10. Install nginx (version 1.13.4 is used in the example).

11. Download and unpack nginx-sticky-module-ng:
git clone bitbucket.org/nginx-goodies/nginx-sticky-module-ng.git

12. Prepare nginx forrecompilation and adding a module.

13. Compile nginx with the module nginx-sticky-module-ng. In my case, the compilation line was:
./configure --prefix = / etc / nginx --sbin-path = / usr / sbin / nginx --modules-path = / usr / lib / nginx / modules --conf-path = / etc / nginx / nginx. conf --error-log-path = / var / log / nginx / error.log --http-log-path = / var / log / nginx / access.log --pid-path = / var / run / nginx. pid --lock-path = / var / run / nginx.lock --http-client-body-temp-path = / var / cache / nginx / client_temp --http-proxy-temp-path = / var / cache / nginx / proxy_temp --http-fastcgi-temp-path = / var / cache / nginx / fastcgi_temp --http-uwsgi-temp-path = / var / cache / nginx / uwsgi_temp --http-scgi-temp-path = / var / cache / nginx / scgi_temp --user = nginx - group = nginx --with-compat --with-file-aio --with-threads --with-http_addition_module --with-http_auth_request_module --with-http_dav_module --with-http_flv_module --with-http_gunz_module --with-http_mp4_module --with-http_random_index_module --with-http_realip_php --with-mail_ssl_module --with-stream --with-stream_realip_module --with-stream_ssl_module--with-stream_ssl_preread_module --with-cc-opt = '- g -O2 -fstack-protector --param = ssp-buffer-size = 4 -Wformat -Werror = format-security -Wp, -D_FORTIFY_SOURCE = 2 -fPIC '--with-ld-opt =' - Wl, -Bsymbolic-functions -Wl, -z, relro -Wl, -z, now -Wl, - as-needed -pie '--add-module = / usr / local / src / nginx-sticky-module-ng

14. Find the file /etc/nginx/nginx.conf, copy it to .bak, configure nginx for reverse proxy mode.

15. Next, we need a file server (preferably also fault tolerant). For example, I chose a Windows server, where I created an NFS ball.

16. Install packages for NFS support on each node:
apt-get install nfs-common

17. Create the / media / jira folder and execute:
chmod -R 0777 / media / Jira

18. Mount the NFS ball as a common one (be sure to install it not in root folder as, for example, in / media / jira) - ON EACH NODE

19.1. Next, you can either mount manually (once):
sudo mount -t nfs -O uid = 1000, iocharset = utf-8 xx.xx.xx.xx: / jira / media / jira
where xx.xx.xx.xx is the ip address of the server with NFS ball

19.2. Either automatically mount at the start of the OS:
mcedit / etc / fstab
At the end, add the line:
192.168.7.239:/jira / media / jira nfs user, rw 0 0
Then save and exit.

20. Assign an ID: the first node of node1, the second node of node2, and so on.
ID a must the BE #This unique across the cluster the
jira.node.id = node1
#The location The home of the the shared directory for all nodes Jira
jira.shared.home = / media / jira

21. Run Djir at the first node
service jira start
check:
go to system -> system info -> look for cluster ON and node number.

22. Setting up nginx balancing

23. Because Previously, we disabled Jira's autorun on the nodes, then you can enable it with the command:
update-rc.d -f jira enable

24. Check the cluster operation and, if necessary, add nodes.

Cluster Startup Order

1. Enable Shared file system server
2. Enable Load balancer
3. Enable node1
4. Enable node2
5. ...

Cluster stop order

1. Stop Jira on both nodes with the service Jira stop command
2. Turn off node 2
3. Turn off node 1
4. Turn off load balancer
5. Turn off file system server

That's all…

Of course, the method described is not the only correct one. This is just one of the ways to implement.

I express my gratitude to my colleagues for their help in preparing the material.
Comment, ask questions and thank you for your attention.

Tags: