Amazon Redshift Parallel Scaling Guide and Test Results

Original author: Stefan Gromoll
  • Transfer

At Skyeng, we use Amazon Redshift, including parallel scaling, so the article by Stefan Gromall, the founder of, for, seemed interesting to us. After the transfer - a little of our experience from the engineer according to Daniyar Belkhodzhaev.

Amazon Redshift architecture allows for scaling by adding new nodes to the cluster. Having to cope with the peak number of requests can lead to over-provisioning of nodes. Concurrency Scaling, in contrast to adding new nodes, increases computing power as needed.

Amazon Redshift parallel scaling gives Redshift clusters additional power to handle peak requests. It works by transferring requests to new “parallel” clusters in the background. Requests are routed based on WLM configuration and rules.

Pricing for parallel scaling is based on a free-use credit model. Above the standard of free loans, payment is based on the time when the parallel scaling cluster processes requests.

The author tested parallel scaling on one of the internal clusters. In this post, he will talk about the test results and give tips on how to get started.

Cluster requirements

To use parallel scaling, an Amazon Redshift cluster must meet the following requirements:

  • platform: EC2-VPC;
  • node type: dc2.8xlarge, ds2.8xlarge, dc2.large or ds2.xlarge;
  • number of nodes: from 2 to 32 (clusters with one node are not supported).

Valid request types

Parallel scaling is not suitable for all types of queries. In the first version, it only processes read requests that satisfy three conditions:

  • Read-only SELECT queries (although more types are planned);
  • the query does not refer to a table with the INTERLEAVED collation;
  • the query does not use Amazon Redshift Spectrum to reference external tables.

To route to a parallel scaling cluster, the request must be queued. In addition, queries that are suitable for the SQA (Short Query Acceleration) queue will not be executed in parallel scaling clusters.

Queues and SQAs require the correct configuration of Redshift Workload Management (WLM) . We recommend that you optimize your WLM first - this will reduce the need for parallel scaling. And this is important because parallel scaling is free only for a certain number of hours. AWS claims that parallel scaling will be free for 97% of customers, which brings us to the issue of pricing.

Parallel Scaling Cost

For parallel scaling, AWS offers a credit model. Each active Amazon Redshift cluster accumulates loans hourly, up to one hour of free parallel scaling loans per day.

You pay only when using parallel scaling clusters exceeds the amount of loans you received.

The cost is calculated at the per-second demand rate for a parallel cluster that is used in excess of the free rate. Payment is made only during the execution of your requests, with a minimum payment of one minute, each time you activate a parallel scaling cluster. Per-second on-demand rates are calculated based on Amazon Redshift's general pricing principles., that is, it depends on the type of node and the number of nodes in your cluster.

Running parallel scaling

Parallel scaling starts for each WLM queue. Go to the AWS Redshift console and select “Workload Management” in the left navigation menu. Select the WLM group of your cluster in the next drop-down menu.

You will see a new column called “Concurrency Scaling Mode” next to each queue. The default value is “Off.” Click "Change", and you can change the settings for each queue.


Parallel scaling works by sending the appropriate requests to new dedicated clusters. New clusters have the same size (type and number of nodes) as the main cluster.

The default number of clusters used for parallel scaling is one (1) with the ability to configure a total of up to ten (10) clusters.
The total number of clusters for parallel scaling can be set by the max_concurrency_scaling_clusters parameter. Increasing this setting provides additional redundant clusters.


The AWS Redshift console has several additional graphs. The Max Configured Concurrency Scaling Clusters chart displays the value of max_concurrency_scaling_clusters over time.

The number of active scaling clusters is displayed in the Concurrency Scaling Activity section in the user interface:

In the Requests tab there is a column showing whether the request was executed in the main cluster or in the parallel scaling cluster:

Regardless of whether a specific request was made in the main cluster or through a parallel scaling cluster, it is stored in stl_query.concurrency_scaling_status.

A value of 1 indicates that the request was running in a parallel scaling cluster, while other values ​​indicate that it was running in the main cluster.


Information about parallel scaling is also stored in some other tables and views, for example, SVCS_CONCURRENCY_SCALING_USAGE. In addition, there are a number of catalog tables that store information about parallel scaling.


The authors launched parallel scaling for one queue in the internal cluster at approximately 6:30 p.m. GMT on 03/29/2019. We changed the max_concurrency_scaling_clusters parameter to 3 at approximately 8:30 p.m. March 29, 2019.

To model the request queue, we reduced the number of slots for this queue from 15 to 5.

The following is a diagram of the dashboard, showing the number of requests running and standing in the queue after reducing the number of slots.

We see that the waiting time for requests in the queue has increased, while the maximum time is more than 5 minutes.

Here is the relevant information from the AWS console about what happened during this time:

Redshift launched three (3) parallel scaling clusters as configured. It seems that these clusters were not fully used, even though many of the requests in our cluster were queued.

The usage graph correlates with the scaling activity graph:

After a few hours, the authors checked the queue, and it seems that 6 requests were executed with parallel scaling. We also selectively checked two requests through the user interface. They did not check how to use these values ​​when several parallel clusters are active at once.


Parallel scaling can reduce the queue time of requests during peak loads.

According to the results of the basic test, it turned out that the situation with loading requests has partially improved. However, parallel scaling alone did not solve all the problems with concurrency.

This is due to restrictions on the types of queries that can use parallel scaling. For example, authors have many tables with interleaved sort keys, and most of our workload is a record.

Although parallel scaling is not a universal solution in configuring WLM, in any case, using this function is simple and understandable.

Therefore, the author recommends using it for your WLM queues. Start with a single parallel cluster and monitor the peak load through the console to determine if new clusters are fully utilized.

As AWS adds support for additional types of queries and tables, parallel scaling should gradually become more and more efficient.
Comment from Belkhodzhaev Daniyar, an engineer according to Skyeng.

We at Skyeng also immediately drew attention to the emerging possibility of parallel scaling.
The functionality is very attractive, especially since AWS estimates that most users do not even have to pay extra for this.

It so happened that in mid-April we had an unusual flurry of requests to the Redshift cluster. During this period, we often resorted to the help of Concurrency Scaling, sometimes the additional cluster worked 24 hours a day without stopping.

This allowed, if not completely solving the queuing problem, at least to make the situation acceptable.

Our observations largely coincide with the impression of the guys from

We also noticed that despite the presence of pending requests in the queue, not all requests were immediately redirected to a parallel cluster. Apparently this is due to the fact that the parallel cluster still takes time to start. As a result, during short-term peak loads, we still have small queues, and the corresponding alarms manage to be triggered.

Having got rid of the abnormal loads in April, we, as AWS expected, went into episodic use mode - as part of the free norm.
You can track concurrent scaling costs in AWS Cost Explorer. You need to select Service - Redshift, Usage Type - CS, for example, USW2-CS: dc2.large.

Details of prices in Russian can be found here.

Also popular now: