rt001 November 30, 2017 at 15:52

Should I use the SSAS tabular model?

You can’t just take and answer this question without taking into account a number of factors.

Many people think that it is necessary to switch to a tabular model, because it is newer and better. But often this is impractical or even impossible in principle. However, we will talk about this a bit later.

Multidimensional model

A multidimensional database has a certain structure and allows us to generate reports very quickly. Once upon a time, to create multidimensional databases, a multidimensional model was the only solution. This model has not changed since SQL Server 2005. If you look at what's new in each release of Analysis Services, it becomes clear that most of the innovations are related to the tabular model.

Tabular model

The table model appeared in SQL Server 2012 and is being actively developed, and each subsequent version includes new features.

The table model runs on another engine (xVelocity) and it is designed to quickly execute column queries because it uses column storage (multidimensional models use string storage), in addition to good data compression. Data is stored in RAM (in-memory mode), so it is very important that the server has a lot of memory and a very fast processor. Disks for the tabular model are not so important. One of the main advantages of the tabular model is that some queries in it work faster (for example, they work very quickly with measurements based on distinct count) and it has a high compression ratio - 1/10 (below is a link with a description of the compression principle), in while in a multidimensional model, only 1/3. The compression ratio is approximate, of course, it can fluctuate, depending on the data.

Hardware

It should be noted that the hardware used for multidimensional databases in most cases cannot be used for a tabular model. The tabular model directly depends on the amount of RAM. The more memory, the higher the performance. If there is not enough memory, the tabular model will simply stop working without any warnings.

CPU frequency is also very important for the tabular model.

Once again: for a tabular model, disks are of secondary importance, but the amount of RAM and CPU speed are very important.

So how much memory do you need? There is such an expression - the more the better! But it is very abstract and impossible to understand, I would like something more tangible, right? On the one hand, there is a simple formula <Relational database size> / 10 * 2, but you should not forget that there are users who will connect to the table database, which means SSAS needs more memory - for query caching, you also need memory for the OS and the SQL Server kernel cache (if SSAS and the relational database are on the same machine). In the tabular model, it is possible to create calculated tables and columns, therefore, they will increase the size of the tabular database, despite the fact that the relational database remains at its former size.

Why in the formula is the result of dividing the size of the database multiplied by two? Because by default, processing is performed in the buffer (in fact, a temporary copy of the table database is created next to the main model), while the main model continues to exist and work unchanged (until the processing is completed successfully, after which the main model data is replaced by data from the buffer, and in case of an error everything remains unchanged). Therefore, be careful when choosing the edition of SQL Server. If the size of the table database is more than 5 GB, then the Standard edition (with a limit of 16 GB for SSAS, which includes the cache ), most likely, will not work! With a lack of cache there will be terrible brakes.

More detailed articles on the amount of memory needed can be found here andhere

Statistics on the use of tabular and multidimensional models

According to the data from this article, a survey was conducted among 440 participants in a webinar on comparing two models, of which 212 people (~ 48%) answered, on the topic “Which model do you use - tabular or multidimensional?”:

Both - 61 (~ 29%)
Multidimensional - 75 (~ 35%)
Another - 43 (~ 20%)
Tabular - 33 (~ 16%)

Transition

If you are already using a multidimensional model and it suits you, then it is better to stay on it. If there is reason to believe that a tabular model can solve your needs, then it makes sense to think about the transition. But the transition from a multidimensional to a tabular model is not an easy task. Basically, everything will have to be done manually and there is no easy way to switch using the magic converter. You can create custom migration tools using tools such as SSIS, PowerShell, or others.

It should also be understood that in multidimensional models there is a functionality that is not supported in tabular models, for example - Writeback.

A full list of functional unsupported in tabular models, in comparison with a multidimensional model, can be found in the official documentation . INThis article talks about some ways to emulate functionality .

Pay attention to this before starting the transition.

New project

For new projects, it is usually recommended to use a tabular model, unless, of course, functionality is needed that is supported only in a multidimensional model. For those who have never encountered analytics before, the tabular model will be more understandable, because it looks like a regular relational database. In addition, those functions that it supports are, in most cases, enough for full work.

Recommendations

If you still decide to use a tabular model, then let me give advice on how not to step on the rake.

The tabular model has calculated columns. With their help, in any of the measurements, you can add a field that is not in the relational database and write the calculation formula. For example, add to the Customers table fields with the names of the country and region that are stored in the adjacent table. The calculated columns are stored in a table database and are filled at the time of processing.

It would seem - beauty, logic is stored in the model metadata, there is no need to correct the views to add a field, but there is a nuance. To understand it, let's look at the stages of processing:

Getting data from a relational database
Data compression
Calculation of calculated values and indicators

From the stages of processing, 2 problems emerge:

Having a calculated column increases processing time
Computed columns are not compressed

Those. You should not abuse calculated columns, despite their convenience.

In this article on the fingers explains what is happening at the expense of data compression.

A few more tips on increasing the processing speed of a tabular model

Tags: