Experience solving a problem by creating an OLAP cube using C #
I would like to share my little experience that I gained at work in one state institution. How I got there is not important, but it is important to know, because this imposes its specificity on the conditions in which it was necessary to solve the problem. It is also worth noting that my basic knowledge and meager programming experience are related to .Net technologies.
Description of the problem: to this day there is an information system created 10 years ago that collects reporting information on various periodically changing statistical indicators from different branches of the organization in the region. The infrastructure of information interaction of the system is shown in the figure below. Metrics in DW are described implicitly.
Short explanation to the figure:
0. DW. Oracle data warehouse with a very complicated model.
1. SI. Statistical information - indicators of accounting, financial, etc. reporting.
2. KEOI. Complexes of electronic processing of statistical information created locally.
3. IM. Interface array - a model describing which SI lists are in DW.
4. SP IM SI. The system for preparing front-end arrays of statistical information is designed to collect data from KEOI, convert them to the IM format, and transfer this array for upload to DW.
Problem: The end user using Java software receives information through direct requests to the DW. Of course, there are no source codes. Storage model cannot be changed. There is a lot of data in the database over 10 years, I worked with a 25 GB backup, but software requests were disgustingly long.
Task: It is necessary to make it so that you can see the necessary information but not waiting for an hour, well, that would be convenient, and "if possible, in Excel"
A pessimistic reader noted from the figure that the solution was to use OLAP cubes as data stalls, but the path to this solution was not very obvious ...
After a period of studying the problem in the subject of information systems, I came across the concept of “Analytical pyramid” and the associated OLAP .
I came to the following conclusion below:
Since the user needed data that successfully fit into the structure of the standard cube of the star model, the cube should contain aggregated information distributed by time, territory and reporting organization, depending on the branch of the state office, it itself begs to somehow automate the creation process cube, for individual branches, organizations and the necessary lists of indicators, which will reduce the amount of unnecessary data.
My .Net profile meant using Microsoft Analysis Services as an OLAP server, which, as you can see from the comparison of models , supports the Local OLAP cubes model , which work offline without a server and can be easily viewed in Excel if you specify them as a source of multidimensional data.
To automate the process of creating cubes, we need to do this programmatically. .Net has an AMO object model for OLAP .
This article served as an initial example of programmatically creating a server OLAP cube .
In my case, I set specific table names from my DW for a specific cube. Although, I agree that the logic turned out to be a little perverted. It should be noted that for programmatic work with data stored in a cube I used ADO MD. It is well described in a cycle of these Habrovsky articles.
It shows very clearly how OLAP server core entities, such as AMO and ADO MD, relate.
So, after we created the cube on the server and successfully displayed its contents in the console window, how to make it so desired to open it in Excel?
- You can standardly connect to the server version via Menu \ Data \ From other sources;
- You can immediately generate a local cube, i.e. A report file with the extension .cub is well described in this article . Actually, an interesting feature is that the server first turns the cube data into XML, which opens up another space for creativity.
The bureaucratic red tape with the reports of one official has become a little easier, because by creating a cube once, it was convenient to work with data before the next reporting period. Although from my point of view, this is a very budgetary solution to the problem.
In addition, there are still interesting possibilities for working with cubes programmatically, I don’t know about everyone, but I will be glad to learn something new, I suggest sharing in the comments, for example: use LINQ or conduct unit testing in C # using ADO MD and MDX, as described in this article .