How not to go crazy in the development of reference management information systems. From the history of our projects

    Engaged in large-scale automation projects and creating new information systems, each time we were faced with the need to implement a subsystem for maintaining directories, classifiers, registers, and other similar objects that make up customer reference information (NSI). For 15 years of working at LANIT with NSI control systems, life has thrown us customers with a wide variety of requirements. And, of course, different situations arose on these projects. I will tell you about several instructive stories that have happened to us. In the article you will find examples that will be useful to many who are involved in software development. Well, those who work directly with the NSI will be even more interesting - their own shirt is closer to the body.

    For illustration, special thanks to the wonderful artist Vasya Lozhkin .


    Case One. How to load a wagon and a small trolley


    Creation of a single counterparty management system for a large production company with many factories throughout the country and abroad.

    The goal of the project is to create a single database of counterparties for all departments. Counterparties are maintained on the basis of applications that are assigned priorities from low to urgent. An urgent application must be processed by NSI experts within 2 hours, regardless of the time difference between units.

    Living history
    The project was agreed with all interested parties (the customer’s management convinced us of this) and was developed on time in accordance with the approved requirements.

    The presentation of the created counterparty management system went smoothly until one prominent woman stood up - the head of the Siberian branch - and very energetically, using Russian idiomatic expressions, informed the audience that when a railway carriage came to her to load finished goods, she would not wait 2 hours, while someone there in Moscow will consider the application for adding a buyer.

    She is not going to pay for a simple carriage while the application is being approved, but will enter the buyer’s data into the system as is and ship the goods, and Moscow comrades can then deal with the buyer’s information as much as necessary.

    This statement was supported by several heads of company branches, which almost completely destroyed the centralized methodology for maintaining a single directory of counterparties based on applications.

    As a result, the project was modified so that all branches had access to the counterparty database and could make changes to it directly, but at the same time an automatic search was made for similar records that were displayed to the branch employee, and he made a decision on the need to adjust the data, which later checked by an expert group.

    What we remembered: do not trust the words of the leaders and decision-makers on the part of the customer that all decisions have been agreed, everything is in the subject and there is no objection. Identify all stakeholders of the project and try to find out the system requirements and limitations directly from them.


    Case Two. As we want, we use


    Creating a centralized customer management system for an insurance company with a large number of branches and agents throughout the country.

    The goal of the project is to create a consolidated customer base for use in analytical applications. The database was collected from all branches, the data was verified, supplemented, duplicate objects were eliminated. The number of customers in one branch is from a thousand to several million. At the same time, there are practically no intersections across customers between branches.

    Living history

    After creating a consolidated customer base, it had to be periodically compared with branch databases to identify differences, their subsequent processing and uploading changes to the consolidated database. The growth of the customer base between reconciliations amounted to several thousand records.

    To perform the verification, a special module was created, the architecture of which was designed on the basis that it should quickly compare a large number of records and generate a relatively small XML file with changes for loading. The XML format was chosen by the customer.

    After the implementation of the system, we received a message from the customer that the reconciliation module works extremely slowly and forms a huge file for loading into the consolidated database that they cannot open with anything.

    What turned out? The customer performed the initial loading of data from the branches into a consolidated reference book. This work seemed tedious and time-consuming to the experts, and they just took the reconciliation module and slipped him the full details of the new branch, which had never been uploaded to the consolidated reference book.

    The reconciliation module, which, in accordance with the ToR, was supposed to generate information about differences in the number of several thousand records, received two million entries at the input, and all of them were absent in the consolidated reference book.

    As a result, after several hours of inhuman efforts, the reconciliation module nevertheless generated a file for downloading, which included all the data from the branch. And, yes, this file was huge.

    The reconciliation module was used by the customer for other purposes, but the fact that the reconciliation allows the initial loading of data was liked by the customer, and he intended to continue working in this way, only asked him to significantly speed up the module and do something with the file to be created so that it could be opened in a text editor.


    To our objections that the reconciliation module is not intended for initial data loading, the customer joyfully showed the statement of work and asked, where is it written here? As we want, we use it!

    As a result, we had to make changes to the architecture of the reconciliation module in order to process large data arrays and generate an output file in CSV format, as the customer absolutely did not want to refuse such a convenient tool.

    What we remembered: always include a description of the restrictions in the ToR - what your system should not do. Well, or create solutions that take into account all possible use cases, which is much more expensive.


    Case Three. Not an elephant, but an elephant, and it must fly


    Creation of a centralized system of conducting NSI for a financial organization.

    The aim of the project is to create a centralized system for maintaining directories and classifiers with the distribution of changes to interested systems and databases. Providing access to external systems to directories through the web services of our system.

    Typically, customers have an average number of entries per directory of several hundred to several thousand. Our recent record holder is a directory with 11 million entries. But this customer gave us a surprise. In his directory there were over 100 million entries. We downloaded it for more than a day, because at boot, a lot of data validation was performed. This would not be a big problem, but the customer demanded that the directory be downloaded in a few minutes.

    As a result, we had to change the order of the system with this directory. In fact, its maintenance is carried out outside the system, and we only provide an interface for its use. Now we are developing for our system new ways of working with very large directories. We hope that the customer will like it.

    What we remembered: in the modern world of data is becoming more and more, and their growth rate is constantly increasing. The system must be prepared for high loads, even where they were not originally intended. We are constantly developing our solution, taking into account current trends in data growth and increasing requirements for the speed of their processing.

    Case Four. Complex focus with files


    Creation of a centralized system of conducting NSI in a large bank.

    The aim of the project is to create a centralized system for maintaining directories and classifiers with the distribution of changes to interested systems and databases. The peculiarity of the project is the very difficult processes of diffusion of changes that affect many systems.

    Since in the future I will have to mention our own solution for managing the NSI, I will allow myself a little digression.

    Read more about the NORMA system.
    The tasks of our customers are largely similar, and we decided to reduce the cost of software development and reduce the time of projects by creating our own universal platform for maintaining NSI and master data (Reference Data Management & Master Data Management). The system has existed for more than 10 years, and all these years we at LANIT have been actively developing it.

    NORMA supports centralized and distributed NSI management. All data and meta-information are kept taking into account the history of changes and the system allows you to view and change the entire array of NSI for an arbitrary date in the past or future. For directories, reconciliation and approval processes can be customized. The system includes a dedicated change distribution server, which allows you to interact with external systems through various interfaces and create fairly complex integration business processes (a kind of mini BizTalk Server). We have export / import data packages that can upload / download directory data to databases and files of various formats. Recoding tables for external systems are supported.

    NORMA includes a graphical query builder and report designer. In addition to working with its own directories, the system allows through its interface to view and change directories that are in external databases relative to it, as well as use these directories in the query builder and export / import packages.

    In response to various events in the system, for example, events of making changes to the directory, plug-in program components written in C # can be launched, which can both check data and interact with external systems and, in fact, the NORMA system itself. Almost all functions of the system are available through web services.

    The system can be scaled both vertically, by increasing the capacity of the application server and the database, and horizontally by using a multi-node application server in which each node or group of nodes is responsible for performing a separate function. The system can use Microsoft SQL Server, Oracle or PostgreSQL to store NSI.

    Typically, when creating directories and processes for disseminating changes, the customer consults with our analysts which tool or set of tools provided by the system is best used for a specific task. This time the customer said that he would create directories and processes on his own.

    After some time, one of the customer’s specialists contacted us with a complaint that he was not loading data into the system. In confirmation, we were sent a data import package, a source file with downloadable records, and an error message stating that the downloaded data was of the wrong type.

    We begin to understand. We twist the package this way and that, we try different options for presenting the source data, but we cannot repeat the error. We turn to the customer with questions, maybe the import package has connected software components, maybe there are some additional restrictions on the directory, maybe the data is not from this process? We get the answer to everything - there’s nothing like that, everything should be easy to load and worked before.


    Turns out this import package was just the tip of the iceberg. Briefly and greatly simplified, the following happened. The import procedure loaded the correct data from the source file into the directory. The original file was deleted. Then our system distributed changes to several databases, in one of which we compared our own data with our changes and formed a file with discrepancies, which was returned to our system for download. Moreover, to download this file, the customer used the same import procedure as for the source file. And it was in this file, formed by the external system, that the data was of the wrong type. Obviously, when analyzing the source file, we could not find any errors, but they did not tell us anything about the second file and the spreading process of distributing the changes.

    What we remembered: Always check the information received even if they tell you that we have a small problem here, and it’s in this place, I swear by my mom! Analyze the problem in context.

    Case Five. I'm getting used to the mismatches


    Creation of an NSI management system in a manufacturing company.

    The aim of the project is to create a system of NSI management in a management company with many branches, factories and design divisions.

    This time we did not advance beyond several presentations. Our NORMA system really liked the techies. She covered all their existing problems. Then it was the turn to show the system to management, and here a decade broke. The senior leader looked, listened and said: “We all work on Apple products, they have a certain style, and your system does not fit into this style. We won’t even consider it. ”


    What we remember: customers are different, and some you just do not fit. The style is different.

    Similar stories happen in various projects. What was interesting in your project life? What was an unexpected lesson for you? Share in the comments.

    By the way, we are looking for specialists in our team

    Also popular now: