Introducing CUBRID - DBMS Optimized for Web Applications
Greetings to all, dear Khabravchians!
Personally, we did not present our development to Habra users, but most likely you have already read about the CUBRID DBMS in Lev Khomich’s habratopik. Some points in the article are not entirely correct, what I want to fix in this topic. Therefore, I propose to get to know each other better and learn in more detail why we present CUBRID as the most optimized DBMS for Web applications. I will also talk about those nuances that you will not find anywhere else (yet), even on the official website of the project http://www.cubrid.org . In this way, you will learn a lot and hopefully tell, advise or offer us your ideas and opinions in the comments. Therefore, I am sure you will be pleased with our acquaintance.
First, when did the development of CUBRID begin?
Different sources cite different dates: 15 years ago, or 2006. Truly, the DBMS was sold and was in great demand long before MySQL, and even CUBRID itself, appeared. It was one of the first with an object-oriented architecture, which is widely used today in the gaming and multimedia industries. The DBMS became so popular that Oracle offered to buy the source code and a license for its further development and sale for 1 billion US dollars. But the developers rejected the offer and instead found sponsors with an asset of $ 2 billion. It was back in the early 90s. Therefore, the habratopika of Lev Khomich and some other sources speak of fifteen years or more.
However, officially, the year we started developing CUBRID, we, the DBMS developers, define 2006 as the year when NHN Corporation , a major player in the South Korean search market with a 74% share, united its main architects and programmers into a team of 40 people and organized the CUBRID project. Occupying 13th place in the global IT industry, NHN had sufficient human and financial resources to successfully launch the project. By then, NHN had already provided over 100 Web services for users in South Korea, Japan, China, and the United States, including many online games , search engines , socialand other services. We were sure that it was Web services, which became more and more popular and diverse throughout the world, that would determine the development of the IT industry. Therefore, we set ourselves the goal of developing the most optimized database management system for Web services and opening its code under the GPL version 2.0 license.
Thus, the company determines to create an object-relational DBMS, which would provide all the advantages of both the OSBMS, which is so often used in online games and multimedia services, and the RDBMS, which has become the most popular solution for all other industries. For this purpose, the company acquires a license for the same OSBMS with 15 years of experience, and by then the open standard of 92 years already takes as a basis for the relational part. This was the beginning of the development of the CUBRID DBMS.
The first open source code
For two years, we developed CUBRID, and by October 2008 we had released version 1.0 of the new DBMS oriented to use with Web applications. The first stable release was involved in the internal services of NHN itself. Then, by the next month, we are finalizing the DBMS and publicly announcing CUBRID as the first open-source DBMS in South Korea.
The popularity of CUBRID in the domestic market grew so quickly that over the course of a year several thousand users began to develop and adapt various applications for working with CUBRID DBMSs, such as LACP (Linux, Apache, CUBRID, PHP / Perl / Python) and LnCP (Linux, nginx , CUBRID, PHP / Perl / Python) stacks, Windows installers, as well as well-known CMS (WordPress, phpBB, Joomla). During this first year, CUBRID was introduced into the internal management systems of the White House of Korea, the National Tax Service of Korea, many ministries and corporations.
Thus, the first year was considered very successful. However, due to the fact that most of the user developments were limited to supporting only the Korean language, this and many others in the team did not suit me. After all, we developed a DBMS not only for users of Korea, but for the entire IT space. Therefore, exactly one year after the first release in October 2009, we transfer the project source code to the new Sourceforge.net resource so that users around the world can follow the development of the project. Thus, SF.net becomes the main SVN, and the main language of development and documentation becomes English.
Key Features and Features
Today, CUBRID is being developed for two major operating systems. These are Linux and Windows, for which the server part of CUBRID, all client applications and programming interfaces are available. For Mac OS X, only client applications are currently available, with which you can fully work with remote CUBRID servers. However, there are no plans to develop the main server side of CUBRID for Mac OS.
The server side of CUBRID is developed in the C / C ++ programming language and distributed under the GPL license version 2.0 or higher. Client applications are developed in different languages and are usually distributed under the BSD license (I will tell you more about the CUBRID licensing policy in the next blog). The main database administration tools CUBRID Manager, Query Browser and Migration Toolkit are written in Java. And programming interfaces are developed in C.
As I said earlier, in the implementation of the relational part of CUBRID we refer to the 92-year open SQL standard. Many DBMSs support it, but each of them implements it differently. Take system tables that store metadata about all existing or about a specific database. For this, in MySQL, MSSQL and some other DBMSs, there are separate systemdatabases, for example, INFORMATION_SCHEMA, which are available for direct editing only to the system itself. Everything, in principle, is convenient except that when transferring databases to another server, the system databases / tables on the new server (and on the old one too) must be updated. Usually this happens automatically when restoring databases, which require additional resources, especially if there are hundreds or thousands of tables in the database. But you can survive it. The worst thing is when the system tables are not updated at all or access to them changes. In this case, direct administrator intervention or changing client applications is required.
In CUBRID, system tables are implemented a little differently. Each database in CUBRID that you create has its own system catalog classes and virtual catalog classes that contain data about this database, including all indexes, columns, users, triggers, etc. Base transfers take place without a headache. Personally, I like this implementation more.
There was talk that in CUBRID there are no tables, columns and much more that is in usual relational DBMS. CUBRID has tables, columns, procedures, and everything else. Access to data in CUBRID is possible in many ways. To access a table, you can use both tables (relational approach) and classes (object approach). To access strings, you can use both strings (relational approach) and class instances (object approach). Columns or attributes. Procedures or methods. This way you can use plain SQL (
CUBRID implements ACID (Atomicity, Consistency, Isolation, Durability), so there is full transaction support. In CUBRID, you can perform data splitting, replication, compression, validation, and recovery. It is also possible to make hot / online backups, create updatable views, triggers, hierarchical and nested queries. CUBRID has no restrictions on the size of the database, the number of tables or rows, or even the size of certain data types, such as BLOB and CLOB. It has cursors, as well as built-in counter functions, which Leo described in detail . CUBRID also allows you to cache and schedule requests. There are many ways to instantly optimize queries using SQL prompts.. One of the main features of CUBRID is its own support for High Availability. This built-in high availability feature in itself is a pretty big topic, so I’ll tell you more about it with pleasure in a separate habratopike.
Where do we use CUBRID ourselves?
In general, CUBRID is a full-featured database management system that can provide uninterrupted operation with data at very high loads. For example, at NHN, we use the CUBRID DBMS on the servers of the NAVER search service, which receives requests from more than 17 million unique users per day. CUBRID is used in the search results monitoring system on NAVER.com and is directly responsible for storing data about the quality of the results. To improve the relevance of query results and fight against spam sites, we need to record the keywords that are used in the search and associate each of them with all Web pages that are already indexed by the search server. Millions of records are either entered or updated, and, of course, retrieved from the database, and CUBRID handles this flawlessly.
You are most likely wondering how well CUBRID handles system crashes. As you know, the reasons may be different, but it’s important for us at NHN that access is nine nines, the lower limit is six. Therefore, on all servers, we will definitely enable the CUBRID High Availability option. Once there was a case when the master server of one of the services failed, and that was due to physical problems. The failure of the master server could completely disable the entire service, but thanks to the High Availability CUBRID function, then the master server role was automatically transferred to the primary slave server. This happened so quickly during the timeout set in the notification system that even the database administrators themselves did not notice a hardware failure, until the plan looked at the logs. This was the first time, and so far the only one
Current status
To date, we have developed a very large number of functions in CUBRID, many of which are fully compatible with other RDBMSs like MySQL or MSSQL, and at the same time there are a lot of unique features. For the convenience of users, we strive to provide maximum compatibility with MySQL so that when switching to CUBRID, users can easily adapt their applications. To this end, we have planned several phases of "MySQL Compatibility" at the SQL level and programming interfaces. The first phase with a fairly wide package of MySQL compatible functions was completed and included in CUBRID version 8.3.0. In parallel, the programming interfaces are updated. Only a few PHP functions leftthat are not yet fully compatible with mysql. At the beginning of next month (May 2011) we plan to release a new version of CUBRID 8.4.0 with a second phase, which will cover almost 90% of MySQL SQL syntax. The final third phase we planned for the end of summer. Thus, by the beginning of autumn, I hope we will make amends to all differences between CUBRID and MySQL.
Additional nuances, the progress of development, plans, as well as other interesting stories from the life of CUBRID I will tell in the following topics. I hope this article gives you a lot of food for thought. Please download CUBRID , work in it, and tell us in the comments your impressions, comments, and wishes.
Personally, we did not present our development to Habra users, but most likely you have already read about the CUBRID DBMS in Lev Khomich’s habratopik. Some points in the article are not entirely correct, what I want to fix in this topic. Therefore, I propose to get to know each other better and learn in more detail why we present CUBRID as the most optimized DBMS for Web applications. I will also talk about those nuances that you will not find anywhere else (yet), even on the official website of the project http://www.cubrid.org . In this way, you will learn a lot and hopefully tell, advise or offer us your ideas and opinions in the comments. Therefore, I am sure you will be pleased with our acquaintance.
First, when did the development of CUBRID begin?
Different sources cite different dates: 15 years ago, or 2006. Truly, the DBMS was sold and was in great demand long before MySQL, and even CUBRID itself, appeared. It was one of the first with an object-oriented architecture, which is widely used today in the gaming and multimedia industries. The DBMS became so popular that Oracle offered to buy the source code and a license for its further development and sale for 1 billion US dollars. But the developers rejected the offer and instead found sponsors with an asset of $ 2 billion. It was back in the early 90s. Therefore, the habratopika of Lev Khomich and some other sources speak of fifteen years or more.
However, officially, the year we started developing CUBRID, we, the DBMS developers, define 2006 as the year when NHN Corporation , a major player in the South Korean search market with a 74% share, united its main architects and programmers into a team of 40 people and organized the CUBRID project. Occupying 13th place in the global IT industry, NHN had sufficient human and financial resources to successfully launch the project. By then, NHN had already provided over 100 Web services for users in South Korea, Japan, China, and the United States, including many online games , search engines , socialand other services. We were sure that it was Web services, which became more and more popular and diverse throughout the world, that would determine the development of the IT industry. Therefore, we set ourselves the goal of developing the most optimized database management system for Web services and opening its code under the GPL version 2.0 license.
Thus, the company determines to create an object-relational DBMS, which would provide all the advantages of both the OSBMS, which is so often used in online games and multimedia services, and the RDBMS, which has become the most popular solution for all other industries. For this purpose, the company acquires a license for the same OSBMS with 15 years of experience, and by then the open standard of 92 years already takes as a basis for the relational part. This was the beginning of the development of the CUBRID DBMS.
The first open source code
For two years, we developed CUBRID, and by October 2008 we had released version 1.0 of the new DBMS oriented to use with Web applications. The first stable release was involved in the internal services of NHN itself. Then, by the next month, we are finalizing the DBMS and publicly announcing CUBRID as the first open-source DBMS in South Korea.
The popularity of CUBRID in the domestic market grew so quickly that over the course of a year several thousand users began to develop and adapt various applications for working with CUBRID DBMSs, such as LACP (Linux, Apache, CUBRID, PHP / Perl / Python) and LnCP (Linux, nginx , CUBRID, PHP / Perl / Python) stacks, Windows installers, as well as well-known CMS (WordPress, phpBB, Joomla). During this first year, CUBRID was introduced into the internal management systems of the White House of Korea, the National Tax Service of Korea, many ministries and corporations.
Thus, the first year was considered very successful. However, due to the fact that most of the user developments were limited to supporting only the Korean language, this and many others in the team did not suit me. After all, we developed a DBMS not only for users of Korea, but for the entire IT space. Therefore, exactly one year after the first release in October 2009, we transfer the project source code to the new Sourceforge.net resource so that users around the world can follow the development of the project. Thus, SF.net becomes the main SVN, and the main language of development and documentation becomes English.
Key Features and Features
Today, CUBRID is being developed for two major operating systems. These are Linux and Windows, for which the server part of CUBRID, all client applications and programming interfaces are available. For Mac OS X, only client applications are currently available, with which you can fully work with remote CUBRID servers. However, there are no plans to develop the main server side of CUBRID for Mac OS.
The server side of CUBRID is developed in the C / C ++ programming language and distributed under the GPL license version 2.0 or higher. Client applications are developed in different languages and are usually distributed under the BSD license (I will tell you more about the CUBRID licensing policy in the next blog). The main database administration tools CUBRID Manager, Query Browser and Migration Toolkit are written in Java. And programming interfaces are developed in C.
As I said earlier, in the implementation of the relational part of CUBRID we refer to the 92-year open SQL standard. Many DBMSs support it, but each of them implements it differently. Take system tables that store metadata about all existing or about a specific database. For this, in MySQL, MSSQL and some other DBMSs, there are separate systemdatabases, for example, INFORMATION_SCHEMA, which are available for direct editing only to the system itself. Everything, in principle, is convenient except that when transferring databases to another server, the system databases / tables on the new server (and on the old one too) must be updated. Usually this happens automatically when restoring databases, which require additional resources, especially if there are hundreds or thousands of tables in the database. But you can survive it. The worst thing is when the system tables are not updated at all or access to them changes. In this case, direct administrator intervention or changing client applications is required.
In CUBRID, system tables are implemented a little differently. Each database in CUBRID that you create has its own system catalog classes and virtual catalog classes that contain data about this database, including all indexes, columns, users, triggers, etc. Base transfers take place without a headache. Personally, I like this implementation more.
There was talk that in CUBRID there are no tables, columns and much more that is in usual relational DBMS. CUBRID has tables, columns, procedures, and everything else. Access to data in CUBRID is possible in many ways. To access a table, you can use both tables (relational approach) and classes (object approach). To access strings, you can use both strings (relational approach) and class instances (object approach). Columns or attributes. Procedures or methods. This way you can use plain SQL (
SELECT index_name FROM db_index)
) to extract, for example, all index names that are used throughout the database. There is no need to refer to an external database. You can also clarify that indexes are only primary, or reverse or unique, or only foreign keys. If you are used to the relational concept, you will not notice any difference from any other RDBMS.CUBRID implements ACID (Atomicity, Consistency, Isolation, Durability), so there is full transaction support. In CUBRID, you can perform data splitting, replication, compression, validation, and recovery. It is also possible to make hot / online backups, create updatable views, triggers, hierarchical and nested queries. CUBRID has no restrictions on the size of the database, the number of tables or rows, or even the size of certain data types, such as BLOB and CLOB. It has cursors, as well as built-in counter functions, which Leo described in detail . CUBRID also allows you to cache and schedule requests. There are many ways to instantly optimize queries using SQL prompts.. One of the main features of CUBRID is its own support for High Availability. This built-in high availability feature in itself is a pretty big topic, so I’ll tell you more about it with pleasure in a separate habratopike.
Where do we use CUBRID ourselves?
In general, CUBRID is a full-featured database management system that can provide uninterrupted operation with data at very high loads. For example, at NHN, we use the CUBRID DBMS on the servers of the NAVER search service, which receives requests from more than 17 million unique users per day. CUBRID is used in the search results monitoring system on NAVER.com and is directly responsible for storing data about the quality of the results. To improve the relevance of query results and fight against spam sites, we need to record the keywords that are used in the search and associate each of them with all Web pages that are already indexed by the search server. Millions of records are either entered or updated, and, of course, retrieved from the database, and CUBRID handles this flawlessly.
You are most likely wondering how well CUBRID handles system crashes. As you know, the reasons may be different, but it’s important for us at NHN that access is nine nines, the lower limit is six. Therefore, on all servers, we will definitely enable the CUBRID High Availability option. Once there was a case when the master server of one of the services failed, and that was due to physical problems. The failure of the master server could completely disable the entire service, but thanks to the High Availability CUBRID function, then the master server role was automatically transferred to the primary slave server. This happened so quickly during the timeout set in the notification system that even the database administrators themselves did not notice a hardware failure, until the plan looked at the logs. This was the first time, and so far the only one
Current status
To date, we have developed a very large number of functions in CUBRID, many of which are fully compatible with other RDBMSs like MySQL or MSSQL, and at the same time there are a lot of unique features. For the convenience of users, we strive to provide maximum compatibility with MySQL so that when switching to CUBRID, users can easily adapt their applications. To this end, we have planned several phases of "MySQL Compatibility" at the SQL level and programming interfaces. The first phase with a fairly wide package of MySQL compatible functions was completed and included in CUBRID version 8.3.0. In parallel, the programming interfaces are updated. Only a few PHP functions leftthat are not yet fully compatible with mysql. At the beginning of next month (May 2011) we plan to release a new version of CUBRID 8.4.0 with a second phase, which will cover almost 90% of MySQL SQL syntax. The final third phase we planned for the end of summer. Thus, by the beginning of autumn, I hope we will make amends to all differences between CUBRID and MySQL.
Additional nuances, the progress of development, plans, as well as other interesting stories from the life of CUBRID I will tell in the following topics. I hope this article gives you a lot of food for thought. Please download CUBRID , work in it, and tell us in the comments your impressions, comments, and wishes.