NewSQL - a new round in the evolution of BigData, we take the best of SQL and NoSQL
Today it is very easy to observe the rapid growth of data on the Internet. According to one estimate, the data created in 2010 is approximately 1,200 EBs (10 18 bytes) and will grow to nearly 8,000 EBUs by 2015 on the Internet, which is the primary provider of consumer data.
This growth is ahead of capacity growth, leading to the emergence of information management systems where data is stored in a distributed manner, but accessed and analyzed as if it were on the same machine.
While programmers around the world are organizing global Holywars on the topic: “SQL vs NoSQL”, large companies such as Google and Facebook with their billionth audience are struggling with the lack of capacity and the ultimate work of the DBMS. Despite the advent of the new NoSQL technology, which made it easy to scale data, it still did not solve the issues related to the compliance of operations with ACID requirements (atomicity, consistency, isolation, durability - “atomicity, consistency, isolation, durability”) - a standard that guarantees accuracy execution of operational transactions using DBMS, even if the system was interrupted. Against this background, VoltDB, with the support of several other companies, began to develop from scratch a new opensource project called NewSQL.
Currently, in order to cope with the load created by 1 billion users, Facebook operates with four thousand instances of MySQL (using sharding, i.e. posting data to servers, starting from a certain attribute, for example, the first letter of the login) and nine thousand memcached installations . Facebook even maintains a special MySQL @ Facebook page, which monitors the work of maintaining the company's databases.
The widely known problem of MySQL is that this DBMS was never intended to handle huge amounts of data and a large number of transactions. Stonebreaker adds that MySQL, like other SQL-based databases, consumes too many resources for overhead additional database operations (for example, to support multithreading and maintain the correct execution of queries within the ACID framework). These requirements and costs do not interfere with work with small amounts of data, but quickly begin to interfere with normal functioning when they increase.
NoSQL systems such as MongoDB and Cassandra are becoming popular, and many see it as an alternative without the limitations that are common to conventional relational DBMSs.
To solve problems, large companies have adopted the NoSQL paradigm, however NoSQL databases are poorly suited for the role of storing regular structured data, in addition, ACID logic with NoSQL has to be embedded in user code, thereby complicating the work. In addition, according to Stonebreaker, NoSQL does not have much increased performance relative to traditional SQL-oriented DBMSs.
Technical Specifications of NewSQL Solutions
- SQL as the main mechanism for interaction.
- ACID transaction support.
- The control mechanism without the use of locks, thus reading real-time data will not be in conflict with the recording, which eliminates the conflict.
- An architecture that provides much higher node performance than is available from traditional RDBMS solutions.
- Convenient scaling that can manage a large number of nodes without suffering bottlenecks.
Project developers claim that NewSQL systems are about 50 times faster than traditional OLTP RDBMS.
An architectural example of one of the NewSQL solutions (dbShards).
The classification is based on various approaches taken to maintain the SQL interface, as well as to solve scalability and performance, which are problems of traditional OLTP solutions.
- New Databases : The NewSQL system is developed entirely from scratch in order to achieve scalability and performance. One of the key factors in improving productivity is the use of RAM or new types of disks (flash memory / SSD), which are the storage of primary data. This solution can be implemented programmatically (VoltDB, NuoDB) or at the iron level (Clustrix, Translattice) Examples of developments are Clustrix, NuoDB and Translattice (commercial) and VoltDB, (Open Source).
- New MySQL database engine : MySQL is part of the LAMP stack and is used in OLTP. To overcome the scalability problems of MySQL, a number of MySQL-based engines have been created. The positive side is the use of the MySQL interface, but there is a bad side - data migration from other databases (including old MySQL) is not supported. Implementation examples - Xeround, GenieDB (commercial) TokuTek; and Akiban, MySQL NDB Group et al. (opensource).
- Transparent clustering : These solutions keep OLTP databases in their original form, but provide an extension feature with transparent grouping and guaranteeing scalability. Another approach is to provide transparent sharding to also improve scalability. Schooner MySQL, Continuent Tungsten and ScalArc follow the first approach, while ScaleBase and dbShards follow the second approach. Both approaches allow reuse of existing sets and ecosystems, and avoid the need to rewrite code or perform any data migrations. Implementation examples - ScalArc, Schooner MySQL, dbShards (commercial) ScaleBase; and Continuent Tungsten (opensource).
The new generation of information management systems, called NewSQL, meets this trend and limitations. NewSQL is apt for firms planning to:
- migration of existing applications to adapt to new data growth trends
- developing new applications on highly scalable OLTP systems
- relying on existing knowledge of using OLTP
According to the creators of NewSQL, traditional SQL is outdated, too complicated and has many problems, moreover, object-oriented DBMSs are not the future, but the present. To simplify the migration, SQL to NewSQL and NewSQL to SQL converters will be developed, while they will be able to translate queries on the fly, providing the ability to run old applications without modification.
The NewSQL project is designed to solve the problems Facebook has encountered using MySQL.
NewSQL takes the best from the world of SQL and NoSQL
NoSQL is Out and NewSQL is In - Says Google (Google Spanner)
Translation in some places may not be correct.
We look forward to your comments on this introductory article,