youROCK September 20, 2009 at 01:50

PHP DBMS engine

Hello dear readers. My DBMS is not a port of any existing DBMS in PHP (unlike C # -SQLITE , for example), but is a unique development. The main difference from existing engines like TxtSQL is index support. If only PRIMARY INDEX is used, the insertion speed on my laptop reaches 5000 / sec. For PHP, it seems to me, this is quite good.

What for?

The main goal for me from the beginning was purely theoretical interest, there was always a desire to make a replacement for MySQL functions with PHP in order to make most scripts that are linked to MySQL work without supporting it.

Before writing an SQL parser, you need to create a platform on which this very DBMS should be supported. Initially, I used phpTable for these purposes, but since it does not have index support, processing large amounts of information was impossible. I also did not find other DBMSs with index support, so I had to create the kernel for the database myself. I’ve been developing it for quite some time, and I went through several evolutionary steps, for example, adding support for B-trees and organizing indexes based on them.

I post it here in the form in which it is now, because I'm looking for people who could help me with the development. For example, in my DBMS there is no SQL support, and I would like to add such support, while I myself would like to engage in further development of the database core.

What is it like

At this stage, the core of the database is at an early stage of development, and some pieces of debugging code have not even been removed from the version that I will present here. They, of course, can be removed and used by the DBMS right now, but I would not like to focus your attention on this. If you can’t wait to see the kernel “in battle”, I have a demo:

forum.dklab.ru forum database imported into a makeshift forum

Initially, the forum used MySQL, but it was modified to work with the kernel of my DBMS (this process didn’t take much a lot of time, literally a few hours). The forum.dklab.ru database contains about 150,000 messages, the message database takes about 60 MB (I can send the source to anyone).

Current database features

At the moment, the database supports AUTO_INCREMENT fields, which are both PRIMARY INDEX, as well as analogues of the INDEX and UNIQUE indexes in MySQL. There is support for multiple indexed fields. There is no support for composite indexes at this time, and so far is not planned. Also, only numeric fields (INT) are available for indexing.

The syntax of the method that is used to fetch strings currently provides for fetching only one condition of the form " field_name operator value ", where the operator can be one of the following: "> <= IN". Indexes are used for queries of the form " field_name "= value ”, and only for them. For PRIMARY INDEX, there is also sampling optimization using the IN operator (this is a complete analogue of the same in MySQL).

Despite the fact that these capabilities may seem quite small, this is quite enough to make, for example, a forum or CMS based on my kernel.

Technical details

For PRIMARY INDEX (and at the same time AUTO_INCREMENT fields), a separate file is used in which the offsets are written in the table with the data, and according to a very simple principle: at position id * 4 (4 = sizeof (int)) the same offset is written. It works very fast.

For UNIQUE indices, a B-tree is used. At the moment, its implementation does not shine with performance, but nevertheless, the use of a real B-tree allows you to make the minimum number of calls to the file system (~ log ( NUMBER OF RECORDS ) / log (70)), and accordingly - storing arbitrary values in large quantity without loss of performance when adding new records.

For INDEX fields, the same B-tree is used, but supplemented by an external list of duplicate values. Accordingly, INDEX performance drops linearly with the number of duplicate values.

All file offsets are 32-bit, which imposes a 2 GB limit for both the base file and the indexes. All indexes are stored in files, so the indexes need to be "warmed up" before use - for unix, do it

cat *.idx *.btr *.pri > /dev/null

before starting the database from a new location. This will put the indexes in the OS cache, which will significantly increase the performance of the database.

Using

To work, my DBMS requires specifying a data folder in which it will place its files.

$db = new YNDb('my_data');

Creating a table is a fairly simple operation. The second argument is an array of fields with their types, the third is an indication of additional attributes, mainly indices.

$db->create('test_table', array('id' => 'INT', 'name' => 'TINYTEXT'), array('AUTO_INCREMENT' => 'id') );

Insertion could not be simpler: Fetching values has the most complex syntax, but it allows you to sort the result, apply limits, etc. Here is one example:

for($i = 0; $i < 100; $i++)

{

$db->insert('test_table', array('name' => 'value'.$i));

}

$results = $db->select('test_table', array('cond' => 'id > 30', 'limit' => '30,50', 'order' => array('name', SORT_ASC) ) );


* This source code was highlighted with Source Code Highlighter.

Where can I download?

The project is hosted on Google Code:

code.google.com/p/yndb

There is a brief description of the project, documentation, and a couple of examples of using the DBMS, including “YNDbAdmin” - a script with 300 lines of code that gives basic functionality for working with base.

Waiting for your comments :)!

Tags: