pudge embedded database in 500 lines on golang

    pudge is an embeddable key / value database written in the standard Go library.

    image

    I will dwell on the fundamental differences from the existing solutions.

    Stateless

    pudge.Set("../test/test", "Hello", "World")
    

    Pooj will automatically create a test database, including subdirectories, or open it. There is no need to store the state of the table and it is safe to store values ​​in multi-threaded applications. Pooj is thread safe.

    Typefree

    Bytes, strings, numbers, or structures can be written to a pudge. Do not worry about converting data into their binary representation.

    type Point struct {
    		X int
    		Y int
    	}
    	for i := 100; i >= 0; i-- {
    		p := &Point{X: i, Y: i}
    		db.Set(i, p)
    	}
    	var point Point
    	db.Get(8, &point)
    	log.Println(point)
    

    QuerySystem Pooj

    provides the ability to extract keys in a specific order, including a selection with the indication of a limit, an indent, a sort, and a selection by prefix.

       keys, _ := db.Keys(7, 2, 0, true)
    

    The above code is analogous to the SQL query:

    selectkeysfrom db wherekey>7orderbykeysasclimit2offset0

    It should be noted that the sorting of keys - "lazy." On the other hand, the keys are stored in memory and it runs pretty quickly.

    Parallelism

    Pooj, like most modern databases, uses a non-blocking read model, but writing to a file blocks all operations. But you can create / open files on the fly, minimizing the number of locks. There is no database already opened error in puja. An example of using an http router:

    funcwrite(c *gin.Context) {
    	var err error
    	group := c.Param("group")
    	counter := c.Param("counter")
    	db, err := pudge.Open(group, cfg)
    	if err != nil {
    		renderError(c, err)
    		return
    	}
    	_, err = db.Counter(counter, 1)
    	if err != nil {
    		renderError(c, err)
    		return
    	}
    	c.String(http.StatusOK, "%s", "ok")
    }
    

    Engines

    Despite its small size, the pooj supports two modes of data storage. In memory and on disk. By default, the pooj stores data (values) only on disk. But if you want, you can turn on data storage in memory. In this case, they will be dropped to disk on request, or when closing the database.

    Status

    Puja used in home projects, and in Productions, on the chart below - the number of requests to the http server on the basis of pujas, and the number of requests for more than 20 ms



    in this case pujas switched on, in complete synchronization, and when the fsync - cases significantly (more than 20 ms) delay. But fortunately, there are not so many of them as a percentage.

    On the project pageYou can find more links with examples of integrating puja into various projects.

    Speed

    In the benchmark repository you can compare the pooj with other databases:

    Test 1


    Number of keys: 1000000
    Minimum key size: 16, maximum key size: 64
    Minimum value size: 128, maximum value size: 512
    Concurrency: 2
    
    pogreb
    goleveldb
    bolt
    badgerdb
    pudge
    slowpoke
    pudge (mem)
    1M (Put + Get), seconds
    187
    38
    126
    34
    23
    23
    2
    1M Put, ops / sec
    5336
    34743
    8054
    33539
    47298
    46789
    439581
    1M Get, ops / sec
    1782423
    98406
    499871
    220597
    499172
    445783
    1652069
    FileSize, Mb
    568
    357
    552
    487
    358
    358
    358

    The pooj is very well balanced in terms of the ratio between the writing speed and the reading speed. Those he is not a highly specialized database optimized for reading or writing. With a high reading speed, a rather high writing speed is maintained. Which however can be further increased due to the parallelization of the record in different files (as is done in the LSM Tree engines).

    Links to the database used in the test:

    • Pogreb Embedded key-store of value for the read-heavy workloads Written in the Go
    • goleveldb LevelDB key / value database in Go.
    • bolt An embedded key / value database for Go.
    • badgerdb Fast key-value DB in Go
    • slowpoke (based on pudge)
    • pudge the Fast and simple key / store of value Written using the the Go's standard library

    They asked to compare with memcache and redis, but since the lion's share of time is spent on network interfaces when interacting with data DB, this is not entirely fair. Although on the other hand, the pooj benefits from multithreading, even though it writes data to disk.

    Further development

    • Transactions It would be convenient to combine pool write requests with an automatic rollback in case of an error.
    • Ability to limit key lifetime (like TTL in memcache / cassandra etc)
    • No server. It is convenient to build a pooj in existing microservices, but most likely there will be a separate server. In a separate project.
    • Mobile version. For use on Android, iOS and as a plugin for Flutter.

    Also popular now: