pudge embedded database in 500 lines on golang

pudge is an embeddable key / value database written in the standard Go library.

I will dwell on the fundamental differences from the existing solutions.

Stateless

pudge.Set("../test/test", "Hello", "World")

Pooj will automatically create a test database, including subdirectories, or open it. There is no need to store the state of the table and it is safe to store values in multi-threaded applications. Pooj is thread safe.

Typefree

Bytes, strings, numbers, or structures can be written to a pudge. Do not worry about converting data into their binary representation.

type Point struct {
		X int
		Y int
	}
	for i := 100; i >= 0; i-- {
		p := &Point{X: i, Y: i}
		db.Set(i, p)
	}
	var point Point
	db.Get(8, &point)
	log.Println(point)

QuerySystem Pooj

provides the ability to extract keys in a specific order, including a selection with the indication of a limit, an indent, a sort, and a selection by prefix.

   keys, _ := db.Keys(7, 2, 0, true)

The above code is analogous to the SQL query:

selectkeysfrom db wherekey>7orderbykeysasclimit2offset0

It should be noted that the sorting of keys - "lazy." On the other hand, the keys are stored in memory and it runs pretty quickly.

Parallelism

Pooj, like most modern databases, uses a non-blocking read model, but writing to a file blocks all operations. But you can create / open files on the fly, minimizing the number of locks. There is no database already opened error in puja. An example of using an http router:

funcwrite(c *gin.Context) {
	var err error
	group := c.Param("group")
	counter := c.Param("counter")
	db, err := pudge.Open(group, cfg)
	if err != nil {
		renderError(c, err)
		return
	}
	_, err = db.Counter(counter, 1)
	if err != nil {
		renderError(c, err)
		return
	}
	c.String(http.StatusOK, "%s", "ok")
}

Engines

Despite its small size, the pooj supports two modes of data storage. In memory and on disk. By default, the pooj stores data (values) only on disk. But if you want, you can turn on data storage in memory. In this case, they will be dropped to disk on request, or when closing the database.

Status

Puja used in home projects, and in Productions, on the chart below - the number of requests to the http server on the basis of pujas, and the number of requests for more than 20 ms

in this case pujas switched on, in complete synchronization, and when the fsync - cases significantly (more than 20 ms) delay. But fortunately, there are not so many of them as a percentage.

On the project pageYou can find more links with examples of integrating puja into various projects.

Speed

In the benchmark repository you can compare the pooj with other databases:

Test 1

Number of keys: 1000000
Minimum key size: 16, maximum key size: 64
Minimum value size: 128, maximum value size: 512
Concurrency: 2

^{pogreb
goleveldb
bolt
badgerdb
pudge
slowpoke
pudge (mem)
1M (Put + Get), seconds
187
38
126
34
23
23
2
1M Put, ops / sec
5336
34743
8054
33539
47298
46789
439581
1M Get, ops / sec
1782423
98406
499871
220597
499172
445783
1652069
FileSize, Mb
568
357
552
487
358
358
358}
The pooj is very well balanced in terms of the ratio between the writing speed and the reading speed. Those he is not a highly specialized database optimized for reading or writing. With a high reading speed, a rather high writing speed is maintained. Which however can be further increased due to the parallelization of the record in different files (as is done in the LSM Tree engines).

Links to the database used in the test:

Pogreb Embedded key-store of value for the read-heavy workloads Written in the Go
goleveldb LevelDB key / value database in Go.
bolt An embedded key / value database for Go.
badgerdb Fast key-value DB in Go
slowpoke (based on pudge)
pudge the Fast and simple key / store of value Written using the the Go's standard library

They asked to compare with memcache and redis, but since the lion's share of time is spent on network interfaces when interacting with data DB, this is not entirely fair. Although on the other hand, the pooj benefits from multithreading, even though it writes data to disk.

Further development

Transactions It would be convenient to combine pool write requests with an automatic rollback in case of an error.
Ability to limit key lifetime (like TTL in memcache / cassandra etc)
No server. It is convenient to build a pooj in existing microservices, but most likely there will be a separate server. In a separate project.
Mobile version. For use on Android, iOS and as a plugin for Flutter.

Tags:

	pogreb	goleveldb	bolt	badgerdb	pudge	slowpoke	pudge (mem)
1M (Put + Get), seconds	187	38	126	34	23	23	2
1M Put, ops / sec	5336	34743	8054	33539	47298	46789	439581
1M Get, ops / sec	1782423	98406	499871	220597	499172	445783	1652069
FileSize, Mb	568	357	552	487	358	358	358

pudge embedded database in 500 lines on golang

Test 1

Also popular now: