pudge embedded database in 500 lines on golang
pudge is an embeddable key / value database written in the standard Go library.
I will dwell on the fundamental differences from the existing solutions.
Stateless
Pooj will automatically create a test database, including subdirectories, or open it. There is no need to store the state of the table and it is safe to store values in multi-threaded applications. Pooj is thread safe.
Typefree
Bytes, strings, numbers, or structures can be written to a pudge. Do not worry about converting data into their binary representation.
QuerySystem Pooj
provides the ability to extract keys in a specific order, including a selection with the indication of a limit, an indent, a sort, and a selection by prefix.
The above code is analogous to the SQL query:
It should be noted that the sorting of keys - "lazy." On the other hand, the keys are stored in memory and it runs pretty quickly.
Parallelism
Pooj, like most modern databases, uses a non-blocking read model, but writing to a file blocks all operations. But you can create / open files on the fly, minimizing the number of locks. There is no database already opened error in puja. An example of using an http router:
Engines
Despite its small size, the pooj supports two modes of data storage. In memory and on disk. By default, the pooj stores data (values) only on disk. But if you want, you can turn on data storage in memory. In this case, they will be dropped to disk on request, or when closing the database.
Status
Puja used in home projects, and in Productions, on the chart below - the number of requests to the http server on the basis of pujas, and the number of requests for more than 20 ms
in this case pujas switched on, in complete synchronization, and when the fsync - cases significantly (more than 20 ms) delay. But fortunately, there are not so many of them as a percentage.
On the project pageYou can find more links with examples of integrating puja into various projects.
Speed
In the benchmark repository you can compare the pooj with other databases:
The pooj is very well balanced in terms of the ratio between the writing speed and the reading speed. Those he is not a highly specialized database optimized for reading or writing. With a high reading speed, a rather high writing speed is maintained. Which however can be further increased due to the parallelization of the record in different files (as is done in the LSM Tree engines).
Links to the database used in the test:
They asked to compare with memcache and redis, but since the lion's share of time is spent on network interfaces when interacting with data DB, this is not entirely fair. Although on the other hand, the pooj benefits from multithreading, even though it writes data to disk.
Further development
I will dwell on the fundamental differences from the existing solutions.
Stateless
pudge.Set("../test/test", "Hello", "World")
Pooj will automatically create a test database, including subdirectories, or open it. There is no need to store the state of the table and it is safe to store values in multi-threaded applications. Pooj is thread safe.
Typefree
Bytes, strings, numbers, or structures can be written to a pudge. Do not worry about converting data into their binary representation.
type Point struct {
X int
Y int
}
for i := 100; i >= 0; i-- {
p := &Point{X: i, Y: i}
db.Set(i, p)
}
var point Point
db.Get(8, &point)
log.Println(point)
QuerySystem Pooj
provides the ability to extract keys in a specific order, including a selection with the indication of a limit, an indent, a sort, and a selection by prefix.
keys, _ := db.Keys(7, 2, 0, true)
The above code is analogous to the SQL query:
selectkeysfrom db wherekey>7orderbykeysasclimit2offset0
It should be noted that the sorting of keys - "lazy." On the other hand, the keys are stored in memory and it runs pretty quickly.
Parallelism
Pooj, like most modern databases, uses a non-blocking read model, but writing to a file blocks all operations. But you can create / open files on the fly, minimizing the number of locks. There is no database already opened error in puja. An example of using an http router:
funcwrite(c *gin.Context) {
var err error
group := c.Param("group")
counter := c.Param("counter")
db, err := pudge.Open(group, cfg)
if err != nil {
renderError(c, err)
return
}
_, err = db.Counter(counter, 1)
if err != nil {
renderError(c, err)
return
}
c.String(http.StatusOK, "%s", "ok")
}
Engines
Despite its small size, the pooj supports two modes of data storage. In memory and on disk. By default, the pooj stores data (values) only on disk. But if you want, you can turn on data storage in memory. In this case, they will be dropped to disk on request, or when closing the database.
Status
Puja used in home projects, and in Productions, on the chart below - the number of requests to the http server on the basis of pujas, and the number of requests for more than 20 ms
in this case pujas switched on, in complete synchronization, and when the fsync - cases significantly (more than 20 ms) delay. But fortunately, there are not so many of them as a percentage.
On the project pageYou can find more links with examples of integrating puja into various projects.
Speed
In the benchmark repository you can compare the pooj with other databases:
Test 1
Number of keys: 1000000
Minimum key size: 16, maximum key size: 64
Minimum value size: 128, maximum value size: 512
Concurrency: 2
pogreb | goleveldb | bolt | badgerdb | pudge | slowpoke | pudge (mem) | |
1M (Put + Get), seconds | 187 | 38 | 126 | 34 | 23 | 23 | 2 |
1M Put, ops / sec | 5336 | 34743 | 8054 | 33539 | 47298 | 46789 | 439581 |
1M Get, ops / sec | 1782423 | 98406 | 499871 | 220597 | 499172 | 445783 | 1652069 |
FileSize, Mb | 568 | 357 | 552 | 487 | 358 | 358 | 358 |
The pooj is very well balanced in terms of the ratio between the writing speed and the reading speed. Those he is not a highly specialized database optimized for reading or writing. With a high reading speed, a rather high writing speed is maintained. Which however can be further increased due to the parallelization of the record in different files (as is done in the LSM Tree engines).
Links to the database used in the test:
- Pogreb Embedded key-store of value for the read-heavy workloads Written in the Go
- goleveldb LevelDB key / value database in Go.
- bolt An embedded key / value database for Go.
- badgerdb Fast key-value DB in Go
- slowpoke (based on pudge)
- pudge the Fast and simple key / store of value Written using the the Go's standard library
They asked to compare with memcache and redis, but since the lion's share of time is spent on network interfaces when interacting with data DB, this is not entirely fair. Although on the other hand, the pooj benefits from multithreading, even though it writes data to disk.
Further development
- Transactions It would be convenient to combine pool write requests with an automatic rollback in case of an error.
- Ability to limit key lifetime (like TTL in memcache / cassandra etc)
- No server. It is convenient to build a pooj in existing microservices, but most likely there will be a separate server. In a separate project.
- Mobile version. For use on Android, iOS and as a plugin for Flutter.