Glommio is a brand new Rust IO framework designed around the new io_uring
interface in Linux, high performance NVMe drives, and "thread-per-core" execution. It is at the bleeding edge of high-performance IO-bound software, and likely be a core component of the next generation of data-intensive applications.
Glommio is led by Glauber Costa, who previously worked on the Seastar C++ library, a similar IO framework that is the foundation of ScyllaDB, a blazing fast NoSQL database.
Rust developers have it pretty good when it comes to people building cool things in their language of choice, with rg
being a prime example. But in the last few years, ScyllaDB and Seastar have been high-profile reminders that C++ has serious chops and an impressive legacy when it comes to ultra high performance software.
ScyllaDB is pretty bad ass. It has a userspace networking stack. It provides compatible front-end interfaces for multiple rival databases (Cassandra, AWS DynamoDB), each of which it obliterates in benchmarks. They also publish excellent content about the design of ScyllaDB.
So, when I heard that Costa had begun work on a Seastar-inspired Rust library, I was thrilled!
Putting the 'bleeding' into 'bleeding edge'
Now, the down side of bleeding edge is instability and software immaturity. ScyllaDB, for all its strengths, is a huge pain to build, and I have spent multiple hours trying, and failing, on previous occasions.
Glommio is so bleeding edge that the first step of using it, for me, will be installing a newer Linux kernel on my workstation.
Here's what happens when you try to run the Glommio tests with a 5.4 kernel:
$ uname -r 5.4.0-72-generic $ cargo test # ... ...panicked at 'Failed to register a probe. The most likely reason is that your kernel witnessed Romulus killing Remus (too old!! kernel should be at least 5.8)', glommio/src/sys/uring.rs:214:13
But, before we get into all that, lets talk big picture.
Objectives and Expectations
In this article, we will walk through building a Rust library for paging S3 documents with keys that start with a given prefix. By paging, I mean downloading each file and performing some arbitrary operation on the data.
Our program will:
- retrieve a list of keys that begin with a given prefix
- retrieve the S3 files that correspond to a list of keys
- provide both sequential and non-sequential means of iterating over the downloaded file data, as it arrives
- utilize the Glommio library for network, file and other IO work, which uses
io_uring
under the hood - use Rust's
async
andawait
syntaxes/idioms
My (our) goals are:
- test drive Glommio
- learn more about designing high performance software around
io_uring
- give
async
/await
a serious, open-minded look
This is not:
- the easiest way to page S3 files in your program
- something will result in an open source library well-designed for general use
- guaranteed to succeed