Small Grant: Sia Virtual Block Device (sia_vbd)

x-star · October 10, 2024, 1:02pm

Would you try to implment sia_vbd with ublk ?
https://github.com/ublk-org/ublksrv

rrauch · October 11, 2024, 2:47pm

When I referred to Database, I didn’t mean a specific implementation approach. Maybe Inventory is a better term and doesn’t carry the same associations. Regardless, this is purely an implementation detail for later.

sia_vbd will operate as a single process in userspace without any external dependencies—such as an RDBMS—except for common system libraries like libc, just as stated in the proposal. Internally, it will likely use SQLite in some capacity, just like renterd does.

Yup. And the way renterd reduces the impact this has on storage efficiency through Upload packing is a great example of trading off implementation simplicity for efficiency.

Users are certainly free to use the block device however they see fit. That being said, not all possible uses will be practical. For example, adding a sia_vbd device to a ZFS zpool is probably not the best idea—but I also don’t anticipate this to be very common.

For now, the focus will be on more typical scenarios. One use case where I believe the Sia storage network can truly shine is offsite backups. sia_vbd 's ability to create snapshots/tags virtually for free is great for this.

Appreciate it!

daniel-lucio · October 11, 2024, 3:04pm

For this, I have already done SIAFS.

rrauch · October 11, 2024, 3:21pm

Ha, interesting that you mention ublk . I’ve been wanting to experiment with it since I stumbled upon the official Rust library earlier this year. Once it matures a bit more, I’ll definitely dive into it.

However, for this project, ublk isn’t really relevant. sia_vbd handles the server-side (aka target) of a network-accessible block device. The clients (initiators) make it appear as a local device, and those already exist.

The first part of the project involves exporting the virtual block device via the nbd protocol. Linux has had an in-kernel nbd client for ages and Windows has an installable driver that acts as an nbd client.

The second part of the project aims to implement the iscsi protocol, which has even broader existing support. However, it’s significantly more complex and will require extensive compatibility testing, so that’s out of scope for now.

In the future, adding ublk as an additional way to access sia_vbd could be an option if it makes sense. It would eliminate the network layer, making it potentially more efficient. On the downside, it’s Linux-only and limits the virtual block device to the same machine that runs sia_vbd .

x-star · October 15, 2024, 2:55am

I am glad to receive your reply.

I have seen your siafs, great work.

ublk can be a virtual block device in user mode. Can it be mapped to the client through NBD protocol or iscsi protocol (such as SCST) on the upper layer?

I look forward to seeing your project approved.

rrauch · October 15, 2024, 2:03pm

ublk is a fairly recent addition to the Linux kernel. It’s a building block (no pun intended) that allows implementing block devices in userspace in a performant and efficient manner. It could be used, for example, to implement an nbd client in userspace—this is actually one of the examples on the official ublk GitHub:

As I mentioned in my previous reply, Linux has had a native in-kernel nbd client for a very long time, so ublk isn’t really relevant for sia_vbd.

steve · October 31, 2024, 7:06pm

Thanks for your latest proposal to The Sia Foundation Grants Program.

After review, the committee has decided to approve your proposal. Your provided info was excellent, and your linked Github repo from your previous grant with the comprehensive guide on how to install and run was appreciated.

We’ll reach out to your provided email address for onboarding. Program onboarding can take a few weeks to complete, so please adjust your timelines accordingly. Congratulations!

rrauch · December 4, 2024, 7:29pm

November 2024 Progress Report

What progress was made on your grant this month?

Studied the NBD protocol specification.
Examined several existing open-source implementations to improve understanding.
Logged network traffic between existing NBD clients and servers.
Utilized Wireguard to inspect traffic and gain deeper insights into the protocol.
Developed an initial, basic implementation of an NBD server.
Successfully tested the server against Linux’s built-in NBD client. The client can:
- Connect
- Handshake
- Negotiate session details
- Transition to the Transmission Phase
- Read, Write, Flush (with a stub backend)
- Orderly end the session

Links to repos worked on this month:

GitHub - rrauch/sia_vbd: Sia Virtual Block Device

What will you be working on this month?

Complete the full implementation of the NBD server, including support for the more modern protocol variant (Structured Reply).
Begin development on the actual Sia backend.
Release Milestone 1.

Kinomora · December 18, 2024, 3:46pm

Hello,

Thank you for your progress report!

Regards,
Kino on behalf of the Sia Foundation and Grants Committee

rrauch · December 31, 2024, 3:02pm

Milestone 1 Released

Milestone 1 of sia_vbd, the first public release, is now available!

This version is a preview. While it doesn’t yet include a renterd-backed persistence layer, it brings in all the key design elements discussed above:

Fully Usable Virtual Block Device
Content-Addressable and Deduplicated Blocks, Clusters & States
Transactional Writes: Every Commit leads to a new, addressable state
Easy Branching & Snapshotting is possible

All of these features are implemented in a way that aligns with the typical expectations of block device consumers.

sia_vbd includes a brand-new, purpose-built NBD server. The NBD protocol has evolved significantly since my last implementation. The main new features are Structured Replies and Extended Headers, which significantly enhance the protocol’s functionality and performance. However, none of the existing Rust NBD server libraries support these newer features.
So I decided to develop a new implementation from scratch. It took a bit more time upfront, but the outcome has been great. The new server includes advanced features like:

Structured Replies
Extended Headers
Multiple Connections
Extended Handshake Options: Such as block size preferences
Optimal Zero Handling

Despite these enhancements, the server remains fully backwards compatible with older clients. During development, it has been continuously tested against Linux’s built-in client - which does not yet support the newer protocol features - and nbdublk, a modern userland NBD client that does support the latest protocol enhancements.

Designed to reduce latency, performance has been excellent. Both ext4 and xfs have been used during testing, and both work seamlessly.

What’s Next

The next release will feature a fully functional, renterd-backed backend.

Git Repository

rrauch · January 2, 2025, 6:14pm

December 2024 Progress Report

What progress was made on your grant this month?

Milestone 1 was released. For detailed information, refer to the release announcement above.
The NBD server implementations has been completed, including full support for Structured Replies and Extended Headers.
Intensive testing against Linux’s built-in NBD client and the more modern nbdublk was conducted.
The key elements of sia_vbd’s design were implemented
A fully functional virtual block device with an in-memory backend was released

Links to repos worked on this month:

GitHub - rrauch/sia_vbd: Sia Virtual Block Device

What will you be working on this month?

The persistence layer will be implemented
Multi-tier caching will be made available
The renterd backend will be completed
The release of Milestone 2!

Kinomora · January 7, 2025, 7:59pm

Hello,

Thank you for your progress report!

Regards,
Kino on behalf of the Sia Foundation and Grants Committee

steve · February 4, 2025, 8:36pm

Hey @rrauch, this is a reminder to submit your progress report for January.

rrauch · February 20, 2025, 1:28pm

Apologies for the delay, I spent the last few weeks in heavy crunch mode in order to get the release out.

Here is my belated progress report:

January 2025 Progress Report

What progress was made on your grant this month?

Developed a data serialization format that serves as the foundation for all permanently stored data.
Implemented the Write-Ahead Log (WAL) with support for transactional writes.
Created the Inventory system to keep track of all components.
Built Chunks with Zstd compression and indexing.
Implemented the Repository for centralized storage.
Updated the renterd_client Rust library, adding necessary functionality for the Repository.
Added automatic crash recovery.
Enabled automatic syncing of the repository.
Introduced WAL Garbage Collection.
Rewrote Chunk indexing to overcome HTTP header size limitations by introducing Manifests.
Testing, testing and more testing.
Developed a user-friendly CLI for easy volume creation and deletion.
Added support for Docker and systemd.
Released Milestone 2.

Links to repos worked on this month:

What will you be working on this month?

Implement caching, which is the most critical feature still missing.
If time allows, add Chunk Garbage Collection.

These last few weeks have been extremely busy, and I had to deal with a significant setback.
But in the end, a lot of progress was made. The difficult parts are all done, and I don’t expect any major issues with the remaining functionality.

rrauch · February 20, 2025, 1:59pm

Milestone 2 Released

Milestone 2 of sia_vbd, the first beta release, is now available!

This version is almost fully functional:

NBD (Network Block Device) support
Cross-Platform: Runs on every platform where renterd is available.
Immutability: Writes never modify existing data; any change leads to a new overall state (Snapshot). Previously
held data remains available (until eventual garbage collection).
Content-Addressed Storage: All data is hashed and identified by its content ID for integrity and deduplication.
Content Compression: Transparently compresses content (Zstd) before uploading.
Transactional Writes: Atomic writes with automatic rollback on failure.
Write-Ahead Logging: Records transactions in a local, durable WAL before being committed to eventual storage.
Crash Tolerant: Detects when the local, WAL-recorded state is ahead of the committed backend state.
Background Synchronization: Continuously uploads new data to the backend in the background, allowing fast writes
and avoids blocking reads.
Multiple Block Devices and Backends: Supports multiple block devices, across one or more renterd instances.
Single Binary, Single Process: Delivered as a single, self-contained binary that runs as a single
process, making deployment easy and straightforward.
Highly Configurable: While coming with reasonable default settings, sia_vbd offers many additional options to
configure and fine-tune.
CLI Interface: Includes an easy-to-use CLI for common operations.
Docker and systemd support,

However, some functionality is still missing:

Caching: Caching is not yet implemented. Without caching, most data must be re-read multiple times from
renterd, resulting in very slow performance due to the high latency of each read operation. Performance
will improve significantly once caching is in place.
Garbage Collection: Garbage collection is currently not available, causing volumes to grow indefinitely.
Implementing GC will allow obsolete data to be deleted over time.
Resizing: Block Devices can not be resized for the time being.
Branching CLI Support: Although branching functionality has been implemented, users currently cannot interact with
it. CLI functions will be added to enable branch operations.
Tags: Tagging is not currently supported.

Test Drive

A Docker image is available to give it a quick try:

docker pull ghcr.io/rrauch/sia_vbd

docker run -it --rm ghcr.io/rrauch/sia_vbd --help

This release lacks caching, so performance will be much slower compared to the upcoming release.

More details about how to use sia_vbd can be found here:

rrauch · March 3, 2025, 2:04pm

Milestone 3 Released

Milestone 3 of sia_vbd, the second beta release, was released a few days ago!

This version added the most important functions that where still missing in the previous release:

Caching: The previous release lacked any sort of caching, so performance was very slow. M3 contains a persistent caching layer for both, metadata as well as block data. The cache is configurable and is structured into 2 levels: L1 (in-memory) and L2 (on-disk).
Garbage Collection: Due to the lack of GC in the previous release, volumes would grow indefinitely. In this release automatic garbage collection is performed in the background. Unreferenced data will be deleted eventually.

Milestone 3 is feature complete with the exception of the following:

Resizing: Block Devices can not be resized for the time being.
Branching CLI Support: Although branching functionality has been implemented, users currently cannot interact with it. CLI functions will be added to enable branch operations.
Tags: Tagging is not currently supported.

The next release will contain these missing functions and will be feature complete.

Test Drive

The Docker image has been updated and is available here:

docker pull ghcr.io/rrauch/sia_vbd

docker run -it --rm ghcr.io/rrauch/sia_vbd --help

Detailed usage instructions and examples can be found here:

rrauch · March 3, 2025, 2:23pm

February 2025 Progress Report

What progress was made on your grant this month?

A persistent caching layer was added to significantly improve performance.
Implemented two-level caching: L1 (in-memory) & L2 (on-disk)
Added configurability for the cache. Resource limits and file system path can be configured on a per-volume basis.
Enabled automatic tracking of unreferenced (obsolete) data and metadata.
Introduced automatic background garbage collection
More testing was performed
Additional sections have been added to the README, with a detailed list of all configuration options, as well as explanations of the concepts behind sia_vbd and the terminology used.
Released Milestone 3.

Links to repos worked on this month:

What will you be working on this month?

Implement the last missing features: Resizing, Branching & Tagging
Release!

rrauch · March 11, 2025, 4:14pm

Version 0.4.0 Released

Version 0.4.0 of sia_vbd is out!

Progress Since the Previous Release

This release adds all remaining features that were still missing in the previous release:

Branching

Volumes can have more than a single branch. New branches can be created from any existing branch, tag or commit. Branches can be instantiated, modified and deleted without affecting the state of any other branch. Please note: Only one branch can be active at any given time.

sia_vbd branch --help

Tagging

Tags are very similar to branches and can also be created from any existing branch, tag or commit. The main difference is that tags cannot be instantiated. However, they can be used as a source of a new branch. Any data associated with an existing tag is guaranteed to not be garbage collected.

sia_vbd tag --help

Resizing

Existing Volumes can be freely resized with the CLI. Resizing only works while the Volume is stopped. Please be careful when shrinking: any data beyond the shrink-point will be lost!
Resizing only affects the selected branch, so it’s possible to create a tag or branch before resizing and roll back in case of accidental data loss.

sia_vbd volume resize --help

Get `sia_vbd`

sia_vbd is available from its Github Repository:

The Docker image has been kept up-to-date and is available here:

docker pull ghcr.io/rrauch/sia_vbd

docker run -it --rm ghcr.io/rrauch/sia_vbd --help

Detailed usage instructions, including configuration options and examples can be found in the Readme.

Caveat

sia_vbd does currently NOT support the recently released renterd version 2 due to a number of breaking API changes.

rrauch · April 4, 2025, 11:55am

March 2025 Progress Report

What progress was made on your grant this month?

Implemented all remaining features: Resizing, Branching, and Tagging
Released the feature-complete version 0.4.0

Links to repos worked on this month:

https://github.com/rrauch/sia_vbd

What will you be working on this month?

I consider this grant to be complete as everything outlined in the proposal has been implemented and released.

Note

Both sia_vbd and my previous project sia_nfs are currently not working properly with renterd v2 due to breaking API changes.
I had planned to update both projects in March to keep them working, but progress is currently blocked by renterd issue 1862.
Once this issue is resolved, I’ll try to make time to update both projects, along with the shared renterd_client library they both use.

steve · April 8, 2025, 7:48pm

Thanks for your final report and congrats on the completion of your grant! We’ll reach out with offboarding information.

Small Grant: Sia Virtual Block Device (sia_vbd)

Milestone 1 Released

What’s Next

Git Repository

Milestone 2 Released

Test Drive

Milestone 3 Released

Test Drive

Version 0.4.0 Released

Progress Since the Previous Release

Branching

Tagging

Resizing

Get sia_vbd

Caveat

Get `sia_vbd`