Project Name:
Sia Virtual Block Device (sia_vbd)
Name of the organization or individual submitting the proposal:
Roland Rauch
Describe your project:
sia_vbd
implements virtual block devices on top of renterd
. Essentially, it provides users with virtual disks that are location-independent, can grow to almost any size, allow snapshots, branching, are deduplicated and compressed, and are fully backed by Sia objects.
How It Works
sia_vbd
organizes data into Blocks
, which are fixed-size units of data addressed by their cryptographic hash. These blocks are grouped into larger structures called Clusters
, which are collections of block hashes forming a Merkle tree. Multiple clusters together form the state of the block device in a similar Merkle tree manner. This design makes sia_vbd
virtual disks similar to Git repositories in nature.
Blocks are compressed and saved in Chunks
, which are stored as regular Sia objects with additional user metadata
indicating the contained blocks and their offsets.
The virtual disks are exported to the user over the network, initially via nbd
(Network block device - Wikipedia) with the ultimate goal of also supporting iscsi
(iSCSI - Wikipedia). Once connected, the virtual disk looks like any regular disk to the user, allowing formatting, partitioning, and other standard disk operations.
Under The Hood
In the background, sia_vbd
maintains a block cache and a Write-Ahead Log (WAL):
- Read Requests: These are mapped to the corresponding block and served either directly from the cache if available, or by fetching the block from
renterd
if not. - Write Requests: These are handled by first updating the affected blocks locally, recalculating their hashes, and committing any new blocks to the local WAL. Once the WAL reaches a certain size, the contained blocks are compressed, written to a new
Chunk
, and uploaded torenterd
, making the current state permanent. - Garbage Collection: Periodically, a garbage collection task identifies
Chunks
with many unusedBlocks
. The task then consolidates the currentBlocks
into newChunks
and deletes the old, now obsoleteChunks
.
Similar to my previous project, sia_vbd
will be implemented in Rust and will be made available as a standalone binary and a Docker image with no other dependencies besides renterd
and common system libraries.
This project proposal is in response to an RFP found at Sia - Grants.
How does the projected outcome serve the Foundation’s mission of user-owned data?
Sia natively provides an Object Storage interface. My previous project, sia_nfs
, added a virtual file system accessible over NFS. Now, with sia_vbd
, my aim is to implement a virtual block device on top of Sia’s object storage, providing the missing piece to make Sia a unified storage solution.
sia_vbd
allows:
- Use cases that are not served by Object Storage or File System access
- Users to have fully decentralized, globally distributed virtual disks that they can attach, detach, and move around at will
- Virtual disks to be used as native disks for VMs
- A single
sia_vbd
server to serve an entire network - Better enterprise integration with workloads that do not fit with the other two storage types
With all three storage types available, users have the flexibility to choose the most suitable storage type for their needs, whether it’s Object Storage, File System, or Block Storage.
Grant Specifics
Amount of money requested and justification with a reasonable breakdown of expenses:
The total amount requested is USD 8,000, which covers:
- 8 weeks of full-time work (320 hours @ USD 25/hour).
No additional equipment is required. During development, the testnet
will be used, no SC are required.
What are the goals of this small grant?
The goal of the grant is to provide sufficient funding for the development of sia_vbd
. The time estimate is based on previous experience building sia_nfs
, the work I can reuse from that project, specifically renterd_client
, and my prior experience in creating a virtual block device with an nbd
interface.
Development Timeline:
Two milestones are planned:
- Milestone 1: Version 0.1.0 at the end of week 4. This version will be very basic and largely untested. Basic functionality will be there, but performance is expected to be slow. I/O scheduling will not be optimized, and non-core functions, such as resizing and snapshots, will be absent. Only the basic functionality will undergo testing at this stage.
- Milestone 2: Version 0.2.0 at the end of week 8. I/O scheduling will be optimized, resulting in improved performance. Many use cases will have been tested, including on Windows. Missing functions, such as resizing and snapshots, will be included. Usage documentation and a Docker image will be available. This will be the first generally usable release.
Features & Scope:
- A single, standalone program that runs on every major platform where
renterd
is available. - Block Cache & WAL (Write-Ahead Log)
- Basic functionality to create, resize, and delete a block device
- Snapshots and Branching
nbd
support- Fully open source (Apache-2.0 & MIT licenses), with a public repository on GitHub.
- Basic usage documentation and example configurations.
- A small, standalone Docker image for a simplified user experience
Potential risks that may affect the outcome of the project:
-
My previous experience building
sia_nfs
has shown that data access latencies can vary significantly when reading object data fromrenterd
. I have observed latencies in the 400-500ms range, but also in the 5000ms range, and occasionally even higher. This is likely partly because I was working on thetestnet
, but it also reflects the inherent nature of a completely decentralized, globally distributed storage network. Many applications are not designed for these latencies, which can seriously limit the practicality of solutions like this one. Furthermore, access patterns such as out-of-order reading/writing, read-ahead, or frequent seeking (especially backwards) have caused major issues when implementingsia_nfs
. I spent considerable time developing, implementing, and testing strategies to mitigate these issues and eventually came up with a solution that works well enough in most cases. I have incorporated these lessons into the design ofsia_vbd
and will implement similar strategies to work around these limitations. However, these fundamental issues exist, and not every use case will work well withsia_vbd
. -
nbd
does not have the same support asiscsi
.nbd
is natively available on Linux, can be installed on Windows (via the Ceph for Windows project - GitHub - cloudbase/wnbd: Windows Ceph RBD NBD driver) with some limitations (Make the driver signed - to make NBD usable on Windows 11 and up without tons of hassle · Issue #89 · cloudbase/wnbd · GitHub), and has very limited macOS support (GitHub - elsteveogrande/osx-nbd: NBD client driver for OSX). Interestingly, Apple supportsnbd
natively in itsVirtualization Framework
(VZNetworkBlockDeviceStorageDeviceAttachment | Apple Developer Documentation), but I don’t believe this is useful for most users. -
Compatibility: Although, in theory, any block device should work with any filesystem, this might not always be the case in practice. When I previously implemented a virtual block device several years ago, I developed it for a specific filesystem. The first time I tested it with a different filesystem, it caused an immediate kernel panic. Sometimes implementations rely on subtle details that should not matter in theory but do in practice. Additionally, users are free to use the virtual disk as they please. They can partition it in various ways, build a software RAID, use it with
lvm
, and much more. I cannot guarantee 100% compatibility in all cases. That said, I will certainly test it against what I believe to be the most common cases—and some uncommon ones—and I am fairly confident it will not have too many compatibility issues in practice. However, this is a risk that needs to be acknowledged.
A Word on iscsi
Initially, this project was supposed to be called sia_iscsi
and was meant to include support for both iscsi
and nbd
as access protocols. However, I decided to change my proposal for two reasons:
- A project with a very similar name has been proposed recently. To avoid confusion, I decided to change the name of my project to
sia_vbd
. - Risks and scope: The network protocol for
iscsi
is significantly more complex thannbd
. Additionally, I need to emulate a virtual SCSI device (thescsi
part iniscsi
). SCSI is extensive—the command reference manual alone is over 500 pages long. This essentially constitutes its own project and will require a lot of testing against numerousiscsi
initiators, operating systems, file systems, etc.
To keep the scope clear and the risks manageable, I decided to split my initial project into two parts. For now, I am focusing on implementing a working, Sia-backed virtual disk, as described above, and making it available via nbd
. Once I have delivered on that, I will submit a proposal for part two—iscsi
support. By then, I will also have a better idea of how this needs to be approached than I do now.
Development Information
Will all of your project’s code be open-source?
Yes, the code will be fully open source and will be made available on GitHub (Apache-2.0 & MIT licenses). Furthermore, all libraries used are also open source.
Leave a link where the code will be accessible for review.
A repository will be created on GitHub once the grant is approved.
Do you agree to submit monthly progress reports?
Of course
Contact Info
Email: [email protected]