I agree that this is an important feature to pursue. I’m not sure that it actually requires “small sectors,” though. What people really want is the ability to upload less than 4 MiB at a time, and I think it’s possible to achieve that without switching to a completely variable sector size.
First, some background on sectors (feel free to skip this if you are already familiar):
There are two basic units of storage on Sia: the segment and the sector. A segment is 64 bytes, and it is the smallest “addressable” unit: when you download from a host, you must download a multiple of 64 bytes, and those bytes have to be “aligned” to an offset that is a multiple of 64. That’s because our Merkle trees have a leaf size of 64 bytes.
Relatedly, when a host completes a storage proof, that proof is for a single (randomly-chosen) segment, and this proof must match the Merkle root of the most recent contract revision. But aside from that, consensus does not impose any restrictions on how the host’s data is laid out, or how renters read/write that data. To take this to an extreme, a host could be running a SQL server used by the renter; as long as the renter trusts the host to faithfully update their SQL database, they can make “revisions,” and the host can provide a storage proof to consensus when asked.
In practice, the “trust” part is not easy to solve. We don’t know how to create an efficient SQL database with verifiably-faithful reads and writes. But we do know how to do this with a big flat file made of segments. So that’s the structure that the host uses.
However, segments are kind of unwieldy. Consider storing a 64 MiB file; that’s 1 million segments. In order to download the file later, we need to know what segments are in it; but if we store the hash of every segment, that’s 32 MiB of metadata! Alternatively, we could store just an offset and a length, identifying which part of the “big flat file” on the host contains our 64 MiB; but then we lose content-addressability, along with other important cryptographic properties.
This is where sectors come in. A sector is just 65536 contiguous segments, totaling 4 MiB of data. Like a segment, a sector can be addressed by its Merkle root. We can also construct Merkle proofs within a sector. Consequently, when we store file metadata, we can store a small number of sector roots rather than a much larger number of segment roots, without any loss of security. Our 64 MiB file now requires just 16 hashes instead of a million. (By the way, our choice of 4 MiB as the sector size was mostly arbitrary, attempting to strike a good balance between small and large files.)
Since sectors are the “unit of practical storage,” we built the host around sectors rather than segments. And the most important design decision here was to store contract data as a “big flat file” made of sectors. This means that sectors must be aligned to 4 MiB boundaries, and that the total size of a contract must be a multiple of 4 MiB. This is where the “minimum file size” comes from: since the host only deals in sectors, if you want to upload less than 4 MiB, you need to add padding.
With all that in mind, let’s go over some potential solutions to the problem.
Option 1: Overhaul the host to operate on variable-size sectors. This prevents lots of nice optimizations in the host code that assume fixed-size sectors. There’s a very good reason why “block storage” exists: it is vastly simpler and more efficient than variable-size storage. Another issue is that this approach reveals the true size of the renter’s files to the host, which can be a significant privacy concern. (Although I suppose in practice you would still pad to 64 bytes, which is not quite as bad.) Another annoyance with this approach is that both the renter and host need to track the size of every sector, which makes lightweight renters more challenging.
Option 2: Add padding on the host side. That is, the renter says “here’s 1 KiB; assume that the rest of the sector is zeros.” The main issue here is that the renter would probably expect to pay less to store this 1 KiB file; after all, 99.97% of the sector will be zeros. But this assumes that the host can store 1 KiB much more efficiently than a full sector, and in order for that to be true, the host would have to violate its nice, efficient, fixed-size storage model. So in practice, hosts might still charge full price for padded sectors, and store fully-padded sectors on their disks. This is obviously a problem if the goal is to store millions of tiny files; it greatly decreases the host’s efficiency. This approach also suffers from the same privacy problem as option 1.
Option 3: Pack multiple files into one sector. This option was described quite well by @meije-storewise, so I won’t reiterate it here. I will note, though, that this is the only option (that I’m aware of) that doesn’t leak filesize metadata to the host.
Option 4: Pack files on the host side. Sort of a fusion of options 2 and 3; the idea is that the host maintains a special “sector buffer”, which only becomes a “true sector” once the renter has uploaded 4 MiB. One upside of this is that it avoids the pricing problem of option 2. I suspect that there is a fair amount of complexity lurking here, though; for example, what happens if you upload half a sector, then upload a full sector? Is the full sector directly written as a “true sector,” or is the first half of it appended to the buffer, then flushed, then the remaining half stored in the buffer?
I’m not sure which option is best (and there are certainly more I haven’t thought of), but I’d probably go with option 2 if I had to pick one today, in the hopes that an efficient implementation could be engineered.
Anyway, my intent here was to demonstrate that there are a number of ways we can tackle this problem that don’t involve fully-variable sector sizes. So I’d suggest that we call this feature “small file support” rather than “small sector support.”
Would be great for @Taek to weigh in here as well, since there’s probably something I missed.