Small file problem



  • I'm builiding an application to handle the small file problem. The idea is to organize small files in standard blocks of 4MB (which is the smallest chunck handled by SIA, please correct me if I'm wrong) and upload them in SIA network.

    The disadvantage is for downloading you need to use this app again.
    The way it goes in is the way it goes out.

    Dev platform: .net core
    Initial OS: Windows later Linux & Mac

    Give me a feedback about :

    • how useful it is
    • what features you think can be nice to have


  • What you're describing is typically called a PC backup program.

    I.e. the use case is that you perform a local backup of your files, then upload them to Sia for offsite storage.

    IMHO your time may be better spent improving the integration of backup products with Sia.



  • Not really... You can see it like an engine for a new file system in SIA.

    The SIA process of uploading is this: (at least in my understanding)

    • every file is inflated to 4MB - minimal sector in SIA
    • than it's inflated 10x more to 40MB - and it's splitted back in 10 blocks of 4MB
    • than these 10 blocks are uploaded 3 times in the cloud

    So if you have 100 pictures of 1MB

    • 300MB wasted first step
    • 3GB wasted second step
    • 9GB wasted third step

    Final result:
    Storage: storage used 12 GB -> 9GB wasted
    Download: 400MB -> 300 MB wasted

    With this tool:

    • 100MB pictures -> 25 file blocks
    • 100MB first step
    • 1 GB second step
    • 3 GB third step

    Final result:
    Storage -> 3GB -> no waste
    Download: 100MB -> no waste

    You can achieve similar results with archiving but later we can add features to this tool which will make it more useful:

    • keep track of files / blocks.
    • file labeling
    • file versioning

    This way we try to store / access small files from SIA with minimal cost.



  • I know exactly the process & improvements you have in mind. My use case for Sia is traditional backups, and ...

    @watcher said in Small file problem:

    a new file system in SIA

    ... your's appears to be online file system.

    I'm sure there are a lot more use cases still that would benefit from your improvement. I just didn't consider them because I'm focused on my narrow use case. Apologies for trying to tell you what to do; go right ahead with whatever you feel is right!



  • Thanks for your response.
    Can you provide more details integrations?!

    Also I have two questions

    1. If I understand correctly the network is built more around long term storage: upload large files one time & host the files and MAYBE a couple of downloads per year. Have any idea if it can handle massive : small file uploads followed by often downloads. The download price of hosts also can be a problem.

    2. Do I understand correctly the upload process? I linked several comments to understand what happens in the background. For an optimal storage 4MB is the ideal size?

    Thanks for your time.
    watcher


  • Global Moderator

    @watcher
    I think the optimal storage size for Sia is 40MB/40MiB (Forget the exact size)

    You may be interested in this article:
    https://blog.spaceduck.io/load-test-1/
    https://blog.spaceduck.io/load-test-2/
    https://blog.spaceduck.io/load-test-3/

    Find me on the Official Sia Discord.

    Feel free to donate and support me!
    SC: e8f701c1b2b37c8560cd9bbd2ab85e352d27112f51f0cec815a1331dcc2257392f6b53440b4c
    DCR: DsSKZQkB1MZ81o5DtePbmC3swPCzgbtdg6f
    ETH/ETC: 0x5d67690768F0Fc4780c578393Ca567e5bCb38378

    0


  • Thx for the info... next week I hope to have the new data format ready + ability to pack and unpack files.


Log in to reply