DIY Sia storage farm



  • So first, disclaimer: I'm definitely not a PC builder. But, I am interested in ideas for how to design a storage farm for Sia, with low cost, convenience and the home-user in mind. With mining difficulty on the rise, it seems data farming could become a real second income for a lot of people.

    As for disks, it seems the cheapest storage currently are 8 TB Seagate Archive disks which on Amazon cost $215/ea.

    Next, you need a chassis to house these disks.

    You could opt for a Storinator storage pod, based on Backblaze specs. Entry model costs $3,000 and can accomodate 30 x 8 TB drives. So, roughly your storage cost would work out to $39.38/TB. The initial investment would be about $9,500. If max utilization at 5$/TB/mo, your potential income is $1,200/mo, meaning you've paid back your investment in 8 months.

    If you wanted to run something smaller , you could build your own "pod" using a standard PC chassis. With a couple of 4-port SATA controllers (usually running at about $15 each) you could perhaps fit 8 (?) HDDs at a cost of roughly $2,500 - $3,000 for 64 TB storage. That's a start-up cost of about $45/TB, which you could potentially earn back in 9 months, with max utilization.

    Most likely though, return on investment would kick in later than above, as utilization will be low initially as the Sia network picks up steam.

    Also, there's the element of competition on Sia. There may be large data farms with existing infrastructure ready to host data at a low price. Finally, some home-users may be on bandwidth limited data plans.

    Any better ideas of how to build a storage pod cheaper?



  • Very interesting, will be looking into this as would really like to build/run a storage farm. Will post my specs and findings when I've done some research.


  • admins

    I don't think that spending $3000 on a pod is necessary. A $50 tower can house 6 drives, though you will need a PSU, CPU, motherboard, and RAM. I'm guessing you can find a tower that's got a lot more than 6 slots for not much more money, I'm guessing you could find a way to pull the whole rig together for something closer to $1000 if you tried hard enough. Even better if you already have a bunch of spare parts lying around.

    You generally don't want to fill a machine will all the same types of drives, because drive failures don't tend to be random, often multiple drives of the same type will fail together. Instead, you'd probably want a variety of brands and models.

    I originally didn't think that RAID would be necessary for Sia storage, but given the requirement of 95% reliability, and given that hosts will actually be putting up collateral, I think you'd want RAID5 across 6 drives, so a 6x8TB rig would effectively become a 40TB rig. The overhead is pretty small, but you're protected against single drive failures and that should be enough to drive your reliability much higher as long as you can replace the drives within a week.

    I don't think bandwidth will be an issue for home users. Sia will be making use of parallelism to overcome hosts with small pipes, and home users often have what is essentially free bandwidth that they can provide. People on datacenter connections often can't offer bandwidth below a certain price because that's what they are getting charged by their providers.

    It does seem like, especially in the early days, storage is going to be super cheap. There's a lot of underutilized storage out there that people are just sitting on, waiting for something like Sia to turn it back into a profitable setup.


  • Global Moderator

    The aim here is to work out the specs of a low-cost yet profitable storage solution, suitable for the home-user.

    With that in mind, although I agree with @Taek on choosing different drives vendors/types, not much seems to beat an 8TB Archive drive at cost. At that price, you could probably still deal with the occasional failure, better than using alternative higher-cost/lower-storage drives IMO.

    Here's a pre-liminary "shopping list". Some of the parts, like the mo-bo, are perhaps not ideally suited or the cheapest but they make up a relative small portion of the total cost anyway.

    Feedback and suggestions welcome!

    Corsair Obsidian Series 750D Tower $115
    12 x Seagate 8TB Archive HDD $215/ea
    3 x HDD extra drive cage $10/ea
    Gigabyte AM3+ AMD 970 motherboard w/6 SATA connectors $95
    AMD FD6350FRHKBOX FX-6350 FX-Series 6-Core Black Edition $120
    2 x Kingston Value RAM 4GB 1600MHz PC3-12800 DDR3 $18/ea
    2 x 4-port SATA controller $16/ea
    Sentey MBP1000-HM 1000W power supply w/ 8 x SATA $99
    15 Pin SATA to 4 SATA Power Splitter Cable $10

    Did I forget anything?

    Total Cost: $3,117
    Total Storage: 96 TB
    Cost / TB: $32.47 (cheaper than a Storinator pod :-))


  • admins

    It's currently up for debate whether 8GB of ram would be enough to support hosting 96TB of data. My napkin math (which is really just a wild swing in the dark) puts the estimated need between 5GB and 50GB of RAM, I'm guessing 32GB would be enough, 8GB would not, and 16GB might be enough but might not. (750,000 files per TB on the file system. 96TB -> 72 million files. Filesystem may use anywhere from 50 bytes to 500 bytes per file, and that's all going to go in RAM. Depends on the filesystem, and I actually have no idea how compact they are).

    Those are constraints we can work on later in the code. There are things we can do to reduce the requirements, but right now you're looking at needing a lot RAM if you got a lot of storage.

    Do you need 1000W power supply? 1000W constantly running will add around $1 per TB per month depending on where you live. I'm assuming it wouldn't be in use the whole time. More napkin math suggests to me that you could get away with 600W.

    If we assume the average drive is going to last 24 months, and the average draw of the whole rig is going to be 500W (at 13 cent per kw/h), you get a total cost of about $4200 for 96TB. That's $2 / TB / mo if you are breaking even. At $4 / TB / mo you're making a decent amount of cash.

    Right now redundancy is at 6, but as the network matures (so, a year or two from now), we should be able to bring that down to around 1.5. (we'll reduce it gradually). If hosts are charging $4 / TB / Mo, and redundancy is 1.5, renters are paying $6 / TB / Mo, which is not bad at all. And that's before we start doing crazier optimizations.

    It's worth pointing out that in your rig above you've only got 6 SATA connectors, but 12 drives, so that's not going to work. I also think that the price of storage is going to be closer to $1 / TB / Mo (for hosts, so $6 for renters) based on what we've seen from hosts on the Storj and Burstcoin networks. It seems that at payouts of $1 / TB / Mo, you can get thousands of TB on your network. Custom built rigs should probably wait until we've outgrown the supply of spare parts. I'm estimating that the spare parts economy will get us somewhere between 10,000TB and 100,000TB. We'll know that we're running out of spare parts though when the price starts to go up.



  • Started working on a 'spare parts' unit, will share spec when finished...


  • Global Moderator

    And just like that Backblaze announces a new storage pod: https://www.backblaze.com/blog/open-source-data-storage-server


  • Global Moderator

    @Fornax said:

    And just like that Backblaze announces a new storage pod

    at a cost of $36.86/TB, which is still higher than 8 TB drives in a regular full-tower chassis.

    The backblaze pods have very high density, however, which is required for housing 500 of these pods.

    Bottom line seems to be a home-user could actually be competitive with enterprise storage providers.


  • Global Moderator

    @Taek said:

    Do you need 1000W power supply?

    No, with an estimate of 20W/drive 600 W total should be sufficient.

    It's worth pointing out that in your rig above you've only got 6 SATA connectors, but 12 drives, so that's not going to work.

    I also put in some 4-port SATA PCIe boards. These are cheap ($16/ea).



  • Any thoughts on using external USB hard drive enclosures?

    There are cheap USB hard drive enclosures. Here's one for ~$13, with free shipping. The cost for 30 is ~$400:
    http://www.amazon.com/inch-Silver-External-Drive-Enclosure/dp/B00S0UCEF6/

    You would also need powered USB hubs. Here's a 13-port hub for ~$20, also free shipping if you buy enough. You'd need 3 of them for 30 ports, which costs about $60:
    http://www.amazon.com/dp/B00HL7Z46K/ref=psdc_281413_t1_B0051PGX2I

    That's ~$460 for the hardware that can host 30 hard drives & connect them all to a computer. So it's cheaper than one of the enterprise-class storage enclosures. With the $215/drive cost from in-cred-u-lous you end up at $6910 for a farm fully loaded with drives. That's a cost of ~$29/TB before RAIDing. And a system like this can be added to gradually, as slowly as 1 hard drive at a time. So you can start out very small & grow at any speed.

    What I don't know:

    • How much processing does a farm controller need to do, say per TB in a fully loaded farm? What kind of CPU is needed to handle it?
    • How many drives can reasonably be farmed before saturating the bandwidth of a single USB port?
    • Is there any cooling needed if you stack 30 USB hard drive enclosures in close proximity? It's surely not an anticipated use.
    • What's the total power usage of the USB hubs in a setup like this? This impacts monthly cost of the farm.
    • Would you need hubs that provide a certain amount of power? Or do all powered USB hubs provide the same amount of power per port?
    • USB drives would need to be software RAIDed; would this hurt performance too much?
    • Would it be better/cheaper to use a USB->SATA adapter? Are a set of towers purely used to mount drives cheaper than 30 USB hard drive enclosures?

    So tempting to just get a Raspberry Pi farming with an array of USB drives connected to it. Not sure if it's possible. I see from Taek the memory cost to host a farm is high; would it work if most of it was in virtual memory? You could set aside a sufficient portion of a large SD card as swap space.



  • Consider remanufactured brand shit, like this, spit cheap
    http://www.serverhome.nl/storage/nas-server/hp-proliant-dl180g6-14.html
    If you want to order in bulk, let me know



  • HP ProLiant DL180G6
    Total: € 692,90
    Incl tax:€ 838,42
    HP ProLiant DL180G6 mainboard / chassis
    HP ProLiant DL180G6 rackserver
    CPU
    1 x Heatsink HP ProLiant DL180G6 P/N: 507247-001
    1 x 2.26GHz / Quad Core / QPI 5.86 GTs / Cache 8M / TDP 60WXeon L5520
    Memory
    6 x 8GB 2Rx4 PC3-10600R DDR3-1333 ECC, Samsung
    Remote Acces - iLO
    No iLO
    Raid / Storage Controller
    No raid. Attention: this server has no Harddisk / RAID controller.
    HP SmartArray Memory
    No raid memory
    HP SmartArray BBWC Batery
    No raid battery
    Harddisk
    No harddisk
    Bracket / Caddy / Tray
    12 x Harddisk Bracket 3.5" SAS / SATA Type HP ProLiant G1 - G7 : ML110G7, ML150G5, DL320G4, DL360G5, DL360G7, DL380G6, DL380G7, etc. incl. 4 screws P/N : 373211-001, 373211-002, 335537-001 To mount your own harddisk. Delivery time: 7 days
    PSU / Power Supply Unit
    2 x HP HSTNS-PL18 Power Supply, 750W, P/N: 506822-201, 506821-001, 511778-001



  • Looks like the enclosure I linked actually powers itself. First one I looked at was powered just by the USB port. So with powered enclosures maybe you could use unpowered USB hubs. Then you'd need power strips to give you enough electric outlets to feed all the enclosures.


  • Global Moderator

    @coinmonkey Buying refurbished equipment is a good option, though potential buyers should know that the maximum storage/drive capacity of "legacy" equipment might be limited (up to 42 TB, or just 5 TB drives, for the model series you link to).

    PS: Could you please remove your second post, however, as it just adds noise to this thread. A link is sufficient. Thanks!


  • Global Moderator

    @HoteiLife said:

    Any thoughts on using external USB hard drive enclosures?

    For convenience, I would personally prefer a single enclosure, like a full-size tower or something. If you need to expand what a tower or rack module can hold, however, external enclosures is a good idea. Are there enclosures that fit multiple drives but does not force on your RAID? i.e. 1 volume per drive.



  • @in-cred-u-lous said:

    For convenience, I would personally prefer a single enclosure, like a full-size tower or something. If you need to expand what a tower or rack module can hold, however, external enclosures is a good idea. Are there enclosures that fit multiple drives but does not force on your RAID? i.e. 1 volume per drive.

    There are, but they all seem to be more expensive per drive. Even the USB 2.0 enclosures without RAID. Here's a 4 drive enclosure for $380, which is $95/drive:
    http://www.newegg.com/Product/Product.aspx?Item=9SIA0AJ35C4582

    There's an 8 drive enclosure for $300, which makes $37.50/drive, still over the $13/drive for single drive enclosures:
    http://www.newegg.com/Product/Product.aspx?Item=9SIA8T93RU6064

    For whatever reason the single drive enclosures are seemingly the cheapest option by a large margin.


  • Global Moderator

    @HoteiLife said:

    For whatever reason the single drive enclosures are seemingly the cheapest option by a large margin.

    An alternative option to USB enclosures is external SATA enclosures which is a lot faster:
    http://smile.amazon.com/Vantec-Inches-Aluminum-Mobile-MRK-M2512T/dp/B00IAUP3OK ($9 / drive)
    http://smile.amazon.com/Sans-Digital-HDDRACK5-5-Bay-Organizing/dp/B001LF40KE ($7.50 / drive)

    Edit: I read the reviews of these and the first option sounds really bad and probably best avoided. Maybe there are better options but I could find none.

    The first option is for 2.5" drives, which can be useful if you have a bunch of these around (from old laptops, external drives etc). You can't daisychain SATA drives though so you still need controller boards with additional eSATA/SATA breakouts.



  • @in-cred-u-lous said:

    An alternative option to USB enclosures is external SATA enclosures which is a lot faster:
    http://smile.amazon.com/Vantec-Inches-Aluminum-Mobile-MRK-M2512T/dp/B00IAUP3OK ($9 / drive)
    http://smile.amazon.com/Sans-Digital-HDDRACK5-5-Bay-Organizing/dp/B001LF40KE ($7.50 / drive)

    The first option is for 2.5" drives, which can be useful if you have a bunch of these around (from old laptops, external drives etc). You can't daisychain SATA drives though so you still need controller boards with additional eSATA/SATA breakouts.

    Nice, with cooling & designed to be stackable. I like it. I think you can get USB to SATA adapters. But what they would cost & how many you could get connected total I don't know.



  • Guess you can also get splitters to share single SATA ports among multiple drives. Again at what point the SATA channel becomes saturated is a consideration. Here's a 1-to-4 splitter:
    http://smile.amazon.com/Cable-Matters-Internal-Mini-SAS-Breakout/dp/B012BPLYJC/


  • Global Moderator

    @HoteiLife said:

    I think you can get USB to SATA adapters. But what they would cost & how many you could get connected total I don't know.

    The ones I've seen are all 1-to-1 and I'm not sure they're bidirectional either... In any case, you loose the speed advantage of SATA.



  • @HoteiLife said:

    Guess you can also get splitters to share single SATA ports among multiple drives. Again at what point the SATA channel becomes saturated is a consideration. Here's a 1-to-4 splitter:
    http://smile.amazon.com/Cable-Matters-Internal-Mini-SAS-Breakout/dp/B012BPLYJC/

    Guess that's not right. It's a SCSI-to-SATA splitter.


Log in to reply