DIY Sia storage farm
-
I don't think that spending $3000 on a pod is necessary. A $50 tower can house 6 drives, though you will need a PSU, CPU, motherboard, and RAM. I'm guessing you can find a tower that's got a lot more than 6 slots for not much more money, I'm guessing you could find a way to pull the whole rig together for something closer to $1000 if you tried hard enough. Even better if you already have a bunch of spare parts lying around.
You generally don't want to fill a machine will all the same types of drives, because drive failures don't tend to be random, often multiple drives of the same type will fail together. Instead, you'd probably want a variety of brands and models.
I originally didn't think that RAID would be necessary for Sia storage, but given the requirement of 95% reliability, and given that hosts will actually be putting up collateral, I think you'd want RAID5 across 6 drives, so a 6x8TB rig would effectively become a 40TB rig. The overhead is pretty small, but you're protected against single drive failures and that should be enough to drive your reliability much higher as long as you can replace the drives within a week.
I don't think bandwidth will be an issue for home users. Sia will be making use of parallelism to overcome hosts with small pipes, and home users often have what is essentially free bandwidth that they can provide. People on datacenter connections often can't offer bandwidth below a certain price because that's what they are getting charged by their providers.
It does seem like, especially in the early days, storage is going to be super cheap. There's a lot of underutilized storage out there that people are just sitting on, waiting for something like Sia to turn it back into a profitable setup.
-
The aim here is to work out the specs of a low-cost yet profitable storage solution, suitable for the home-user.
With that in mind, although I agree with @Taek on choosing different drives vendors/types, not much seems to beat an 8TB Archive drive at cost. At that price, you could probably still deal with the occasional failure, better than using alternative higher-cost/lower-storage drives IMO.
Here's a pre-liminary "shopping list". Some of the parts, like the mo-bo, are perhaps not ideally suited or the cheapest but they make up a relative small portion of the total cost anyway.
Feedback and suggestions welcome!
Corsair Obsidian Series 750D Tower $115
12 x Seagate 8TB Archive HDD $215/ea
3 x HDD extra drive cage $10/ea
Gigabyte AM3+ AMD 970 motherboard w/6 SATA connectors $95
AMD FD6350FRHKBOX FX-6350 FX-Series 6-Core Black Edition $120
2 x Kingston Value RAM 4GB 1600MHz PC3-12800 DDR3 $18/ea
2 x 4-port SATA controller $16/ea
Sentey MBP1000-HM 1000W power supply w/ 8 x SATA $99
15 Pin SATA to 4 SATA Power Splitter Cable $10Did I forget anything?
Total Cost: $3,117
Total Storage: 96 TB
Cost / TB: $32.47 (cheaper than a Storinator pod :-))
-
It's currently up for debate whether 8GB of ram would be enough to support hosting 96TB of data. My napkin math (which is really just a wild swing in the dark) puts the estimated need between 5GB and 50GB of RAM, I'm guessing 32GB would be enough, 8GB would not, and 16GB might be enough but might not. (750,000 files per TB on the file system. 96TB -> 72 million files. Filesystem may use anywhere from 50 bytes to 500 bytes per file, and that's all going to go in RAM. Depends on the filesystem, and I actually have no idea how compact they are).
Those are constraints we can work on later in the code. There are things we can do to reduce the requirements, but right now you're looking at needing a lot RAM if you got a lot of storage.
Do you need 1000W power supply? 1000W constantly running will add around $1 per TB per month depending on where you live. I'm assuming it wouldn't be in use the whole time. More napkin math suggests to me that you could get away with 600W.
If we assume the average drive is going to last 24 months, and the average draw of the whole rig is going to be 500W (at 13 cent per kw/h), you get a total cost of about $4200 for 96TB. That's $2 / TB / mo if you are breaking even. At $4 / TB / mo you're making a decent amount of cash.
Right now redundancy is at 6, but as the network matures (so, a year or two from now), we should be able to bring that down to around 1.5. (we'll reduce it gradually). If hosts are charging $4 / TB / Mo, and redundancy is 1.5, renters are paying $6 / TB / Mo, which is not bad at all. And that's before we start doing crazier optimizations.
It's worth pointing out that in your rig above you've only got 6 SATA connectors, but 12 drives, so that's not going to work. I also think that the price of storage is going to be closer to $1 / TB / Mo (for hosts, so $6 for renters) based on what we've seen from hosts on the Storj and Burstcoin networks. It seems that at payouts of $1 / TB / Mo, you can get thousands of TB on your network. Custom built rigs should probably wait until we've outgrown the supply of spare parts. I'm estimating that the spare parts economy will get us somewhere between 10,000TB and 100,000TB. We'll know that we're running out of spare parts though when the price starts to go up.
-
Started working on a 'spare parts' unit, will share spec when finished...
-
And just like that Backblaze announces a new storage pod: https://www.backblaze.com/blog/open-source-data-storage-server
-
@Fornax said:
And just like that Backblaze announces a new storage pod
at a cost of $36.86/TB, which is still higher than 8 TB drives in a regular full-tower chassis.
The backblaze pods have very high density, however, which is required for housing 500 of these pods.
Bottom line seems to be a home-user could actually be competitive with enterprise storage providers.
-
@Taek said:
Do you need 1000W power supply?
No, with an estimate of 20W/drive 600 W total should be sufficient.
It's worth pointing out that in your rig above you've only got 6 SATA connectors, but 12 drives, so that's not going to work.
I also put in some 4-port SATA PCIe boards. These are cheap ($16/ea).
-
Any thoughts on using external USB hard drive enclosures?
There are cheap USB hard drive enclosures. Here's one for ~$13, with free shipping. The cost for 30 is ~$400:
http://www.amazon.com/inch-Silver-External-Drive-Enclosure/dp/B00S0UCEF6/You would also need powered USB hubs. Here's a 13-port hub for ~$20, also free shipping if you buy enough. You'd need 3 of them for 30 ports, which costs about $60:
http://www.amazon.com/dp/B00HL7Z46K/ref=psdc_281413_t1_B0051PGX2IThat's ~$460 for the hardware that can host 30 hard drives & connect them all to a computer. So it's cheaper than one of the enterprise-class storage enclosures. With the $215/drive cost from in-cred-u-lous you end up at $6910 for a farm fully loaded with drives. That's a cost of ~$29/TB before RAIDing. And a system like this can be added to gradually, as slowly as 1 hard drive at a time. So you can start out very small & grow at any speed.
What I don't know:
- How much processing does a farm controller need to do, say per TB in a fully loaded farm? What kind of CPU is needed to handle it?
- How many drives can reasonably be farmed before saturating the bandwidth of a single USB port?
- Is there any cooling needed if you stack 30 USB hard drive enclosures in close proximity? It's surely not an anticipated use.
- What's the total power usage of the USB hubs in a setup like this? This impacts monthly cost of the farm.
- Would you need hubs that provide a certain amount of power? Or do all powered USB hubs provide the same amount of power per port?
- USB drives would need to be software RAIDed; would this hurt performance too much?
- Would it be better/cheaper to use a USB->SATA adapter? Are a set of towers purely used to mount drives cheaper than 30 USB hard drive enclosures?
So tempting to just get a Raspberry Pi farming with an array of USB drives connected to it. Not sure if it's possible. I see from Taek the memory cost to host a farm is high; would it work if most of it was in virtual memory? You could set aside a sufficient portion of a large SD card as swap space.
-
Consider remanufactured brand shit, like this, spit cheap
http://www.serverhome.nl/storage/nas-server/hp-proliant-dl180g6-14.html
If you want to order in bulk, let me know
-
HP ProLiant DL180G6
Total: € 692,90
Incl tax:€ 838,42
HP ProLiant DL180G6 mainboard / chassis
HP ProLiant DL180G6 rackserver
CPU
1 x Heatsink HP ProLiant DL180G6 P/N: 507247-001
1 x 2.26GHz / Quad Core / QPI 5.86 GTs / Cache 8M / TDP 60WXeon L5520
Memory
6 x 8GB 2Rx4 PC3-10600R DDR3-1333 ECC, Samsung
Remote Acces - iLO
No iLO
Raid / Storage Controller
No raid. Attention: this server has no Harddisk / RAID controller.
HP SmartArray Memory
No raid memory
HP SmartArray BBWC Batery
No raid battery
Harddisk
No harddisk
Bracket / Caddy / Tray
12 x Harddisk Bracket 3.5" SAS / SATA Type HP ProLiant G1 - G7 : ML110G7, ML150G5, DL320G4, DL360G5, DL360G7, DL380G6, DL380G7, etc. incl. 4 screws P/N : 373211-001, 373211-002, 335537-001 To mount your own harddisk. Delivery time: 7 days
PSU / Power Supply Unit
2 x HP HSTNS-PL18 Power Supply, 750W, P/N: 506822-201, 506821-001, 511778-001
-
Looks like the enclosure I linked actually powers itself. First one I looked at was powered just by the USB port. So with powered enclosures maybe you could use unpowered USB hubs. Then you'd need power strips to give you enough electric outlets to feed all the enclosures.
-
@coinmonkey Buying refurbished equipment is a good option, though potential buyers should know that the maximum storage/drive capacity of "legacy" equipment might be limited (up to 42 TB, or just 5 TB drives, for the model series you link to).
PS: Could you please remove your second post, however, as it just adds noise to this thread. A link is sufficient. Thanks!
-
@HoteiLife said:
Any thoughts on using external USB hard drive enclosures?
For convenience, I would personally prefer a single enclosure, like a full-size tower or something. If you need to expand what a tower or rack module can hold, however, external enclosures is a good idea. Are there enclosures that fit multiple drives but does not force on your RAID? i.e. 1 volume per drive.
-
@in-cred-u-lous said:
For convenience, I would personally prefer a single enclosure, like a full-size tower or something. If you need to expand what a tower or rack module can hold, however, external enclosures is a good idea. Are there enclosures that fit multiple drives but does not force on your RAID? i.e. 1 volume per drive.
There are, but they all seem to be more expensive per drive. Even the USB 2.0 enclosures without RAID. Here's a 4 drive enclosure for $380, which is $95/drive:
http://www.newegg.com/Product/Product.aspx?Item=9SIA0AJ35C4582There's an 8 drive enclosure for $300, which makes $37.50/drive, still over the $13/drive for single drive enclosures:
http://www.newegg.com/Product/Product.aspx?Item=9SIA8T93RU6064For whatever reason the single drive enclosures are seemingly the cheapest option by a large margin.
-
@HoteiLife said:
For whatever reason the single drive enclosures are seemingly the cheapest option by a large margin.
An alternative option to USB enclosures is external SATA enclosures which is a lot faster:
http://smile.amazon.com/Vantec-Inches-Aluminum-Mobile-MRK-M2512T/dp/B00IAUP3OK ($9 / drive)
http://smile.amazon.com/Sans-Digital-HDDRACK5-5-Bay-Organizing/dp/B001LF40KE ($7.50 / drive)Edit: I read the reviews of these and the first option sounds really bad and probably best avoided. Maybe there are better options but I could find none.
The first option is for 2.5" drives, which can be useful if you have a bunch of these around (from old laptops, external drives etc). You can't daisychain SATA drives though so you still need controller boards with additional eSATA/SATA breakouts.
-
@in-cred-u-lous said:
An alternative option to USB enclosures is external SATA enclosures which is a lot faster:
http://smile.amazon.com/Vantec-Inches-Aluminum-Mobile-MRK-M2512T/dp/B00IAUP3OK ($9 / drive)
http://smile.amazon.com/Sans-Digital-HDDRACK5-5-Bay-Organizing/dp/B001LF40KE ($7.50 / drive)The first option is for 2.5" drives, which can be useful if you have a bunch of these around (from old laptops, external drives etc). You can't daisychain SATA drives though so you still need controller boards with additional eSATA/SATA breakouts.
Nice, with cooling & designed to be stackable. I like it. I think you can get USB to SATA adapters. But what they would cost & how many you could get connected total I don't know.
-
Guess you can also get splitters to share single SATA ports among multiple drives. Again at what point the SATA channel becomes saturated is a consideration. Here's a 1-to-4 splitter:
http://smile.amazon.com/Cable-Matters-Internal-Mini-SAS-Breakout/dp/B012BPLYJC/
-
@HoteiLife said:
I think you can get USB to SATA adapters. But what they would cost & how many you could get connected total I don't know.
The ones I've seen are all 1-to-1 and I'm not sure they're bidirectional either... In any case, you loose the speed advantage of SATA.
-
@HoteiLife said:
Guess you can also get splitters to share single SATA ports among multiple drives. Again at what point the SATA channel becomes saturated is a consideration. Here's a 1-to-4 splitter:
http://smile.amazon.com/Cable-Matters-Internal-Mini-SAS-Breakout/dp/B012BPLYJC/Guess that's not right. It's a SCSI-to-SATA splitter.
-
@HoteiLife said:
@HoteiLife said:
Guess you can also get splitters to share single SATA ports among multiple drives. Again at what point the SATA channel becomes saturated is a consideration. Here's a 1-to-4 splitter:
http://smile.amazon.com/Cable-Matters-Internal-Mini-SAS-Breakout/dp/B012BPLYJC/Guess that's not right. It's a SCSI-to-SATA splitter.
Correct, SATA can't be split/daisychained. You need 1 port on the controller per drive. The cables you linked to is SAS to SATA. I think Intel makes different 6 port SAS cards which you could use these cables with to get 4 x 6 = 24 SATA channels on a single board. These boards are ~$250. These are probably ideal for RAID arrangements. Otherwise, the 4-channel SATA controllers I mention in my first post would be cheaper.