Decentralizing Data Ownership on Portals

One of the outstanding issues for Skynet portals is that at the end of the day, the portal owns the contracts that pin the data, which means that the user is ultimately trusting the portal not to choose to delete their data.

We can fix this by having the user buy file contracts off of the portal. @nemo has already written some code which makes this possible, we could potentially adapt it to the premium portals that allows users to buy sovereignty over their data without ever needing to run a Sia node. The contracts / transactions could be optionally verified independently against other portals or against an online explorer, or it could be verified locally using a Sia full node.

Once utreexo is in place, we can go even one step further by extending the Skynet SDK to have a Sia full node run in the background on the web browser, which could itself independently verify the user’s contracts, meaning the user gets to achieve full decentralization without ever touching siacoins, and without ever running a full node. They just need to be a Skynet user talking to a centralized portal, and they get full sovereignty and full decentralization.

Hopefully this once again showcases the incredible power unlocked by Skynet. We’ve only just scratched the surface of what is possible, and as we continue innovating we will keep finding major enhancements that can push the Internet to the next level. The future awaits.

5 Likes

Initially we’d probably approach this by having the user and the portal both pin the file at full redundancy. This would be a little more expensive, but gives the user full control over their data while still giving the portal the ability to maintain a safety net for the user. If the user disappears for too long and their contracts expire, their data will disappear. If the portal is also keeping a copy, the data is safe even if the user is negligent.

A more sophisticated option, which would likely take a significant amount of time to develop, would be to have the portal monitor the user contract. The portal wouldn’t keep any redundancy at all if the user had moved the data onto sovereign contracts, and would instead just monitor the blockchain for renewals. If the portal sees that no renewals occur, it could step in at the last minute and migrate all of the user data onto its contracts, giving the user the same level of protection as in the paragraph above, but at significantly reduced cost to the portal, as the data would only need to be double-replicated for users that fail to keep their data maintained.

I kind of touched on this in the first post but it’s worth calling out explicitly - if we update the Skynet SDK to be able to perform contract and full node maintenance in the background of a webapp, we could get to the point where a user satisfies their requirement of “running a node once a month” simply by having the user use any Skynet app which backgrounds a process to perform node maintenance. This creates a significantly smoother experience for the user compared to Sia today, because a user that checks Skynet social media on a daily or weekly basis doesn’t even have to know that all this consensus stuff is happening in the background. From their perspective, it just works.

Being able to use Skynet without ever having to touch SC once Utreexo is out would most definitely be valuable.

Though, I don’t see how this would be useful until Utreexo is out? Unless you have your own full node you can’t maintain or renew contracts; it doesn’t really add much if you can “take control” of your own contracts if they become useless in a matter of weeks.

Maybe I just interpreted your timeframe on this wrong. Fill me in if I did.

You’d be in a position of trusting your portal that the contracts are real. You could also verify those contracts against other portals and against block explorers, which means you can expand your trust set from just trusting the portal to trusting a combination of sources that generally have trust from the community.

Even if you aren’t going and verifying those contracts, you also get protection against a portal that changes its mind later. If a portal grants you data in your own contract and then later decides it wants to redact that control, it has no power to do so. Which closes you off from attack vectors where a trustworthy resource goes rogue.

So there is significant value to doing this even before utreexo is out, but utreexo would enable full in-browser decentralization where beforehand the main advantage you get is the ability to use double check your safety with other centralized sources and protect yourself against good portals that get hacked or otherwise go rogue.

Okay, okay, so you buy the contracts off the portal, check them against siastats or whatever, and then do what? Without a sia node, you can’t do anything with them. And if you do move them to another portal, what do you get out of that?

Is this strictly for migration? Because as a migration feature this could be useful. Though it would be far more useful to just have a reg entry that links to a json file with a list of all your skylinks and the datastructure, so you can pin that recursively instead.

Basically, I still don’t understand the importance of developing something like this without Utreexo.

Once you have the contracts, you can use APIs on the portal (either happening automatically in the background from the SDK or explicitly using a skapp) to have the portal pay for storage as it moves sectors from to your contracts. Except for the “do my contracts actually exist?” question, you can do all the basic operations like appending sectors in the web browser today with a minimal amount of code.

Basically, when you want a sector, you would sign an update to the contract, hand that update to the portal, and then the portal would pay to move the sectors around on the host such that they are now owned by your contract. This is a lot simpler than having a full renter. Because you control the signature, the portal can either choose not to update your contract at all, or must update the contract according to your instructions. The portal would be paying your host, and then charging your centralized account with the portal for the expense.

Make sense?

Aaaaaaaaa that hurts my brain. Okay okay

So you send your copy of your contracts with your newly appended addition(alongside the signature) to what you want done; then the portal takes this contract, and does the revision you request. Once this is complete it passes back the new set of contracts.

So for example, you have your original contracts(set a), the data you want to append to your contracts, and the revised contracts(set b) alongside the signature for them. You pass all of this to the portal(because if the portal doesn’t have control over your contracts, it has no reason to have the previous version your contracts), then the portal does the operation that is required to make this change happen. It then uses it’s own funds to pay for changes? It then sends back proof of what has been completed.

Is that right? I may be alone on this, but I’m still a tad confused.

I get buying pre-funded contracts for Utreexo so you don’t have to pay in SC. And I also get the advantage to this idea in decentralization, you can use your contracts with any portal. But it seems like a huge amount of overhead. Correct me if I’m wrong.

You have the general idea correct, yes. A user would store the most recent version of their contracts on the Registry (raw, not as skylinks), and then when they want to migrate data onto those contracts they would call an api endpoint on the portal which:

  • has the old contract revisions
  • has the new contract revisions signed by the owner but not the hosts

And then the portal would run the updates with the hosts, so the API endpoint would return:

  • the new contracts revisions signed by both the owner and the hosts

Overall this should have pretty low overhead, you are basically doing some signing and some hashing, and then most of the rest of the work is being done by the portal. Updates are a single round trip, which means the protocol overall isn’t too bad to implement.

The hardest part is probably the piece that allows the user to manage what stuff in stored in their contract. You’d need some way for the user to know what they are pinning in their contracts, as well as some way for the user add and remove things. Specifically the system that tracks “which files correspond with which sectors?”

1 Like

I think I am having trouble grasping the fundamental concept.

As I understand it, in order to persist files to Skynet, you need to use a portal. This can be a public portal you trust or one you operate yourself. But a portal is the means of generating and executing the contracts.

How does a user get control over their contracts without operating their own node? Does the user negotiate with hosts directly? Again, how do you do this without a node? And how would this be feasible without Utreexo?

A file contract is composed of several parts. There’s:

  • The Merkle root
  • The siacoin payouts
  • The duration and storage proof window
  • The signatures that control the contract

You need a node to handle the siacoins, but you do not need a node to handle everything else. If you are using a portal’s node but maintaining control of the contract yourself, what this means is that the portal is the one adding siacoins and paying for the operations, but you the user are the one that controls the signatures.

The portal can only update the file contract if it has a valid signature for the changes it wants to make. By keeping control of the private key and ensuring the portal never learns the private key, a user can maintain control of the contracts even though they aren’t running a node, and even though they aren’t the ones supplying the siacoins.

Does that help it make more sense?

That makes more sense. I suppose right now, the portal is signing all the files in addition to remitting payment to hosts, but in this other arrangement, control is held by the user. And this wouldn’t need anything more than a small program to sign contracts, analogous to what every Skynet Registry-powered app already does.

So, under the current arrangement where portal is both contract-owner and payor, the portal can decide, “you know what, I don’t want this file anymore” and not only would they stop serving it, they could take the extra step of deleting it off of Sia altogether. And unless some other node has pinned the file, the user is basically screwed. But by splitting contract ownership from payment-making responsibilities, a portal can stop serving a file but it would have no power to purge it from Sia. This gives you the option to seamlessly take up business with another portal instead, so long as you do it before the contract expires. Do I have it right?

this is exactly correct

Okay, I get it now.

But does this pose a scalability issue to the portals? If you have to create a fresh set of contracts for each user(let’s say 50 contracts because you don’t need the full 300 of a portal because it’s just for storage, not for fetching from hosts) and each siad node is limited to 10 contracts a block, that means you can onboard a whole user per hour per node… that seems a tad slow.

That’s a good question. I’m not as much worried about portal speed - 10 contracts per block per server isn’t that slow, and the servers could make the contracts in advance, signing over the ownership to a new user later. That means on our current setup, we could do as many as 200 sets of contracts per day, which would mean anywhere from 5,000 to 25,000 paying customers (well over the cost of running these portals) on the current set. So you could just scale out horizontally.

But if you scale this too far, the Sia network as a whole starts to have trouble and you run into high transaction fee territory. Utreexo will help that some, but on today’s consensus code we’d probably cap out at somewhere between 100,000 and 1,000,000 sovereign users total.

But at that many soverign users, Siafunds and Skynet monetization will be driving enough revenue to dedicate multiple engineers to on-chain scaling, without needing to take engineers away from other goals.

Right ya makes sense.

I’m not really worried about scalability in regards to contract forming at this point, just wondering if you had any further insights.