Core Development: Utreexo

Deliverable: Utreexo hardfork code implemented, tested, and released
Manpower: 1-3 Foundation engineers, 3-4 contracted engineers
Budget: 300MS (does not include Foundation engineer salaries)
Timeline: Jan 2021 — Jan 2022

This is a formal proposal for implementing a new consensus algorithm for Sia based on Utreexo. Utreexo is a proposal for Bitcoin that replaces the UTXO set with a compact, hash-based accumulator. This greatly lowers the resource requirements for operating a full node, at the cost of increased bandwidth. Another significant feature of Utreexo is that, due to the small size of the accumulator, fully-validating nodes can be “bootstrapped” using a short message or even a QR code. (Okay, technically they’re more like “forward-validating” nodes, but you get the idea.) Thanks to this bootstrapping, you’ll be able to run a node on a phone, Raspberry Pi, or other low-power device. You’ll also be able to help your friends bootstrap: instead of telling them to start siad and wait a day or two for it to download the entire blockchain, you can just give them your accumulator, and they’ll be instantly synced with no loss of security (as long as they trust you, of course).

Adopting the Utreexo model in Sia is a major undertaking: I expect it to be the largest single project that the Foundation pursues in 2021. Much of the existing consensus and networking code will need to be heavily modified or replaced entirely. This code has served us well for years now, but we have learned many lessons during that time. Utreexo offers a chance for us to start fresh on an extremely robust foundation. My hope is that the resulting codebase not only meets our own needs, but also serves as a template and an inspiration for the wider crypto community—something that everyone can point to as an example of excellent blockchain engineering.

Work on integrating Utreexo has already begun. I have been making slow progress throughout 2020 towards “minimum viable Utreexo” – that is, the simplest possible blockchain that contains all the core Utreexo functionality. As a result of this work, we’ve learned a lot about what Utreexo consensus looks like in practice, and how to adapt it to Sia’s requirements. The next step in the process is the development of a comprehensive specification, laying out in detail every aspect of Utreexo-based Sia. This includes the accumulator’s Merkle tree structure, the transaction format, new consensus rules, changes to the transaction pool and wallet, changes to the gateway protocol, the hardfork upgrade procedure, and much more. The spec will be developed in cooperation with Skynet Labs.

I expect the specification to be complete in Q1 2021. At this time, we will publish the spec and announce a timeline for the implementation, consisting of anticipated deadlines for various milestones. During the implementation phase, the Foundation will work closely with Skynet engineers to turn the agreed-upon spec into actual code. The Foundation will be publishing monthly updates as to the status of Utreexo (among other projects), allowing the community to track our progress towards future milestones. Once all of the major pieces are in place, we will need to begin a massive testing effort; after all, this will constitute the most significant hardfork in Sia’s history. At this time, we may also solicit professional security audit(s) for the new consensus code. When the code has been tested and audited to our satisfaction, we will deploy it to a public testnet for community testing. Assuming the testnet operates without serious bugs or failures for a reasonable period of time, we will set an activation height for the hardfork and release the hardfork binaries.

I am allotting a period of 1 year for this effort. Software deadlines are notoriously over-optimistic, and so despite having made substantial progress already, I anticipate that there will be many unforeseen challenges and setbacks during the integration process. It’s possible that the work will be complete well before December 2021, but it’s also possible that we barely squeak by and the hardfork does not actually activate until 2022.

I will lead the overall effort, with the assistance of any in-house devs that can be found and trained in time. Our salaries will fall under the larger Core Development budget, so are not counted here. I am budgeting a further 300 MS for contracting, security audits, and other associated costs. This is based on an estimate of $100k for a high-quality security audit and $500k for contracting 3-4 Skynet engineers part-time for one year.

8 Likes

This sounds good to me. I mean utreexo and actual improvements to the host code have been the allure of the Foundation from the get go(at least for me) so I don’t see a reason to not put my support behind this.

I know some will take issue with contracting Skynet Labs, but it doesn’t matter much to me.

1 Like

If you know anywhere who is better equipped to build utreexo for the Sia network, feel free to propose them as an alternative.

I feel they would say this out of principal more than because the team is incompetent. Once again though, I’m 100% cool with it.

What would they say is the more principled way to go about it?

With regards to the actual proposal, I see this as an excellent opportunity to upgrade Sia’s consensus as a whole. Sia was built in 2014 using 2014 tech, and the blockchain space has come a long way since then.

I believe that if the Sia consensus code is re-written carefully, we can achieve the following objectives:

  • Any device can fully validate the blockchain.
  • Blockchain state stored on Skynet instead of locally means a user’s wallet/account/node is accessible from any device. You only have to sync one time ever
  • Any user can sign a fully validated blockchain, asserting correctness. This can allow other users to bootstrap instantly if enough people they trust have signed / endorsed a particular state
  • Once bootstrapped, any device can continue to follow along with the blockchain as a full validator
  • After bootstrapping, a user can do backwards-verification, eventually verifying everything for themselves and becoming a full node.

We should strive to get as much L1 throughput as possible, which means making intelligent selection of data structures and algorithms, providing as much room for parallelism as possible, and ensure that the p2p network for transactions and blocks is efficient.

1 Like

huh, I hadn’t considered that. The ability to start using the network immediately, while still eventually validating the whole chain, is a powerful idea.

1 Like

Is this public? I would be pretty interested in seeing the work so far even if it’s in a non-working state.

from a technical perspective it shouldn’t even be that difficult. Utreexo makes it possible to verify each block independently, and then you just need to check that they all link together correctly. If you use SkyDB to store how far backwards you’ve checked, it’s super easy to resume from there and keep going further backwards.

I thought a good length about potential privacy support and hyper-scaling support (e.g. STARKS), but ultimately decided that these technologies as of 2020 are still not mature enough to match the Sia engineering philosophy, which is to only use very well understood and very stable primitives (with a potential exception for proof-of-work itself).

I feel like STARKs are within 5 years of being stable enough and advanced enough for our needs. But it’s not quite there yet. This means that Sia will continue to not have any L1 privacy or L1 hyperscaling.

To give a quick definition, hyperscaling means that users no longer need to validate every transaction in every block in order to be full nodes. The easiest way to achieve this currently is to use a STARK which proves that the block is valid. “blockchain sharding” is another common approach to achieve hyperscaling. There are some unsolved problems with all hyperscaling schemes today, and given that we are a storage community, not a blockchain scalability community, I think we should let other teams tackle those issues for the time being.

Not currently, but I intend to publish it on my personal GitHub as soon as it’s “finished,” which for me means you can run a testnet on it, mine blocks, relay transactions, and send/receive coins using a basic wallet. I expect that there will be a DoS vector or two in the p2p code, but writing perfect p2p code is a non-goal for minimum-viable-Utreexo; writing perfect consensus code is. I’ll also be including a “design doc” with rationale for various decisions, for those interested in cryptocurrency design. For the actual Sia Utreexo implementation, we’ll be reusing or building upon a lot of this code (in particular the consensus code), but elements like the p2p, tpool, and wallet will be completely different and far more robust.

Strong agree; the only reason I’m comfortable migrating to Utreexo is because its accumulator design uses a very old and well-understood structure – a Merkle tree. In fact, it uses the exact Merkle tree variant (a set of unique perfect trees) that Sia has used for years. I don’t boast often, but in this case I believe that we are more qualified to implement Utreexo than just about any team in the world.

3 Likes