Grant Proposal: Log Store

Project Name: Log Store

Organization: Usher Labs Pty Ltd (ABN 35658656332)

Primary Contact & Lead Developer: Ryan Soury (ryan@usher.so)

Overview

Log Store is a decentralised event log storage solution developed on existing and utilised networks such as Streamr, Kyve and ideally, the Sia Network. Log Store is analogous to Elastic’s Logstash but architected for decentralisation.

Log Store nodes are designed to receive real-time data via the Streamr Network, behave as a cache for this data in order to serve query requests, and then perform Kyve’s storage, proposal and validation mechanisms in order to move data to the Sia Network in a fully decentralised manner.

Purpose

The Usher team is interested in realising the Log Store to enable open and collaborative telemetry for dApps, such that events triggered within websites and mobile apps can have cumulative effects on other dApps or even blockchain Smart Contracts.

There currently does not exist a schema-less storage mechanism for arbitrary log data on a decentralised data network. Log Store aims to become this solution through integration with Sia, whether it be for storage of dApp web events, decentralised network node health checks or IoT device data.

Sia’s very cost-effective storage makes this log storage solution applicable to Web2 platforms and enterprises looking for more cost-effective log management solutions.

The Log Store solution is a component of a much larger decentralised data platform being designed with the support of the Usher team.

Solution

This solution is comprised of leveraging four technologies:

  • Streamr: Necessary for transporting data anywhere. Streamr is a pub/sub network capable of transporting data messages from just about any compute environment, whether a website, mobile app, IoT device or centralised server.
  • Kyve: A two-layer decentralised network with the purpose of archiving data. On the Chain layer, validator nodes secure a Tendermint-based blockchain managing consensus on whether the stored data is valid. On the Protocol layer, up to 50 nodes (that form a Pool), cooperate to submit votes to the chain on whether a bundle proposed by a single node is valid. Each validator can be an independent entity securely participating by staking via Kyve’s platform.
  • Sia: A network to facilitate the pay-as-you-go decentralised data storage.
  • EVM Smart Contract — logstore.sol: A decentralised authority over which data streams are stored by the validator nodes. A value-based stake (in ETH, MATIC, DATA, SIA, etc.) is required to be included in the parameterised Log Store to finance the compute and storage costs.

By integrating these technologies, we form a decentralised data pipeline governed by an EVM Smart Contract, such that Streamr data is moved to Sia for decentralised storage by a Kyve Pool (a decentralised network of nodes).

A proposal, Streamr Governance Proposal (SIP-13), for the Log Store solution has already been approved by the Streamr community with the potential of making this solution the default storage mechanism for the Streamr Network.

Processes

Log Store Node

Each node follows the same deterministic process:

  1. Pull/Refresh data streams from logstore.sol Smart Contract.
  2. Listen to data streams
  3. Move events into a local “bundle”
  4. Kyve Blockchain selects the Node as a Data Bundle “Proposer” or “Validator”
  5. If Proposer
    1. Propose the bundle of data, comprised of events from different streams by uploading it to Sia Storage
    2. Use the Storage ID to submit a proposal for a new bundle to the Kyve Blockchain
  6. If Validator
    1. Download the proposed bundle of data from Sia Storage
    2. Compare foreign proposed bundle with local bundle
    3. Prepare a vote on whether the data bundle is valid
    4. Submit the vote to the Kyve Blockchain
  7. Clear local “bundle”
  8. Repeat Step 1

In the circumstance where a Log Store configuration is insufficiently funded, all Bundles of data on Sia will be scanned, and if unfunded data exists, it will be removed. New Bundles that exclude the unfunded data will be re-uploaded to Sia, and a mapping of Storage (Transaction) IDs will be persisted inside of a Log Store configured especially for the Log Store Network. This means that the Log Store nodes will effectively re-use their own data storage interface to store a history of events indicating data changes, as well as persisting this mapping of updated data identifiers locally.

logstore.sol

This Smart Contract will allow anyone to create a new decentralised Log Store.

The user experience is like so:

  1. Create a Log Store by submitting a transaction with Streamr identifiers and a financial stake

  2. Receive an NFT representing ownership over the new log store.

  3. When log store funds deplete, top up your funds to continue log storage.

    Funding Log Stores is open to anyone with a vested interest in maintaining a log store.

  4. Burn the log store NFT to delete the log store.

  5. Use the log store NFT to amend the log store.

To ensure funds are being re-allocated continuously, the following processes will be included:

  1. A validator node will be selected to notify the Smart Contract when a new dataset/bundle is validated and uploaded.

  2. Within this transaction, an Oracle Network will be tasked to bring validated metadata about the latest dataset onto the Smart Contract.

    This may or may not use Kyve’s GraphQL interface depending on Oracle compatibility.

  3. This data is used to reallocate funds to compensate the validators accordingly. Price Feeds provided by Oracles, such as Redstone, will ensure fee accuracy between disparate cryptocurrencies staked and expensed.

Storage Costs and Kyve Staking will be managed by Node Operators participating in the Kyve Pool. Node Operators will earn funds within Kyve and a proportional amount will be reflected as fees in the EVM Smart Contract.

Fee Calculation

The stake requirement to facilitate a decentralised log store compensates the Kyve Pool Validators. Fees are calculated for each log store. Fees are reallocated after a given dataset/bundle has been stored.

The total fee is the sum of the:

  • fees incurred from data stored
  • validator rewards for facilitating storage
  • treasury fees to fund the maintenance & development of the Log Store solution

Validator rewards and treasury fees should & will be governable in a decentralised manner. While a new governance token offers a breadth of options for scale and purpose, Kyve already offers a governance platform, and therefore, the first version of the Log Store platform will leverage Kyve’s existing governance technology. This enables validators staking in Kyve’s platform to participate and vote.

Implementation

The objectives of this technology are to:

  • be compatible with Streamr interfaces
  • determine which Streamr streams to store as defined in a Smart Contract
  • ingest real-time data from the Streamr network
  • store ingested data ephemerally – ie. cache data
  • respond to queries rapidly using this stored data
  • use stored data to produce bundles for validation
  • perform validation and movement of data bundles to Sia as per Kyve’s process
  • request and load data from Sia when incoming Streamr queries request data that has expired from the ephemeral cache
  • re-uploading bundles to Sia, and persisting a mapping of updated identifiers where insufficient funds are detected for a given log store configuration.

These objectives are achieved by developing a hybridised node comprising of the Streamr Broker Node and the Kyve Protocol Node integrated with a new Sia Storage Provider, such that each of these disparate processes is embedded into a single executable binary.

The independent nodes are each developed in Typescript which makes the integration simpler with regard to inter-library dependency. The unification of logic into a single executable keeps Node Operation simple and ensures that Kyve’s KYSOR tool can be used to manage runtime upgradability.

Despite the integration of logic, an effort will be made to retain the disparate libraries’ frameworks. This way any upgrades made by the Kyve or Streamr teams to their corresponding libraries can be applied to the hybrid Log Store node, ideally through dependency versioning.

As the data ingested in the Node is now purposefully ephemeral, Streamr’s default Storage Plugin which depends on the management and uptime of a Cassandra DB will be modified to work with an embedded time-series database while keeping all of the interfacing intact.

The architecture of the Log Store Node can be found here:

Budget & Timeline

The project is requesting $40,000 USD for the delivery of the scope detailed in this grant proposal.

Milestone 1 — Concept to Production

  • Estimated Duration: 6 - 9 weeks
  • Costs: $25,000 USD

This phase includes developing just enough of the solution to demonstrate that it can work within production parameters such that ingestion of data into the node network represents Streamr’s live environment. This demonstration will be emulated through benchmarks against an operable test/staging version of the Node Network.

The objectives within this scope are to develop the:

  1. Log Store Node — which operates the Kyve Pool and includes integration of Sia storage.
  2. logstore.sol Smart Contract
    1. An EVM Compatible Smart Contract that requires an arbitrary financial stake in MATIC, ETH or DATA
    2. A list of Streamr data stream identifiers
Number Deliverable Specification
1 Documentation We will provide documentation of the code in the form of a README.md and a video tutorial that explains how a Node Operator can engage the Kyve Blockchain to participate, and spin up one of the Log Store nodes. Once the node is up, it will be possible to receive data over the Streamr Network to show how the nodes cooperate to move data to Sia in a decentralised manner.
2 Unit Testing Suite Core Node functions and Smart Contract methods will be fully covered by unit tests to ensure functionality and robustness. The documentation will outline the steps required to run unit tests locally.
3 Architecture Planning and Testing Preparing an architecture for the node that delivers on the short and long-term objectives of the project.
4 Sia x Kyve Storage Provider Module We will create a module compatible with any Kyve Node that allows Sia to be a Storage Provider for the archiving network.
5 Log Store Node Development We will develop a binary that incorporates the Kyve, Streamr and Sia processes
6 logstore.sol Smart Contract We will develop an EVM Smart Contract to govern which Streamr data streams are stored on Sia
7 Local Node Testing Testing the Node in a local capacity to ensure a working prototype. While this includes Unit Tests, it also involves the operation of the Node to ensure networking capabilities
8 End-to-end Testing & Improvements End-to-end testing of the Node; operation of the Node in a private network that conforms to a Kyve Pool’s security measures and processes
9 Benchmarks & Optimisations Benchmarks will be captured throughout this testing process and included in the open-source repository

The outcome of this development will be a stable Log Store Node capable of operation within a Kyve Pool to deliver on the objectives outlined in the scope. The EVM Smart Contract will be purely programmatically accessible.

Milestone 2 — Improvements, Tooling and Integrations

  • Estimated Duration: 4 weeks
  • Costs: $15,000 USD

This phase will primarily focus on developer tooling, deeper integration into the Streamr tools/UI, integration of Sia technology into the Log Store Node for storage cost optimisation, and further production-grade improvements to the Node and Smart Contract.

This phase will allow all new Streamr developers to simply add Sia as a storage provider for their data through the Streamr UI. A simple UI will also be included to demonstrate activity specifically for the Log Store network. Among other Smart Contract improvements, SIA (ERC20 compatible version) can be added as a payment currency to power storage using the Log Store.

Number Deliverable Specification
1. Node Improvements Production data gathered from node operation will open dialogue around how to better improve the Node Operation experience, as well as highlight performance optimisations. This line item encapsulates the developments aligned with this objective.
2. Smart Contract Improvements While the Smart Contract will be designed to minimise upgrade requirements, these improvements can include new currencies or additional Smart Contracts that extend the logic of the original.
3. Streamr Tools & UI Integration Considering the node interfaces with the Streamr Network, it’s important to allow Streamr developers to publish and pull data from the network using the native client libraries and interfaces.
4. Log Store UI While integration into Streamr UI can simplify the process of Log Store creation, and Kyve’s UI can indicate the state of the Log Store’s data bundle creation, a Log Store explorer and dedicated UI can add more visibility into how data is managed and the network’s traffic and performance.

This project is effectively a 1-off bounty type of effort, as monetisation is embedded into the project to ensure health of maintenance after utility. Despite this, ongoing funding, ideally from the foundation and other community grant programs or community donors, would accelerate marketing and user acquisition efforts, which in turn will increase data stored on Sia.

As this project is planned to be a component of a much larger decentralised data platform, we hope for this project to serve as a demonstration of Usher’s commitment to the development of decentralised technology, and for this project to catalyse a much deeper partnership.

While ideal to receive the project budget as a lump sum to ensure the most effective delivery of development items, Usher can accept the budget separated as per each milestone, such that the second milestone only proceeds once the first milestone has been delivered and approved.

Future Plans

Log Store is designed as a standalone solution that any network or dApp can use to transport logs and events to a decentralised storage network like Sia. The immediate strategy upon production deployment is to:

  1. Contact teams already using the Streamr Network and offer managed decentralised storage.
  2. Reach out to large dApps and offer open and transparent log-based telemetry that can be safely integrated into open-source projects. This way integrated dApp communities, and the teams behind these dApps have product analytics that is not siloed, centralised and obscured from stakeholders.

Log Store is also designed to be incorporated into a larger decentralised data platform that Usher is supporting the creation of. Once fully developed, this data platform will be capable of using data stored on Sia, through Log Store, as input data for decentralised computation capable of yielding fully verifiable data served to Oracle Networks. This way custom data ingested via Log Store can be combined and cross-verified against Blockchain data in a fully decentralised manner to yield verifiable data Smart Contracts can use to manage digital assets. The purpose of the data platform is to remove any centralised computational components and related security risks involved in using custom data in Smart Contracts to manage assets.

Open Source Commitment

Our philosophy around open-source technology is quite simple. If the software is being operated by others, it should be open source.

Usher has already released open-source libraries already to support its existing software services:

  • Github: usherlabs/programs
  • Github: usherlabs/usher.js

The Log Store Node and associated Smart Contracts will also be completely open-sourced, either licensed under MIT, Apache 2, LGPL-3.0 or GPL-2.0/3.0. Suggestions on licensing are also appreciated.

The codebase for the Log Store: GitHub - usherlabs/logstore: Log Store Node

Risks

The risks of this project lie primarily in the fact that it is effectively a composition of independent networks. Each of the foundations maintaining these networks could deviate in development direction from what would be ideal to maintain the Log Store network.

Usher has a direct line of communication with the leadership teams at Streamr and Kyve, which are the two primary networks utilised by this project. Discussions around support, funding and market distribution have already been held with these respective teams. Streamr has already agreed that the Log Store can become the de-facto storage mechanism for all Streamr developers. Kyve has already provided the development resources necessary to expedite the development and is ready to approve a new Pool of Nodes that will operate the Log Store.

Usher is confident that this risk is negated through coordinated partnerships and transparent communication.

Reporting

The project will include benchmarks and unit tests within the open-source code repository to allow the community and committee to evaluate the effectiveness and resilience of the project.

A key objective of the first milestone is to demonstrate through benchmarks that an operable network of nodes can handle Streamr’s production data ingress/egress — which can vary quite drastically on a daily basis.

The second milestone primarily involves UI developments, deeper integrations, and optimisations. Benchmarks will demonstrate how optimisations have improved performance. All other visually observable developments can be demonstrated over a video guide produced by the Usher team.

Conclusion

Usher is healthy and has raised capital to enable research and development. Usher is developing a community coordination software service that will allow anyone to share links and Smart Contracts to collect data, leverage insights, manage rewards and even earn rewards. This development’s focus on data has led the team to explore how dApps can be as transparent in their event emissions as Smart Contracts and Blockchains. Services depend on Smart Contracts events to determine when a specific piece of logic has been executed, ie. NFT transfers. Log Store is a solution to enable this event emission from any software environment, especially user-facing websites and mobile apps so that a deeply integrated application ecosystem can be built on top. Usher aims to be the first to utilise and integrate the Log Store, enabling dApps to emit events related to a user’s engagement, and subsequently allow anyone to learn how their referred community members actually interact with the integrated dApp.

Hi Ryan, thanks for your proposal! The committee meets every other week - and this was submitted the day before their latest meeting - so they’ll review during their next session on January 24th.

We’ll respond in this thread with their response. Thanks again!

1 Like

Hello Ryan,

The Committee reviewed your proposal and overall liked the well-written proposal but has decided that more is information needed. This is a thorough proposal, and log storage is an interesting use case, but we have some questions and concerns, namely:

  • 6-9 weeks does not seem like enough time for a single developer to implement the described functionality. Is some of the work already complete, and if so, how much?
  • Your Streamr grant was funded for essentially the same bill of work, but with Arweave instead of Sia. Was the Arweave integration completed? Also, given that the Streamr grant was in November, shouldn’t some of the milestones (e.g. developing the Log Store Node) already be complete?

We recommend modifying the scope of the proposal to focus on the Sia x Kyve Storage Provider Module. It is harder to justify providing grant funding for work that does not directly relate to, or benefit, the Sia ecosystem.

If the Sia Provider Module is successfully implemented, however, a follow-up grant to integrate it with the rest of the described system is likely to be approved.

Regards,
Kino on behalf of the Foundation and Grants Committee

Hi @Kinomora,

The full scope of work was proposed to ensure Sia is completely aware of what is being developed, the timeline, and to be an initial sponsor and storage module when realising the very first proof-of-concept (alpha) release.

The request does only apply to development that is applicable to Sia (ie. the Storage module), despite the proposal covering the full scope.
Therefore, the suggestion to amend the line items so that each item references work that impacts Sia directly is very reasonable, and we shall proceed with this approach.

The scope of work is indeed underway. The final architecture has been planned to include Arweave, and the code is being developed. The 6 - 9 week scope covers this development – despite being a very optimistic timeline, the outcome will simply be a usable proof-of-concept (alpha). Improvements will follow as per the second milestone stipulated.

I shall proceed to create a new grant proposal exclusively for the Sia Storage Module after the alpha has been released.

This will also give you some assurance that the outcome is usable.

Otherwise, if this reply gives you enough context to participate as an initial sponsor, we’d be happy to accommodate development time so that the initial release includes a Sia Storage Module.

Keen for your response.

Thanks,

Can we see the code that you are developing now? Is it open-source?

Hello @mike76,

Yes it is.

Please see https://github.com/usherlabs/logstore/tree/feature/two-layer

Thanks,