Small Grant - ObsidianLog

Introduction

Project Name: ObsidianLog

Name of the organization or individual submitting the proposal: Emmanuel GloryPraise

Software engineer with 5+ years of experience building full-stack and blockchain systems, specializing in TypeScript and Rust. At INTMAX (L2 privacy protocol), engineered and maintained SDKs powering the INTMAX Wallet integration layer, enabling developers to build privacy-preserving applications using client-side proving. Experienced in designing end-to-end systems across backend services, data models, and APIs, with a focus on scalability and developer usability. Built and implemented modular smart contract architectures leveraging standards like EIP-2535 (Diamond Standard). Contributed to developer ecosystem growth at Web3Bridge and 0xShip HQ through workshops, conference talks, and technical content, onboarding developers into Web3. Hands-on experience with SDKs and smart contract development across blockchains including EVM chains(Solidity), Polkadot (ink!), Stellar (Soroban), Arbitrum (Stylus), etc.

Github: emmaglorypraise (Glory Praise Emmanuel) · GitHub

Describe your project.

ObsidianLog is a long-term log archival system built natively on Sia using the official developer SDKs. It sits alongside existing observability stacks - Datadog, Grafana, self-hosted ELK - as a cold-tier destination. Logs flow into hot tools for active monitoring, then automatically archive to Sia: client-side encrypted, zstd compressed, hash-chained for tamper-evidence, and queryable at a fraction of the cost.

The problem it solves is structural: compliance regulations (SOC2, HIPAA, GDPR, PCI-DSS) mandate log retention for 1–7 years, yet AWS CloudWatch costs over $54,000 for three years of storage at moderate scale, and Datadog exceeds $86,000 over the same period. Most engineering teams respond by deleting logs after 30–90 days and hoping auditors never ask. ObsidianLog makes compliance-grade long-term retention affordable - under $1000 over three years for the same volume, after zstd compression.

This MVP grant funds the foundational layer: a core Sia SDK storage library in Rust, a Vector-compatible HTTP ingest server, a CLI query tool, an interactive obsidianlog init setup wizard that reduces onboarding to under 15 minutes, and a live GitHub Actions demo any developer can fork immediately. ObsidianLog is among the first purpose-built decentralized log archival systems designed for production observability pipelines. There is no comparable tool in the Sia ecosystem, and very few decentralized solutions today address production-grade log archival with compliance-focused features in a developer-friendly way.

By leveraging indexd, ObsidianLog also reduces the operational complexity traditionally required to interact with Sia, enabling developers to archive logs without managing low-level blockchain concerns such as contract formation or storage coordination.

How does the projected outcome serve the Foundation’s mission of user-owned data? What problem does your project solve?

Today, every engineering team’s operational logs containing sensitive system behaviour, security events, user actions, and infrastructure telemetry are owned and controlled by AWS, Datadog, or Elastic. Teams cannot verify tamper-evidence, cannot guarantee retention, and have no meaningful access controls independent of the vendor. When you archive logs to AWS Glacier, Amazon controls the keys, the retention, and the access.

ObsidianLog returns that ownership entirely to the team that generated the data. Encryption keys are generated locally and never transmitted; they exist only on the user’s infrastructure. Storage contracts are formed directly between the user’s Siacoin wallet and Sia hosts, with no intermediary. The builder is entirely out of the payment flow. Every log chunk is append-only and hash-chained: SHA-256(previous chunk) is embedded in each new chunk, creating a cryptographic tamper-evident chain that any auditor can independently verify. If a chunk is altered or deleted, the chain breaks, detectably, provably, without relying on any third party’s word.

This is user-owned data in its most operationally practical form. It is not a theoretical privacy benefit, it directly solves the compliance exposure that forces engineering teams to choose between deleting evidence and paying enterprise cloud bills. ObsidianLog makes the Sia mission concrete for millions of engineering teams that interact with log data every single day.

Are you a resident of any jurisdiction on that list? No

Will your payment bank account be located in any jurisdiction on that list? No


Grant Specifics

Amount of money requested and justification with a reasonable breakdown of expenses:

Total requested: $10,000 for a 3-month MVP development period (May – July 2026).

Line Item Detail Amount
Developer fees 3 months work $10,000

Grant payments will be received monthly via ACH/wire in USD.

  • Sia storage costs excluded: ObsidianLog is architecturally designed so that each user provisions and funds their own Sia node with indexd support. The builder incurs zero ongoing storage or hosting costs. The grant funds engineering time only.

What is the high-level architecture overview for the grant? What security best practices are you following?

ObsidianLog is a pipeline with four layers. Every log batch passes through all four before and after it touches Sia.

The system is designed to align with Sia’s modular architecture, leveraging indexd to simplify interaction with the network while maintaining full user control over data.

1. Ingestion Layer

The primary ingestion integration for the MVP is a local HTTP ingest endpoint that Vector routes to using its built-in HTTP sink, the fastest-growing modern log agent, and the de facto standard for new infrastructure builds. Developers add a single destination block to their vector.toml:

[sinks.obsidian]
type = "http"
inputs = ["your_source"]
uri = "http://localhost:7080/ingest"
encoding.codec = "json"
batch.timeout_secs = 300
batch.max_bytes = 10485760

ObsidianLog runs a lightweight local ingest server that receives log batches from Vector at this endpoint. No application code changes are required. Vector handles buffering, backpressure, and retry. If Sia is temporarily unavailable, logs buffer locally and flush when connectivity resumes; no data is lost. A native compiled Vector sink plugin is scoped into Phase 2 once the plugin architecture has been explored in depth during Month 1.

2. Processing Layer

Each log batch passes through a deterministic, ordered pipeline before any data leaves the user’s infrastructure:

  • Parsing: Structured fields (timestamp, service, level, host, trace_id) are extracted into a lightweight metadata index. The index is stored separately and is less than 1% of the full log size. All queries hit the index first; full log chunks are fetched only when needed.
  • Compression: zstd compression is applied to the full log batch. Log text is among the most compressible data types — repetitive structure, repeated field names, predictable patterns. In practice, zstd achieves 90–97% reduction on log data. 100 GB/day of raw logs becomes approximately 3–10 GB stored.
  • Encryption: The compressed batch is encrypted with AES-256-GCM using a user-generated key. Keys are created during obsidianlog init, stored locally in the user’s keychain or a local secrets file, and never transmitted to any external system. The encryption is authenticated: the GCM tag ensures ciphertext integrity in addition to confidentiality.
  • Hash chaining: Each encrypted chunk is assigned a SHA-256 hash. That hash is embedded in the next chunk’s header: chunk_n.prev_hash = SHA-256(chunk_{n-1}). This creates an append-only tamper-evident chain. Any deletion, reordering, or modification of a chunk breaks the chain at a detectable position. The obsidianlog verify command traverses the full chain and reports any integrity violations.
  • Chunking: Logs are grouped into configurable time windows (default: 1 hour) and written as discrete chunk files. Chunks are named by time window for efficient range retrieval.

3. Sia Storage & Indexing Layer (via indexd)

During this process, metadata references are registered and resolved via indexd, which acts as both an indexing layer and a gateway for interacting with the Sia network. This enables efficient discovery and retrieval of stored log chunks without scanning full archives, while also abstracting away lower-level complexities such as contract management and storage coordination.

This aligns naturally with the system’s architecture, where lightweight metadata indexes are queried first before fetching encrypted log data.

The system separates responsibilities clearly:

  • Sia network: durable, decentralized storage of encrypted log chunks

  • indexd: structured indexing and retrieval coordination

  • ObsidianLog: ingestion, processing, encryption, and query logic

The on-Sia storage layout is:

/<bucket>/
  index/
    <service>/<YYYY-MM-DD-HH>.idx   ← metadata index (lightweight, always fetched first)
  chunks/
    <service>/<YYYY-MM-DD-HH>.bin   ← encrypted + compressed log data
  manifest.json                     ← root manifest linking all chunks with hash chain values

All writes are append-only. Chunks are never modified or deleted post-write. The manifest tracks the current chain head and all chunk references.

Month 1 of the grant is dedicated to hands-on exploration of indexd, including indexing behavior, retrieval performance, and integration patterns under realistic log workloads. Architecture decisions (chunk sizing, index structure, manifest format) will be finalized based on observed system behavior.

4. Query Layer

The CLI queries the metadata index to identify relevant chunks (fast, no large downloads), then fetches and decrypts only the matching chunks. Filters available in the MVP: time range, service name, log level, host, keyword. Output formats: human-readable terminal, JSON, raw (for piping to jq). The obsidianlog verify subcommand traverses and validates the full hash chain.

Security practices followed:

  • Client-side encryption only: No plaintext log data ever leaves the user’s infrastructure. Encryption occurs before data is written to the Sia network and registered via indexd.
  • User-controlled keys: Key generation happens locally during obsidianlog init. Keys are stored in the user’s OS keychain (via the keyring crate on Linux/macOS/Windows) or an explicit local secrets file with 0600 permissions. No key escrow, no key transmission.
  • Authenticated encryption: AES-256-GCM provides both confidentiality and ciphertext integrity. Tampered ciphertext is rejected at decryption time.
  • Append-only storage model: Chunks are write-once. The storage model is designed to prevent in-place modification, relying on append-only writes combined with indexd-coordinated retrieval and hash chaining for tamper evidence. This provides strong tamper-evidence without requiring a blockchain.
  • No intermediary in the storage path: ObsidianLog connects to the user’s Sia node with indexd support for indexing and retrieval coordination, without any centralized intermediary. There is no ObsidianLog-operated proxy, gateway, or relay for the self-hosted MVP. The builder has zero access to user data.
  • Dependency audit: The Rust dependency tree will be audited with cargo audit in CI. Critical dependencies (zstd, aes-gcm, sha2) are maintained by the RustCrypto and zstd-rs communities with strong security track records.

What are the goals of this small grant? Please provide a general timeline for completion.

The MVP answers one core question: can real logs flow from a production-grade log agent into Sia, encrypted, compressed, tamper-proof, and be retrieved accurately through a query interface? Every deliverable in the MVP exists to answer that question with evidence. Everything else is explicitly deferred to Phase 2.

Month 1 - Foundation & Exploration (May 2026)

Goals: Hands-on exploration of indexd APIs and integration patterns. Map failure modes, write latency, host selection behaviour, and contract lifecycle under realistic conditions. Prototype the full compression → encryption → hash-chain pipeline end-to-end against Sia testnet. Finalize storage architecture and chunk layout decisions based on real observations, not assumptions.

Deliverables:

  • Architecture Decision Record (ADR) documenting all storage design choices with rationale
  • Spike code: end-to-end write/read round-trip against Sia testnet (not production-ready, but functionally correct)
  • Documented edge cases and failure scenarios discovered during indexd exploration
  • Month 1 progress report submitted to the Sia Foundation forum

Month 2 - Core Build (June 2026)

Goals: Build the production core storage library based on Month 1 architecture. Build the HTTP ingest server. Achieve 1 GB/hour sustained ingest in load testing. Publish to crates(dot)io.

Deliverables:

  • Public GitHub repository (MIT licensed)
  • obsidianlog-store crate: core storage library with compression, encryption, hash-chain (published to crates(dot)io)
  • ‘obsidianlog-ingest’ crate: local HTTP ingest server receiving log batches from Vector, with batching, backpressure handling, and structured + unstructured log support (published to crates(dot)io)
  • Integration test suite with CI passing against Sia testnet
  • Month 2 progress report submitted to the Sia Foundation forum

Month 3 - Ship & Launch (July 2026)

Goals: Build the CLI query tool and hash-chain verification command. Build the obsidianlog init wizard. Build and publish the GitHub Actions reusable workflow. Ship cross-platform static binaries. Write full documentation. Record live demo. Public launch.

Deliverables:

  • obsidianlog CLI: static binaries for Linux, macOS, and Windows (published via GitHub Releases)
  • obsidianlog init interactive setup wizard (indexd detection, wallet guidance, key generation, connection test)
  • GitHub Actions reusable workflow (published to GitHub Marketplace)
  • Documentation site
  • Docker Compose quickstart
  • Live demo recording showing the full end-to-end flow
  • Final MVP report with usage metrics

Success criteria (end of Month 3):

  • Logs flow from Vector to Sia end-to-end, typically within 60 seconds from setup completion under normal conditions
  • Archived logs are retrievable via CLI with correct filtering and intact content
  • obsidianlog verify hash-chain check passes on all stored chunks
  • obsidianlog init completes in under 15 minutes on a clean machine
  • GitHub Actions demo is publicly available and forkable
  • At least 10 external developers have tested the tool and provided feedback. Outreach will target the Sia Discord, developer communities, and the Vector community Slack.

Who is the target user for your project?

The MVP is targeted at technically capable developers and small engineering teams. The defining characteristic of this cohort is that they are comfortable running self-hosted infrastructure and configuring Sia nodes with indexd support; they are comfortable with CLIs, config files, and self-hosted infrastructure.

Primary audience (MVP):

Segment Why They Adopt
Solo developers and indie hackers Zero existing log retention. ObsidianLog provides compliance-grade archival at effectively zero cost.
Web3 / crypto-native teams Already comfortable with wallet infrastructure. Philosophically aligned with user-owned data. Will trial early.
Sia ecosystem developers Already using Sia. ObsidianLog is a natural addition to their existing workflow.
Privacy-first startups Legal tech, health tech adjacent. Want “we don’t store your logs with Big Tech” as a genuine claim, not marketing.
Open source project maintainers Long-term, redundant archival for CI logs, release artifacts, and project telemetry. GitHub deletes Actions logs after 90 days.

Secondary audience (Phase 2, post-grant):

Mid-size engineering teams at growth-stage companies. Their primary barrier is not cost or technical capability — it is vendor trust. They need to see ObsidianLog running in production, evidence from other teams using it for compliance, and a track record of Sia network reliability. The grant period is explicitly about earning that trust and building that evidence base.

The positioning is additive, not competitive: ObsidianLog is not competing with Datadog or Grafana for hot log storage and active incident response. It is the cold-tier drain at the bottom of the log pipeline. Teams keep their existing hot tools; ObsidianLog is one additional destination in their Vector config. There is nothing to switch, nothing to migrate, no dashboards to rebuild.


What are your plans for this project following the grant?

ObsidianLog is planned as a three-phase project. The MVP (Phase 1) establishes the foundational layer. Subsequent phases extend it into a complete product and a self-sustaining commercial offering.

Phase 2 — Full Product

Phase 2 extends ObsidianLog from a developer proof-of-concept to a complete product:

  • Fluentd output plugin (enterprise Kubernetes/GCP/AWS standard)
  • Logstash output plugin (legacy enterprise ELK environments)
  • HTTP ingestion endpoint (custom applications and greenfield integrations)
  • Web UI with visual query interface, date range picker, faceted filters, full-text search
  • Compliance export packages: timestamped, signed audit bundles for SOC2, HIPAA, GDPR
  • Retention policy configuration (automatic pruning with policy records)
  • Access audit log (who queried what and when — required for SOC2 CC6.1)
  • Grafana datasource plugin (query archived logs directly within existing Grafana dashboards)
  • Batch import CLI (migrate historical logs from S3, CloudWatch, or local files)

Phase 3 - Commercial

Phase 3 introduces the commercial tier that makes ObsidianLog accessible to non-technical teams and enterprises while remaining self-sustaining:

  • Managed hosted gateway: developers pay in USD via card; Siacoin complexity is fully abstracted; the gateway operator handles SC conversion and absorbs price exposure
  • Enterprise tier: SLA guarantees, SSO integration, dedicated compliance support, volume pricing
  • Stable USD pricing to eliminate Siacoin volatility for budget-sensitive enterprise teams
  • Dashboard: usage analytics, cost projections, storage health monitoring
  • SIEM integrations: Splunk and IBM QRadar forwarding from archived logs

Sustainability model:

The open-source core (storage library, CLI, ingestion plugins) will remain MIT-licensed permanently. Community contributions keep the project alive independent of commercial outcomes. Revenue from the managed gateway and enterprise tier funds ongoing development. Because the builder incurs zero infrastructure costs in the self-hosted model, the project is sustainable even at zero commercial revenue during the post-grant open-source phase.


Potential risks that will affect the outcome of the project:

1. Cross-platform static binary shipping
Distributing a single Rust binary that works reliably on Linux, macOS, and Windows — including ARM variants — requires careful toolchain setup, especially around OpenSSL and system library linking.

Mitigation: Static linking via musl (Linux), x86_64-apple-darwin + aarch64-apple-darwin (macOS), and x86_64-pc-windows-gnu (Windows) targets will be established in CI during Month 2, not left to Month 3. The release pipeline is built before it is needed.

2. Onboarding friction
The self-hosted Sia + indexd setup requires more steps than traditional SaaS tools like Datadog. This is the biggest adoption risk for the initial cohort.

Mitigation: The obsidianlog init wizard (Month 3 deliverable) is designed to reduce setup from a multi-hour process to under 15 minutes on a clean machine. A Docker Compose quickstart that spins up indexd alongside ObsidianLog will be published as an alternative path for developers who prefer containerized workflows.

3. Sia network availability at query time
Users querying archived logs depend on Sia hosts being available.

Mitigation: ObsidianLog uses 3x redundancy via indexd’s default replication. The obsidianlog verify command provides a health check that confirms archive integrity and host availability on demand. Compliance use cases — the primary driver — require that archived evidence is retrievable at audit time, not necessarily in real-time. Sia’s durability model (redundant storage across independent hosts) is well-suited to this access pattern.

4. Indexd maturity and integration risk
Indexd is an evolving component of the Sia ecosystem, and its behavior under sustained log ingestion workloads is not yet fully characterized.

Mitigation: Month 1 is dedicated to validating indexd’s capabilities and constraints under realistic workloads. The system will be designed with modular storage abstractions to allow flexibility if underlying integration patterns evolve.


Development Information

Will all of your project’s code be open-source?

Yes. All code produced under this grant will be released under the MIT license.

Leave a link where code will be accessible for review.

Do you agree to submit monthly progress reports?

Yes. Monthly progress reports will be submitted to the Sia Foundation forum at the end of each month


Contact

Email: [email protected]

Linkedin: https://linkedin.com/in/emmaglorypraise

Hi @giipee - welcome to the Sia community! Thank you for your proposal.

I see multiple mentions of renterd in your proposal but we have shifted to focus on indexd.

Please update by 5pm ET today to have this proposal at next week’s Grants Committee meeting.

Hi @mecsbecs , thank you for the feedback!

I’ve updated the proposal to align with indexd, replacing the previous renterd-based approach and refining the architecture to reflect indexd’s role in indexing and retrieval coordination.

I’ve also reviewed the proposal for consistency and made additional improvements based on this change.

The updated version has now been submitted. Please let me know if there’s anything else you’d like me to clarify or adjust.

Thanks again. I appreciate the guidance!

Hi @giipee - thanks for these edits. This will be presented to the Committee next week.

Hello @giipee,

The Committee was unfortunately unable to get to your proposal during today’s meeting. Your proposal has now been slotted for review during the next meeting on May 12th.

Thank you for your patience.

Hi @mecsbecs. Thanks for the update.

Thanks for your proposal to The Sia Foundation Grants Program.

After review, the Committee has decided to reject your proposal citing the following reasons:

  • The Committee has concerns with the compliance components of the project. As outlined in the proposal, The Sia Foundation would be required to have relationships with storage providers (hosts) to ensure they are compliant, which is not possible on a decentralized network.

  • Milestone 1 should not include testnet work to validate indexd and its functionality, as this should be accomplished ahead of the grant work.

  • Additional consideration of features beyond log ingestion and archival would be interesting to see included in the main proposal or post-grant plans to make the proposed solution more robust and to better compare to other potential services.

We’ll be moving this to the Rejected section of the Forum. Thanks again for your proposal, and you’re always welcome to submit new requests if you feel you can address the Committee’s concerns.