Introduction
Project Name: Sia Migration / ETL Tooling
Name of the organization or individual submitting the proposal: Mert Köklü
Describe your project:
This project delivers two open-source integration tools that make Sia a first-class destination in the two most widely adopted data movement ecosystems: rclone (file synchronization) and Airbyte (data integration / ELT). The first tool is a native rclone backend for renterd, replacing the deprecated siad-based backend that currently ships with rclone and is failing integration tests. The second is a dedicated Airbyte destination connector, enabling automated ELT pipelines from any of Airbyte’s 600+ source connectors into Sia storage — a category where no decentralized storage project currently has representation.
Both tools adopt a hybrid architecture: renterd’s S3-compatible gateway handles data plane operations (uploads, downloads, object management) through well-tested S3 protocols, while renterd’s REST API provides control plane capabilities — wallet health, contract status, blockchain sync state, server-side copy/move — that generic S3 clients cannot access. The rclone backend is written in Go, registers Sia as a named provider discoverable via rclone config, and supports core file operations, server-side copy/move, multipart uploads for large files, and storage diagnostics via rclone about. The Airbyte connector is a Python-based destination built on the Airbyte Python CDK that consumes structured records from any Airbyte source, serializes them into compressed JSON Lines files optimized for Sia’s slab architecture, and uploads them with full state checkpointing for reliable incremental syncs.
Who benefits from your project?
The rclone backend serves the broadest audience: individual users backing up personal files, system administrators scripting automated backup jobs with rclone sync, data archivists orchestrating cloud-to-Sia transfers from rclone’s 70+ source backends, and media server operators mounting Sia storage via rclone’s FUSE layer. It also provides a functioning upgrade path for legacy Sia users whose existing rclone setup targets the deprecated siad daemon.
The Airbyte integration targets data engineers building ELT pipelines — enabling database-to-Sia archival, SaaS-to-Sia compliance backups, and cross-platform data lake architectures where Sia serves as cost-effective cold storage. No Sia destination currently exists in Airbyte’s 600+ connector catalog, and no decentralized storage project has one either, making this a new category entry.
Both tools ultimately benefit the Sia network itself by lowering the barrier to data ingestion, increasing stored data volume, and expanding the pool of hosts with active contracts.
How does the project serve the Foundation’s mission of user-owned data?
The Sia Foundation’s mission centers on data privacy and ownership, and its 2026 roadmap explicitly targets simplified user onboarding, developer onboarding, and full S3 protocol compatibility — making this the optimal window for migration tooling investment.
Currently, no purpose-built migration tool exists for Sia. Users who want to move data from centralized providers face a fragmented process: manually configuring S3 endpoints, generating access keys, and navigating generic error messages. This friction keeps data locked in centralized silos and directly contradicts the Foundation’s mission. A native rclone backend reduces migration to rclone config → select Sia → start syncing. An Airbyte connector reduces it to selecting “Sia” from a dropdown in a visual pipeline builder. Storj’s experience validates this approach — their rclone integration and S3 compatibility are consistently cited as the primary reasons enterprises find them easy to evaluate and adopt. Meanwhile, Walrus — a well-funded decentralized storage protocol backed by $140M from a16z and Standard Crypto — has an RFP specifically for migration and ETL tooling, signaling that competing networks already recognize this category as a critical adoption driver. The window for Sia to establish presence in these ecosystems is narrowing.
Beyond onboarding, both tools ensure that data remains user-owned throughout: files are encrypted and erasure-coded by renterd across independent hosts with 3× redundancy, and no centralized intermediary holds the data or the keys. By embedding Sia into tools that engineers already use daily, this project converts the Foundation’s infrastructure investments into accessible, discoverable migration paths — turning user-owned data from an aspirational goal into a practical default.
Are you a resident of any jurisdiction on that list? No
Will your payment bank account be located in any jurisdiction on that list? No
Grant Specifics
Amount of money requested and justification with a reasonable breakdown of expenses:
Total requested: $20,000 USD over a 3-month development period, paid monthly.
This budget covers a single developer working across two sequential workstreams (rclone backend in Go, Airbyte connector in Python). The breakdown reflects development effort only — no marketing, community engagement, or promotional expenses are included.
| Category | Amount | Description |
|---|---|---|
| rclone Backend Development | $8,000 | Native renterd backend in Go: core fs.Fs/fs.Object implementation, S3 gateway data operations, REST API control plane integration, server-side copy/move, multipart uploads, About/ListR/Purge features, modification time and hash support, integration test suite, rclone documentation page, and upstream PR submission |
| Airbyte Connector Development | $8,000 | Python CDK destination connector: spec/check/write implementation, boto3 S3 client integration, JSONL and CSV output formatters with GZIP compression, overwrite/append sync modes with state checkpointing, configurable file sizing, Sia-specific error reporting, unit and integration tests, Dockerfile, metadata.yaml, and Airbyte Marketplace submission |
| Infrastructure, Testing and Documentation | $4,000 | Renterd instances on Zen testnet and local Docker environments for integration testing, Siacoin for contract formation and upload/download verification across both tools, mainnet validation testing before upstream submission, documentation |
Monthly payment schedule:
- Month 1: $8,000 (Milestone 1 delivery)
- Month 2: $8,000 (Milestone 2 delivery)
- Month 3: $4,000 (Milestone 3 delivery)
What is the high-level architecture overview for the grant? What security best practices are you following?
The architecture follows a hybrid approach: both integrations use renterd’s S3-compatible gateway for data plane operations (uploads, downloads, object management) and renterd’s REST API for control plane features (listings, server-side copy/move, wallet health, contract status). This design maximizes compatibility with established S3 protocols while enabling Sia-specific capabilities unavailable through S3 alone.
Rclone Backend (Go)
The backend registers as sia in rclone’s backend system, replacing the existing deprecated siad implementation. It implements the required fs.Fs and fs.Object interfaces, plus optional capabilities: Copier, Mover, Purger, ListRer, Abouter, and PutStreamer.
- Data plane (S3 gateway): Handles uploads, downloads, and multipart uploads for large files, with chunk sizes aligned to Sia’s 120 MiB slab boundaries.
- Control plane (REST API): Handles directory listings with prefix/delimiter support, server-side copy and rename operations, and the
Aboutcommand (wallet balance, contract count, storage usage) — diagnostics impossible through S3 alone. - Authentication: S3 operations use AWS Signature V4 with key pairs from
renterd.yml. REST API operations use HTTP Basic Auth with the renterd API password.
Configuration options include renterd API URL, API password, S3 gateway endpoint, S3 access/secret keys, and bucket name.
Airbyte Connector (Python)
The connector (destination-sia) extends the Airbyte Python CDK’s Destination base class and implements the three required operations: spec(), check(), and write(). It connects to renterd’s S3 gateway using boto3 configured with a custom endpoint.
- Write flow: Records are buffered per-stream into local temporary files. When a buffer exceeds the configurable threshold (default 200 MB, aligned with renterd’s ~120 MiB slab size), the file is uploaded to Sia via boto3. State checkpoints are emitted only after successful uploads, ensuring reliable incremental syncs.
- Output formats: JSON Lines with GZIP compression (default) and CSV. JSONL is the primary format because Sia’s storage model optimizes for file-level access, and JSONL achieves 70–80% compression with GZIP.
- File naming:
<bucket_path>/<namespace>/<stream_name>/<YYYY>_<MM>_<DD>_<epoch>_<part>.jsonl.gz - Sync modes:
overwrite(full refresh — previous generation deleted after successful sync, tagged with generation ID metadata) andappend(incremental — new files alongside existing). - Connection check: Verifies S3 connectivity, confirms the target bucket exists (creating it if needed), and writes/deletes a test object. Optionally queries the REST API to verify blockchain sync status and wallet health.
The connector is packaged as a Docker image using Airbyte’s standard Python connector base.
Security Best Practices:
- All credentials are handled through rclone’s built-in obscured password storage and Airbyte’s
airbyte_secretfield annotation — never logged or exposed in plaintext. Both tools operate exclusively against the user’s own local renterd instance, with no data passing through third-party servers. - Both integrations validate renterd health (blockchain sync status, wallet balance, contract availability) before attempting data operations, and implement retry logic with exponential backoff and incomplete upload cleanup on failure.
- All code is open-source under Apache 2.0, enabling community security review.
Timeline with measurable objectives and goals.
Milestone 1 — Rclone Native Backend: Core Operations (Month 1)
Objective: Deliver a functional rclone backend for renterd that supports essential file operations, enabling rclone copy, rclone sync, and rclone ls to work end-to-end against a live renterd instance.
Deliverables:
- Backend registered as
siain rclone’s backend registry, replacing the deprecated siad implementation. Configuration viarclone configprompting for renterd API URL, API password, S3 gateway endpoint, S3 credentials, and bucket name. - Core
fs.Fsinterface implemented:List— directory listing viaPOST /api/bus/objects/listwith prefix and delimiter support, returning files and virtual directories with proper pagination for large buckets.NewObject(stat) — object metadata retrieval viaHEAD /api/worker/objects/*key, returning file size, modification time, and ETag.Put/PutStream— file upload viaPUT /api/worker/objects/*key, streaming request body to the renterd worker. Support for unknown-size streams from pipes and standard input.Open— file download viaGET /api/worker/objects/*keywith HTTP Range header support for partial reads.Remove— single object deletion viaDELETE /api/worker/objects/*key.Mkdir/Rmdir— bucket creation via the bus API and directory removal via prefix-based batch delete.
- Unit test suite covering configuration parsing, API response mapping, error handling, and metadata extraction, using Go’s
httptest.NewServerfor mocked renterd responses. - Integration test suite runnable against a Zen testnet renterd instance.
- End-to-end verification:
rclone ls,rclone copy(local → Sia),rclone copy(Sia → local), andrclone syncdemonstrated functional against a Zen testnet renterd instance. - Public repository with Apache 2.0 license containing the backend code.
Evaluation Criteria: A reviewer can build the rclone binary with the Sia backend, configure it against a running renterd instance, and successfully copy files to and from Sia using standard rclone commands. Unit tests pass.
Cost: $8,000
Milestone 2 — Rclone Advanced Features + Airbyte Connector MVP (Month 2)
Objective: Complete the rclone backend with all feasible optional features and prepare it for upstream review. Simultaneously deliver a functional Airbyte destination connector supporting JSONL output with overwrite and append sync modes.
Rclone Deliverables:
Copy— server-side copy viaPOST /api/bus/objects/copy, avoiding re-upload for duplicate operations.Move— rename viaPOST /api/bus/objects/rename, falling back to copy+delete where necessary.Purge— recursive directory deletion via batch delete API.ListR— recursive listing in a single API call for--fast-listsupport, improving performance on large directory trees.About— storage diagnostics showing wallet balance (SC), active contract count, total data stored, and remaining allowance, queried from/api/bus/walletand/api/bus/stats/objects.- Multipart upload support for large files using renterd’s multipart API with 120 MiB chunk alignment.
SetModTime— modification time preservation viax-amz-meta-mtimecustom metadata on the S3 gateway, enabling accurate sync comparisons.- Hash support — MD5 checksums via S3 ETag for integrity verification.
rclone mountandrclone serveverified functional for FUSE-based filesystem access.- Integration test suite passing against Zen testnet.
- Documentation page authored for
docs/content/sia.mdfollowing rclone’s documentation format.
Airbyte Deliverables:
- Connector specification (
spec.yaml) defining configuration fields: renterd S3 endpoint, S3 access/secret keys, optional REST API URL and password, bucket name, bucket path, output format, compression, and max file size. checkimplementation — connectivity verification by listing S3 buckets, confirming the target bucket exists (creating it if needed), and writing/deleting a test object to confirm read/write permissions. Optional REST API health check verifying blockchain sync status and wallet balance.writeimplementation — consumingAirbyteMessagerecords, serializing to JSON Lines with GZIP compression, buffering per-stream to configurable file size (default 200 MB), uploading via boto3put_object(), and emitting STATE messages only after durable upload confirmation.overwritesync mode (full refresh — previous generation files deleted after successful sync, identified byx-amz-meta-ab-generation-idmetadata).appendsync mode (incremental — new files added alongside existing ones).- File naming convention:
<bucket_path>/<namespace>/<stream_name>/<YYYY>_<MM>_<DD>_<epoch>_<part>.jsonl.gz. - Unit tests covering configuration validation, record serialization, buffer flushing, and S3 client interaction using
unittest.mock. - Dockerfile using
airbyte/python-connector-base:2.0.0,metadata.yaml, andpyproject.toml.
Evaluation Criteria: The rclone backend passes its integration test suite on Zen testnet with all optional features functional. The Airbyte connector successfully syncs records from a test source to a Zen testnet renterd instance in both overwrite and append modes, with data verifiable through renterd’s web UI or API.
Cost: $8,000
Milestone 3 — Airbyte Completion, Testing, and Upstream Submissions (Month 3)
Objective: Complete the Airbyte connector with additional output formats and enhanced diagnostics. Pass Connector Acceptance Tests (CAT) and submit to the Airbyte Marketplace. Submit the rclone backend upstream and finalize both contributions.
Airbyte Deliverables:
- CSV output format support with configurable delimiter and header options.
- Configurable compression options (GZIP, none) across all output formats.
- Enhanced error reporting with Sia-specific diagnostics: if REST API credentials are provided, failed uploads surface context from renterd (insufficient contracts, low wallet balance, blockchain not synced) rather than opaque S3 errors.
- Custom path format templates supporting
${NAMESPACE},${STREAM_NAME},${YEAR},${MONTH},${DAY}variables for flexible output directory structures. - Integration test suite running against Zen testnet, covering full-refresh and incremental syncs with JSONL and CSV formats.
- Connector Acceptance Tests (CAT) passing —
specvalidation andconnectiontesting suites as required by Airbyte’s contribution process. - Submission to the Airbyte Marketplace as a community connector (
releaseStage: alpha,supportLevel: community,license: Apache-2.0).
Rclone Deliverables:
- Pull request submitted to
rclone/rcloneto replacebackend/sia/, following rclone’s contribution requirements: vendor commit withgo.mod/go.sum, backend code in separate commits, full documentation, and integration tests. - GitHub issue filed in advance per rclone contribution guidelines, engaging with maintainers on acceptance strategy.
- Maintainer review feedback addressed and iterated toward merge acceptance.
Shared Deliverables:
- README documentation for both projects covering installation, configuration, usage examples, and known limitations.
- Both repositories published as standalone open-source projects under Apache 2.0, usable independently of upstream merging.
Evaluation Criteria: The Airbyte connector passes CAT and the submission PR is open on airbytehq/airbyte. Both output formats (JSONL, CSV) produce correct output verified against Zen testnet. The rclone PR is open on rclone/rclone with passing CI checks and maintainer feedback addressed. Both repositories are publicly accessible with complete documentation.
Cost: $4,000
Potential risks that will affect the outcome of the project:
Airbyte CDK evolution (Low impact). The Python CDK’s Destination base class has remained stable across recent versions. I will pin to a compatible version range, and the connector’s thin integration surface makes any future migration straightforward.
Performance at scale (Low impact). Sia’s 3× erasure coding overhead limits upload throughput. Both tools include built-in throttling mechanisms (--transfers for rclone, buffered writes for Airbyte) and will document expected throughput and recommended renterd configuration.
Zen testnet reliability (Low impact). Testnet host availability can be intermittent. Development uses a local Docker-based renterd instance for rapid iteration, reserving testnet for milestone validation.
Development Information
Will all of your project’s code be open-source?
Yes. All code produced under this grant will be released under the Apache 2.0 license. Both the rclone backend and the Airbyte connector will be developed in public repositories from day one. The rclone backend will be submitted as a PR to the rclone/rclone repository (which uses the MIT license — Apache 2.0 is compatible). The Airbyte connector will be submitted to airbytehq/airbyte as a community connector, where Apache 2.0 is an accepted contributor license.
Leave a link where code will be accessible for review.
Github link will be provided upon proposal approval.
Do you agree to submit monthly progress reports?
Yes. I will submit monthly progress reports on the Sia Foundation grants forum, aligned with each milestone completion.
Contact info
Mert Köklü
Mert is an experienced software engineer with over four years of blockchain development experience, specializing in tooling, integrations, and backend infrastructure. He holds Bachelor’s degree in Computer Science and has led engineering teams focused on both AI and blockchain technologies.
He previously worked with ApeWorX, where he developed StarkNet plugins and compiler tools for the Cairo language. Before that, he served as AI Video Analytics Team Lead at an NVIDIA partner company, managing large-scale intelligent vision projects.
He is also a long-term Sia ecosystem contributor, having developed multiple grant-funded projects for the network.
Recent Notable Projects:
- Alerts dYdX: Monitor positions, alerts, and account status on dYdX in real time (web)
- Cosmos: Local Starknet block explorer (repo)
- Kurtosis-Orbit: Tool for spinning up complete Arbitrum Orbit rollup environments (repo)
Email: [email protected]
Any other preferred contact methods:
- GitHub: justmert (Mert Köklü) · GitHub
- LinkedIn: https://www.linkedin.com/in/mertkoklu/
- Discord: mertkkl
- Telegram: @mertkklu