NFS/iSCSI protocols abstracted over Sia network for data storage, security and governance

Project Name: Data Guardian

Name of the organization or individual submitting the proposal: Emmanuel Damilare Adediji

Project Description

Backgrounds

We’re surrounded by data. It’s everywhere…our very lifestyle, choices, core values - everything is data and are being influenced by it. From personal data to video files, pictures, designs and schematics to numerical metrics - a whole lot depends on data.

In fact, it’s save to say that every company is a data company. Many individuals, companies and governmental parastatals store, use and dispense both processed and unprocessed data on a petabytes basis.

In this age of perilous data breaches, cyber warfare, hacking, hijacking and other often radical cyber-attacks, it has been proved that the main aim of the adversary is to gain access and insight to some sort of data or another, data theft, legitimate access denial to data, data tampering, etc.

This has made for unauthorized and unauthenticated access to data, data theft and piracy, libel, slander and blackmail due to access to confidential, classified or personal data, cyber-terrorism, cyber-bullying, the list goes on.

For individuals, this could mean embarrassment, depressions and suicidal inclinations. For enterprises, this could mean loss of profit and revenue, bottom-line dip, lawsuits and closure from customers and governmental regulatory and compliance bodies alike.

It is not an understatement however, that the governmental bodies of various countries have put various compliance laws and regulations in place, but what is a law without the means to enforce it.

The Response

To forestall the aforesaid and even worse, the individuals and enterprises do try to find different ways to protect their data - one of them being cyber-security, others include policy enforcement. Aside general protection, they also want the protection to include various aspects that are unique to their use-cases - for instance, “Can I set this data‘s expiration?”, “Will this data be accessible outside France - we don’t want it to be!”, “This design is a business secret, we don’t want it assmoessed outside 10 meter radius to our office!”, “How can I meet up with digital security compliance and regulations in my country regarding my business?”

In short, they want not only to secure and protect their data in ways unique to them, but also holistically govern their data even when it is sent outside their domain.

Networks and Protocols

As hitherto mentioned, individual’s and enterprise data usually finds themselves under various computing scenarios and hence networks and protocols. Although the primary focus for a true cyber-security approach via cryptography is by focusing on safeguarding at the data layer, there’s a truly intricate relationship between data and the network and protocols through which they travel. There are diverse networks and protocols available for use by individuals and enterprises, governmental parastatals and organizations to move, share and store data.

We discuss under the following treatise:

  • High Level Networks

This include high level networks and protocols designed to move, share and retrieve data. The most common of this is the HTTP(s)/REST. However, networks and protocols are diverse and varied -

  • Alternative Web: SOAP, XML-RPC, gRPC, jsonRPC, Graph-QL, Websocket, WebRTC,etc.

  • Messaging: AMPQ, MQTT, COAP, KAFKA, etc.

In our case, we focus on HTTP(s)/REST.

For Storage and Retrieval -

IPFS, BitTorrent, Swarm, NFS, iSCSI, Fibre Channel, Sia, StorJ etc.

In this case, we focus on both NFS/iSCSI and Sia.

  • Low Level Networks/Protocols:

Low level Networks and Protocols - TCP, UDP, IP MULTICAST, IRC, FTP, SSH, NNTP, IRC, RTP

Mail and Email Networks - XMPP, IMAP, POP3

The majority of the aforementioned low level networks/protocols are used in building the higher level networks/protocols.

How do we securely store, protect, govern, retrieve and share data moving through or resting in these kinds of networks.

As one can easily detect, not all data movement and storage is done via the REST or HTTP protocol only - especially in the enterprise and big organization where data gets moved in myriads of networking infrastructure.

Technology Exegesis

NFS(Network File Storage) and iSCSI(Internet Small Computer System Interface) are protocols for file storage and retrieval.

NFS is a distributed file system protocol that allows users to access files over a network as if they were stored locally. It operates on top of TCP/IP, making it readily available on most network infrastructures.

iSCSI is a block-level storage networking protocol that allows storage devices to be accessed over a standard IP network. It encapsulates SCSI commands within TCP/IP packets, enabling block-level storage access over standard Ethernet networks.

Whereas these two protocols allows for seamless storage and retrieval experience, they are usually built on centralized storage systems.

However, Sia decentralized storage offers a major advantage of the storage of the future - a truly decentralized storage solution.

Project Specifics

The system is proposed to be a highly robust and scalable NFS/iSCSi infrastructure, but with the Sia storage engine handling persistence underneath.

This will be accomplished through the following architectural and algorithmic exegesis:

  • The Client Layer:

This layer acts as the User Interface frontend for the NFS/iSCSI. The user can interact with this as if it were a truly NFS/iSCSI interface - traditional and common functionalities such as mounting and un-mounting shares, discovering targets, read/write, managing/enforcing permissions/policies, etc.

  • The Abstraction Layer:

This serves as a middleware between the client layer and the server gateway infrastructure. It sits between the traditional protocol implementation and the actual communication with the gateway.

This layer translates the traditional NFS/iSCSI protocol commands into appropriate API calls to the gateway service, effectively hiding the underlying complexity of the decentralized storage network from the frontend application.

On the flip side, it also acts as an intermediary between the responses coming from the server gateway to translate them into forms compatible with the client layer(as afore-discussed)

  • The Server Gateway:

This layer is the custom server infrastructure through which well-defined APIs are exposed. This APIs will expose functionalities similar to the chosen traditional storage protocol (NFS or iSCSI) - such as storage, retrieval and even more (governance parameters).

Implicitly, the gateway handles both compute and memory intensive tasks such as concurrency, asynchronous strategies, cryptography, key management and rotation, (de-)compression, caching, authentication, authorization, error-handling, policy enforcement and data governance operations. It also handles both persistence and retrieval operations by interacting with the underlying distributed storage engine(Sia), using different optimization approaches and strategies e.g chunking, content-addressing, data corruption prevention, etc. for speed and scalability.

Optionally, transactions involving storage space procurement and allocation via Siacoin is also exposed in this layer.

Furthermore, aside the API, the Gateway layer also exposes a Web UI, for the admin user to easily set configuration, procure storage space, set and enforce general policies, permissions and governance parameters, and manage stored data and such other operations as could determine and control the access and usability of the expected data in motion or at rest.

  • The Storage Layer:

The persistence layer is a decentralized layer which is highly robust and for this specific case, the Sia storage.

The server gateway layer utilize the various optimization strategies such as chunking, content-addressing, etc. to ensure effective storage and retrieval over this layer.

Furthermore, portions of storage size available for each user on the storage engine is mapped by the server gateway based on the storage engine’s API.

Moreover, transactions involving storage space procurement and allocation via Siacoin is handled in this layer.

How the projected outcome serve the Foundation’s mission of user-owned data

The impact and service of the proposed system is that of a combined approach :

By integrating the various aforementioned technologies stacks, the projected outcome facilitates user-owned data in the following ways:

Decentralized Control: Users control where their data resides, eliminating reliance on centralized storage providers.

Data Privacy: Fast and strong encryption protects user data from unauthorized access, even by Sia network operators.

Familiar Interface: Traditional protocols like NFS/iSCSI offer a comfortable way to interact with decentralized storage.

Future-Proof System: A blend of the distributed system and post quantum cryptography exposed by the ease of use of the NFS/iSCSI layer help ensure a true future-proof system against quantum attacks and breaches regarding user data.

Optional Interface: Exposing settings and configuration over the web also serves as a plus regarding flexibility and ease of use.

Granular Access Control: Users define who can gain access to data, how their data would be assessed, where their data could be assessed and when their data would be assessed through various policy enforcements, permission controls and data governance rules. In addition, they can also revoke access at any time.

Zero Trust Protection: In a multi-connected environment, this can add to a layer of security where security travels with data even outside the domain of the original data owner.

Transparency: Open-source code allows users to understand and verify data storage and management.

Conclusion:

The Sia Foundation’s mission of user-owned data is realized through a well-designed and robust system that leverages decentralized storage, traditional protocols, abstraction layers, secure gateways, and data privacy and governance principles.

This combined approach not only empowers data owners (viz individuals or enterprises), but also gives them back control over their valuable data, hence, aligning with the core values of the internet of the future - a truly secured, decentralized and distributed web where, no matter their location, user data can be truly said to be OWNED.

Grant Specifics

Amount of money requested and justification with a reasonable breakdown of expenses:

Budget Breakdown:

Development (3 Months): $6,912 (48 hrs/week, $12/hr)

Justification: This hourly rate reflects my experience in Software development and expertise with relevant technology stacks. Working 48 hours per week for 3 months allows for focused and efficient development.

Deployment: $900

Justification:

Cloud Resources: $500 (Estimated cost for a CPU/IO bound virtual machine instance on a cloud platform, leverage on free tier as much as possible)

CI/CD: $200 (For automated builds and deployment)

Hosting/Domain Name: $200 (Cost of domain registration and general hosting for the gateway service)

Hardware: $1,400

Justification:

1 MacBook Pro: ($1300) This budget allows for a used MacBook Pro with sufficient performance for development. Exploring used options helps maximize budget allocation for essential project components.

Alternative: Had an existing hardware been available, I would have utilized it, but I am in a resource constrained environment.

1 5G Router for Internet Connectivity: ($100)

Justification: Having a standby gadget for internet access is a boost to my productivity and efficiency in getting the job done. As I live in a developing country with unstable internet access, this will boost my productivity and workflow.

Workspace Rentage: $500

Justification: Renting a dedicated workspace with constant electricity is necessary for the fast and effective completion of the project. As I live in a developing country with unsteady power supply, this will boost my productivity and work-flow.

Contingencies: $200

Justification: Unforeseen and unplanned situations and circumstances

Total Project Budget: $9,912

Project Timeline:

Month 1:

Project setup, environment configuration, and system design.

Development of core functionalities: Abstraction Layer, Server Gateway service, infrastructure and functionalities, abstraction layer, data chunking/distribution.

Month 2:

Client Layer: NFS/iSCSI client interface and experience.

Persistence Layer: Integration with Sia API, security and optimization implementations.

Unit testing and core functionality verification.

Month 3:

Implementation of advanced features within all layers (progressive).

Modification and Refinement.

System documentation, user testing,

Success Metrics:

Functional Server Gateway enabling data access via NFS/iSCSI protocols.

Functional Abstraction Layer enabling call translation and communication.

Integration with Sia network for decentralized storage.

Functional Client Layer - interface for the NFS/iSCSI protocol.

Secure data storage with various blends of fast, strong and post-quantum encryption at rest and in transit.

Permissions, Governance and Policy enforcement configurations.

User testing and positive feedback on usability.

Additional Considerations:

This proposal prioritizes core functionalities within budget constraints. Other advanced features can be implemented later.

Open-source libraries and tools will be used whenever possible to minimize costs.

The project will prioritize collaboration with the Sia community for feedback and potential integration into existing Sia tools.

Summarily, I have outlined a cost-effective and well-defined plan to develop a secured decentralized storage gateway on top of traditional protocols. By utilizing my skills, exploring cost-saving measures, and focusing on core user needs, this project aims to deliver a valuable contribution to the Sia ecosystem.

What are the goals of this small grant?

The goals of the small grant are mapped into these corresponding project scope:

Development Environment Setup:

Purchase a laptop suitable for software development.

Set up necessary development tools and software licenses.

Workspace and Living Expenses:

Secure a conducive temporary workspace with stable electricity during the development period.

Allocate funds for living expenses during development.

System Development:

Design and Implement the aforesaid NFS/iSCSI Frontend Client layer (see description above)

Develop and Implement the aforesaid Abstraction (Middleware) layer within the Frontend application for NFS/iSCSI compatibility(see description above)

Design and Implement the aforesaid Core Server Gateway Layer, focusing on essential functionalities - concurrency, encryption, key management, policy enforcements, permissions configurations, data governance, etc…(see description above)

Design and Implement essential Data Chunking, Storage, Distribution, Corruption prevention, content addressability and retrieval functionalities for interacting with Siacoin.

Integrate with core Sia API functionalities for file management.

Security:

Implement encryption of data chunks at rest (balancing strong and fast algorithms, non-deterministic, post-quantum, key management and rotations);

Authorization and Authentication;

Implement Secure Communication Protocols for critical interactions within the system (e.g. by utilizing HTTPS for API calls).

Monitoring and Logging:

Design and Implement monitoring and logging of algorithmic level operations

Design and Implement monitoring and logging for system level operations

Testing:

Conduct both unit and integration testing to ensure core functionalities work as intended.

Documentation:

Develop an essential and robust user and developer documentation especially for the server gateway and the various User Interfaces to ensure a seamless flow among both technical and non-technical users

Potential risks that will affect the outcome of the project:

A holistic treatment of potential risks that might affect the outcome of the project to build a decentralized storage gateway for Siacoin with NFS/iSCSI access:

(1) Technical Risks:

Security Vulnerabilities: Security vulnerabilities in the custom server gateway, the Sia API integration or underlying libraries could expose user data or compromise the system’s integrity. It is expedient to stay updated with security patches and conduct thorough security audits.

Integration Challenges: Integrating Sia’s functionalities with traditional protocols like NFS/iSCSI could be complex. Compatibility issues or unexpected behavior might arise during integration and require a very thorough development and testing effort.

Data Chunking and Distribution: Implementing efficient data chunking and distribution algorithms, especially for large files, could be challenging. Inefficiencies could lead to performance issues, data corruption or wasted storage space on the Sia network.

Sia Network Immaturity: Being a relatively young project compared to other established battle-tested centralized storage solutions, the network could experience technical glitches or rather unexpected behaviors when accessed that could impact and compromise the server gateway’s functionality and hence, the NFS/iSCSI’s subsequent functionalities.

Scaling, Fault Tolerance & Latency: Developing a multi-tiered system like this could be demanding in both compute and memory resources. Both vertical and horizontal scaling concerns and issues might not be well addressed.

Also, error handling, retries, restart and supervision might not be well handled which could lead to system crashes and irrecoverable failures.

Moreover, the speed of storage, retrieval, and myriads of other operations and tasks to be performed both by the core server gateway and the persistence layer could cause higher latency and in turn very low speed of execution and crashes.

Testing Limitations: Implementing comprehensive testing for a system interacting with a decentralized network like Sia could be difficult. Inefficient testing could lead to bugs or unforeseen issues surfacing after deployment.

(2) Project Management Risks:

Limited Resources: Developing a project as a single engineer presents resource constraints. Furthermore, time constraints might necessitate compromises in functionality depth or security compared to a larger team approach.

Scope Creep: As the project progresses, new features or functionalities might seem wanted and desirable. When not carefully managed however, scope creep can lead to delays, budget overruns, and a departure from the core focus - the user-owned data goals.

Unexpected Delays: Unexpected technical hurdles, dependency issues with libraries, or unforeseen personal challenges could lead to project delays. Maintaining flexible schedule and clear communication - such as the monthly reporting - could serve as essential essential strategies for mitigating such risks.

(3) External Risks:

Sia Network Disruption: While unlikely, major disruptions within the Sia network, for instance - a large-scale outage or security breach, could impact not only the server gateway’s functionality, but also the user experience and access to data.

Fluctuations in Siacoin Price: The price of Siacoin could fluctuate significantly. This could affect the cost of storing data on the Sia network, hence, impacting user adoption of the proposed infrastructure.

Proposed Risk Mitigation Strategies:

Continuous Logging and Monitoring: Continuously Monitoring the Server Gateway and the Sia network for stability, performance and error issues through logging and a robust monitoring strategy help mitigate poor oversight and hence undetected errors and crashes. Regularly reviewing and updating libraries and dependencies would also help to address known vulnerabilities and application-level bugs.

Phased Development: Implementing project features in phases while focusing on core functionalities first makes for quick testing and iteration before expanding to more complex adjunct features later.

Robust Architecting, Designing and Development: A well architected, designed and developed system and infrastructure mitigate the risk of scaling, fault-tolerance and latency issues. Technical concerns like concurrency, asynchronous solutions, optimization strategies, design patterns, caching, etc. could be built into the very core of the system.

Thorough and Holistic Testing: While comprehensive and holistic test coverage might be challenging, it could be accomplished by prioritizing both unit and features testing to ensure that all parts of the system operates as intended. Also, using community testing tools or soliciting community feedback during development would be put into consideration.

Clear Communication: Maintaining a clear communication with potential users and community regarding the project’s potential strengths, weaknesses, limitations, and other risk metrics associated with using such a system.

Contingency Planning: Developing contingency plans for potential network disruptions or unexpected resource limitations could be taken into consideration. Scope creeps and unexpected delays could be controlled by allotting another month to project duration (hence, by making the duration four (4) months, all edge cases could be covered)

All these however, would depend on the timeline and constraints finally imposed on the scope of the project.

Development Information

Will all of your project’s code be open-source? Of course, all will be open source.

Leave a link where code will be accessible for review.

Monthly progress reports submission:

Project’s progress reports will be submitted monthly on the forum

Contact info

Email: [email protected]

Other preferred contact methods:

[email protected]

https://linkedIn.com/in/emmanuel-damilare-adediji

https://x.com/EmmanuelDa17210

I came up with the schematics and architectural overview of the proposed system. I have attached relevant images below.

P.S. - Apologies for any inconvenience regarding the hand-drawn images - I am in a resource-constrained environment.










Hello @Youngemmy,

Thanks for your new proposal to the Sia Foundation Grants Program. The committee is requesting some additional information:

  • Your proposal mentions both NFS and iSCSI, which are distinct protocols. We would like to see you focus on one protocol for the time being, and in the future we can explore expanding the project.
  • We would like to see a revised timeline for the project. The committee is concerned that the stated goals would not be achievable in a 3 months timeframe and would like you to re-evaluate this.
  • The committee would like to see some examples of your previous development work to support your application, given the scope and proposed timeline for the project.

Once the above is provided the committee can re-evaluate this grant.

Regards,
Kino on behalf of the Sia Foundation and Grants Committee

Hello.

It’s great hearing back from you.

  • First, I agree that NFS and iSCSI are distinct protocols. I would be focusing on the NFS protocol first.

Request:
We would like to see a revised timeline for the project. The committee is concerned that the stated goals would not be achievable in a 3 months timeframe and would like you to re-evaluate this.

Response:

  • It’ true - the project completion could be impacted if not properly planned, implemented, tracked and managed. However, here are some strategies that could speed me up:

  • I would be using a language suited for rapid prototyping and gluing systems together (Python)

  • I will not be re-inventing the wheel as my solution will consist of utilizing and re-using well established open-source libraries, frameworks, toolsets (e.g. Celery for Async Queue, Thespian for Actor-based concurrency, PyNacl and PyCryptoDome for encryption, Dogpile for Cache, Cosmian KMS for key management, nfsv4 for client etc.) and existing standards ( e.g. Dublin Core, ISO )

  • To speed up Sia network connectivity and interactivity, I will be using the S5 channel (developed by Redsolver) at the Storage layer.

  • Agile development through horizontal cross-development and quick Unit Testing.

Nevertheless, despite the aforementioned and as previously explicated in my proposal, scope creep could occur indeed. I have come up with a revised timeline and more detailed corresponding project breakdown accordingly, which I think is much more suited for the execution:

Week 1:

  • Project/Infrastructure setup
  • Environment installation and Configuration of Docker, Poetry, Celery, Redis, etc.
  • Setting up project directory structure
  • Initializing abstract classes (interfaces), base design templates and dependencies for all layers (Server Gateway, Storage Layer, Abstraction Layer and Client Layer)

Week 2 - 3:
MAJOR

  • Server Gateway Core (1)
  • Non-Deterministic Crypto Strategy
  • (As-)Symmetric Crypto Algos
  • Block/Stream Crypto Algos
  • Post-Quantum Crypto Algos
  • Compression
  • Caching
    MINOR
  • Planning, Strategizing & Design-Thinking on NFS Client & Abstraction Layer.

Week 4 - 6:
MAJOR

  • Server Gateway Core (2)
  • Data Governance
  • Policy Enforcement
  • Data Standardization
  • Data Provenance, Fingerprinting & Logging
    MINOR
  • Beginning the RESTFUL API planning and setup
  • Python’s Fast-API, custom Gateway’s DB setup & basic interactions

Week 7 - 8:
MAJOR

  • Server Gateway Core (3)
  • Key Storage, Management & Rotation
  • Authorization & Authentication
  • Permission Provisioning
  • RESTFUL API exposure (1)
    MINOR
  • Implementing & Unit-testing few Client & Abstraction layer classes
  • Implementing & Unit-testing few Persistence layer classes

Week 9 - 10:
MAJOR

  • Persistence Layer Implementations
  • Transactioning & STM
  • Sia Storage/Retrieval interaction via S5
  • Dynamic Space Allocation & Query
  • Persistence Time- & Key- Stamping
    MINOR
  • RESTFUL API exposure (2)
  • Implementing & Unit-testing few Abstraction layer plugs (interaction with both Client and Backend exposed API)

Week 11 - 12:
MAJOR

  • Thorough focus on NFS client & Abstraction Layer implementations.
  • Abstraction Layer Plugs and Wrappers
  • Client Layer Plugs
  • Thorough Integrations Testing of Client, Abstraction, & Server Gateway Layers.
    MINOR
  • Finalizing the persistence layer
  • Thorough Integrations Testing of Persistence and Server Gateway Layers.
  • System Deployment (1)

Week 13 - 14: (If Approved)
MAJOR

  • System Deployment (2)
  • Performance metrics gathering
  • System Documentations
  • Holistic, thorough & system-wide features and Integrations Testing
    MINOR
  • Holistic Debugging and optimization
  • Alternate Web UI Dashboard - Designing & Envisioning

Week 15 - 16: (If Approved)
MAJOR

  • Alternate Web UI Dashboard - Full Implementation & Backend connection.
  • System Deployment & Documentations ( Enhancements, Improvements & Finalization)

CAVEATS.
Some development feature concerns are cross-cutting and may involve going back and forth in an overlapping manner…

Other measures to ensure speed, quality, agility and effectiveness:

  • A hybrid of both Defensive Programming (Exception Handling) and DBC (Design By Contract) practices coupled with aforesaid unit-testing will be used to detect bugs quickly & early, speedily ensure class-level integrity, while not sacrificing on quality and effective development, moving forward.

  • SOLID, DRY and other software engineering principles will be strictly adhered to.

  • DI/IOC, Python’s Poetry & Docker container will be utilized to streamline and manage cross-dependencies, pre-requisites and complex operational requirements at the code, system and infrastructure levels respectively.

Request:
The committee would like to see some examples of your previous development work to support your application, given the scope and proposed timeline for the project.

Response:
I am a polyglot software engineer with 6+ years experience. While my repository (emmYgd (Adediji Emmanuel Damilare) · GitHub) showcases lots of projects that I have been a part of. For the sake of this application, I will be specific on some allied examples:

(1)
A fault-tolerant system for an IoT infrastructure written in both Java and Groovy.

Role: My part was to implement the networking and encryption layer - I utilized the Java Networking API and Java Cryptography Extension(JCE). Concurrency was handled using the Groovy GPARS library.

Code Patches and Samples can be found here:

(2)
A cloud infrastructure whose core is in the similitude of a message queue.

Role: My part was to implement the event-bus layer using the “subscribe-consume” model - I utilized Python’s Twisted networking and Thespian for actor-based concurrency.

Code Patches and Samples can be found here:

(3)
An encryption and base layer components implemented as part of an academic research for some client’s freelance post-grad project on Data Governance.

Role: My part was to implement the base layer core for the research project in Python. I utilized PyNaCl, PyCryptoDome and documented their Data Standards.

Code Patches and Samples can be found here:

Aside the aforementioned, I have participated in various projects involving different programming languages, architectural paradigms, technology stacks and frameworks.

From Backend API for an e-commerce shopping project at:
GitHub - WiCartitOrg/WiCartIt-Laravel-BE: WiCartIt Backend infrastructure with Laravel Web and Eloquent ORM frameworks in PHP to real estate app that connects house owners to prospective tenants:
GitHub - emmYgd/Rentium: Full Stack Web Application. Backend in Laravel and Frontend in Next.js

From Typescript projects at peiges.co: Peiges · GitHub to various other research projects in my career.

For a more holistic capture and a broader perspective into my professional background, career trajectory exegesis and projects implemented, you could visit and read through my LinkedIn profile at: https://linkedin.com/in/emmanuel-damilare-adediji

Best Regards

Hello, wanting to drop in and make a comment about something to be clear. You seem to want to make use of S5, and that could be possible through its HTTP API’s… though they require a bit more trust.

S5 as a whole though currently does not have a python port yet (if not using HTTP), so I would recommend you verify how you plan to make use of S5 in your architecture.

Kudos.

1 Like

Hello.

Thank you for the recommendation. It is duly noted.

As you rightly pointed out, http API exposed by S5 cannot be used as is - given the nature and scope of my project in order to avoid man-in-the-middle and many other attacks and breaches. It is highly likely that my Server Gateway Layer and Storage layer would be deployed on separate servers, hence a need for security at the network layer between these two components.

There were two design choices that I was thinking about and planning for:

  • Implement a key exchange protocol like Diffie-Helman ECDH with ephemeral key designs between the Server Gateway and the Storage Layer. A signature layer could also be implemented to strengthen system integrity. This could be accomplished via the PyNacl, PyCryptoDome and Cryptography libraries.

  • Implement a Wrapper Layer in the similitude of a reverse proxy on top of the existing S5 http server. This proxy will wrap the existing http S5 service into an https context. This could be accomplished using open source libraries and services like PyOpenSSL, Let’s Encrypt Certificates and custom network implementations.

Pros

The first option could present more flexibility in algorithmic implementations, adhoc & dynamic network layer protection, likely lesser protocol overhead.

The second option could present more speed of implementation, re-use (https already contains key exchange protocol), robust execution and in-built performance optimization.

Cons

The first option could present more development time and efforts, complications in implementation, and hence undetected bugs and vulnerability.

The second option could present protocol overhead due to its robust nature.

Considerations

In order to adhere to time and speed of implementation, coupled with the re-use principle which I stated earlier as avoiding re-inventing the wheel as much as possible, I’ll be strongly considering the second option.

In addition to the aforesaid, I will be putting certain measures in place to further improve upon security and integrity of Gateway-Storage interaction and the Storage layer itself:

Transactioning / STM: As aforementioned in my previous explanations, I’ll be making the system ACID-compliant by using transaction oriented concurrent approaches…To this end, Python’s software transactional memory and thespian actors libraries will be utilized.

Asynchronous Requests: I would utilize asynchronous libraries like Python’s aiohttp with asyncio to make non-blocking HTTP requests. This could help handle multiple requests concurrently, and hence, reduce wait times.

Requests Batching: I would combine multiple and disparate small requests into a single batch request to reduce the number of network round trips.

Connection Pooling: I would be re-using existing connections instead of creating new ones for each request. The aforementioned Python libraries provides this out of the box.

HTTP/2: I would also be using the HTTP/2, which allows for multiplexing multiple requests over a single connection, reducing latency and overhead.

Compression: Just as in Server Gateway data-level compression, I would enable compression for request and response payloads to reduce data transfer size.

Caching: Just as in Server Gateway data-level caching, I would be caching responses for repeated requests to avoid making the same HTTP call multiple times.

With a blend of the strategies explicated above, I believe the storage layer will be built in a secured and robust manner hence, enhancing the holistic performance of the proposed system.

Regards

Thanks for the updates to your proposal. Please include an updated budget section to reflect your new timeline. The only budget in this thread is the one for your three months of work.

Hello. It’s great hearing from you.

Included below is an updated budget section to reflect my newly proposed timeline.

Amount of money requested and justification with a reasonable breakdown of expenses:

Budget Breakdown:

Development (4 Months - if approved): $9,216 (48 hrs/week, $12/hr)

Justification: This hourly rate reflects my experience in Software development and expertise with relevant technology stacks. Working 48 hours per week for 4 months (if approved) allows for focused and efficient development.

The other parts of the budget remain as they were and has been re-stated here for emphasis’ sake:

Deployment: $900

Justification:

Cloud Resources: $500 (Estimated cost for a CPU/IO bound virtual machine instance on a cloud platform, leverage on free tier as much as possible)

CI/CD: $200 (For automated builds and deployment)

Hosting/Domain Name: $200 (Cost of domain registration and general hosting for the gateway service)

Hardware: $1,400

Justification:

1 MacBook Pro: ($1300) This budget allows for a used MacBook Pro with sufficient performance for development. Exploring used options helps maximize budget allocation for essential project components.

1 5G Router for Internet Connectivity: ($100)
Having a standby gadget for internet access is a boost to my productivity and efficiency in getting the job done. As I live in a developing country with unstable internet access, this will boost my productivity and workflow.

Alternatives: Had existing hardware been available, they would have been utilized - however, I am in a resource-constrained environment.

Workspace Rentage: $500

Justification: Renting a dedicated workspace with constant electricity is necessary for the fast and effective completion of the project. As I live in a developing country with unsteady power supply, this will boost my productivity and work-flow.

Contingencies: $200

Justification: Unforeseen and unplanned situations and circumstances

Total Project Budget: $12,216

Thanks for the edits to your proposal and the additional info on your budget! After review, the committee has decided to reject this proposal. They cited the complexity and focus of the proposal. Highlighting elements like post-quantum cryptography seemed out of place as well. Finally, your linked Github repos didn’t include enough recent development work to ease delivery concerns.

We wish you the best with your future development work!

Hello.
Great hearing from you.

Concerning post-quantum cryptography, it is well established in python with libraries that had NIST-certified algos - plus the crypto key management system that I proposed (Cosmian KMS) already has post-quantum algorithmic flow…

The flow could have been Secret key encrypts data, ephemeral Ec-Public key encrypts secret key, ephemeral Pq-public key encrypts Ec-private key…the respective private keys are then stored, managed and rotated by the Cosmian KMS…this isn’t a complex flow.

Also, concerning my linked Github repos - although, my most recent development works haven’t been in this area, I am confident that my past experiences garnered over the years could be valuable and ensure speed gains.

Concerning the complexity and focus of the proposal, can I re-apply if I cut down on the proposed gateway requirements or change the proposed protocol?

Regards…

Project Name:
Sia iSCSI Server Gateway (sia-iscsi)

Name of the organization or individual submitting the proposal:
Emmanuel Damilare Adediji

Describe your project:
Sia-iSCSI gives access to the Sia storage network through a normal iSCSI interface, hence, allowing any iSCSI-compatible client to access the hosted files and directories.

The usage of this could be both local access or network-wide deployment.

The project will be implemented in Python, a robust, scalable programming language, and will be available on major platforms - Linux is guaranteed first, then clients on MAC-OS and then Windows should work.

To avoid overhead, the final deliverable will be shipped with no additional dependency requirements save renterd and common system libraries.

How does the projected outcome serve the Foundation’s mission of user-owned data?
The access through iSCSI offers several key advantages:

  • iSCSI clients are well-established outlets and widely deployed on diverse platforms.

  • A true plug-and-play with transparent access to Sia-hosted content from iSCSI applications.

  • High data uniformity and discoverability with open metadata standards and file formats (Dublin Core and H5)

  • The proposed gateway can make Sia storage network and hence, its hosted contents available to an entire network, hence more adoption.

  • Data security is guaranteed through simple encryption workflows.

  • Monitoring through simple file logging interface.

  • Seamless and easy enterprise integration of the gateway into existing IT strategies.

Grant Specifics
Amount of money requested and justification with a reasonable breakdown of expenses:

Amount of money requested and justification with a reasonable breakdown of expenses:

Budget Breakdown:

Development (3 Months): $6,912 (48 hrs/week, $12/hr)

Justification: Working 48 hours per week for 3 months allows for focused and efficient development.

Hardware: $1,400
Justifications:

1 MacBook Pro:
($1300)
Justification: This budget allows for a used MacBook Pro with sufficient performance for development. Exploring used options helps maximize budget allocation for essential project components.

1 5G Router for Internet Connectivity: ($100)
Justification: Having a standby gadget for internet access is a boost to my productivity and efficiency in getting the job done.

Workspace Rentage: $500
Justification: Renting a dedicated workspace with constant electricity is necessary for the fast and effective completion of the project.

Contingencies: $200
Justification: Unforeseen and unplanned situations and circumstances

Total Project Budget: $9,012

Throughout the developmental stage, the test-net will be used, hence, SiaCoin isn’t required for storage.

What are the goals of this small grant?

The grant’s goal is to procure enough financial resources in order to make the development of the hitherto proposed Sia iSCSI Gateway possible.

The time estimate is based on workload analysis which is from a similar context of experience building enterprise systems and also the re-use of existing Python iSCSI server libraries as appropriate.

Development Timeline:

Three fundamental milestone deliverables are envisioned:

  • VERSION 1.0:

  • At the end of the 4th week, a basic version will be released which contains basic functionalities.

  • This versions will already be fast and performant as concurrency is built in from scratch using an actor-based architecture.

  • However, other features like caching, compression, encryption and logging will not be present.

  • This version will be for testing and Proof of Concepts scenarios.

  • VERSION 2.0:

  • At the end of the 8th week, there will be very significant improvements on VERSION 1.0. Now, more optimization and performance gains will be incorporated and realized through fast Metadata Caching and File Compression implementations.

  • Deployment through a Docker image will be available.

  • Client Tests for Linux and Mac-OS would have been performed.

  • Documentation with basic usage contents will appear…

  • This version will be for general usability and Minimum Viable Product scenarios.

  • VERSION 3.0:

  • At the end of the 12th week, there will be security and monitoring integrated into VERSION 2.0. Now, fast encryption and logging will be incorporated.

  • Task call architecture will be made asynchronous - hence, further squeezing out more performance juice.

  • VERSION 3.0 Docker image will be available.

  • Client Tests for Windows would have been performed.

  • Documentation with more holistic usage contents will appear…

Features & Scope:

  • Fully open source (Apache-2.0 license) via public repository on GitHub.

  • A stand-alone, simple, containerized and performant server-gateway that is cross-platform, highly adaptable and iSCSI/renterd-compliant.

  • Fast out-of-the-box caching and compression to improve speed, access performance, reduce latency, and lower entry barrier and usage costs.

  • Fast/Efficient out-of-the-box encryption and logging to facilitate security, monitoring and integration into existing enterprise IT strategies and requirements (such as compliance)

  • Holistic Documentations and sample configurations and use-case tutorials.

Potential risks that may affect the outcome of the project:

  • Integration Challenges: Integrating Sia’s functionalities with traditional protocols like iSCSI could be complex. Compatibility issues or unexpected behavior might arise during integration and require a very thorough development and testing effort.

  • Security Vulnerabilities: Security vulnerabilities in the custom server gateway, the Sia API, or underlying libraries could expose user data or compromise the system’s integrity. It is expedient to stay updated with security patches and conduct thorough security audits.

  • Excessive metadata caching could lead to memory overruns if not managed effectively.

  • Compression, encryption and logging could add additional overhead if the system is not well architected for scaling, latency, concurrency and other high impact concerns.

  • Cross platform challenges could arise. While Linux and Unix-variant iSCSI client exists and are stable, deployment on Windows could pose a serious challenge due to various incompatibilities.

  • Sia Network Disruption: While unlikely, major disruptions within the Sia network, for instance - a large-scale outage or security breach, could impact not only the server gateway’s functionality, but also the user experience and access to data.

Development Information
Will all of your project’s code be open-source?

Certainly.

Under the Apache-2.0 license, the code will be fully open source and will be made available on GitHub.

Coupled with that, all deployed libraries will also be open source.

Leave a link where the code will be accessible for review.
A GitHub repository will be created once the grant is approved.

Do you agree to submit monthly progress reports?
Of course.

Contact Info
Email: [email protected]