Grant Proposal: Positioning Sia as the leader of newfound research data sharing interest

Introduction

Project Name: Expanding the Horizons of Research Data with Blockchain: A Project Proposal

Name of the organization or individual submitting the proposal: V Raja, affiliated with the Center for Translational Research in Neuroimaging and Data Science (TReNDS) at Georgia State University (GSU), the Georgia Institute of Technology, and Emory University.

Description of your project, who benefits, and how the project will serve the Foundation’s mission of user-owned data:

This project focuses on researching and developing secure, efficient, and scalable mechanisms for handling and sharing large-scale research data. Leveraging the Sia network’s blockchain technology, it aims to enhance data security, reduce storage and egress costs, and potentially contribute improvements to Sia. Primary beneficiaries include academic researchers and institutions who require secure, cost-effective data-sharing solutions, particularly within the realm of neuroinformatics, and providing the way for all the research fields to adopt the data-sharing mechanism later. The project aligns with the Foundation’s mission of user-owned data by ensuring data control resides with users while maintaining stringent standards of data security and privacy.

Here is a more detailed breakdown of the specific objectives in line with the project’s vision:

  1. Secure Research Data Sharing: Research data, particularly in the realm of neuroinformatics, contains sensitive information that mandates stringent measures to ensure its security during storage and transmission. The proposed project aims to leverage blockchain technology’s immutable and tamper-proof nature to provide a secure platform for research data sharing. This will enable researchers to share data with the assurance that the integrity of their information is maintained. It also intends to provide a mechanism for direct data transmission from the scanner to the blockchain platform, further enhancing data security by minimizing points of vulnerability.
  2. Large-Scale Data Management: As research data grows exponentially, the efficient management of such large volumes of data becomes increasingly challenging. The project aims to address this by leveraging the Sia network’s scalability to handle and store large-scale data effectively, thus facilitating seamless data retrieval and sharing among researchers.
  3. Cost Reduction: Traditional cloud storage services often come with significant costs, particularly for large-scale data. By utilizing the Sia network’s cost-effective data hosting capabilities, the project aspires to significantly reduce storage and egress costs for research institutions.
  4. Compliance with Legal and Ethical Standards: A key objective of the project is to ensure that data sharing complies with prevailing legal and ethical standards, such as the Health Insurance Portability and Accountability Act (HIPAA) and the NIH. To this end, the project will investigate and develop protocols to anonymize data, thereby safeguarding the privacy of individuals whose data forms part of the shared research data.
  5. Enhancements to the Sia Network: While the Sia network will form the foundation of the project, it’s anticipated that the project could lead to the identification of potential enhancements to the Sia network. This could further strengthen Sia’s position as a leading solution for decentralized data storage.

This project has the potential to revolutionize the way research data is stored and shared. It addresses many of the pressing issues faced by the research community, such as the need for secure, efficient, and cost-effective data sharing. In doing so, it can serve as a pioneering initiative that paves the way for the broader adoption of blockchain technology in academic research and beyond.

By successfully completing this project, Sia can position itself as a data leader in the research domain, offering solutions that not only cater to the present needs but are also capable of evolving in line with the future requirements of the ever-growing research data landscape.

The budgeting and funding aspects will be handled by Georgia State University, and invoicing will be carried out accordingly, thus maintaining financial transparency and accountability throughout the project lifecycle.

Lastly, the results and findings of this project will be integral to my PhD research, further contributing to the academic value and scholarly relevance of the project.

The journey ahead will surely be challenging but with potential high rewards in reshaping how research data is handled, shared, and secured. We are looking forward to the opportunity to make significant strides in this exciting and crucially important endeavor.

Grant Specifics

Amount of money requested and justification with a reasonable breakdown of expenses:

The project requests $3 million over a period of six years for successful execution. The budget breakdown is as follows:

  1. Research & Development: $1.2 million
  2. Product Prototyping & Development: $500,000
  3. Testing, Onboarding Researchers, Universities, and Labs: $600,000
  4. Conferences, Marketing & Travel: $400,000
  5. Administrative & Miscellaneous: $300,000

The budget reflects the needs for platform development, user interface design, rigorous testing for scalability and performance, marketing and promotion to onboard users, and administrative expenses for seamless project execution.

Timeline with measurable objectives and goals:

Year 1: Initial research, development of project framework, and creation of partnerships with academic institutions.
Year 2: Further research and development, beginning of the development phase for the platform.
Year 3: Continuation of platform development, initiation of pilot testing with select research data.
Year 4: Completion of platform development, full-scale testing, and refinement.
Year 5: Project completion, full launch of the platform, and expansion to accommodate additional types of research data.
Year 6: Evaluation of project outcomes, concluding the research, and planning for future developments.

Potential risks that will affect the outcome of the project:

Risks include technical complexities related to handling large-scale data on the blockchain, ensuring high levels of performance, and compliance with legal and ethical standards. To mitigate these, the project will leverage the technical expertise within TReNDS and the affiliated universities.

Development Information

Will all of your project’s code be open-source?
Yes, in line with the principle of transparency in research and development, all developed code for the project will be open-source.

Leave a link where code will be accessible for review.

To be provided once the project has commenced and the repository has been established.

Do you agree to submit monthly progress reports?
Yes, I agree to provide regular updates on the progress of the project via monthly reports in the forum.

Contact info

Email: MY NAME@gsu.edu
LinkedIn: https://www.linkedin.com/in/amrvignesh/

More info: https://neurophd.notion.site/sia-e283626b07a94dac83c1d48b8bc7cb00

Hello Mr Raja,

Thanks for the proposal!

Just for some context, this is the largest grant proposal we’ve received by a large margin. We’re not opposed to it strictly on those grounds, but have a few recommendations for you that will help the committee properly evaluate a project at this scale.

  1. Reduce the scope and timespan. Our grants program has only been around for nine months, so our current largest grants are still ongoing. We’d much prefer to see this grant broken down into six to 12 month intervals, with a new proposal submitted for the first period of time. This new proposal for the first leg of the project should have its own goals and milestones, risk assessment, etc. Please refer to the template when submitting your proposal (shown when you go to create a new thread in the Grants category).
  2. We’ll need a significantly more detailed budget. On the current proposal for example, we’d want a number of line items under each of your five categories detailing where estimated expenses will go. Since you’ll be submitting a new proposal, please consider things like estimated hires as well.

We’re excited to see these updates. Thanks again!

Regards,
Kino on behalf of the Sia Foundation and Grants Committee

Dear Sia Grants Committee,

Thank you for the thoughtful feedback. We will submit distinct proposals for each annual phase with details on milestones, budgets, and metrics.

However, we suggest considering the full multi-year vision and funding, as research data needs are expanding exponentially. Labs participating as Sia hosts and renters are essential to supply and utilize this capacity. This ecosystem requires an enduring commitment beyond short-term phases.

While early traction is achievable in months, realizing Sia’s full potential to transform research data management needs sustained effort over years to drive adoption, integrations, and network growth. The long-term impact will not happen overnight.

Approving the 6-year strategic roadmap upfront would empower more holistic planning and execution towards the ultimate goals, which require patience and continuity. In addition to monthly reports, we suggest billing the approved grant amounts on a quarterly or monthly basis tied to the completion of work items in each period. This incremental approach provides regular oversight.

We aim to balance demonstrating phase progress with pursuing the greater vision. This substantial initiative can reinvent research data management, but needs commensurate support. We welcome suggestions on conveying this effectively.

Sincerely,
V Raja

Here is an updated outline incorporating the research data (especially on neuroinformatics) focused Sia use cases into the 6 year proposal timeline:

Year 1:

  • Set up core Sia infrastructure for decentralized storage and data sharing
  • Develop apps for researchers to manage, organize and share data on Sia
  • Begin developing integration for MRI scanners to transmit scans directly to Sia
  • Initial proof of concept with sample DICOM dataset streaming
  • Onboard 5 pilot neuroinformatics labs, quantify storage needs
  • Support computational pipelines by partitioning tasks across lab datasets on Sia
  • Have 2 labs allocate servers as Sia hosts to earn storage revenue
  • Publish papers introducing Sia research data management platform

Year 2:

  • Expand storage capacity as more researchers onboard
  • Enable streaming EEG data directly from devices to Sia
  • Add compliance features like HIPAA access controls tied to Sia identities
  • Develop federated learning capabilities using data Partitioned on Sia
  • Launch data marketplace for researchers to sell access to datasets
  • Enable 5 more labs to become Sia hosts to grow capacity
  • Support advanced analytics by leveraging computational resources on Sia

Year 3:

  • Offer features like versioning, replication, and backup for stored data
  • Create data provenance tracking by leveraging Sia’s immutable ledger
  • Develop identity management for patients to control data access
  • Create reference architectures for using Sia in neuroinformatics workflows
  • Incentivize Sia usage

Year 4:

  • Expand scope beyond neuroscience into other research domains
  • Conduct pilots with labs in genomics, physics, astronomy, etc.
  • Identify cross-discipline decentralized data management needs
  • Develop modular solutions on Sia tailored to different fields
  • Create domain-specific apps and custom integrations as needed

Year 5:

  • Focus on performance, stability, and optimization for diverse data types
  • Implement data lifecycle policies for long-term research datasets
  • Develop advanced analytics and monitoring capabilities on Sia
  • Promote interdisciplinary usage of shared tools and data on Sia

Year 6:

  • Evaluate sustainability plan for decentralized governance
  • Transfer ownership to research consortiums if needed
  • Create comprehensive documentation and training programs
  • Prepare to open source platform components and learnings
  • Position Sia for commercial research usage beyond academia

Updated grant proposal for the first year:

Organization: Georgia State University/Georgia Institute of Technology/Emory University Center for Translational Research in Neuroimaging and Data Science (TReNDS)

Purpose: This project will benefit neuroscience researchers by providing a decentralized platform for affordable, secure, private data storage and sharing powered by Sia. It aligns with Sia’s mission of user-owned data by giving researchers control over storage and dissemination of their data.

Open Source: All code will be open source and contributed back to the Sia ecosystem.

Timeline and Goals:

  • Months 1-2: Set up core Sia infrastructure, renter/host nodes, and onboard initial researchers.
  • Months 3-5: Develop web apps with COINS (Collaborative Informatics. and Neuroimaging Suite) integration for researchers to manage data on Sia.
  • Months 6-8: Integrate streaming of sample MRI DICOM datasets from scanners to Sia by means of IoT and COINS.
  • Months 9-12: Support 5 pilot labs, publish atleast 2 papers on platform, and present at conference.

Risks: Potential researcher hesitancy to adopt new data management practices. Mitigation through extensive support, training, and championing users. Also dependency on Sia platform stability.

Budget: Total request: $555,000

Expense Details Price (Single/ Monthly) Quantity Total (Annual)
Research Half Neuroinformatics Researcher Billing $8,000 2 $192,000
Engineering Neuroinformatics/ Blockchain Senior Engineer Billing $10,000 1 $120,000
Neuroinformatics/ Blockchain Engineer Billing $8,000 1 $96,000
Sia Infrastructure Storage (TB) - Procurement & Maintenance $10 200 $24000
Renter nodes $2,000 1 $2,000
Host nodes $4,000 2 $8,000
Siacoin for contracts $0.005 5,000,000 $25,000
Conference presentations $5,000 2 $10,000
Marketing Content creation $1,000 1 $12,000
Website updates $1,000 1 $12,000
Print collateral $5,000 1 $5,000
Onboarding Lab onboarding $4,000 5 $20,000
Researcher/Lab staff training $2,000 10 $20,000
Operations Administration $50 100 hrs $5,000
Reserves Contingency buffer - - $4000

Reporting: Monthly progress reports to Sia Grants Committee. Quarterly summaries of milestones and learning published to the community.

This initial 12 month phase will focus on building foundational integration with Sia, developing initial tooling tailored for neuroinformatics researchers, conducting pilots with select labs, and raising awareness through publications. We welcome any feedback from the Sia Grants Committee on this first year plan and budget.

Hello @dreds3,

Thanks for refining this proposal. The committee has found that after review, a proposal like this is still difficult for us to evaluate. Sia Foundation grants should primarily be for the specific work of integrating Sia into a project and furthering the development of user-owned data on Sia. While we’re not fully opposed to some other foundational work, like bringing Sia to a new industry/userbase, committing to the largest grant we’ve ever given based on the information at hand is proving difficult.

A couple of specific concerns in this review were:

  • A large portion of the budget is for a “Neuroinformatics Researcher”. Is this research into the viability of the project, or conducting research on behalf of the project? If the former, a feasibility study would be far more appropriate as a first step. If the latter, it’s not in our scope for the Foundation to be funding research in an unrelated industry.
  • A second, smaller concern, was the hosting nodes. We’re not sure why you’d need to run host nodes as part of the project when a full hosting community already exists.

A common theme in discussion, and a basic question we’d like answered before committing half a million dollars, was “Can you store your data on Sia and will it work the way you want/need?”. Getting this question answered first would greatly help the committee’s decision-making. Under your Timeline and Goals, seeing the first 1-2 and 3-5 month goals succeed would certainly boost confidence in the endeavor.

Regards,
Kino on behalf of the Sia Foundation and Grants Committee

Dear Grants Committee,

We appreciate your feedback identifying areas needing adjustment. You have raised reasonable concerns that we aim to address responsibly.

Regarding the neuroinformatics researcher role, we believe they could provide value during the initial feasibility study and pilot by lending domain expertise to:

  • Evaluate and suggest improvements to better utilize Sia’s capabilities for neuroscience/medical imaging use cases. Their expertise would guide technical validation.
  • Collaborate with engineers on requirements, real data testing, and quantifying benefits.
  • Optimize the platform for neuroscience workflows as an end user representative.
  • Assist in researcher/lab/org onboarding and training on the platform.
  • Help quantify pilot results and co-author publications to promote adoption in the neuroinformatics community.
  • Accelerate real-world validation, user feedback, and community building beyond implementation.

However, if you still have reservations, we are open to alternatives for including relevant research field input.

On the matter of hosting nodes, we agree with your point. As such, we will remove hosting nodes from the current proposal as advised.

That said, looking further ahead, Sia’s value proposition for research relies on unlimited scalable storage capacity. To prepare for exponential data growth with greater adoption, we felt research labs as hosts (utilizing their available storage setup) could help us:

  • Experiment with optimizing hardware and configurations to scale.
  • Identify required network tuning, and resiliency measures as storage expands.
  • Develop easy onboarding for labs’ clusters and servers to come online as hosts, expanding capacity.
  • Test hosting economics and incentivization models.

That said, looking ahead, new hosts could help us plan for data growth in later stages, as we foresee petabytes of data migrating to Sia, which will require expanded capacity beyond the current ~3PB available. However, for the initial years, removing hosting is reasonable. We welcome suggestions on ensuring sufficient capacity over time as adoption scales.

In summary, we aim to address your concerns by:

  • Removing hosting nodes as advised
  • Making the researcher role explicitly about feasibility evaluation and pilot optimization
  • Focusing first on a lean pilot tailored to proving core Sia capabilities

We believe this alignment will position us to gain your confidence by delivering tangible milestones and value. Please share any additional feedback as we revise the proposal accordingly. We greatly appreciate your guidance to strengthen this important initiative.

Can you store your data on Sia and will it work the way you want/need?

We strongly believe Sia has the technical capabilities and architecture to securely store, share, and stream the neuroimaging and other research data types required for this initiative to be successful.

While further real-world validation through an initial feasibility study and pilot is prudent, we are confident in Sia’s core value proposition after extensive evaluation:

  • Decentralization provides resilience and removal of central points of failure critical for research data.
  • Granular access controls allow selectively sharing datasets in a compliant manner.
  • Encryption implemented in smart contracts protects sensitive data and intellectual property.
  • Integration with university identity management enables access tied to approved researchers. (Still pending exploration and maybe part of the future roadmap)
  • The decentralized host ecosystem offers scalable capacity to handle growing data volumes cost-effectively.
  • Backup support safeguards data integrity over long time horizons.
  • Programmatic APIs enable piping data directly from instruments like MRI scanners.

Through the pilot, we will quantify benefits and optimize integrations. But at its core, Sia offers the right substrate of security, control, affordability, and scalability needed for modern research data requirements. The pilot will prove capabilities we are already confident in based on robust technical analysis. We welcome the opportunity to demonstrate Sia’s readiness through measured milestones.

Thanks for your response and attention toward address the committee’s concerns. In order to consider the changes you’ve committed to, the committee would like to see a separate proposal for the lean pilot you mentioned. We believe this is your intention based on your bullet point of “Focusing first on a lean pilot tailored to proving core Sia capabilities”, but wanted to explicitly confirm that a dedicated proposal with reduced scope (e.g. those first five months) and reduced budget will be necessary.

1 Like

Dear Grants Committee,

We value your input and the chance to enhance our approach. As advised, we’ll create a focused proposal for a 6-month lean pilot to demonstrate Sia’s core capabilities for research data use.

This proposal aims to address prerequisites while keeping scope and budget contained. This pilot will show essential functionality, validate with sample data, and assess benefits. We’ll streamline the budget for crucial personnel, resources, and costs.

Our plan is to proceed with a longer-term proposal if the pilot succeeds. This would allow gradually scaling impact in research data management aligned with Sia’s strengths.

We’ll submit the revised pilot proposal soon. Please share any clarifications or recommendations. Demonstrating Sia’s abilities is our priority.

Thank you!

Project Name: Sia Neuroimaging Research Data Management Pilot

Organization: TReNDS

Describe your project: This pilot project aims to demonstrate Sia’s capabilities for securely storing and accessing neuroimaging research datasets. It will integrate Sia with our center’s COINS platform and MRI scanners to handle real datasets. The goal is to establish product-market fit before further investment.

Proposed Solution

  • Integrate the COINS platform with Sia to enable pushing neuroinformatics datasets into Sia storage.
  • Develop a connector to receive data directly from MRI scanners and transmit it to Sia storage.
  • Build simple web widget for COINS to allow researchers to view and access datasets stored on Sia.
  • Onboard 5 pilot researchers from our lab to use the Sia integration for handling their neuroimaging data.

Success Criteria:

  • Storing 10TB+ of sample datasets with positive researcher feedback.
  • COINS able to push and retrieve data from Sia.
  • Documented metrics showing resource savings versus alternatives.
  • 5 pilot researchers successfully managing data on Sia.

Who benefits? Neuroinformatics researchers at our center will benefit from improved data management. The open sourced integrations can benefit the broader Sia community.

Mission: This serves Sia’s mission of user-owned data by empowering researchers to control data produced by them.

Grant Specifics
Requested Amount: $150,000 over 6 months

Budget Breakdown:
$100,000 Engineering: Two developers
$10,000 Project Management: Part-time coordinator
$5,000 Sia Usage: Pilot data storage
$30,000 Researcher Involvement: Consultation
$5,000 Marketing & Operations: Collateral and legal

Timeline:
Month 1: Core Sia setup, initial datasets

Month 2: Prototype researcher apps

Month 3: Integrate streaming from scanners

Month 4: Quantify metrics and benefits

Month 5-6: Researcher onboarding, feedback

Risks: Technical obstacles blocking milestones, researcher adoption is slower than expected.

Development Information

Open Source: The Sia integrations and connectors will be open source. Some COINS components are proprietary.

Code Access: Repository URLs will be provided.

Monthly Reports: Yes, we agree to provide monthly progress reports.

Note: The key contrast is this pilot is intended to establish basic functionality and validate Sia’s capabilities on a small scale first, rather than directly pursuing the full vision. This targeted approach allows incremental progress while addressing the Committee’s reasonable concerns and feedback on the original proposal.

Contact info: Via Discord

Hello @dreds3,

Thanks again for this proposal, we really appreciate your flexibility with us during this process. This proposal, being the largest one we’ve had submitted to date as well as being outside our areas of expertise, has continuously proved difficult to evaluate.

The committee’s first comment was that, though far removed from the original grant, the $150,000 budget request is still considered very large by our committee’s standards and isn’t sufficiently itemized for its size.

Beyond that is the issue of feasibility, and committing $150,000 to determine if Sia is actually usable for your use-case is a large ask. We’re currently developing a process to help us better evaluate grants of this size and to ensure proper use of Foundation funds. We hope to have details of this process revealed in the coming weeks and will hope to see you re-submit then. For now, the committee will be rejecting this proposal.

Regards,
Kino on behalf of the Sia Foundation and Grants Committee

Thank you for the feedback. I’ll try to conduct a smaller scale feasibility study and potentially re-apply later once we have initial results and the new grant process is in place. Please delete/hide this thread since the proposal was rejected and the stakes are high.