https://github.com/darkstarstrix/nexapod

A Distributed Compute Fabric for Scientific Problem

https://github.com/darkstarstrix/nexapod

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.9%) to scientific vocabulary

Keywords

backend-service compute distrubted-systems gpu infrastructure mesh-networks platform-engineering protocol scientific-computing
Last synced: 5 months ago · JSON representation

Repository

A Distributed Compute Fabric for Scientific Problem

Basic Info
Statistics
  • Stars: 9
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Topics
backend-service compute distrubted-systems gpu infrastructure mesh-networks platform-engineering protocol scientific-computing
Created 8 months ago · Last pushed 6 months ago
Metadata Files
Readme Contributing Funding License

README.md

NEXAPod: Distributed Compute Fabric for Scientific Problems

CI/CD Pipeline License: Apache 2.0 Python Uses uv Issues Pull Requests Last Commit Contributors Repo Size Docker Image Docker Pulls

NEXAPod seamlessly unites diverse computing resources from consumer GPUs to high-end clusters to tackle large-scale scientific challenges.


1. Mission

NEXAPod is a distributed computing system designed to coordinate heterogeneous compute resources—ranging from consumer GPUs to high-performance clusters—to solve large-scale scientific problems. It is, in essence, Folding@home for the AI era.


2. Frequently Asked Questions (FAQ)

What is NexaPod?
NexaPod is a modern take on distributed scientific computing. It allows anyone to contribute their computer's idle processing power to help solve complex scientific problems, starting with molecular science and protein structure prediction.

What was the inspiration for NexaPod?
NexaPod is a synthesis of three ideas: 1. Decentralized Compute at Scale (e.g., Prime Intellect): Inspired by the vision of training large AI models on a decentralized network of nodes. 2. Mesh Networking (e.g., Meshtastic): Built on the concept of a resilient, decentralized network of peers. 3. Scientific Mission (e.g., Folding@home): Focused on applying this compute power to solve real-world scientific challenges.

Is this project affiliated with Prime Intellect?
No. NexaPod is an independent, open-source project. While inspired by the ambitious goals of projects like Prime Intellect, it is not formally associated with them. NexaPod's focus is on scientific computing and inference, not general-purpose LLM training.

How is NexaPod different from Folding@home?
NexaPod aims to be a modern successor. Key differences include: - AI-Native: Designed for modern machine learning inference tasks. - Heterogeneous Compute: Built from the ground up to support diverse hardware (CPU, GPU). - Job Agnostic: The architecture can be adapted to any scientific problem, not just a single one. - Modern Tooling: Uses containers, modern CI/CD, and robust orchestration for security and scalability.


3. Project Roadmap

Phase 1: Alpha (Launched)

  • Goal: Ship a working proof-of-concept. Test the core system and validate that the distributed mesh works in the wild.
  • Actions: Launched the first public alpha running a secondary structure prediction job. Onboarding technical users to gather feedback, observe bugs, and fix obvious blockers.

Phase 2: Beta (Next 2–4 Weeks)

  • Goal: Iterate on user feedback, harden the system, and expand the network.
  • Actions: Bugfixes and infrastructure upgrades (better logging, validation, robust VPS). Refine onboarding and documentation. Begin groundwork for ZK proofs, incentives, and improved scheduling.

Phase 3: Full Launch (Post-Beta, ~1–2 Months Out)

  • Goal: A production-grade, incentivized scientific compute mesh ready to tackle a "grand challenge" problem.
  • Actions: Implement ZK proofs for trustless validation. Roll out more robust job scheduling. Launch incentive mechanisms (token/reputation). Target a large-scale challenge like DreamMS (inference on 201 million molecular datapoints).

4. DevOps & Containers

This project uses a robust GitHub Actions workflow for continuous integration and delivery. The pipeline includes: - Test Stage: Runs automated tests for quality and reliability. - Build & Push Stage: Builds server and client Docker images and pushes them to GitHub Container Registry (GHCR). - Artifact Storage: Stores build artifacts for traceability.

Deployment to production/staging is currently manual, allowing for controlled and verified rollouts.


5. Getting Started

There are two main ways to run the project: using the installed package or running the script directly.

Method 1: Install and Use nexapod-cli (Recommended)

This method installs the project as a command-line tool, nexapod-cli, which you can run from anywhere.

1. Installation

Open a terminal in the project's root directory and run:

bash pip install -e .

2. Usage

After installation, you can use the nexapod-cli command with one of the available subcommands:

To set up the client:

bash nexapod-cli setup

To run a specific job:

bash nexapod-cli run <your-job-id>

Example:

bash nexapod-cli run job-12345

To start the client worker:

bash nexapod-cli start

Method 2: Run the Script Directly (For Development)

This method is useful for development as it runs the CLI without installation.

For Bash users (Linux, macOS, Git Bash on Windows):

  1. Make the script executable (only needs to be done once): bash chmod +x scripts/NexaPod.sh
  2. Run the script from the project root: bash ./scripts/NexaPod.sh setup ./scripts/NexaPod.sh run my-job-123

For PowerShell users (Windows):

  1. Open a PowerShell terminal. You may need to allow script execution for the current session: powershell Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass
  2. Run the script from the project root: powershell .\scripts\NexaPod.ps1 setup .\scripts\NexaPod.ps1 run my-job-123

For Developers (Running Locally with Docker Compose)

To run the entire stack (server, client, monitoring) locally using the pre-built images from the registry: 1. Prerequisites: Docker and Docker Compose. 2. Log in to GHCR (first time only): bash # Use a Personal Access Token (classic) with read:packages scope. echo "YOUR_PAT_TOKEN" | docker login ghcr.io -u YOUR_GITHUB_USERNAME --password-stdin 3. Pull the latest images: bash docker-compose pull 4. Launch Services: bash docker-compose up -d This will start the server, a local client, Prometheus, and Grafana. To stop the services, run docker-compose down.

Monitoring Dashboard (Local)

Once the services are running, you can access the monitoring stack: - Prometheus: http://localhost:9090 (View metrics and service discovery) - Grafana: http://localhost:3000 (Create dashboards; default user/pass: admin/admin)


6. Project Structure

nexapod/ ├── NexaPod_CLI/ # The user-facing CLI tool ├── client/ # Worker node agent code ├── Infrastructure/ # Dockerfiles and Kubernetes manifests ├── docs/ # Architecture, API, and Onboarding documentation ├── scripts/ # Utility scripts ├── tests/ # Unit and integration tests ├── .github/ # CI/CD workflows └── docker-compose.yaml # Local development setup


7. Core Components & Tech Stack

| Layer | Component | Tech / Libs | |----------------------|--------------------------|-------------------------------------------------------| | Comms | HTTP API | FastAPI (server) + requests (client) | | Profiling | Hardware detection | psutil, nvidia-ml-py, subprocess (nvidia-smi) | | Execution | Container runtime | Docker (nexapod CLI) | | Scheduling | Job queue & matching | In-memory queue (Alpha) | | Data storage | Metadata & logs | SQLite (Alpha) → Postgres | | Security | Cryptographic signatures | cryptography (Ed25519) | | Orchestration | Single-node MVP | Python scripts + Docker | | | Multi-node (v2) | Kubernetes (k8s) manifests | | Monitoring | Metrics & logs | Prometheus / Grafana | | Testing | Unit & integration tests | pytest |


8. Contributing

PRs and issues are welcome! See Docs/CONTRIBUTING.md for detailed guidelines.


Owner

  • Name: Allan Murimi Wandia
  • Login: DarkStarStrix
  • Kind: user
  • Location: U.S.A
  • Company: Freelance

Full stack Dev Turning ideas into projects

GitHub Events

Total
  • Watch event: 8
  • Push event: 34
  • Pull request event: 1
  • Pull request review event: 1
  • Create event: 2
Last Year
  • Watch event: 8
  • Push event: 34
  • Pull request event: 1
  • Pull request review event: 1
  • Create event: 2

Dependencies

pyproject.toml pypi
  • cryptography >=45.0.5
  • docker-py >=1.10.6
  • fastapi ==0.115.14
  • flask >=3.0.3
  • networkx >=3.1
  • psutil >=7.0.0
  • pytest >=8.3.5
  • pyvis >=0.3.2
  • pyyaml >=6.0.2
  • requests >=2.25.1
  • streamlit >=1.40.1
  • uvicorn >=0.33.0
requirements.txt pypi
  • PyYAML *
  • boto3 *
  • cryptography *
  • docker *
  • docker-py *
  • fastapi *
  • flask *
  • flask-limiter *
  • ipfshttpclient *
  • networkx *
  • psutil *
  • pydantic *
  • pytest *
  • pyvis *
  • qiskit *
  • requests *
  • streamlit *
  • uvicorn *
uv.lock pypi
  • 135 dependencies
.github/workflows/ci.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
  • docker/build-push-action v4 composite
  • docker/login-action v2 composite
  • docker/setup-buildx-action v2 composite
.github/workflows/static.yml actions
  • actions/checkout v4 composite
  • actions/configure-pages v5 composite
  • actions/deploy-pages v4 composite
  • actions/upload-pages-artifact v3 composite
Client/requirements.txt pypi
  • PyYAML ==6.0.2
  • cryptography >=41.0.0
  • networkx >=3.1
  • numpy >=1.24.0
  • pandas >=2.1.0
  • plotly >=5.17.0
  • psutil >=5.9.0
  • requests >=2.31.0
  • safetensors ==0.5.3
  • streamlit >=1.28.0
  • torch ==2.7.1
Server/requirements.txt pypi
  • PyYAML ==6.0.2
  • cryptography >=41.0.0
  • fastapi >=0.104.0
  • prometheus_client ==0.22.1
  • pydantic ==2.11.7
  • requests >=2.31.0
  • uvicorn >=0.24.0