dnanalyzer

Precision genomics for everyone, everywhere. Powered by private AI.

https://github.com/verisimilitudex/dnanalyzer

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.1%) to scientific vocabulary

Keywords

artificial-intelligence bioinformatics biotechnology cli computational-biology dna-analysis dna-sequencing fasta genomics machine-learning on-device open-source precision-medicine privacy regulatory-elements
Last synced: 6 months ago · JSON representation

Repository

Precision genomics for everyone, everywhere. Powered by private AI.

Basic Info
  • Host: GitHub
  • Owner: VerisimilitudeX
  • License: other
  • Language: HTML
  • Default Branch: main
  • Homepage: http://dnanalyzer.org/
  • Size: 164 MB
Statistics
  • Stars: 160
  • Watchers: 8
  • Forks: 69
  • Open Issues: 0
  • Releases: 15
Topics
artificial-intelligence bioinformatics biotechnology cli computational-biology dna-analysis dna-sequencing fasta genomics machine-learning on-device open-source precision-medicine privacy regulatory-elements
Created over 3 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog Contributing Funding License Code of conduct Citation Security

README.md

DNAnalyzer-modified

Next-Generation On-Device DNA Insights

Private. Precise. Powered by AI.

[![Copyright](https://img.shields.io/badge/copyright-2025-blue?style=for-the-badge)](https://github.com/VERISIMILITUDEX/DNAnalyzer) [![Release](https://img.shields.io/github/v/release/VERISIMILITUDEX/DNAnalyzer?style=for-the-badge&color=green)](https://github.com/VERISIMILITUDEX/DNAnalyzer/releases) [![Build Status](https://img.shields.io/github/actions/workflow/status/VerisimilitudeX/DNAnalyzer/gradle.yml?style=for-the-badge)](https://github.com/VerisimilitudeX/DNAnalyzer/actions/workflows/gradle.yml) [![DOI](https://img.shields.io/badge/DOI-10.5281%2Fzenodo.14556578-blue?style=for-the-badge)](https://zenodo.org/records/14556578)
Open in GitHub Codespaces    Model in Hugging Face   

About DNAnalyzer

DNAnalyzer is a biotechnology research and deployment company revolutionizing genomic analysis through AI-powered, privacy-first technology. Supported by Anthropic for Startups, our mission is to democratize DNA analysis by delivering enterprise-grade genomic insights through secure on-device computation.

Founded by Piyush Acharya, DNAnalyzer brings together 46 leading computational biologists and computer scientists from Microsoft Research, the University of Macedonia, and Northeastern University.

Our groundbreaking work has been presented at Y Combinator's Mini YC, starred by the organizer of the AI World's Fair Expo, and recognized by the CEO of DEV.to.


Why DNAnalyzer Matters

| Industry Standard | DNAnalyzer's Innovation | |---|---| | **$100** average cost for DNA sequencing | Completely **Free** analysis | | Up to **$600** for basic health insights | **Universally accessible**, empowering underserved communities worldwide* | | **78%** of companies share genetic data with third parties | **100% Private**: All computation happens locally on your device | | Data breaches compromise millions (23andMe: 6.9M users in 2023) | **Zero central storage**: Your genetic data never leaves your device |

"Unlike a password, compromised genetic data is permanently exposed. You cannot change it."

*Excluding testing costs. We're developing an affordable in-house testing kit to eliminate this final barrier.


Core Capabilities

Codon & Protein Detection
Rapidly identifies protein-coding regions, amino acid chains, and critical genomic indicators with unprecedented accuracy.
GC-rich Region Analysis
Precisely pinpoints genomic promoter areas with significant biological implications (45-60% GC-content).
Neurological Genomics
Detects genetic markers associated with neurological conditions including autism, ADHD, and schizophrenia.
Promoter Element Identification
Locates key transcription initiation sequences (BRE, TATA, INR, DPE) with surgical precision.
Multi-format FASTA Integration
Seamlessly supports comprehensive DNA database analysis from uploads or external sources.
Met CLI Automation
Harnesses a powerful CLI interface for scripting, automation, and enterprise-scale analysis tasks.
Privacy-First Ancestry Insights
Estimates continental origin using on-device reference panels without compromising privacy.

See the [Ancestry Snapshot guide](docs/usage/ancestry-snapshot.md) for detailed usage instructions. > **New:** Interactive web dashboard for real-time visualization is now available under `web/dashboard`, seamlessly communicating with the local REST API at `/api`. ### Intelligent Natural Language Reports Following each CLI analysis, DNAnalyzer automatically generates two AI-powered summaries via the OpenAI API: - **Researcher Report** Technical analysis featuring detailed statistics and professional terminology - **Layperson Report** Clear, accessible overview highlighting actionable insights Both reports are displayed in the console upon analysis completion when an `OPENAI_API_KEY` is configured.

## Quickstart Guide Ready to unlock your genomic insights? Begin precision DNA analysis in seconds: ```bash # Clone the repository git clone https://github.com/VerisimilitudeX/DNAnalyzer.git # Navigate to project directory cd DNAnalyzer # Install dependencies ./gradlew build ``` ### **NEW: Intuitive Launch Script** We've transformed DNAnalyzer's user experience! Say goodbye to complex command-line options: ```bash # Simple preset modes ./easy_dna.sh your_file.fa basic # Standard analysis ./easy_dna.sh your_file.fa detailed # Comprehensive analysis ./easy_dna.sh your_file.fa mutations # Generate mutations ./easy_dna.sh your_file.fa all # Complete suite ./easy_dna.sh your_file.fa custom # Interactive mode # Or use the traditional Java method java -jar build/libs/DNAnalyzer-1.2.1.jar your_file.fa ``` ### **NEW: Intelligent Output Organization** All generated files are automatically organized in a clean, intuitive directory structure: ``` output/dnanalyzer_output_{filename}_{timestamp}/ charts/ # Quality control and analysis visualizations (PNG) sequences/ # Generated mutations and processed sequences (FASTA) reports/ # Comprehensive analysis reports and summaries (HTML) ``` ### **NEW: Smart Analysis Profiles** Leverage predefined profiles tailored to your workflow: ```bash # Select analysis profiles optimized for common use cases java -jar build/libs/DNAnalyzer-1.2.1.jar --profile research your_file.fa java -jar build/libs/DNAnalyzer-1.2.1.jar --profile clinical your_file.fa java -jar build/libs/DNAnalyzer-1.2.1.jar --profile mutation your_file.fa # Available profiles: basic, detailed, quick, research, mutation, clinical ``` ### Documentation - [Getting Started Guide](docs/getting-started.md) - Essential setup and configuration - [Enhanced Features Guide](docs/usage/enhanced-features.md) - **NEW!** Comprehensive guide to all user experience improvements - [Command Reference](docs/usage/) - Complete command-line options and examples - [Changelog](CHANGELOG.md) - **NEW!** Detailed release notes and version history
## Polygenic Health-Risk Scores DNAnalyzer now features an advanced polygenic risk score calculator alongside engaging trait predictions. Simply provide your 23andMe data file with a CSV of SNP weights to compute personalized scores: ```bash ./gradlew run --args='--23andme my_data.txt --prs assets/risk/heart_disease_prs.csv sample.fa' ``` Trait predictions and risk scores are displayed following standard DNA analysis. **Disclaimer:** Trait predictions are provided for educational purposes only and should not be used for medical or health decisions.
### REST API For seamless automated workflows, DNAnalyzer exposes a robust REST endpoint. Launch the Spring Boot application and send your FASTA file to `/server/analyze`: ```bash curl -F file=@sample.fa http://localhost:8080/server/analyze ``` The response delivers core pipeline output as JSON, enabling effortless scripting from Python, R, or any preferred language without GUI dependencies. Additionally, the `/api/file/parse` endpoint enables straightforward FASTA or FASTQ file upload and sequence parsing. ## GPU-Accelerated Smith-Waterman Our optional PyOpenCL module delivers GPU acceleration for local sequence alignment. When no compatible GPU is detected, the implementation gracefully falls back to optimized Python execution. Execute the module directly or via CLI: ```bash python -m src.python.gpu_smith_waterman SEQ1 SEQ2 ``` From the DNAnalyzer CLI, request Smith-Waterman alignment by combining `--sw-align` with `--align`: ```bash java -jar dnanalyzer.jar --align reference.fa --sw-align ``` See [GPU_Smith_Waterman.md](docs/developer/GPU_Smith_Waterman.md) for comprehensive technical details. ### Packaging Analysis Sessions After completing your DNAnalyzer run, archive inputs, logs, and interactive HTML reports using `package-session.sh`: ```bash ./scripts/package-session.sh sample.fa ``` This generates a time-stamped ZIP archive containing the FASTA file, console output, generated reports, and all QC visualizations.
## Development Roadmap
| Upcoming Innovation | Description | |---|---| | **Optimized SQL Database** | Scalable architecture supporting genomic datasets across diverse species | | **Enhanced Neural Network** | Seamless integration with third-party genotype datasets (23andMe, AncestryDNA) | | **DIAMOND Implementation** | Harmonizing DIAMOND's speed with BLAST's accuracy for next-generation analyses | | **AI Trait Predictor Suite** | Engaging, shareable predictionsfrom cilantro taste to chronotypevalidated by peer-reviewed SNP studies | | **Secure Share & Compare** | Offline-generated QR summaries enable selective insight sharing with healthcare providersraw genome remains private |

## Contribute to DNAnalyzer We enthusiastically welcome contributions from all experience levels: - [Guidelines for Contribution](./docs/contributing/Contribution_Guidelines.md) - [Git Usage Instructions](./docs/contributing/CONTRIBUTING.md) - [Development Environment](./docs/Development_Environment.md)
Stars Issues Pull Requests Discord

## Academic Citations When referencing DNAnalyzer in academic work, please cite: ```bibtex @software{Acharya_DNAnalyzer_ML-Powered_DNA_2022, author = {Acharya, Piyush}, doi = {10.5281/zenodo.14556577}, month = oct, title = {{DNAnalyzer: ML-Powered DNA Analysis Platform}}, url = {https://github.com/VerisimilitudeX/DNAnalyzer}, version = {3.5.0-beta.0}, year = {2022} } ```
## Terms of Use DNAnalyzer is provided "as-is." Usage of this software implies acceptance of all associated risks and liabilities. DNAnalyzer disclaims responsibility for any loss or damage arising from its use. For assistance or inquiries: help@dnanalyzer.org DNAnalyzer, Piyush Acharya 2025. A fiscally sponsored 501(c)(3) nonprofit (EIN: 81-2908499), licensed under MIT License.
--- ## Impact Metrics | Metric | Current Value | |--------|---------------| | GitHub Stars | **147** | | Forks | **62** | | Contributors | **46** | | Monthly FASTA files analyzed* | **5,000+** | | Total downloads (Gradle/CLI) | **4,042** | | Deployments via GitHub Pages | **485** | --- --- ## Community Engagement - **Discord** Active `#genomics-ai` channel (80+ members) - **Open Issues for First-Timers** Labeled `good-first-issue` to mentor newcomers - **Monthly Release Notes** Transparent changelogs with contributor recognition --- \*Monthly FASTA throughput calculated from anonymized CLI telemetry and public workflow logs.

Project Growth

Star History Chart

Support DNAnalyzer

23andMe

Get 10% off your order
DNAnalyzer earns $20 per referral

23andMe Referral

Ancestry Membership

Get up to 24% off membership
DNAnalyzer earns $10 per referral

Ancestry Referral

Owner

  • Name: Piyush Acharya
  • Login: VerisimilitudeX
  • Kind: user
  • Location: Redmond, Washington, United States
  • Company: @mightycrayon @hackclub

Science enthusiast, developer, and continuous learner. Founder of DNAnalyzer & Hack Club Chapter Leader.

GitHub Events

Total
  • Create event: 67
  • Release event: 1
  • Issues event: 27
  • Watch event: 24
  • Delete event: 36
  • Issue comment event: 115
  • Push event: 290
  • Pull request review comment event: 5
  • Pull request review event: 10
  • Pull request event: 129
  • Fork event: 9
Last Year
  • Create event: 67
  • Release event: 1
  • Issues event: 27
  • Watch event: 24
  • Delete event: 36
  • Issue comment event: 115
  • Push event: 290
  • Pull request review comment event: 5
  • Pull request review event: 10
  • Pull request event: 129
  • Fork event: 9

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 14
  • Total pull requests: 71
  • Average time to close issues: 3 months
  • Average time to close pull requests: 5 days
  • Total issue authors: 4
  • Total pull request authors: 4
  • Average comments per issue: 2.79
  • Average comments per pull request: 0.69
  • Merged pull requests: 42
  • Bot issues: 0
  • Bot pull requests: 19
Past Year
  • Issues: 13
  • Pull requests: 71
  • Average time to close issues: 2 months
  • Average time to close pull requests: 5 days
  • Issue authors: 4
  • Pull request authors: 4
  • Average comments per issue: 2.15
  • Average comments per pull request: 0.69
  • Merged pull requests: 42
  • Bot issues: 0
  • Bot pull requests: 19
Top Authors
Issue Authors
  • VerisimilitudeX (15)
  • Mrigankkh (1)
  • maxwofford (1)
  • Karzinisierung (1)
  • Paucey (1)
Pull Request Authors
  • VerisimilitudeX (47)
  • deepsource-autofix[bot] (28)
  • Paucey (3)
  • Nowimprint (1)
  • restyled-io[bot] (1)
  • imgbot[bot] (1)
Top Labels
Issue Labels
no-issue-activity (10) good first issue (4) enhancement (3) help wanted (3) bug (3) hacktoberfest-accepted (2) research (2) software (2) Stale (1)
Pull Request Labels
codex (31) Stale (4)