https://github.com/learningcircuit/local-deep-research
Local Deep Research achieves ~95% on SimpleQA benchmark (tested with GPT-4.1-mini). Supports local and cloud LLMs (Ollama, Google, Anthropic, ...). Searches 10+ sources - arXiv, PubMed, web, and your private documents. Everything Local.
Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.8%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
Local Deep Research achieves ~95% on SimpleQA benchmark (tested with GPT-4.1-mini). Supports local and cloud LLMs (Ollama, Google, Anthropic, ...). Searches 10+ sources - arXiv, PubMed, web, and your private documents. Everything Local.
Basic Info
Statistics
- Stars: 3,349
- Watchers: 25
- Forks: 326
- Open Issues: 54
- Releases: 49
Topics
Metadata Files
README.md
Local Deep Research
🚀 What is Local Deep Research?
LDR is an AI research assistant that performs systematic research by:
- Breaking down complex questions into focused sub-queries
- Searching multiple sources in parallel (web, academic papers, local documents)
- Verifying information across sources for accuracy
- Creating comprehensive reports with proper citations
It aims to help researchers, students, and professionals find accurate information quickly while maintaining transparency about sources.
🎯 Why Choose LDR?
- Privacy-Focused: Run entirely locally with Ollama + SearXNG
- Flexible: Use any LLM, any search engine, any vector store
- Comprehensive: Multiple research modes from quick summaries to detailed reports
- Transparent: Track costs and performance with built-in analytics
- Open Source: MIT licensed with an active community
📊 Performance
~95% accuracy on SimpleQA benchmark (preliminary results) - Tested with GPT-4.1-mini + SearXNG + focused-iteration strategy - Comparable to state-of-the-art AI research systems - Local models can achieve similar performance with proper configuration - Join our community benchmarking effort →
✨ Key Features
🔍 Research Modes
- Quick Summary - Get answers in 30 seconds to 3 minutes with citations
- Detailed Research - Comprehensive analysis with structured findings
- Report Generation - Professional reports with sections and table of contents
- Document Analysis - Search your private documents with AI
🛠️ Advanced Capabilities
- LangChain Integration - Use any vector store as a search engine
- REST API - Authenticated HTTP access with per-user databases
- Benchmarking - Test and optimize your configuration
- Analytics Dashboard - Track costs, performance, and usage metrics
- Real-time Updates - WebSocket support for live research progress
- Export Options - Download results as PDF or Markdown
- Research History - Save, search, and revisit past research
- Adaptive Rate Limiting - Intelligent retry system that learns optimal wait times
- Keyboard Shortcuts - Navigate efficiently (ESC, Ctrl+Shift+1-5)
- Per-User Encrypted Databases - Secure, isolated data storage for each user
🌐 Search Sources
Free Search Engines
- Academic: arXiv, PubMed, Semantic Scholar
- General: Wikipedia, SearXNG
- Technical: GitHub, Elasticsearch
- Historical: Wayback Machine
- News: The Guardian
Premium Search Engines
- Tavily - AI-powered search
- Google - Via SerpAPI or Programmable Search Engine
- Brave Search - Privacy-focused web search
Custom Sources
- Local Documents - Search your files with AI
- LangChain Retrievers - Any vector store or database
- Meta Search - Combine multiple engines intelligently
⚡ Quick Start
Option 1: Docker (Quickstart on MAC/ARM)
```bash
Step 1: Pull and run SearXNG for optimal search results
docker run -d -p 8080:8080 --name searxng searxng/searxng
Step 2: Pull and run Local Deep Research (Please build your own docker on ARM)
docker run -d -p 5000:5000 --network host --name local-deep-research --volume 'deep-research:/data' -e LDRDATADIR=/data localdeepresearch/local-deep-research ```
Option 2: Docker Compose (Recommended)
LDR uses Docker compose to bundle the web app and all it's dependencies so you can get up and running quickly.
Option 2a: Quick Start (One Command)
Linux:
bash
curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml && docker compose up -d
Windows:
bash
curl.exe -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml; docker compose up -d
Use with a different model:
bash
MODEL=gemma:1b curl -O https://raw.githubusercontent.com/LearningCircuit/local-deep-research/main/docker-compose.yml && docker compose up -d
Open http://localhost:5000 after ~30 seconds. This starts LDR with SearXNG and all dependencies.
Option 2b: DIY docker-compose
See docker-compose.yml for a docker-compose file with reasonable defaults to get up and running with ollama, searxng, and local deep research all running locally.
Things you may want/need to configure: * Ollama GPU driver * Ollama context length (depends on available VRAM) * Ollama keep alive (duration model will stay loaded into VRAM and idle before getting unloaded automatically) * Deep Research model (depends on available VRAM and preference)
Option 2c: Use Cookie Cutter to tailor a docker-compose to your needs:
Prerequisites
- Docker
- Docker Compose
cookiecutter: Runpip install --user cookiecutter
Clone the repository:
bash
git clone https://github.com/LearningCircuit/local-deep-research.git
cd local-deep-research
Configuring with Docker Compose
Cookiecutter will interactively guide you through the process of creating a
docker-compose configuration that meets your specific needs. This is the
recommended approach if you are not very familiar with Docker.
In the LDR repository, run the following command to generate the compose file:
bash
cookiecutter cookiecutter-docker/
docker compose -f docker-compose.default.yml up
Option 3: Python Package
```bash
Step 1: Install the package
pip install local-deep-research
Step 2: Setup SearXNG for best results
docker pull searxng/searxng docker run -d -p 8080:8080 --name searxng searxng/searxng
Step 3: Install Ollama from https://ollama.ai
Step 4: Download a model
ollama pull gemma3:12b
Step 5: Build frontend assets (required for Web UI)
Note: If installed via pip and using the Web UI, you need to build assets
Navigate to the installation directory first (find with: pip show local-deep-research)
npm install npm run build
Step 6: Start the web interface
python -m localdeepresearch.web.app ```
Important for pip users: If you installed via pip and want to use the web UI, you must run npm install and npm run build in the package installation directory to generate frontend assets (icons, styles). Without this, the UI will have missing icons and styling issues. For programmatic API usage only, these steps can be skipped.
💻 Usage Examples
Python API
```python from localdeepresearch.api import quicksummary from localdeepresearch.settings import CachedSettingsManager from localdeepresearch.database.sessioncontext import getuserdb_session
Authentication required - use with user session
with getuserdbsession(username="yourusername", password="yourpassword") as session: settingsmanager = CachedSettingsManager(session, "yourusername") settingssnapshot = settingsmanager.getall_settings()
# Simple usage with settings
result = quick_summary(
query="What are the latest advances in quantum computing?",
settings_snapshot=settings_snapshot
)
print(result["summary"])
```
HTTP API
```python import requests
Create session and authenticate
session = requests.Session() session.post("http://localhost:5000/auth/login", json={"username": "user", "password": "pass"})
Get CSRF token
csrf = session.get("http://localhost:5000/auth/csrf-token").json()["csrf_token"]
Make API request
response = session.post( "http://localhost:5000/research/api/start", json={"query": "Explain CRISPR gene editing"}, headers={"X-CSRF-Token": csrf} ) ```
Command Line Tools
```bash
Run benchmarks from CLI
python -m localdeepresearch.benchmarks --dataset simpleqa --examples 50
Manage rate limiting
python -m localdeepresearch.websearchengines.ratelimiting status python -m localdeepresearch.websearchengines.ratelimiting reset ```
🔗 Enterprise Integration
Connect LDR to your existing knowledge base:
```python from localdeepresearch.api import quick_summary
Use your existing LangChain retriever
result = quicksummary( query="What are our deployment procedures?", retrievers={"companykb": yourretriever}, searchtool="company_kb" ) ```
Works with: FAISS, Chroma, Pinecone, Weaviate, Elasticsearch, and any LangChain-compatible retriever.
📊 Performance & Analytics
Benchmark Results
Early experiments on small SimpleQA dataset samples:
| Configuration | Accuracy | Notes | |--------------|----------|--------| | gpt-4.1-mini + SearXNG + focusediteration | 90-95% | Limited sample size | | gpt-4.1-mini + Tavily + focusediteration | 90-95% | Limited sample size | | gemini-2.0-flash-001 + SearXNG | 82% | Single test run |
Note: These are preliminary results from initial testing. Performance varies significantly based on query types, model versions, and configurations. Run your own benchmarks →
Built-in Analytics Dashboard
Track costs, performance, and usage with detailed metrics. Learn more →
🤖 Supported LLMs
Local Models (via Ollama)
- Llama 3, Mistral, Gemma, DeepSeek
- LLM processing stays local (search queries still go to web)
- No API costs
Cloud Models
- OpenAI (GPT-4, GPT-3.5)
- Anthropic (Claude 3)
- Google (Gemini)
- 100+ models via OpenRouter
📚 Documentation
Getting Started
Core Features
Advanced Features
Development
Examples & Tutorials
🤝 Community & Support
- Discord - Get help and share research techniques
- Reddit - Updates and showcases
- GitHub Issues - Bug reports
🚀 Contributing
We welcome contributions! See our Contributing Guide to get started.
📄 License
MIT License - see LICENSE file.
Built with: LangChain, Ollama, SearXNG, FAISS
Support Free Knowledge: Consider donating to Wikipedia, arXiv, or PubMed.
Owner
- Login: LearningCircuit
- Kind: user
- Repositories: 1
- Profile: https://github.com/LearningCircuit
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| LearningCircuit | 1****t | 347 |
| hashedviking | 6****g | 118 |
| Daniel Petti | d****i@g****m | 98 |
| dim-tsoukalas | d****g@g****m | 5 |
| ScottVR | s****r@g****m | 4 |
| Davit Mnatobishvili | s****s@g****m | 3 |
| JayLiu | 1****7@q****m | 3 |
| Sam | s****j | 2 |
| Nikhil Dev Goyal | n****l@g****m | 2 |
| kabachuha | a****1@y****u | 1 |
| Ikko Eltociear Ashimine | e****r@g****m | 1 |
| Chris Cowley | 1****y | 1 |
| Dominik Witczak | d****o@g****m | 1 |
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 159
- Total pull requests: 744
- Average time to close issues: 8 days
- Average time to close pull requests: about 23 hours
- Total issue authors: 75
- Total pull request authors: 21
- Average comments per issue: 2.22
- Average comments per pull request: 0.32
- Merged pull requests: 519
- Bot issues: 0
- Bot pull requests: 22
Past Year
- Issues: 159
- Pull requests: 744
- Average time to close issues: 8 days
- Average time to close pull requests: about 23 hours
- Issue authors: 75
- Pull request authors: 21
- Average comments per issue: 2.22
- Average comments per pull request: 0.32
- Merged pull requests: 519
- Bot issues: 0
- Bot pull requests: 22
Top Authors
Issue Authors
- LearningCircuit (42)
- djpetti (16)
- MicahZoltu (5)
- theodorevo (5)
- taoeffect (3)
- EmmanuelROGER (3)
- StatusQuo209 (3)
- EggzYy (3)
- xybernaut (2)
- kendonB (2)
- i-d-lytvynenko (2)
- lixy910915 (2)
- AhaZsy (2)
- Penner10000 (2)
- kabachuha (2)
Pull Request Authors
- LearningCircuit (489)
- djpetti (156)
- HashedViking (28)
- dependabot[bot] (19)
- scottvr (10)
- MicahZoltu (7)
- dim-tsoukalas (5)
- Drswagzz (4)
- sammcj (4)
- Nikhil0250 (4)
- github-actions[bot] (3)
- mehmetcanfarsak (2)
- wutzebaer (2)
- catsudon (2)
- JayLiu7319 (2)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 1,777 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 58
- Total maintainers: 2
- Total advisories: 1
pypi.org: local-deep-research
AI-powered research assistant with deep, iterative analysis using LLMs and web searches
- Homepage: https://github.com/LearningCircuit/local-deep-research
- Documentation: https://local-deep-research.readthedocs.io/
- License: MIT License Copyright (c) 2025 LearningCircuit Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
-
Latest release: 1.1.11
published 6 months ago