profhai-deep-research

https://github.com/minhthai1995/profhai-deep-research

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.8%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: minhthai1995
License: apache-2.0
Language: Python
Default Branch: master
Size: 15.4 MB

Statistics

Stars: 0
Watchers: 0
Forks: 0
Open Issues: 1
Releases: 0

Created about 1 year ago · Last pushed about 1 year ago

Metadata Files

Readme Contributing License Code of conduct Citation

🔬 AI Researcher - User Guide

Welcome to AI Researcher! This guide will walk you through everything you need to know to get the most out of your AI-powered research assistant.

Main interface of AI Researcher

🚀 Getting Started

Prerequisites

Python 3.8+
Node.js 18+
Ollama (for local AI models)
OpenAI API key (for cloud models)

Quick Setup

Clone and install dependencies: bash git clone https://github.com/assafelovic/gpt-researcher cd gpt-researcher pip install -r requirements.txt cd frontend/nextjs && npm install
Configure environment: ```bash

Copy and edit your .env file

cp .env.example .env

Add your API keys and settings

```
Start the services: ```bash

Terminal 1: Start backend

uvicorn backend.server.server:app --host=0.0.0.0 --port=8000 --reload

# Terminal 2: Start frontend
cd frontend/nextjs && npm run dev ```

Access the application: Open http://localhost:3000 in your browser

🎯 Understanding the Interface

Main components of the AI Researcher interface

Key Components

🏠 Header - Navigation and branding
📚 Mode Selection - Choose between Local Chat and Deep Research
📁 File Upload Area - Drag and drop your documents
⚙️ Provider Toggle - Switch between Ollama (local) and OpenAI (cloud)
💬 Chat Interface - Input area and conversation history
📊 Results Display - Research outputs and analysis

💬 Local Chat Mode

Local Chat mode lets you have conversations with your uploaded documents using either local or cloud AI models.

Step 1: Upload Documents

File Upload Drag and drop files or click to select

Drag and drop files into the upload area, or click to browse
Supported formats: PDF, DOCX, TXT, MD, CSV, XLS, XLSX
Multiple files can be uploaded at once
Files are automatically processed and ready for chat

Step 2: Select Local Chat Mode

Mode Selection Choose Local Chat for document-only conversations

Click the "Local Chat" button (blue)
This mode focuses only on your uploaded documents
No internet search is performed

Step 3: Choose Your AI Provider

Provider Selection Select between Ollama (private) and OpenAI (cloud)

Ollama (Local & Private): - ✅ 100% privacy - data never leaves your machine - ✅ No API costs - ✅ Works offline - ⚠️ Requires Ollama to be running locally - ⚠️ May be slower than cloud models

OpenAI (Cloud): - ✅ Latest AI models (GPT-4o, etc.) - ✅ Faster responses - ✅ Advanced reasoning capabilities - ⚠️ Requires internet connection - ⚠️ Uses API credits

Step 4: Start Chatting

Local Chat Interface Chat interface with conversation history

Type your question about the documents
Press Enter or click the "Chat" button
Watch the loading indicator while AI processes your request
View the response based on your document content

Example Questions:

"What are the main findings in this research paper?"
"Summarize the key points from these meeting notes"
"What budget items are mentioned in the financial report?"
"Compare the approaches described in these documents"

🔍 Deep Research Mode

Deep Research mode combines your documents with internet search for comprehensive analysis.

Step 1: Select Deep Research Mode

Deep Research mode for comprehensive analysis

Click the "Deep Research" button (teal)
This mode uses both your documents AND internet sources
Provides more comprehensive and current information

Step 2: Configure Research Settings

Research Settings Advanced provider and model selection

Choose Provider: OpenAI or Ollama
Select Model: Different models for different needs
- GPT-4o: Most advanced, best for complex analysis
- GPT-4o Mini: Fast and efficient for most tasks
- Mistral 7B: Local alternative, good balance
Optional: Add custom research questions

Step 3: Run Research

Research Process Real-time research progress and results

Enter your research topic
Click "Research" to start
Watch real-time progress as sources are found and analyzed
Review comprehensive report with citations and sources

Research Output Includes:

Executive Summary
Detailed Analysis
Source Citations
Related Documents from your uploads
Downloadable Reports (PDF, DOCX)

⚙️ Provider Selection

Ollama (Local AI)

Ollama local setup and model selection

Setup Requirements: 1. Install Ollama: brew install ollama (macOS) or visit ollama.ai 2. Start Ollama: ollama serve 3. Download models: ollama pull mistral:7b

Available Models: - Mistral 7B - Balanced performance and speed - Llama 3.2 3B - Fast and lightweight - Llama 3.3 70B - Highest quality (requires more RAM) - Phi-3 Mini - Microsoft's efficient model - DeepSeek R1 - Great for reasoning tasks

Best For: - 🔒 Privacy-sensitive documents - 💰 Cost-conscious usage - 🌐 Offline environments - 🏠 Personal projects

OpenAI (Cloud AI)

OpenAI API configuration and model options

Setup Requirements: 1. Get API Key from platform.openai.com 2. Add to .env file: OPENAI_API_KEY=your_key_here 3. Monitor usage on OpenAI dashboard

Available Models: - GPT-4o - Most advanced, best reasoning - GPT-4o Mini - Fast, cost-effective - GPT-4 Turbo - Good balance of speed and capability

Best For: - 🧠 Complex analysis and reasoning - 🚀 Fastest response times - 📈 Professional/business use - 🌍 Latest information synthesis

📁 File Management

Supported File Types

Supported document formats

| Format | Extension | Best For | |--------|-----------|----------| | PDF | .pdf | Research papers, reports, presentations | | Word | .docx, .doc | Documents, proposals, notes | | Text | .txt, .md | Code, documentation, plain text | | Spreadsheet | .xlsx, .xls, .csv | Data, financial reports, logs |

File Operations

File upload, preview, and management interface

Upload: - Drag & Drop multiple files at once - Click to browse for traditional file selection - Automatic processing extracts text content - Real-time feedback shows upload progress

Manage: - View uploaded files with size and type info - Remove individual files using the X button - Clear all files with the "Clear All" button - Auto-refresh checks for new files every 5 seconds

Session Persistence

Current session documents display

Files persist between browser sessions
Document indicator shows current files on startup
Ready status confirms documents are processed
File count displays total available documents

🔧 Advanced Features

Custom Research Questions

Custom Questions Define specific research questions for targeted analysis

In Deep Research mode, you can add custom questions: 1. Click "+ Add Question" 2. Enter specific research angles 3. Click "Research These Questions" 4. Get targeted analysis for each question

Chat History Management

Chat History Conversation history and context management

Persistent conversations within sessions
Context awareness - AI remembers previous questions
Visual indicators for user vs assistant messages
Scrollable history for long conversations

Research History Sidebar

Access previous research sessions

Saved research automatically stored
Quick access to previous analyses
Delete options for cleanup
Export functionality for reports

Real-time Progress Tracking

Live research progress and source discovery

Source discovery shows websites being analyzed
Processing steps display current AI operations
Time estimates help manage expectations
Cancel option to stop long-running research

🛠️ Troubleshooting

Common Issues

1. "Provider toggle not showing"

Provider toggle requires uploaded documents

Solution: - ✅ Upload at least one document first - ✅ Select "Local Chat" mode - ✅ Toggle will appear below uploaded files

2. "Ollama not responding"

Ollama connection troubleshooting

Solutions: ```bash

Check if Ollama is running

curl http://localhost:11434/api/tags

Start Ollama service

ollama serve

Pull required models

ollama pull mistral:7b ollama pull mxbai-embed-large ```

3. "OpenAI API errors"

OpenAI API troubleshooting

Solutions: - ✅ Check API key in .env file - ✅ Verify billing and credits on OpenAI dashboard - ✅ Test API key: curl https://api.openai.com/v1/models -H "Authorization: Bearer YOUR_KEY"

4. "File upload failures"

Upload Error File upload troubleshooting

Solutions: - ✅ Check file size (max 50MB per file) - ✅ Verify file format is supported - ✅ Clear browser cache and cookies - ✅ Check backend is running on port 8000

5. "Chat button stuck loading"

Chat loading state troubleshooting

Solutions: - ✅ Check browser console for errors - ✅ Verify backend connection - ✅ Refresh the page and try again - ✅ Check if Ollama/OpenAI service is responding

Performance Tips

For Best Results: - 📄 Document Quality: Clean, well-formatted documents work best - 🧠 Model Selection: Use GPT-4o for complex analysis, Mistral for speed - 💾 File Management: Keep document count reasonable (< 50 files) - 🔄 Regular Cleanup: Clear old files to improve performance - 🌐 Network: Stable internet connection for cloud providers

Getting Help

Debug Information: - Check browser console (F12) for error messages - View backend logs in the terminal - Test API endpoints directly with curl - Verify environment variables are loaded

Support Resources: - 📖 Project Documentation - 💬 GitHub Issues - 🌟 Community Discord

🎉 Tips for Success

Effective Questioning

Good Questions: - ✅ "What are the main recommendations in these strategy documents?" - ✅ "Compare the financial performance across these quarterly reports" - ✅ "Summarize the key technical specifications mentioned"

Less Effective: - ❌ "Tell me everything about these files" - ❌ "What do you think?" - ❌ Very broad questions without context

Document Organization

Best Practices: - 📁 Group related documents for coherent analysis - 📝 Use descriptive filenames - 🗂️ Clear out old files when switching topics - 📊 Mix document types for comprehensive insights

Provider Strategy

Ollama When: - 🔒 Privacy is paramount - 💰 Controlling costs - 🏠 Working offline - 📚 Simple document Q&A

OpenAI When: - 🧠 Complex reasoning needed - 🚀 Speed is important - 🌍 Need latest information - 💼 Professional analysis

Enjoy using AI Researcher! 🚀

Last updated: January 2025

Owner

Login: minhthai1995
Kind: user

Repositories: 2
Profile: https://github.com/minhthai1995

Citation (citation.cff)

cff-version: 1.0.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: Elovic
    given-names: Assaf
title: gpt-researcher
version: 0.5.4
date-released: 2023-07-23
repository-code: https://github.com/assafelovic/gpt-researcher
url: https://gptr.dev

GitHub Events

Total

Push event: 4
Pull request event: 3
Create event: 2

Last Year

Push event: 4
Pull request event: 3
Create event: 2

Dependencies

.github/workflows/docker-build.yml actions

actions/checkout v3 composite
docker/setup-buildx-action v2 composite
docker/setup-qemu-action v2 composite

.github/workflows/docker-push.yml actions

actions/checkout v4 composite
docker/build-push-action v6 composite
docker/login-action v3 composite
docker/metadata-action v5 composite
docker/setup-buildx-action v3 composite

Dockerfile docker

gpt-researcher-install latest build
install-browser latest build
python 3.11.4-slim-bullseye build

docker-compose.yml docker

gptresearcher/gpt-researcher latest
gptresearcher/gpt-researcher-tests latest
gptresearcher/gptr-nextjs latest

docs/discord-bot/Dockerfile docker

node 18.17.0-alpine build

frontend/nextjs/Dockerfile docker

nginx latest build
node 18.17.0-alpine build

docs/discord-bot/package.json npm

discord.js ^14.16.1
dotenv ^16.4.5
express ^4.17.1
jsonrepair ^3.8.0
nodemon ^3.1.4
ws ^8.18.0

docs/npm/package.json npm

ws ^8.18.0

docs/package.json npm

@docusaurus/core 3.7.0
@docusaurus/preset-classic 3.7.0
@easyops-cn/docusaurus-search-local ^0.49.2
@mdx-js/react ^3.1.0
@svgr/webpack ^8.1.0
clsx ^1.1.1
file-loader ^6.2.0
hast-util-is-element 1.1.0
minimatch 3.0.5
react ^18.0.1
react-dom ^18.0.1
rehype-katex ^7.0.1
remark-math 3
trim ^0.0.3
url-loader ^4.1.1

frontend/nextjs/package.json npm

@babel/core ^7.26.9 development
@babel/plugin-syntax-flow ^7.26.0 development
@babel/plugin-transform-typescript ^7.26.8 development
@babel/preset-env ^7.26.9 development
@babel/preset-react ^7.26.3 development
@babel/preset-typescript ^7.26.0 development
@rollup/plugin-alias ^5.1.1 development
@rollup/plugin-babel ^6.0.4 development
@rollup/plugin-commonjs ^28.0.2 development
@rollup/plugin-json ^6.1.0 development
@rollup/plugin-node-resolve ^16.0.0 development
@rollup/plugin-replace ^6.0.2 development
@rollup/plugin-typescript ^12.1.2 development
@types/jsdom ^21.1.6 development
@types/node ^20 development
@types/react ^18 development
@types/react-dom ^18 development
autoprefixer ^10.4.20 development
eslint ^8 development
eslint-config-next 14.2.3 development
postcss ^8 development
prettier ^3.2.5 development
prettier-plugin-tailwindcss ^0.6.0 development
react-ga4 ^2.1.0 development
rollup ^2.79.2 development
rollup-plugin-peer-deps-external ^2.2.4 development
rollup-plugin-postcss ^4.0.2 development
rollup-plugin-terser ^7.0.2 development
rollup-plugin-typescript2 ^0.31.2 development
tailwindcss ^3.4.1 development
typescript ^5 development
@emotion/react ^11.10.5
@emotion/styled ^11.10.5
@langchain/langgraph-sdk ^0.0.1-rc.12
@mozilla/readability ^0.5.0
@next/third-parties ^15.1.6
axios ^1.3.2
date-fns ^4.1.0
eventsource-parser ^1.1.2
framer-motion ^9.0.2
next ^14.2.0
next-plausible ^3.12.0
react ^18.0.0
react-dom ^18.0.0
react-dropzone ^14.2.3
react-ga4 ^2.1.0
react-hot-toast ^2.4.1
rehype-prism-plus ^2.0.0
remark ^15.0.1
remark-gfm ^4.0.1
remark-html ^16.0.1
remark-parse ^11.0.0
zod ^3.0.0
zod-to-json-schema ^3.23.0

multi_agents/package.json npm

@langchain/langgraph-sdk ^0.0.1-rc.13

evals/simple_evals/requirements.txt pypi

pandas >=1.5.0
tqdm >=4.65.0

multi_agents/requirements.txt pypi

json5 *
langgraph *
langgraph-cli *
loguru *
python-dotenv *
weasyprint *

pyproject.toml pypi

PyMuPDF >=1.23.6
SQLAlchemy >=2.0.28
aiofiles >=23.2.1
aiohappyeyeballs >=2.6.1
aiohttp >=3.12.0
aiosignal >=1.3.2
annotated-types >=0.7.0
anyio >=4.9.0
arxiv >=2.0.0
attrs >=25.3.0
backoff >=2.2.1
beautifulsoup4 >=4.12.2
brotli >=1.1.0
certifi >=2025.4.26
cffi >=1.17.1
chardet >=5.2.0
charset-normalizer >=3.4.2
click >=8.2.1
colorama >=0.4.6
cryptography >=45.0.2
cssselect2 >=0.8.0
dataclasses-json >=0.6.7
distro >=1.9.0
docopt >=0.6.2
duckduckgo-search >=4.1.1
duckduckgo_search >=4.1.1
emoji >=2.14.1
fastapi >=0.104.1
feedparser >=6.0.11
filelock >=3.18.0
filetype >=1.2.0
fonttools >=4.58.0
frozenlist >=1.6.0
fsspec >=2025.5.1
greenlet >=3.2.2
h11 >=0.16.0
html5lib >=1.1
htmldocx >=0.0.6
htmldocx ^0.0.6
httpcore >=1.0.9
httpx >=0.28.1
httpx-aiohttp >=0.1.4
httpx-sse >=0.4.0
huggingface-hub >=0.32.0
idna >=3.10
importlib-metadata >=8.7.0
jinja2 >=3.1.6
jinja2 >=3.1.2
jiter >=0.10.0
joblib >=1.5.1
json-repair >=0.44.0
json-repair ^0.29.8
json5 >=0.12.0
json5 ^0.9.25
jsonpatch >=1.33
jsonpointer >=3.0.0
jsonschema >=4.23.0
jsonschema-specifications >=2025.4.1
kiwisolver >=1.4.8
langchain ^0.3.18
langchain-community >=0.3.17
langchain-core >=0.3.61
langchain-ollama >=0.3.3
langchain-openai >=0.3.6
langchain-openai ^0.3.6
langchain-text-splitters >=0.3.8
langchain_community ^0.3.17
langdetect >=1.0.9
langgraph >=0.2.76
langgraph >=0.2.73,<0.3
langgraph-checkpoint >=2.0.26
langgraph-cli >=0.2.10
langgraph-sdk >=0.1.70
langsmith >=0.3.42
litellm >=1.71.0
loguru >=0.7.3
loguru ^0.7.2
lxml >=5.4.0
lxml >=4.9.2
markdown >=3.8
markdown >=3.5.1
markdown2 >=2.5.3
markupsafe >=3.0.2
marshmallow >=3.26.1
mcp >=1.9.1
md2pdf >=1.0.1
mistune >=3.1.3
mistune ^3.0.2
multidict >=6.4.4
mypy-extensions >=1.1.0
nest-asyncio >=1.6.0
nltk >=3.9.1
numpy >=2.2.6
olefile >=0.47
ollama >=0.4.8
openai >=1.82.0
openai >=1.3.3
orjson >=3.10.18
ormsgpack >=1.10.0
packaging >=24.2
pillow >=11.2.1
primp >=0.15.0
propcache >=0.3.1
psutil >=7.0.0
pycparser >=2.22
pydantic >=2.11.5
pydantic >=2.5.1
pydantic-core >=2.33.2
pydantic-settings >=2.9.1
pydyf >=0.11.0
pymupdf >=1.26.0
pypdf >=5.5.0
pyphen >=0.17.2
python >=3.11, <4
python-docx >=1.1.2
python-docx ^1.1.0
python-dotenv >=1.1.0
python-dotenv >=1.0.0
python-iso639 >=2025.2.18
python-magic >=0.4.27
python-multipart >=0.0.6
python-multipart >=0.0.20
python-oxmsg >=0.0.2
pyyaml >=6.0.1
pyyaml >=6.0.2
rapidfuzz >=3.13.0
referencing >=0.36.2
regex >=2024.11.6
requests >=2.31.0
requests >=2.32.3
requests-toolbelt >=1.0.0
rpds-py >=0.25.1
sgmllib3k >=1.0.0
six >=1.17.0
sniffio >=1.3.1
soupsieve >=2.7
sqlalchemy >=2.0.41
sse-starlette >=2.3.5
starlette >=0.46.2
tenacity >=9.1.2
tiktoken >=0.7.0
tiktoken >=0.9.0
tinycss2 >=1.4.0
tinyhtml5 >=2.0.0
tokenizers >=0.21.1
tqdm >=4.67.1
typing-extensions >=4.13.2
typing-inspect >=0.9.0
typing-inspection >=0.4.1
unstructured >=0.13
unstructured >=0.17.2
unstructured-client >=0.35.0
urllib3 >=2.4.0
uvicorn >=0.24.0.post1
uvicorn >=0.34.2
weasyprint >=65.1 ; sys_platform != 'win32'
webencodings >=0.5.1
websockets ^13.1
websockets >=15.0.1
win32-setctime >=1.2.0
wrapt >=1.17.2
yarl >=1.20.0
zipp >=3.21.0
zopfli >=0.2.3.post1
zstandard >=0.23.0

requirements.txt pypi

aiofiles >=23.2.1
aiohappyeyeballs >=2.6.1
aiohttp >=3.12.0
aiosignal >=1.3.2
annotated-types >=0.7.0
anyio >=4.9.0
arxiv >=2.0.0
attrs >=25.3.0
beautifulsoup4 >=4.12.2
brotli >=1.1.0
certifi >=2025.4.26
cffi >=1.17.1
chardet >=5.2.0
charset-normalizer >=3.4.2
click >=8.2.1
colorama >=0.4.6
cryptography >=45.0.2
cssselect2 >=0.8.0
dataclasses-json >=0.6.7
distro >=1.9.0
docopt >=0.6.2
duckduckgo-search >=4.1.1
fastapi >=0.104.1
feedparser >=6.0.11
filelock >=3.18.0
filetype >=1.2.0
fonttools >=4.58.0
frozenlist >=1.6.0
fsspec >=2025.5.1
greenlet >=3.2.2
h11 >=0.16.0
htmldocx >=0.0.6
httpcore >=1.0.9
httpx >=0.28.1
httpx-aiohttp >=0.1.4
httpx-sse >=0.4.0
huggingface-hub >=0.32.0
idna >=3.10
importlib-metadata >=8.7.0
jinja2 >=3.1.6
jiter >=0.10.0
joblib >=1.5.1
json-repair >=0.29.8
json5 >=0.9.25
jsonpatch >=1.33
jsonpointer >=3.0.0
jsonschema >=4.23.0
jsonschema-specifications >=2025.4.1
kiwisolver >=1.4.8
langchain-community >=0.3.17
langchain-core >=0.3.61
langchain-ollama >=0.3.3
langchain-openai >=0.3.6
langchain-text-splitters >=0.3.8
langgraph >=0.2.76
langgraph-checkpoint >=2.0.26
langgraph-cli >=0.2.10
langgraph-sdk >=0.1.70
langsmith >=0.3.42
litellm >=1.71.0
loguru >=0.7.2
lxml >=4.9.2
markdown >=3.5.1
markdown2 >=2.5.3
markupsafe >=3.0.2
marshmallow >=3.26.1
mcp >=1.9.1
md2pdf >=1.0.1
mistune >=3.0.2
multidict >=6.4.4
mypy-extensions >=1.1.0
nest-asyncio >=1.6.0
numpy >=2.2.6
olefile >=0.47
ollama >=0.4.8
openai >=1.3.3
orjson >=3.10.18
ormsgpack >=1.10.0
packaging >=24.2
pillow >=11.2.1
primp >=0.15.0
propcache >=0.3.1
psutil >=7.0.0
pycparser >=2.22
pydantic >=2.5.1
pydantic-core >=2.33.2
pydantic-settings >=2.9.1
pydyf >=0.11.0
pymupdf >=1.23.6
python-docx >=1.1.0
python-dotenv >=1.0.0
python-multipart >=0.0.6
pyyaml >=6.0.1
rapidfuzz >=3.13.0
referencing >=0.36.2
regex >=2024.11.6
requests >=2.31.0
requests-toolbelt >=1.0.0
rpds-py >=0.25.1
sgmllib3k >=1.0.0
six >=1.17.0
sniffio >=1.3.1
soupsieve >=2.7
sqlalchemy >=2.0.28
sse-starlette >=2.3.5
starlette >=0.46.2
tenacity >=9.1.2
tiktoken >=0.7.0
tinycss2 >=1.4.0
tinyhtml5 >=2.0.0
tokenizers >=0.21.1
tqdm >=4.67.1
typing-extensions >=4.13.2
typing-inspect >=0.9.0
typing-inspection >=0.4.1
unstructured >=0.13
unstructured-client >=0.35.0
urllib3 >=2.4.0
uvicorn >=0.24.0.post1
webencodings >=0.5.1
websockets >=13.1
win32-setctime >=1.2.0
wrapt >=1.17.2
yarl >=1.20.0
zipp >=3.21.0
zopfli >=0.2.3.post1
zstandard >=0.23.0

requirements_minimal.txt pypi

aiofiles >=23.2.1
aiohappyeyeballs >=2.6.1
aiohttp >=3.12.0
aiosignal >=1.3.2
annotated-types >=0.7.0
anyio >=4.9.0
arxiv >=2.0.0
attrs >=25.3.0
backoff >=2.2.1
beautifulsoup4 >=4.12.2
brotli >=1.1.0
certifi >=2025.4.26
cffi >=1.17.1
chardet >=5.2.0
charset-normalizer >=3.4.2
click >=8.2.1
colorama >=0.4.6
cryptography >=45.0.2
cssselect2 >=0.8.0
dataclasses-json >=0.6.7
distro >=1.9.0
docopt >=0.6.2
duckduckgo_search >=4.1.1
emoji >=2.14.1
fastapi >=0.104.1
feedparser >=6.0.11
filelock >=3.18.0
filetype >=1.2.0
fonttools >=4.58.0
frozenlist >=1.6.0
fsspec >=2025.5.1
greenlet >=3.2.2
h11 >=0.16.0
html5lib >=1.1
htmldocx >=0.0.6
httpcore >=1.0.9
httpx >=0.28.1
httpx-aiohttp >=0.1.4
httpx-sse >=0.4.0
huggingface-hub >=0.32.0
idna >=3.10
importlib-metadata >=8.7.0
jinja2 >=3.1.2
jiter >=0.10.0
joblib >=1.5.1
json-repair >=0.29.8
json5 >=0.9.25
jsonpatch >=1.33
jsonpointer >=3.0.0
jsonschema >=4.23.0
jsonschema-specifications >=2025.4.1
kiwisolver >=1.4.8
langchain-core >=0.3.60
langchain-ollama >=0.3.3
langchain-openai >=0.3.6
langchain-text-splitters >=0.3.8
langchain_community >=0.3.17
langdetect >=1.0.9
langgraph >=0.2.73,<0.3
langgraph-checkpoint >=2.0.26
langgraph-cli >=0.2.10
langgraph-sdk >=0.1.70
langsmith >=0.3.42
litellm >=1.71.0
loguru >=0.7.3
lxml >=5.4.0
markdown >=3.8
markdown2 >=2.5.3
markupsafe >=3.0.2
marshmallow >=3.26.1
mcp >=1.9.1
md2pdf >=1.0.1
mistune >=3.1.3
multidict >=6.4.4
mypy-extensions >=1.1.0
nest-asyncio >=1.6.0
nltk >=3.9.1
numpy >=2.2.6
olefile >=0.47
ollama >=0.4.8
openai >=1.82.0
orjson >=3.10.18
ormsgpack >=1.10.0
packaging >=24.2
pillow >=11.2.1
primp >=0.15.0
propcache >=0.3.1
psutil >=7.0.0
pycparser >=2.22
pydantic >=2.11.5
pydantic-core >=2.33.2
pydantic-settings >=2.9.1
pydyf >=0.11.0
pymupdf >=1.26.0
pypdf >=5.5.0
pyphen >=0.17.2
python-docx >=1.1.2
python-dotenv >=1.1.0
python-iso639 >=2025.2.18
python-magic >=0.4.27
python-multipart >=0.0.20
python-oxmsg >=0.0.2
pyyaml >=6.0.2
rapidfuzz >=3.13.0
referencing >=0.36.2
regex >=2024.11.6
requests >=2.32.3
requests-toolbelt >=1.0.0
rpds-py >=0.25.1
sgmllib3k >=1.0.0
six >=1.17.0
sniffio >=1.3.1
soupsieve >=2.7
sqlalchemy >=2.0.41
sse-starlette >=2.3.5
starlette >=0.46.2
tenacity >=9.1.2
tiktoken >=0.9.0
tinycss2 >=1.4.0
tinyhtml5 >=2.0.0
tokenizers >=0.21.1
tqdm >=4.67.1
typing-extensions >=4.13.2
typing-inspect >=0.9.0
typing-inspection >=0.4.1
unstructured >=0.17.2
unstructured-client >=0.35.0
urllib3 >=2.4.0
uvicorn >=0.34.2
weasyprint >=65.1
webencodings >=0.5.1
websockets >=15.0.1
win32-setctime >=1.2.0
wrapt >=1.17.2
yarl >=1.20.0
zipp >=3.21.0
zopfli >=0.2.3.post1
zstandard >=0.23.0

setup.py pypi