Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.8%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: minhthai1995
  • License: apache-2.0
  • Language: Python
  • Default Branch: master
  • Size: 15.4 MB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 1
  • Releases: 0
Created 9 months ago · Last pushed 9 months ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.md

🔬 AI Researcher - User Guide

Welcome to AI Researcher! This guide will walk you through everything you need to know to get the most out of your AI-powered research assistant.

Main interface of AI Researcher

📋 Table of Contents

  1. Getting Started
  2. Understanding the Interface
  3. Local Chat Mode
  4. Deep Research Mode
  5. Provider Selection (Ollama vs OpenAI)
  6. File Management
  7. Advanced Features
  8. Troubleshooting

🚀 Getting Started

Prerequisites

  • Python 3.8+
  • Node.js 18+
  • Ollama (for local AI models)
  • OpenAI API key (for cloud models)

Quick Setup

  1. Clone and install dependencies: bash git clone https://github.com/assafelovic/gpt-researcher cd gpt-researcher pip install -r requirements.txt cd frontend/nextjs && npm install

  2. Configure environment: ```bash

    Copy and edit your .env file

    cp .env.example .env

    Add your API keys and settings

    ```

  3. Start the services: ```bash

    Terminal 1: Start backend

    uvicorn backend.server.server:app --host=0.0.0.0 --port=8000 --reload

# Terminal 2: Start frontend
cd frontend/nextjs && npm run dev ```

  1. Access the application: Open http://localhost:3000 in your browser

🎯 Understanding the Interface

Main components of the AI Researcher interface

Key Components

  1. 🏠 Header - Navigation and branding
  2. 📚 Mode Selection - Choose between Local Chat and Deep Research
  3. 📁 File Upload Area - Drag and drop your documents
  4. ⚙️ Provider Toggle - Switch between Ollama (local) and OpenAI (cloud)
  5. 💬 Chat Interface - Input area and conversation history
  6. 📊 Results Display - Research outputs and analysis

💬 Local Chat Mode

Local Chat mode lets you have conversations with your uploaded documents using either local or cloud AI models.

Step 1: Upload Documents

File Upload Drag and drop files or click to select

  1. Drag and drop files into the upload area, or click to browse
  2. Supported formats: PDF, DOCX, TXT, MD, CSV, XLS, XLSX
  3. Multiple files can be uploaded at once
  4. Files are automatically processed and ready for chat

Step 2: Select Local Chat Mode

Mode Selection Choose Local Chat for document-only conversations

  • Click the "Local Chat" button (blue)
  • This mode focuses only on your uploaded documents
  • No internet search is performed

Step 3: Choose Your AI Provider

Provider Selection Select between Ollama (private) and OpenAI (cloud)

Ollama (Local & Private): - ✅ 100% privacy - data never leaves your machine - ✅ No API costs - ✅ Works offline - ⚠️ Requires Ollama to be running locally - ⚠️ May be slower than cloud models

OpenAI (Cloud): - ✅ Latest AI models (GPT-4o, etc.) - ✅ Faster responses - ✅ Advanced reasoning capabilities - ⚠️ Requires internet connection - ⚠️ Uses API credits

Step 4: Start Chatting

Local Chat Interface Chat interface with conversation history

  1. Type your question about the documents
  2. Press Enter or click the "Chat" button
  3. Watch the loading indicator while AI processes your request
  4. View the response based on your document content

Example Questions:

  • "What are the main findings in this research paper?"
  • "Summarize the key points from these meeting notes"
  • "What budget items are mentioned in the financial report?"
  • "Compare the approaches described in these documents"

🔍 Deep Research Mode

Deep Research mode combines your documents with internet search for comprehensive analysis.

Step 1: Select Deep Research Mode

Deep Research Mode Deep Research mode for comprehensive analysis

  • Click the "Deep Research" button (teal)
  • This mode uses both your documents AND internet sources
  • Provides more comprehensive and current information

Step 2: Configure Research Settings

Research Settings Advanced provider and model selection

  1. Choose Provider: OpenAI or Ollama
  2. Select Model: Different models for different needs
    • GPT-4o: Most advanced, best for complex analysis
    • GPT-4o Mini: Fast and efficient for most tasks
    • Mistral 7B: Local alternative, good balance
  3. Optional: Add custom research questions

Step 3: Run Research

Research Process Real-time research progress and results

  1. Enter your research topic
  2. Click "Research" to start
  3. Watch real-time progress as sources are found and analyzed
  4. Review comprehensive report with citations and sources

Research Output Includes:

  • Executive Summary
  • Detailed Analysis
  • Source Citations
  • Related Documents from your uploads
  • Downloadable Reports (PDF, DOCX)

⚙️ Provider Selection

Ollama (Local AI)

Ollama local setup and model selection

Setup Requirements: 1. Install Ollama: brew install ollama (macOS) or visit ollama.ai 2. Start Ollama: ollama serve 3. Download models: ollama pull mistral:7b

Available Models: - Mistral 7B - Balanced performance and speed - Llama 3.2 3B - Fast and lightweight - Llama 3.3 70B - Highest quality (requires more RAM) - Phi-3 Mini - Microsoft's efficient model - DeepSeek R1 - Great for reasoning tasks

Best For: - 🔒 Privacy-sensitive documents - 💰 Cost-conscious usage - 🌐 Offline environments - 🏠 Personal projects

OpenAI (Cloud AI)

OpenAI API configuration and model options

Setup Requirements: 1. Get API Key from platform.openai.com 2. Add to .env file: OPENAI_API_KEY=your_key_here 3. Monitor usage on OpenAI dashboard

Available Models: - GPT-4o - Most advanced, best reasoning - GPT-4o Mini - Fast, cost-effective - GPT-4 Turbo - Good balance of speed and capability

Best For: - 🧠 Complex analysis and reasoning - 🚀 Fastest response times - 📈 Professional/business use - 🌍 Latest information synthesis


📁 File Management

Supported File Types

Supported document formats

| Format | Extension | Best For | |--------|-----------|----------| | PDF | .pdf | Research papers, reports, presentations | | Word | .docx, .doc | Documents, proposals, notes | | Text | .txt, .md | Code, documentation, plain text | | Spreadsheet | .xlsx, .xls, .csv | Data, financial reports, logs |

File Operations

File upload, preview, and management interface

Upload: - Drag & Drop multiple files at once - Click to browse for traditional file selection - Automatic processing extracts text content - Real-time feedback shows upload progress

Manage: - View uploaded files with size and type info - Remove individual files using the X button - Clear all files with the "Clear All" button - Auto-refresh checks for new files every 5 seconds

Session Persistence

Current session documents display

  • Files persist between browser sessions
  • Document indicator shows current files on startup
  • Ready status confirms documents are processed
  • File count displays total available documents

🔧 Advanced Features

Custom Research Questions

Custom Questions Define specific research questions for targeted analysis

In Deep Research mode, you can add custom questions: 1. Click "+ Add Question" 2. Enter specific research angles 3. Click "Research These Questions" 4. Get targeted analysis for each question

Chat History Management

Chat History Conversation history and context management

  • Persistent conversations within sessions
  • Context awareness - AI remembers previous questions
  • Visual indicators for user vs assistant messages
  • Scrollable history for long conversations

Research History Sidebar

Access previous research sessions

  • Saved research automatically stored
  • Quick access to previous analyses
  • Delete options for cleanup
  • Export functionality for reports

Real-time Progress Tracking

Live research progress and source discovery

  • Source discovery shows websites being analyzed
  • Processing steps display current AI operations
  • Time estimates help manage expectations
  • Cancel option to stop long-running research

🛠️ Troubleshooting

Common Issues

1. "Provider toggle not showing"

Provider toggle requires uploaded documents

Solution: - ✅ Upload at least one document first - ✅ Select "Local Chat" mode - ✅ Toggle will appear below uploaded files

2. "Ollama not responding"

Ollama connection troubleshooting

Solutions: ```bash

Check if Ollama is running

curl http://localhost:11434/api/tags

Start Ollama service

ollama serve

Pull required models

ollama pull mistral:7b ollama pull mxbai-embed-large ```

3. "OpenAI API errors"

OpenAI API troubleshooting

Solutions: - ✅ Check API key in .env file - ✅ Verify billing and credits on OpenAI dashboard - ✅ Test API key: curl https://api.openai.com/v1/models -H "Authorization: Bearer YOUR_KEY"

4. "File upload failures"

Upload Error File upload troubleshooting

Solutions: - ✅ Check file size (max 50MB per file) - ✅ Verify file format is supported - ✅ Clear browser cache and cookies - ✅ Check backend is running on port 8000

5. "Chat button stuck loading"

Loading Issue Chat loading state troubleshooting

Solutions: - ✅ Check browser console for errors - ✅ Verify backend connection - ✅ Refresh the page and try again - ✅ Check if Ollama/OpenAI service is responding

Performance Tips

For Best Results: - 📄 Document Quality: Clean, well-formatted documents work best - 🧠 Model Selection: Use GPT-4o for complex analysis, Mistral for speed - 💾 File Management: Keep document count reasonable (< 50 files) - 🔄 Regular Cleanup: Clear old files to improve performance - 🌐 Network: Stable internet connection for cloud providers

Getting Help

Debug Information: - Check browser console (F12) for error messages - View backend logs in the terminal - Test API endpoints directly with curl - Verify environment variables are loaded

Support Resources: - 📖 Project Documentation - 💬 GitHub Issues - 🌟 Community Discord


🎉 Tips for Success

Effective Questioning

Good Questions: - ✅ "What are the main recommendations in these strategy documents?" - ✅ "Compare the financial performance across these quarterly reports" - ✅ "Summarize the key technical specifications mentioned"

Less Effective: - ❌ "Tell me everything about these files" - ❌ "What do you think?" - ❌ Very broad questions without context

Document Organization

Best Practices: - 📁 Group related documents for coherent analysis - 📝 Use descriptive filenames - 🗂️ Clear out old files when switching topics - 📊 Mix document types for comprehensive insights

Provider Strategy

Ollama When: - 🔒 Privacy is paramount - 💰 Controlling costs - 🏠 Working offline - 📚 Simple document Q&A

OpenAI When: - 🧠 Complex reasoning needed - 🚀 Speed is important - 🌍 Need latest information - 💼 Professional analysis


Enjoy using AI Researcher! 🚀

Last updated: January 2025

Owner

  • Login: minhthai1995
  • Kind: user

Citation (citation.cff)

cff-version: 1.0.0
message: "If you use this software, please cite it as below."
authors:
  - family-names: Elovic
    given-names: Assaf
title: gpt-researcher
version: 0.5.4
date-released: 2023-07-23
repository-code: https://github.com/assafelovic/gpt-researcher
url: https://gptr.dev

GitHub Events

Total
  • Push event: 4
  • Pull request event: 3
  • Create event: 2
Last Year
  • Push event: 4
  • Pull request event: 3
  • Create event: 2

Dependencies

.github/workflows/docker-build.yml actions
  • actions/checkout v3 composite
  • docker/setup-buildx-action v2 composite
  • docker/setup-qemu-action v2 composite
.github/workflows/docker-push.yml actions
  • actions/checkout v4 composite
  • docker/build-push-action v6 composite
  • docker/login-action v3 composite
  • docker/metadata-action v5 composite
  • docker/setup-buildx-action v3 composite
Dockerfile docker
  • gpt-researcher-install latest build
  • install-browser latest build
  • python 3.11.4-slim-bullseye build
docker-compose.yml docker
  • gptresearcher/gpt-researcher latest
  • gptresearcher/gpt-researcher-tests latest
  • gptresearcher/gptr-nextjs latest
docs/discord-bot/Dockerfile docker
  • node 18.17.0-alpine build
frontend/nextjs/Dockerfile docker
  • nginx latest build
  • node 18.17.0-alpine build
docs/discord-bot/package.json npm
  • discord.js ^14.16.1
  • dotenv ^16.4.5
  • express ^4.17.1
  • jsonrepair ^3.8.0
  • nodemon ^3.1.4
  • ws ^8.18.0
docs/npm/package.json npm
  • ws ^8.18.0
docs/package.json npm
  • @docusaurus/core 3.7.0
  • @docusaurus/preset-classic 3.7.0
  • @easyops-cn/docusaurus-search-local ^0.49.2
  • @mdx-js/react ^3.1.0
  • @svgr/webpack ^8.1.0
  • clsx ^1.1.1
  • file-loader ^6.2.0
  • hast-util-is-element 1.1.0
  • minimatch 3.0.5
  • react ^18.0.1
  • react-dom ^18.0.1
  • rehype-katex ^7.0.1
  • remark-math 3
  • trim ^0.0.3
  • url-loader ^4.1.1
frontend/nextjs/package.json npm
  • @babel/core ^7.26.9 development
  • @babel/plugin-syntax-flow ^7.26.0 development
  • @babel/plugin-transform-typescript ^7.26.8 development
  • @babel/preset-env ^7.26.9 development
  • @babel/preset-react ^7.26.3 development
  • @babel/preset-typescript ^7.26.0 development
  • @rollup/plugin-alias ^5.1.1 development
  • @rollup/plugin-babel ^6.0.4 development
  • @rollup/plugin-commonjs ^28.0.2 development
  • @rollup/plugin-json ^6.1.0 development
  • @rollup/plugin-node-resolve ^16.0.0 development
  • @rollup/plugin-replace ^6.0.2 development
  • @rollup/plugin-typescript ^12.1.2 development
  • @types/jsdom ^21.1.6 development
  • @types/node ^20 development
  • @types/react ^18 development
  • @types/react-dom ^18 development
  • autoprefixer ^10.4.20 development
  • eslint ^8 development
  • eslint-config-next 14.2.3 development
  • postcss ^8 development
  • prettier ^3.2.5 development
  • prettier-plugin-tailwindcss ^0.6.0 development
  • react-ga4 ^2.1.0 development
  • rollup ^2.79.2 development
  • rollup-plugin-peer-deps-external ^2.2.4 development
  • rollup-plugin-postcss ^4.0.2 development
  • rollup-plugin-terser ^7.0.2 development
  • rollup-plugin-typescript2 ^0.31.2 development
  • tailwindcss ^3.4.1 development
  • typescript ^5 development
  • @emotion/react ^11.10.5
  • @emotion/styled ^11.10.5
  • @langchain/langgraph-sdk ^0.0.1-rc.12
  • @mozilla/readability ^0.5.0
  • @next/third-parties ^15.1.6
  • axios ^1.3.2
  • date-fns ^4.1.0
  • eventsource-parser ^1.1.2
  • framer-motion ^9.0.2
  • next ^14.2.0
  • next-plausible ^3.12.0
  • react ^18.0.0
  • react-dom ^18.0.0
  • react-dropzone ^14.2.3
  • react-ga4 ^2.1.0
  • react-hot-toast ^2.4.1
  • rehype-prism-plus ^2.0.0
  • remark ^15.0.1
  • remark-gfm ^4.0.1
  • remark-html ^16.0.1
  • remark-parse ^11.0.0
  • zod ^3.0.0
  • zod-to-json-schema ^3.23.0
multi_agents/package.json npm
  • @langchain/langgraph-sdk ^0.0.1-rc.13
evals/simple_evals/requirements.txt pypi
  • pandas >=1.5.0
  • tqdm >=4.65.0
multi_agents/requirements.txt pypi
  • json5 *
  • langgraph *
  • langgraph-cli *
  • loguru *
  • python-dotenv *
  • weasyprint *
pyproject.toml pypi
  • PyMuPDF >=1.23.6
  • SQLAlchemy >=2.0.28
  • aiofiles >=23.2.1
  • aiohappyeyeballs >=2.6.1
  • aiohttp >=3.12.0
  • aiosignal >=1.3.2
  • annotated-types >=0.7.0
  • anyio >=4.9.0
  • arxiv >=2.0.0
  • attrs >=25.3.0
  • backoff >=2.2.1
  • beautifulsoup4 >=4.12.2
  • brotli >=1.1.0
  • certifi >=2025.4.26
  • cffi >=1.17.1
  • chardet >=5.2.0
  • charset-normalizer >=3.4.2
  • click >=8.2.1
  • colorama >=0.4.6
  • cryptography >=45.0.2
  • cssselect2 >=0.8.0
  • dataclasses-json >=0.6.7
  • distro >=1.9.0
  • docopt >=0.6.2
  • duckduckgo-search >=4.1.1
  • duckduckgo_search >=4.1.1
  • emoji >=2.14.1
  • fastapi >=0.104.1
  • feedparser >=6.0.11
  • filelock >=3.18.0
  • filetype >=1.2.0
  • fonttools >=4.58.0
  • frozenlist >=1.6.0
  • fsspec >=2025.5.1
  • greenlet >=3.2.2
  • h11 >=0.16.0
  • html5lib >=1.1
  • htmldocx >=0.0.6
  • htmldocx ^0.0.6
  • httpcore >=1.0.9
  • httpx >=0.28.1
  • httpx-aiohttp >=0.1.4
  • httpx-sse >=0.4.0
  • huggingface-hub >=0.32.0
  • idna >=3.10
  • importlib-metadata >=8.7.0
  • jinja2 >=3.1.6
  • jinja2 >=3.1.2
  • jiter >=0.10.0
  • joblib >=1.5.1
  • json-repair >=0.44.0
  • json-repair ^0.29.8
  • json5 >=0.12.0
  • json5 ^0.9.25
  • jsonpatch >=1.33
  • jsonpointer >=3.0.0
  • jsonschema >=4.23.0
  • jsonschema-specifications >=2025.4.1
  • kiwisolver >=1.4.8
  • langchain ^0.3.18
  • langchain-community >=0.3.17
  • langchain-core >=0.3.61
  • langchain-ollama >=0.3.3
  • langchain-openai >=0.3.6
  • langchain-openai ^0.3.6
  • langchain-text-splitters >=0.3.8
  • langchain_community ^0.3.17
  • langdetect >=1.0.9
  • langgraph >=0.2.76
  • langgraph >=0.2.73,<0.3
  • langgraph-checkpoint >=2.0.26
  • langgraph-cli >=0.2.10
  • langgraph-sdk >=0.1.70
  • langsmith >=0.3.42
  • litellm >=1.71.0
  • loguru >=0.7.3
  • loguru ^0.7.2
  • lxml >=5.4.0
  • lxml >=4.9.2
  • markdown >=3.8
  • markdown >=3.5.1
  • markdown2 >=2.5.3
  • markupsafe >=3.0.2
  • marshmallow >=3.26.1
  • mcp >=1.9.1
  • md2pdf >=1.0.1
  • mistune >=3.1.3
  • mistune ^3.0.2
  • multidict >=6.4.4
  • mypy-extensions >=1.1.0
  • nest-asyncio >=1.6.0
  • nltk >=3.9.1
  • numpy >=2.2.6
  • olefile >=0.47
  • ollama >=0.4.8
  • openai >=1.82.0
  • openai >=1.3.3
  • orjson >=3.10.18
  • ormsgpack >=1.10.0
  • packaging >=24.2
  • pillow >=11.2.1
  • primp >=0.15.0
  • propcache >=0.3.1
  • psutil >=7.0.0
  • pycparser >=2.22
  • pydantic >=2.11.5
  • pydantic >=2.5.1
  • pydantic-core >=2.33.2
  • pydantic-settings >=2.9.1
  • pydyf >=0.11.0
  • pymupdf >=1.26.0
  • pypdf >=5.5.0
  • pyphen >=0.17.2
  • python >=3.11, <4
  • python-docx >=1.1.2
  • python-docx ^1.1.0
  • python-dotenv >=1.1.0
  • python-dotenv >=1.0.0
  • python-iso639 >=2025.2.18
  • python-magic >=0.4.27
  • python-multipart >=0.0.6
  • python-multipart >=0.0.20
  • python-oxmsg >=0.0.2
  • pyyaml >=6.0.1
  • pyyaml >=6.0.2
  • rapidfuzz >=3.13.0
  • referencing >=0.36.2
  • regex >=2024.11.6
  • requests >=2.31.0
  • requests >=2.32.3
  • requests-toolbelt >=1.0.0
  • rpds-py >=0.25.1
  • sgmllib3k >=1.0.0
  • six >=1.17.0
  • sniffio >=1.3.1
  • soupsieve >=2.7
  • sqlalchemy >=2.0.41
  • sse-starlette >=2.3.5
  • starlette >=0.46.2
  • tenacity >=9.1.2
  • tiktoken >=0.7.0
  • tiktoken >=0.9.0
  • tinycss2 >=1.4.0
  • tinyhtml5 >=2.0.0
  • tokenizers >=0.21.1
  • tqdm >=4.67.1
  • typing-extensions >=4.13.2
  • typing-inspect >=0.9.0
  • typing-inspection >=0.4.1
  • unstructured >=0.13
  • unstructured >=0.17.2
  • unstructured-client >=0.35.0
  • urllib3 >=2.4.0
  • uvicorn >=0.24.0.post1
  • uvicorn >=0.34.2
  • weasyprint >=65.1 ; sys_platform != 'win32'
  • webencodings >=0.5.1
  • websockets ^13.1
  • websockets >=15.0.1
  • win32-setctime >=1.2.0
  • wrapt >=1.17.2
  • yarl >=1.20.0
  • zipp >=3.21.0
  • zopfli >=0.2.3.post1
  • zstandard >=0.23.0
requirements.txt pypi
  • aiofiles >=23.2.1
  • aiohappyeyeballs >=2.6.1
  • aiohttp >=3.12.0
  • aiosignal >=1.3.2
  • annotated-types >=0.7.0
  • anyio >=4.9.0
  • arxiv >=2.0.0
  • attrs >=25.3.0
  • beautifulsoup4 >=4.12.2
  • brotli >=1.1.0
  • certifi >=2025.4.26
  • cffi >=1.17.1
  • chardet >=5.2.0
  • charset-normalizer >=3.4.2
  • click >=8.2.1
  • colorama >=0.4.6
  • cryptography >=45.0.2
  • cssselect2 >=0.8.0
  • dataclasses-json >=0.6.7
  • distro >=1.9.0
  • docopt >=0.6.2
  • duckduckgo-search >=4.1.1
  • fastapi >=0.104.1
  • feedparser >=6.0.11
  • filelock >=3.18.0
  • filetype >=1.2.0
  • fonttools >=4.58.0
  • frozenlist >=1.6.0
  • fsspec >=2025.5.1
  • greenlet >=3.2.2
  • h11 >=0.16.0
  • htmldocx >=0.0.6
  • httpcore >=1.0.9
  • httpx >=0.28.1
  • httpx-aiohttp >=0.1.4
  • httpx-sse >=0.4.0
  • huggingface-hub >=0.32.0
  • idna >=3.10
  • importlib-metadata >=8.7.0
  • jinja2 >=3.1.6
  • jiter >=0.10.0
  • joblib >=1.5.1
  • json-repair >=0.29.8
  • json5 >=0.9.25
  • jsonpatch >=1.33
  • jsonpointer >=3.0.0
  • jsonschema >=4.23.0
  • jsonschema-specifications >=2025.4.1
  • kiwisolver >=1.4.8
  • langchain-community >=0.3.17
  • langchain-core >=0.3.61
  • langchain-ollama >=0.3.3
  • langchain-openai >=0.3.6
  • langchain-text-splitters >=0.3.8
  • langgraph >=0.2.76
  • langgraph-checkpoint >=2.0.26
  • langgraph-cli >=0.2.10
  • langgraph-sdk >=0.1.70
  • langsmith >=0.3.42
  • litellm >=1.71.0
  • loguru >=0.7.2
  • lxml >=4.9.2
  • markdown >=3.5.1
  • markdown2 >=2.5.3
  • markupsafe >=3.0.2
  • marshmallow >=3.26.1
  • mcp >=1.9.1
  • md2pdf >=1.0.1
  • mistune >=3.0.2
  • multidict >=6.4.4
  • mypy-extensions >=1.1.0
  • nest-asyncio >=1.6.0
  • numpy >=2.2.6
  • olefile >=0.47
  • ollama >=0.4.8
  • openai >=1.3.3
  • orjson >=3.10.18
  • ormsgpack >=1.10.0
  • packaging >=24.2
  • pillow >=11.2.1
  • primp >=0.15.0
  • propcache >=0.3.1
  • psutil >=7.0.0
  • pycparser >=2.22
  • pydantic >=2.5.1
  • pydantic-core >=2.33.2
  • pydantic-settings >=2.9.1
  • pydyf >=0.11.0
  • pymupdf >=1.23.6
  • python-docx >=1.1.0
  • python-dotenv >=1.0.0
  • python-multipart >=0.0.6
  • pyyaml >=6.0.1
  • rapidfuzz >=3.13.0
  • referencing >=0.36.2
  • regex >=2024.11.6
  • requests >=2.31.0
  • requests-toolbelt >=1.0.0
  • rpds-py >=0.25.1
  • sgmllib3k >=1.0.0
  • six >=1.17.0
  • sniffio >=1.3.1
  • soupsieve >=2.7
  • sqlalchemy >=2.0.28
  • sse-starlette >=2.3.5
  • starlette >=0.46.2
  • tenacity >=9.1.2
  • tiktoken >=0.7.0
  • tinycss2 >=1.4.0
  • tinyhtml5 >=2.0.0
  • tokenizers >=0.21.1
  • tqdm >=4.67.1
  • typing-extensions >=4.13.2
  • typing-inspect >=0.9.0
  • typing-inspection >=0.4.1
  • unstructured >=0.13
  • unstructured-client >=0.35.0
  • urllib3 >=2.4.0
  • uvicorn >=0.24.0.post1
  • webencodings >=0.5.1
  • websockets >=13.1
  • win32-setctime >=1.2.0
  • wrapt >=1.17.2
  • yarl >=1.20.0
  • zipp >=3.21.0
  • zopfli >=0.2.3.post1
  • zstandard >=0.23.0
requirements_minimal.txt pypi
  • aiofiles >=23.2.1
  • aiohappyeyeballs >=2.6.1
  • aiohttp >=3.12.0
  • aiosignal >=1.3.2
  • annotated-types >=0.7.0
  • anyio >=4.9.0
  • arxiv >=2.0.0
  • attrs >=25.3.0
  • backoff >=2.2.1
  • beautifulsoup4 >=4.12.2
  • brotli >=1.1.0
  • certifi >=2025.4.26
  • cffi >=1.17.1
  • chardet >=5.2.0
  • charset-normalizer >=3.4.2
  • click >=8.2.1
  • colorama >=0.4.6
  • cryptography >=45.0.2
  • cssselect2 >=0.8.0
  • dataclasses-json >=0.6.7
  • distro >=1.9.0
  • docopt >=0.6.2
  • duckduckgo_search >=4.1.1
  • emoji >=2.14.1
  • fastapi >=0.104.1
  • feedparser >=6.0.11
  • filelock >=3.18.0
  • filetype >=1.2.0
  • fonttools >=4.58.0
  • frozenlist >=1.6.0
  • fsspec >=2025.5.1
  • greenlet >=3.2.2
  • h11 >=0.16.0
  • html5lib >=1.1
  • htmldocx >=0.0.6
  • httpcore >=1.0.9
  • httpx >=0.28.1
  • httpx-aiohttp >=0.1.4
  • httpx-sse >=0.4.0
  • huggingface-hub >=0.32.0
  • idna >=3.10
  • importlib-metadata >=8.7.0
  • jinja2 >=3.1.2
  • jiter >=0.10.0
  • joblib >=1.5.1
  • json-repair >=0.29.8
  • json5 >=0.9.25
  • jsonpatch >=1.33
  • jsonpointer >=3.0.0
  • jsonschema >=4.23.0
  • jsonschema-specifications >=2025.4.1
  • kiwisolver >=1.4.8
  • langchain-core >=0.3.60
  • langchain-ollama >=0.3.3
  • langchain-openai >=0.3.6
  • langchain-text-splitters >=0.3.8
  • langchain_community >=0.3.17
  • langdetect >=1.0.9
  • langgraph >=0.2.73,<0.3
  • langgraph-checkpoint >=2.0.26
  • langgraph-cli >=0.2.10
  • langgraph-sdk >=0.1.70
  • langsmith >=0.3.42
  • litellm >=1.71.0
  • loguru >=0.7.3
  • lxml >=5.4.0
  • markdown >=3.8
  • markdown2 >=2.5.3
  • markupsafe >=3.0.2
  • marshmallow >=3.26.1
  • mcp >=1.9.1
  • md2pdf >=1.0.1
  • mistune >=3.1.3
  • multidict >=6.4.4
  • mypy-extensions >=1.1.0
  • nest-asyncio >=1.6.0
  • nltk >=3.9.1
  • numpy >=2.2.6
  • olefile >=0.47
  • ollama >=0.4.8
  • openai >=1.82.0
  • orjson >=3.10.18
  • ormsgpack >=1.10.0
  • packaging >=24.2
  • pillow >=11.2.1
  • primp >=0.15.0
  • propcache >=0.3.1
  • psutil >=7.0.0
  • pycparser >=2.22
  • pydantic >=2.11.5
  • pydantic-core >=2.33.2
  • pydantic-settings >=2.9.1
  • pydyf >=0.11.0
  • pymupdf >=1.26.0
  • pypdf >=5.5.0
  • pyphen >=0.17.2
  • python-docx >=1.1.2
  • python-dotenv >=1.1.0
  • python-iso639 >=2025.2.18
  • python-magic >=0.4.27
  • python-multipart >=0.0.20
  • python-oxmsg >=0.0.2
  • pyyaml >=6.0.2
  • rapidfuzz >=3.13.0
  • referencing >=0.36.2
  • regex >=2024.11.6
  • requests >=2.32.3
  • requests-toolbelt >=1.0.0
  • rpds-py >=0.25.1
  • sgmllib3k >=1.0.0
  • six >=1.17.0
  • sniffio >=1.3.1
  • soupsieve >=2.7
  • sqlalchemy >=2.0.41
  • sse-starlette >=2.3.5
  • starlette >=0.46.2
  • tenacity >=9.1.2
  • tiktoken >=0.9.0
  • tinycss2 >=1.4.0
  • tinyhtml5 >=2.0.0
  • tokenizers >=0.21.1
  • tqdm >=4.67.1
  • typing-extensions >=4.13.2
  • typing-inspect >=0.9.0
  • typing-inspection >=0.4.1
  • unstructured >=0.17.2
  • unstructured-client >=0.35.0
  • urllib3 >=2.4.0
  • uvicorn >=0.34.2
  • weasyprint >=65.1
  • webencodings >=0.5.1
  • websockets >=15.0.1
  • win32-setctime >=1.2.0
  • wrapt >=1.17.2
  • yarl >=1.20.0
  • zipp >=3.21.0
  • zopfli >=0.2.3.post1
  • zstandard >=0.23.0
setup.py pypi