deep_research_final
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.6%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: saiteja12-g
- Language: Python
- Default Branch: main
- Size: 309 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Research Paper Assistant
A comprehensive system for extracting, analyzing, and generating review papers from academic research using AI.
Overview
This application provides a complete workflow for academic research paper processing:
- Research Paper Extraction: Automatically fetch papers from arXiv based on your query and follow citation networks
- Knowledge Base Integration: Load extracted papers into a Neo4j graph database and vector database
- Contextual Analysis: Process papers to extract key themes, methodologies, strengths, and limitations
- Review Paper Generation: Generate comprehensive review papers using AI agents
Features
- Intelligent Paper Discovery: BFS traversal of citation networks starting from initial query results
- Graph-based Knowledge Representation: Store papers and their relationships in Neo4j
- Semantic Search: Find related papers using vector embeddings in ChromaDB
- Image Processing: Extract and analyze figures from research papers
- Citation Mapping: Identify and map citations between papers
- AI-Powered Review Generation: Generate structured review papers with proper citations using LLM agents
- Interactive UI: Streamlit-based frontend for easy interaction
Setup and Installation
Prerequisites
- Python 3.9+
- Docker (for running Neo4j)
- OpenAI API key
Installation
Clone this repository:
bash git clone <your-repository-url> cd <repository-directory>Create a virtual environment (optional but recommended):
bash python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activateInstall the required Python packages:
bash pip install -r requirements.txtCreate a
.envfile in the project root with your OpenAI API key:OPENAI_API_KEY=your-api-key-here
Running the App
Using the Workflow Manager (Recommended)
The workflow.py script provides a simplified way to run the complete pipeline or individual steps:
Run the complete workflow:
bash python workflow.py --query "Single image to 3D" --full-workflowExtract papers only:
bash python workflow.py --query "Single image to 3D" --extract-only --max-depth 2 --max-papers 5Load extracted papers to database:
bash python workflow.py --load-onlyGenerate a review paper:
bash python workflow.py --query "Single image to 3D" --generate-reviewContinue a previously started paper:
bash python workflow.py --generate-review --continue
Using the Streamlit UI
Start the Streamlit app:
bash streamlit run frontend.pyOpen your browser and navigate to
http://localhost:8501In the Streamlit interface:
- Configure your environment settings
- Start the Neo4j database
- Run paper extraction, loading, or review generation processes
Running Components Individually
Alternatively, you can run each component separately:
Start Neo4j (required for knowledge storage):
bash docker run -p 7474:7474 -p 7687:7687 --env NEO4J_AUTH=neo4j/research123 neo4j:latestExtract papers from arXiv:
bash python papers_extractor_bfs.pyProcess and load papers into knowledge base:
bash python knowledge_base.pyGenerate a review paper:
bash python main.py --query "Your research topic"
Running with Docker (Not Tested)
The application includes Docker support for easy deployment:
Build the Docker image:
bash docker build -t research-paper-assistant .Run using docker-compose (handles both the app and Neo4j):
bash docker-compose upAccess the Streamlit interface at
http://localhost:8501
Project Structure
frontend.py: Streamlit applicationworkflow.py: Complete workflow managerpapers_extractor_bfs.py: ArXiv paper extraction with BFS traversalknowledge_base.py: Database and knowledge storage integrationscitation_mapper.py: Handles paper citationsprocessing_pipeline.py: Text and image processingreview_writer.py: AI-powered paper generationmain.py: Command-line interface for review generation
Folder Structure
/papers- Downloaded PDF files/papers_summary- Extracted metadata in JSON format/output- Generated review papers and figures/chroma_db- Vector embeddings database/neo4j- Graph database files
Troubleshooting
- Docker Issues: Ensure Docker is running and you have permission to create containers
- API Rate Limits: If you encounter OpenAI API rate limits, add waiting periods or implement retries
- Memory Issues: Reduce batch sizes in the extraction and processing pipelines for lower memory usage
- Neo4j Connection: Ensure the Neo4j container is running before running knowledge base operations
Video Demo
Flowcharts
Model Architecture

Agent Workflow

Click the image above to watch the demo video of the Research Paper Assistant in action.
Acknowledgements
Owner
- Name: SAI TEJA GILUKARA
- Login: saiteja12-g
- Kind: user
- Repositories: 1
- Profile: https://github.com/saiteja12-g
Coding maniac
GitHub Events
Total
- Push event: 20
- Create event: 1
Last Year
- Push event: 20
- Create event: 1
