research-assistant-ai

Research Assistant AI is a powerful application built with Streamlit that enables users to analyze research papers efficiently.

https://github.com/anubhab-m02/research-assistant-ai

Science Score: 31.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.7%) to scientific vocabulary

Keywords

ai ai-assistant artificial-intelligence generative-ai research streamlit
Last synced: 6 months ago · JSON representation ·

Repository

Research Assistant AI is a powerful application built with Streamlit that enables users to analyze research papers efficiently.

Basic Info
  • Host: GitHub
  • Owner: anubhab-m02
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 18.6 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
ai ai-assistant artificial-intelligence generative-ai research streamlit
Created over 1 year ago · Last pushed over 1 year ago
Metadata Files
Readme Citation

readme.md

Research Assistant AI

Overview

This application is a Research Assistant AI built using Streamlit. It allows users to upload research papers in PDF format or paste their content, and then perform various analyses such as extracting citations, performing semantic searches, asking questions, and summarizing the papers. The application leverages Google Generative AI for natural language processing tasks.

Features

  • Upload and Analyze Research Papers: Upload PDF files or paste text content for analysis.
  • Extract Citations: Extract and format citations from the research papers.
  • Semantic Search: Perform semantic searches across the uploaded papers.
  • Ask Questions: Ask specific questions about the content of the papers and get AI-generated answers.
  • Summarize Papers: Generate concise summaries of the research papers.
  • Find Related Papers: Suggest related research papers based on the analysis.

Prerequisites

  • Python 3.10 or higher
  • Google API Key for accessing Google Generative AI

Installation

  1. Clone the repository: sh git clone https://github.com/anubhab-m02/Research-Assistant-AI.git cd research-paper-analysis-assistant

  2. Create a virtual environment: sh python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`

  3. Install the dependencies: sh pip install -r requirements.txt

Configuration

  1. Set up your Google API Key:
    • Obtain an API key from Google Cloud Platform.

Running the Application

  1. Start the Streamlit application: sh streamlit run app.py

  2. Open your web browser and navigate to http://localhost:8501 to access the application.

Code Structure

  • app.py: Main entry point of the application. Configures the Streamlit page and handles the main logic.

  • ui_layout.py: Contains functions to render the sidebar and main content layout.

  • analysis.py: Functions for analyzing research papers, comparing papers, and summarizing content.

  • pdf_utils.py: Utility functions for extracting text from PDF files.

  • citation.py: Functions for extracting and formatting citations.

  • semantic_search.py: Functions for performing semantic searches and highlighting text.

  • reference_management.py: Functions for managing references using Zotero.

Potential Vulnerabilities

  1. API Key Exposure: Ensure that the Google API key is kept secure and not exposed in the code or version control.

  2. Input Validation: The application does not perform extensive validation on user inputs, which could lead to issues such as injection attacks or processing of malformed data.

  3. Error Handling: While the application includes basic error handling, it could be improved to handle more specific cases and provide more informative error messages.

  4. PDF Extraction: The text extraction from PDFs relies on PyPDF2, which may not handle all PDF formats correctly, especially scanned or protected PDFs.

  5. Session State Management: The use of Streamlit's session state is convenient but may lead to issues if not managed properly, especially with concurrent users.

Contributing

  1. Fork the repository.
  2. Create a new branch for your feature or bugfix: sh git checkout -b feature-name
  3. Commit your changes: sh git commit -m "Description of your changes"
  4. Push to the branch: sh git push origin feature-name
  5. Create a pull request.

For any issues or feature requests, please open an issue on the GitHub repository.

Owner

  • Name: Anubhab Mishra
  • Login: anubhab-m02
  • Kind: user

Secretary @ NSCC SRM | 3rd Yr BTech CSE @ SRMIST, KTR | Alumni of Sai International School, BBSR | 5⭐C++ @ Hackerrank | C, Python, UI/UX

Citation (citation.py)

from typing import List
import re

def extract_citations(text: str) -> List[dict]:
    # Simple regex-based citation extraction
    citations = re.findall(r'\((?:[^()]*\d{4}[^()]*)\)', text)
    
    # Convert extracted citations to a list of dictionaries
    return [{"raw": citation} for citation in citations]

def format_citation(citation: dict, style: str = 'apa') -> str:
    raw_citation = citation.get("raw", "")
    
    if style.lower() == 'apa':
        return f"APA Style: {raw_citation}"
    elif style.lower() == 'mla':
        return f"MLA Style: {raw_citation}"
    elif style.lower() == 'chicago':
        return f"Chicago Style: {raw_citation}"
    else:
        return f"Unknown Style: {raw_citation}"

GitHub Events

Total
Last Year

Issues and Pull Requests

Last synced: 9 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Dependencies

requirements.txt pypi
  • PyPDF2 *
  • google-generativeai *
  • pandas *
  • pyzotero *
  • scikit-learn *
  • streamlit *