research-tool

https://github.com/niclashart/research-tool

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (7.3%) to scientific vocabulary

Last synced: 11 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: niclashart
License: mit
Language: HTML
Default Branch: main
Size: 7.91 MB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created about 1 year ago · Last pushed about 1 year ago

Metadata Files

Readme License Citation

AI Research Hub

Ein intelligentes Forschungs-Tool, das automatisch KI-relevante Artikel aus verschiedenen Quellen sammelt, zusammenfasst und analysiert. Mit integriertem AI-Chat-Assistenten fr erweiterte Recherche und Analyse.

Features

Dashboard

Echtzeit-bersicht ber aktuelle KI-Forschung und News
Interaktive Diagramme fr Trends und Quellen-Statistiken
Top-Keywords und Trending-Topics-Analyse
Responsive Design mit Dark/Light Mode

Multi-Source Datensammlung

ArXiv: Aktuelle KI-Forschungsarbeiten mit Zitationsstilen
TechCrunch: KI-relevante Tech-News
VentureBeat: Startup- und Business-KI-News
Stanford AI Blog: Akademische KI-Insights
The Verge: Tech-Journalismus zu KI-Themen
The Hacker News: Cybersecurity mit KI-Fokus

AI-Chat-Assistent

OpenAI-Integration fr intelligente Artikel-Analyse
Kontextuelle Antworten basierend auf gesammelten Artikeln
Mehrsprachige Untersttzung (DE/EN/FR/ZH)
Persistente Chat-Historie

Erweiterte Funktionen

Automatische Relevanz-Bewertung mit KI-Keywords
Flexible Export-Optionen (Word, JSON, Text)
Caching-System fr Performance-Optimierung
Benutzerfreundliche Einstellungen

Installation

Voraussetzungen

Python 3.8+
Google Chrome (fr Web-Scraping)
OpenAI API Key

1. Repository klonen

bash git clone https://github.com/yourusername/ai-research-hub.git cd ai-research-hub

2. Virtual Environment erstellen

```bash python -m venv venv source venv/bin/activate # Linux/Mac

oder

venv\Scripts\activate # Windows ```

3. Dependencies installieren

bash pip install -r requirements.txt

4. Umgebungsvariablen konfigurieren

Erstellen Sie eine .env-Datei im Projektverzeichnis: ```env

OpenAI API Key (erforderlich)

OPENAIAPIKEY=youropenaiapikeyhere

Optional: Weitere Einstellungen

FLASKSECRETKEY=yoursecretkeyhere OPENAIMODEL=gpt-3.5-turbo THEME=light DEFAULTSOURCES=arxiv,techcrunch,venturebeat MAXRESULTS=10 DBPATH=researchdata.db ```

5. Anwendung starten

bash python app.py

Die Anwendung ist dann unter http://localhost:5000 verfgbar.

Verwendung

1. Quellen konfigurieren

Navigieren Sie zur Sources-Seite
Whlen Sie gewnschte Datenquellen aus
Konfigurieren Sie Parameter (max. Artikel, ArXiv-Kategorien, etc.)
Klicken Sie auf "Daten abrufen"

2. Dashboard analysieren

Aktuelle Artikel durchstbern
Trend-Diagramme analysieren
Top-Keywords verfolgen
Quellen-Statistiken einsehen

3. AI-Chat nutzen

Stellen Sie Fragen zu gesammelten Artikeln
Lassen Sie sich Trends erklren
Vergleichen Sie verschiedene Forschungsanstze
Exportieren Sie Chat-Verlufe

4. Berichte exportieren

Word-Dokumente mit Zusammenfassungen
JSON-Daten fr weitere Analyse
Verschiedene Zitationsstile (APA, MLA, Chicago, BibTeX)

Technische Details

Architektur

```

Frontend Backend Data Layer
(HTML/JS) (Flask) (SQLite)

                      AI Services   
                      (OpenAI)

```

Hauptkomponenten

app.py: Haupt-Flask-Anwendung
scrapers/: Datensammlung von verschiedenen Quellen
templates/: Frontend-Templates
static/: CSS, JavaScript, Assets
db_manager.py: Datenbankoperationen
relevance.py: KI-Relevanz-Analyse

Datenbank Schema

sql -- Artikel-Tabelle CREATE TABLE articles ( id INTEGER PRIMARY KEY, title TEXT NOT NULL, source TEXT NOT NULL, url TEXT, summary TEXT, content TEXT, pub_date TEXT, keywords TEXT, relevance_score REAL, created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP );

Projektstruktur

``` ai-research-hub/ app.py # Haupt-Flask-Anwendung web_app.py # Alternative Flask-App requirements.txt # Python-Dependencies .env # Umgebungsvariablen .gitignore # Git-Ignore-Datei README.md # Diese Datei LICENSE # MIT-Lizenz

scrapers/ # Datensammlung arxivscraper.py techcrunch.py venturebeat.py stanford_ai.py theverge.py thn.py

templates/ # HTML-Templates layout.html index.html dashboard.html sources.html chat.html settings.html

static/ # Frontend-Assets css/ js/ images/

cache/ # Zwischenspeicher summary/ # Export-Dateien logs/ # Log-Dateien ```

Konfiguration

Scraper-Einstellungen

```python

Maximale Artikel pro Quelle

MAX_ARTICLES = 10

ArXiv-Kategorien

ARXIV_CATEGORIES = ['cs.AI', 'cs.LG', 'cs.CL', 'cs.CV']

Cache-Verhalten

CACHE_DURATION = 3600 # 1 Stunde ```

OpenAI-Einstellungen

```python

Verwendetes Modell

OPENAI_MODEL = 'gpt-3.5-turbo'

Max. Tokens pro Anfrage

MAX_TOKENS = 500

Temperatur (Kreativitt)

TEMPERATURE = 0.7 ```

Troubleshooting

Hufige Probleme

1. OpenAI API-Fehler

Error: OpenAI API key not set Lsung: Setzen Sie OPENAI_API_KEY in der .env-Datei

2. Chrome-Driver-Fehler

Error: ChromeDriver not found Lsung: Installieren Sie Chrome und aktualisieren Sie den WebDriver: bash pip install --upgrade webdriver-manager

3. Keine Daten vom Dashboard

Error: No sources selected Lsung: - Gehen Sie zur Sources-Seite - Whlen Sie mindestens eine Quelle aus - Klicken Sie auf "Daten abrufen"

4. Langsame Performance

Lsung: - Reduzieren Sie MAX_ARTICLES in den Einstellungen - Leeren Sie den Cache-Ordner - Prfen Sie Ihre Internetverbindung

Performance-Optimierung

Caching-Strategien

Artikel-Cache: 1 Stunde fr bereits verarbeitete Artikel
Database-Cache: Lokale SQLite-Datenbank fr schnelle Abfragen
API-Cache: Rate-Limiting fr externe APIs

Monitoring

```python

Logging-Konfiguration

logging.basicConfig( level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s', handlers=[ logging.StreamHandler(), logging.FileHandler('app.log') ] ) ```

Contributing

Beitrge sind willkommen! Bitte beachten Sie:

Fork das Repository
Erstellen Sie einen Feature-Branch (git checkout -b feature/amazing-feature)
Committen Sie Ihre nderungen (git commit -m 'Add amazing feature')
Push zum Branch (git push origin feature/amazing-feature)
Erstellen Sie eine Pull Request

Entwicklungsrichtlinien

Verwenden Sie aussagekrftige Commit-Messages
Fgen Sie Tests fr neue Features hinzu
Dokumentieren Sie Ihre nderungen
Befolgen Sie PEP 8 fr Python-Code

Lizenz

Dieses Projekt ist unter der MIT-Lizenz lizenziert - siehe LICENSE fr Details.

Dieses Tool wurde entwickelt, um Forschern und KI-Enthusiasten zu helfen, mit den neuesten Entwicklungen in der Knstlichen Intelligenz Schritt zu halten.

Owner

Name: Niclas
Login: niclashart
Kind: user
Location: Germany

Repositories: 1
Profile: https://github.com/niclashart

AI student

Citation (citation.py)

from datetime import datetime

def format_arxiv_citation(paper_data, citation_style="apa"):
    """
    Generate formatted citation for arXiv papers
    Args:
        paper_data: Dictionary containing paper data
        citation_style: "apa", "mla", "chicago" or "bibtex"
    Returns:
        Formatted citation string
    """
    if citation_style.lower() == "apa":
        return generate_apa_citation(paper_data)
    elif citation_style.lower() == "mla":
        return generate_mla_citation(paper_data)
    elif citation_style.lower() == "chicago":
        return generate_chicago_citation(paper_data)
    elif citation_style.lower() == "bibtex":
        return generate_bibtex_citation(paper_data)
    else:
        return generate_apa_citation(paper_data)

def generate_apa_citation(paper):
    """Generate APA style citation"""
    # Format authors
    authors = paper.get('authors', [])
    if isinstance(authors, str):
        authors = [authors]
    
    if len(authors) == 0:
        author_text = "Anonymous"
    elif len(authors) == 1:
        author_text = authors[0]
    elif len(authors) == 2:
        author_text = f"{authors[0]} & {authors[1]}"
    else:
        author_text = f"{authors[0]} et al."
    
    # Format year
    published = paper.get('published', datetime.now())
    if isinstance(published, str):
        try:
            year = published.split('-')[0]
        except:
            year = str(datetime.now().year)
    else:
        year = str(published.year)
    
    # Format title
    title = paper.get('title', '').strip()
    if not title.endswith('.'):
        title += '.'
    
    # Get arXiv ID
    link = paper.get('link', '')
    arxiv_id = link.split('/')[-1].replace('.pdf', '')
    
    # Format citation
    citation = f"{author_text} ({year}). {title} arXiv preprint arXiv:{arxiv_id}"
    
    return citation

def generate_mla_citation(paper):
    """Generate MLA style citation"""
    # Format authors
    authors = paper.get('authors', [])
    if isinstance(authors, str):
        authors = [authors]
    
    if len(authors) == 0:
        author_text = "Anonymous"
    elif len(authors) == 1:
        last_first = authors[0].split()
        if len(last_first) > 1:
            author_text = f"{last_first[-1]}, {' '.join(last_first[:-1])}"
        else:
            author_text = last_first[0]
    else:
        last_first = authors[0].split()
        if len(last_first) > 1:
            author_text = f"{last_first[-1]}, {' '.join(last_first[:-1])}, et al"
        else:
            author_text = f"{last_first[0]}, et al"
    
    # Format title
    title = f"\"{paper.get('title', '').strip()}\""
    
    # Format date
    published = paper.get('published', datetime.now())
    if isinstance(published, str):
        try:
            date_parts = published.split('-')
            date = f"{date_parts[0]}"
        except:
            date = str(datetime.now().year)
    else:
        date = str(published.year)
    
    # Get arXiv ID
    link = paper.get('link', '')
    arxiv_id = link.split('/')[-1].replace('.pdf', '')
    
    # Format citation
    citation = f"{author_text}. {title}. arXiv:{arxiv_id}, {date}."
    
    return citation

def generate_chicago_citation(paper):
    """Generate Chicago style citation"""
    # Format authors
    authors = paper.get('authors', [])
    if isinstance(authors, str):
        authors = [authors]
    
    if len(authors) == 0:
        author_text = "Anonymous"
    elif len(authors) == 1:
        author_text = authors[0]
    else:
        author_text = f"{authors[0]}, and {authors[1]}" if len(authors) == 2 else f"{authors[0]} et al."
    
    # Format title
    title = f"\"{paper.get('title', '').strip()}\""
    
    # Format date
    published = paper.get('published', datetime.now())
    if isinstance(published, str):
        try:
            date = published.split('-')[0]
        except:
            date = str(datetime.now().year)
    else:
        date = str(published.year)
    
    # Get arXiv ID
    link = paper.get('link', '')
    arxiv_id = link.split('/')[-1].replace('.pdf', '')
    
    # Format citation
    citation = f"{author_text}. {title} arXiv:{arxiv_id} ({date})."
    
    return citation

def generate_bibtex_citation(paper):
    """Generate BibTeX citation"""
    # Get arXiv ID
    link = paper.get('link', '')
    arxiv_id = link.split('/')[-1].replace('.pdf', '')
    
    # Format authors for BibTeX
    authors = paper.get('authors', [])
    if isinstance(authors, str):
        authors = [authors]
    
    if len(authors) == 0:
        author_text = "Anonymous"
    else:
        author_text = " and ".join(authors)
    
    # Format title
    title = paper.get('title', '').strip()
    
    # Format date
    published = paper.get('published', datetime.now())
    if isinstance(published, str):
        try:
            year = published.split('-')[0]
        except:
            year = str(datetime.now().year)
    else:
        year = str(published.year)
    
    # Create BibTeX entry
    bibtex = (
        f"@article{{{arxiv_id},\n"
        f"  author = {{{author_text}}},\n"
        f"  title = {{{title}}},\n"
        f"  journal = {{arXiv preprint arXiv:{arxiv_id}}},\n"
        f"  year = {{{year}}},\n"
        f"  url = {{https://arxiv.org/abs/{arxiv_id}}}\n"
        f"}}"
    )
    
    return bibtex

GitHub Events

Total

Public event: 1
Push event: 20

Last Year

Public event: 1
Push event: 20

Dependencies

requirements.txt pypi

PyPDF2 *
arxiv *
beautifulsoup4 *
feedparser *
lxml *
openai *
pandas *
pdfplumber *
python-docx *
python-dotenv *
requests *
selenium *
streamlit *