et-ai-hybrid

Assistente de Revisões Científicas Híbrido - Resumos Inteligentes e Geração de Citações com IA.

https://github.com/marcosgoncaf/et-ai-hybrid

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Unable to calculate vocabulary similarity

Last synced: 10 months ago · JSON representation ·

Repository

Assistente de Revisões Científicas Híbrido - Resumos Inteligentes e Geração de Citações com IA.

Basic Info

Host: GitHub
Owner: marcosgoncaf
License: mit
Language: Python
Default Branch: main
Size: 9.77 KB

Statistics

Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Releases: 0

Created about 1 year ago · Last pushed about 1 year ago

Metadata Files

Readme License Citation

README.md

et-ai-hybrid

Assistente de Revisões Científicas Híbrido - Resumos Inteligentes e Geração de Citações com IA.

Owner

Login: marcosgoncaf
Kind: user

Repositories: 1
Profile: https://github.com/marcosgoncaf

Citation (citation_engine.py)

from sentence_transformers import SentenceTransformer
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np

class EnhancedCitationGenerator:
    def __init__(self, pdf_data, num_matches=3):
        self.pdf_data = pdf_data
        self.num_matches = num_matches
        self.sbert = SentenceTransformer("all-MiniLM-L6-v2", device="cpu")

    def _chunk_sentences(self, text):
        return [s.strip() for s in text.split(".") if s.strip()]

    def generate(self, user_text):
        segments = [seg.strip() for seg in user_text.split("/") if seg.strip()]
        emb_u = self.sbert.encode(segments)
        results = {}
        for seg, u in zip(segments, emb_u):
            scores = {}
            for key, info in self.pdf_data.items():
                sents = self._chunk_sentences(info["full_text"])
                emb_s = self.sbert.encode(sents, show_progress_bar=False)
                sims = cosine_similarity([u], emb_s)[0]
                best_idx = int(np.argmax(sims))
                scores[key] = (float(sims[best_idx]), sents[best_idx])
            topk = sorted(scores.items(), key=lambda x: x[1][0], reverse=True)[:self.num_matches]
            refs = [{"source": self.pdf_data[k]["nome"], "score": sc, "excerpt": ex, "page": "N/D"}
                    for k,(sc,ex) in topk]
            results[seg] = refs
        return results

GitHub Events

Total

Push event: 3
Create event: 2

Last Year

Push event: 3
Create event: 2

Dependencies

Dockerfile docker

python 3.11-slim build

requirements.txt pypi

PyMuPDF *
bibtexparser *
faiss-cpu *
huggingface_hub *
llama-cpp-python *
pandas *
sentence-transformers *
streamlit *
sumy *
torch *
transformers *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

et-ai-hybrid

Science Score: 44.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

et-ai-hybrid

Owner

Citation (citation_engine.py)

GitHub Events

Total

Last Year

Dependencies