et-ai-hybrid
Assistente de Revisões Científicas Híbrido - Resumos Inteligentes e Geração de Citações com IA.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Unable to calculate vocabulary similarity
Last synced: 6 months ago
·
JSON representation
·
Repository
Assistente de Revisões Científicas Híbrido - Resumos Inteligentes e Geração de Citações com IA.
Basic Info
- Host: GitHub
- Owner: marcosgoncaf
- License: mit
- Language: Python
- Default Branch: main
- Size: 9.77 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Created 10 months ago
· Last pushed 10 months ago
Metadata Files
Readme
License
Citation
README.md
et-ai-hybrid
Assistente de Revisões Científicas Híbrido - Resumos Inteligentes e Geração de Citações com IA.
Owner
- Login: marcosgoncaf
- Kind: user
- Repositories: 1
- Profile: https://github.com/marcosgoncaf
Citation (citation_engine.py)
from sentence_transformers import SentenceTransformer
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np
class EnhancedCitationGenerator:
def __init__(self, pdf_data, num_matches=3):
self.pdf_data = pdf_data
self.num_matches = num_matches
self.sbert = SentenceTransformer("all-MiniLM-L6-v2", device="cpu")
def _chunk_sentences(self, text):
return [s.strip() for s in text.split(".") if s.strip()]
def generate(self, user_text):
segments = [seg.strip() for seg in user_text.split("/") if seg.strip()]
emb_u = self.sbert.encode(segments)
results = {}
for seg, u in zip(segments, emb_u):
scores = {}
for key, info in self.pdf_data.items():
sents = self._chunk_sentences(info["full_text"])
emb_s = self.sbert.encode(sents, show_progress_bar=False)
sims = cosine_similarity([u], emb_s)[0]
best_idx = int(np.argmax(sims))
scores[key] = (float(sims[best_idx]), sents[best_idx])
topk = sorted(scores.items(), key=lambda x: x[1][0], reverse=True)[:self.num_matches]
refs = [{"source": self.pdf_data[k]["nome"], "score": sc, "excerpt": ex, "page": "N/D"}
for k,(sc,ex) in topk]
results[seg] = refs
return results
GitHub Events
Total
- Push event: 3
- Create event: 2
Last Year
- Push event: 3
- Create event: 2
Dependencies
Dockerfile
docker
- python 3.11-slim build
requirements.txt
pypi
- PyMuPDF *
- bibtexparser *
- faiss-cpu *
- huggingface_hub *
- llama-cpp-python *
- pandas *
- sentence-transformers *
- streamlit *
- sumy *
- torch *
- transformers *