semantic_dl
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.7%) to scientific vocabulary
Last synced: 6 months ago
·
JSON representation
·
Repository
Basic Info
- Host: GitHub
- Owner: mariolpantunes
- License: mit
- Language: Python
- Default Branch: main
- Size: 6.32 MB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 1
Created almost 5 years ago
· Last pushed over 1 year ago
Metadata Files
Readme
License
Citation
README.md
semantic_dl
Instalation
- This commands are run on the git root folder
- make sure to have c++11 installed as fastText needs it.
- compile fasttext
wget https://github.com/facebookresearch/fastText/archive/v0.9.2.zipunzip v0.9.2.zipcd fastText-0.9.2make- move the binary file
fasttextto the fasttext foldermv fastText-0.9.2/fasttext fasttext/
- install python3-dev
sudo apt install python3-dev(for debain based systems)
- Download dataset for similarity training https://raw.githubusercontent.com/AlexGrinch/ro_sgns/master/datasets/rg65.csv
- put it in
dataset/train_sim/
- put it in
- Download constrained corpus https://www.kaggle.com/datasets/mantunes/semantic-corpus-from-web-search-snippets
- put the uncompressed files (
.csvformat) indataset/train/
- put the uncompressed files (
- Download dataset for similarity evaluation (IoT) https://www.kaggle.com/datasets/mantunes/semantic-iot
- put it in
dataset/testwith the nameen-mc-30.csv
- put it in
- Download dataset for similarity evaluation (MC) https://www.kaggle.com/datasets/mantunes/millercharles
- put it in
dataset/testwith the nameen-iot-30.csv
- put it in
- Download pretrained fasttext https://dl.fbaipublicfiles.com/fasttext/vectors-english/wiki-news-300d-1M.vec.zip
- put it in
fasttext/pre-trained/ - rename it
pretrained.vec
- put it in
- Download pretrained glove https://nlp.stanford.edu/data/glove.6B.zip
- put it in
glove/pre-trained/
- put it in
- Install python libraries
pip install -r requirements.txt
Authors
- Mrio Antunes - mariolpantunes
- Rafael Teixeira - rgtzths
License
This project is licensed under the MIT License - see the LICENSE file for details
Citation
Teixeira, Rafael & Antunes, Mrio & Gomes, Diogo & Aguiar, Rui. (2022). Comparison of Semantic Similarity Models on Constrained Scenarios. Information Systems Frontiers. 10.1007/s10796-022-10350-w.
Owner
- Name: Mário Antunes
- Login: mariolpantunes
- Kind: user
- Location: Aveiro
- Company: @ATNoG
- Repositories: 12
- Profile: https://github.com/mariolpantunes
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Antunes"
given-names: "Mário"
orcid: "https://orcid.org/0000-0002-6504-9441"
- family-names: "Rafael"
given-names: "Teixeira"
orcid: "https://orcid.org/0000-0000-0000-0000"
title: "Comparison of Semantic Similarity Models on Constrained Scenarios"
version: 1.0.0
doi: 10.1007/s10796-022-10350-w
date-released: 2022-11-10
url: "https://github.com/mariolpantunes/semantic_dl/"
preferred-citation:
type: journal
authors:
- family-names: "Teixeira"
given-names: "Rafael"
orcid: "https://orcid.org/0000-0001-7211-382X"
- family-names: "Antunes"
given-names: "Mário"
orcid: "https://orcid.org/0000-0002-6504-9441"
- family-names: "Gomes"
given-names: "Diogo"
orcid: "https://orcid.org/0000-0002-5848-2802"
- family-names: "Aguiar"
given-names: "Rui L."
orcid: "https://orcid.org/0000-0003-0107-6253"
title: "Comparison of Semantic Similarity Models on Constrained Scenarios"
doi: 10.1007/s10796-022-10350-w
journal: "Information Systems Frountier"
month: 11
year: 2022
GitHub Events
Total
Last Year
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Rafael Teixeira | r****a@u****t | 39 |
| Rafael Teixeira | r****a@u****t | 5 |
| Mario Antunes | m****s@g****m | 3 |
Committer Domains (Top 20 + Academic)
ua.pt: 2
Issues and Pull Requests
Last synced: 9 months ago
All Time
- Total issues: 0
- Total pull requests: 1
- Average time to close issues: N/A
- Average time to close pull requests: less than a minute
- Total issue authors: 0
- Total pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- rgtzths (1)