attentionranklib
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (7.7%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: oeg-upm
- License: apache-2.0
- Language: Python
- Default Branch: main
- Size: 4.56 MB
Statistics
- Stars: 1
- Watchers: 6
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
AttentionRankLib
Repository to develop AttentionRank algorithm as library
Based on the work: https://github.com/hd10-iupui/AttentionRank
Install
Using Python 3.9
pip install -r requirements.txt
pip install -e .
``` python -m spacy download encorewebsm python -m spacy download escorenewssm
```
Execution
python main.py --dataset_name example --model_name_or_path PlanTL-GOB-ES/roberta-base-bne --model_type roberta --lang es --type_execution exec --k_value 15
Important: The dataset must be in this folder with any name and all documents must be inside another folder named docsutf8
Results will be provided inside the folder of the dataset in a folder named res+k_value
Docker run
For a fast run use the dockerfile and this two commands.
``` docker build -t attentionranklib .
```
docker run --rm -v ./example:/app/example attentionranklib --dataset_name example --model_name_or_path PlanTL-GOB-ES/roberta-base-bne --model_type roberta --lang es --type_execution exec --k_value 15
Acknowledgments
Para su desarrollo este código ha recibido financiación del proyecto INESData (Infraestructura para la INvestigación de ESpacios de DAtos distribuidos en UPM), un proyecto financiado en el contexto de la convocatoria UNICO I+D CLOUD del Ministerio para la Transformación Digital y de la Función Pública en el marco del PRTR financiado por Unión Europea (NextGenerationEU).
Paper Citation
bibtext
@inproceedings{Calleja2024,
author = {Pablo Calleja and Patricia Martín-Chozas and Elena Montiel-Ponsoda},
title = {Benchmark for Automatic Keyword Extraction in Spanish: Datasets and Methods},
booktitle = {Poster Proceedings of the 40th Annual Conference of the Spanish Association for Natural Language Processing 2024 (SEPLN-P 2024)},
series = {CEUR Workshop Proceedings},
volume = {3846},
pages = {132--141},
year = {2024},
publisher = {CEUR-WS.org},
address = {Valladolid, Spain},
month = {September 24-27},
urn = {urn:nbn:de:0074-3846-7},
url = {https://ceur-ws.org/Vol-3846/}
}
Owner
- Name: Ontology Engineering Group (UPM)
- Login: oeg-upm
- Kind: organization
- Email: oeg-dev@delicias.dia.fi.upm.es
- Location: Boadilla del Monte, Madrid, Spain
- Website: https://oeg.fi.upm.es/
- Repositories: 294
- Profile: https://github.com/oeg-upm
Citation (CITATION.cff)
cff-version: 1.2.0
message: "Si usas este código, por favor cita el siguiente artículo:"
authors:
- name: "Pablo Calleja"
- name: "Patricia Martín-Chozas"
- name: "Elena Montiel-Ponsoda"
title: "Benchmark for Automatic Keyword Extraction in Spanish: Datasets and Methods"
booktitle: "Poster Proceedings of the 40th Annual Conference of the Spanish Association for Natural Language Processing 2024 (SEPLN-P 2024)"
series: "CEUR Workshop Proceedings"
volume: "3846"
pages: "132-141"
year: "2024"
publisher: "CEUR-WS.org"
conference:
name: "40th International Conference of the Spanish Society for Natural Language Processing (SEPLN 2024)"
place: "Valladolid, Spain"
date-start: "2024-09-24"
date-end: "2024-09-27"
url: "https://ceur-ws.org/Vol-3846/"
identifiers:
- type: "urn"
value: "urn:nbn:de:0074-3846-7"
date-released: "2024-09-24"
GitHub Events
Total
- Push event: 3
Last Year
- Push event: 3
Dependencies
- Babel ==2.11.0
- CacheControl ==0.12.11
- Cython ==0.29.33
- Flask ==1.1.4
- GDAL ==3.0.4
- HeapDict ==1.0.1
- Jinja2 ==2.11.3
- Keras-Preprocessing ==1.1.2
- LunarCalendar ==0.0.9
- Markdown ==3.4.1
- MarkupSafe ==2.0.1
- Pillow ==7.1.2
- PyDrive ==1.3.1
- PyGObject ==3.36.0
- PyMeeus ==0.5.12
- PyOpenGL ==3.1.6
- PySocks ==1.7.1
- PyWavelets ==1.4.1
- PyYAML ==6.0
- Pygments ==2.6.1
- SQLAlchemy ==1.4.46
- Send2Trash ==1.8.0
- Sphinx ==3.5.4
- Werkzeug ==1.0.1
- absl-py ==1.3.0
- aeppl ==0.0.33
- aesara ==2.7.9
- aiohttp ==3.8.3
- aiosignal ==1.3.1
- alabaster ==0.7.12
- albumentations ==1.2.1
- altair ==4.2.0
- appdirs ==1.4.4
- arviz ==0.12.1
- astor ==0.8.1
- astropy ==4.3.1
- astunparse ==1.6.3
- async-timeout ==4.0.2
- atari-py ==0.2.9
- atomicwrites ==1.4.1
- attrs ==22.2.0
- audioread ==3.0.0
- autograd ==1.5
- backcall ==0.2.0
- beautifulsoup4 ==4.6.3
- bert-embedding ==1.0.1
- bleach ==5.0.1
- blis ==0.7.9
- bokeh ==2.3.3
- branca ==0.6.0
- bs4 ==0.0.1
- cachetools ==5.2.1
- catalogue ==2.0.8
- certifi ==2022.12.7
- cffi ==1.15.1
- cftime ==1.6.2
- chardet ==4.0.0
- charset-normalizer ==2.1.1
- click ==7.1.2
- clikit ==0.6.2
- cloudpickle ==2.2.0
- cmake ==3.22.6
- cmdstanpy ==1.0.8
- colorcet ==3.0.1
- colorlover ==0.3.0
- community ==1.0.0b1
- confection ==0.0.3
- cons ==0.4.5
- contextlib2 ==0.5.5
- convertdate ==2.4.0
- crashtest ==0.3.1
- crcmod ==1.7
- cufflinks ==0.17.3
- cupy-cuda11x ==11.0.0
- cvxopt ==1.3.0
- cvxpy ==1.2.3
- cycler ==0.11.0
- cymem ==2.0.7
- daft ==0.0.4
- dask ==2022.2.1
- datascience ==0.17.5
- datasets ==2.8.0
- db-dtypes ==1.0.5
- dbus-python ==1.2.16
- debugpy ==1.0.0
- decorator ==4.4.2
- defusedxml ==0.7.1
- descartes ==1.1.0
- dill ==0.3.6
- distributed ==2022.2.1
- dlib ==19.24.0
- dm-tree ==0.1.8
- dnspython ==2.2.1
- docopt ==0.6.2
- docutils ==0.16
- dopamine-rl ==1.0.5
- earthengine-api ==0.1.335
- easydict ==1.10
- ecos ==2.0.12
- editdistance ==0.5.3
- entrypoints ==0.4
- ephem ==4.1.4
- et-xmlfile ==1.1.0
- etils ==1.0.0
- etuples ==0.3.8
- fa2 ==0.3.5
- fastai ==2.7.10
- fastcore ==1.5.27
- fastdownload ==0.0.7
- fastdtw ==0.3.4
- fastjsonschema ==2.16.2
- fastprogress ==1.0.3
- fastrlock ==0.8.1
- feather-format ==0.4.1
- filelock ==3.9.0
- firebase-admin ==5.3.0
- fix-yahoo-finance ==0.0.22
- flatbuffers ==1.12
- folium ==0.12.1.post1
- frozenlist ==1.3.3
- fsspec ==2022.11.0
- future ==0.16.0
- gast ==0.4.0
- gdown ==4.4.0
- gensim ==3.6.0
- geographiclib ==1.52
- geopy ==1.17.0
- gin-config ==0.5.0
- glob2 ==0.7
- gluonnlp ==0.6.0
- graphviz ==0.8.4
- greenlet ==2.0.1
- grpcio ==1.51.1
- grpcio-status ==1.48.2
- gspread ==3.4.2
- gspread-dataframe ==3.0.8
- gym ==0.25.2
- gym-notices ==0.0.8
- h5py ==3.1.0
- hijri-converter ==2.2.4
- holidays ==0.18
- holoviews ==1.14.9
- html5lib ==1.0.1
- httpimport ==0.5.18
- httplib2 ==0.17.4
- httpstan ==4.6.1
- huggingface-hub ==0.11.1
- humanize ==0.5.1
- hyperopt ==0.1.2
- idna ==2.10
- imageio ==2.9.0
- imagesize ==1.4.1
- imbalanced-learn ==0.8.1
- imblearn ==0.0
- imgaug ==0.4.0
- importlib-metadata ==6.0.0
- importlib-resources ==5.10.2
- imutils ==0.5.4
- inflect ==2.1.0
- intel-openmp ==2023.0.0
- intervaltree ==2.1.0
- ipykernel ==5.3.4
- ipython ==7.9.0
- ipython-genutils ==0.2.0
- ipython-sql ==0.3.9
- ipywidgets ==7.7.1
- itsdangerous ==1.1.0
- jax ==0.3.25
- jieba ==0.42.1
- joblib ==1.2.0
- jpeg4py ==0.1.4
- jsonschema ==4.3.3
- jupyter-client ==6.1.12
- jupyter-console ==6.1.0
- jupyter_core ==5.1.3
- jupyterlab-widgets ==3.0.5
- kaggle ==1.5.12
- kapre ==0.3.7
- keras ==2.9.0
- keras-vis ==0.4.1
- kiwisolver ==1.4.4
- korean-lunar-calendar ==0.3.1
- langcodes ==3.3.0
- libclang ==15.0.6.1
- librosa ==0.8.1
- lightgbm ==2.2.3
- llvmlite ==0.39.1
- lmdb ==0.99
- locket ==1.0.0
- logical-unification ==0.4.5
- lxml ==4.9.2
- marshmallow ==3.19.0
- matplotlib ==3.2.2
- matplotlib-venn ==0.11.7
- miniKanren ==1.0.3
- missingno ==0.5.1
- mistune ==0.8.4
- mizani ==0.7.3
- mkl ==2019.0
- mlxtend ==0.14.0
- more-itertools ==9.0.0
- moviepy ==0.2.3.5
- mpmath ==1.2.1
- msgpack ==1.0.4
- multidict ==6.0.4
- multipledispatch ==0.6.0
- multiprocess ==0.70.14
- multitasking ==0.0.11
- murmurhash ==1.0.9
- music21 ==5.5.0
- mxnet ==1.4.0
- natsort ==5.5.0
- nbconvert ==5.6.1
- nbformat ==5.7.1
- netCDF4 ==1.6.2
- networkx ==3.0
- nibabel ==3.0.2
- nltk ==3.7
- notebook ==5.7.16
- numba ==0.56.4
- numexpr ==2.8.4
- numpy ==1.14.6
- oauth2client ==4.1.3
- oauthlib ==3.2.2
- okgrade ==0.4.3
- opencv-contrib-python ==4.6.0.66
- opencv-python ==4.6.0.66
- opencv-python-headless ==4.7.0.68
- openpyxl ==3.0.10
- opt-einsum ==3.3.0
- osqp ==0.6.2.post0
- packaging ==21.3
- palettable ==3.3.0
- pandas ==1.3.5
- pandas-datareader ==0.9.0
- pandas-gbq ==0.17.9
- pandas-profiling ==1.4.1
- pandocfilters ==1.5.0
- panel ==0.12.1
- param ==1.12.3
- parso ==0.8.3
- partd ==1.3.0
- pastel ==0.2.1
- pathlib ==1.0.1
- pathy ==0.10.1
- patsy ==0.5.3
- pep517 ==0.13.0
- pexpect ==4.8.0
- pickleshare ==0.7.5
- pip-tools ==6.6.2
- platformdirs ==2.6.2
- plotly ==5.5.0
- plotnine ==0.8.0
- pluggy ==0.7.1
- pooch ==1.6.0
- portpicker ==1.3.9
- prefetch-generator ==1.0.3
- preshed ==3.0.8
- prettytable ==3.6.0
- progressbar2 ==3.38.0
- prometheus-client ==0.15.0
- promise ==2.3
- prompt-toolkit ==2.0.10
- prophet ==1.1.1
- proto-plus ==1.22.2
- protobuf ==3.19.6
- psutil ==5.4.8
- psycopg2 ==2.9.5
- ptyprocess ==0.7.0
- py ==1.11.0
- pyarrow ==9.0.0
- pyasn1 ==0.4.8
- pyasn1-modules ==0.2.8
- pycocotools ==2.0.6
- pycparser ==2.21
- pyct ==0.4.8
- pydantic ==1.10.4
- pydata-google-auth ==1.5.0
- pydot ==1.3.0
- pydot-ng ==2.0.0
- pydotplus ==2.0.2
- pyemd ==0.5.1
- pyerfa ==2.0.0.1
- pylev ==1.4.0
- pymc ==4.1.4
- pymongo ==4.3.3
- pymystem3 ==0.2.0
- pyparsing ==3.0.9
- pyrsistent ==0.19.3
- pysimdjson ==3.2.0
- pystan ==3.3.0
- pytest ==3.6.4
- python-apt ==2.0.0
- python-dateutil ==2.8.2
- python-louvain ==0.16
- python-slugify ==7.0.0
- python-utils ==3.4.5
- pytz ==2022.7
- pyviz-comms ==2.2.1
- pyzmq ==23.2.1
- qdldl ==0.1.5.post2
- qudida ==0.0.4
- regex ==2022.6.2
- requests ==2.25.1
- requests-oauthlib ==1.3.1
- requests-unixsocket ==0.2.0
- resampy ==0.4.2
- responses ==0.18.0
- rpy2 ==3.5.5
- rsa ==4.9
- scikit-image ==0.18.3
- scikit-learn ==1.0.2
- scipy ==1.7.3
- screen-resolution-extra ==0.0.0
- scs ==3.2.2
- seaborn ==0.11.2
- seqeval ==1.2.2
- setuptools-git ==1.2
- shapely ==2.0.0
- six ==1.15.0
- sklearn-pandas ==1.8.0
- smart-open ==6.3.0
- snowballstemmer ==2.2.0
- sortedcontainers ==2.4.0
- soundfile ==0.11.0
- spacy ==3.4.4
- spacy-legacy ==3.0.11
- spacy-loggers ==1.0.4
- sphinxcontrib-devhelp ==1.0.2
- sphinxcontrib-htmlhelp ==2.0.0
- sphinxcontrib-jsmath ==1.0.1
- sphinxcontrib-qthelp ==1.0.3
- sphinxcontrib-serializinghtml ==1.1.5
- sphinxcontrib.applehelp ==1.0.3
- sqlparse ==0.4.3
- srsly ==2.4.5
- statsmodels ==0.12.2
- sympy ==1.7.1
- tables ==3.7.0
- tabulate ==0.8.10
- tblib ==1.7.0
- tenacity ==8.1.0
- tensorboard ==2.9.1
- tensorboard-data-server ==0.6.1
- tensorboard-plugin-wit ==1.8.1
- tensorflow ==2.9.2
- tensorflow-datasets ==4.8.1
- tensorflow-estimator ==2.9.0
- tensorflow-gcs-config ==2.9.1
- tensorflow-hub ==0.12.0
- tensorflow-io-gcs-filesystem ==0.29.0
- tensorflow-metadata ==1.12.0
- tensorflow-probability ==0.17.0
- termcolor ==2.2.0
- terminado ==0.13.3
- testpath ==0.6.0
- text-unidecode ==1.3
- textblob ==0.15.3
- thinc ==8.1.6
- threadpoolctl ==3.1.0
- tifffile ==2022.10.10
- tokenizers ==0.13.2
- toml ==0.10.2
- tomli ==2.0.1
- toolz ==0.12.0
- torchsummary ==1.5.1
- torchtext ==0.14.1
- tornado ==6.0.4
- tqdm ==4.64.1
- traitlets ==5.7.1
- transformers ==4.25.1
- tweepy ==3.10.0
- typeguard ==2.7.1
- typer ==0.7.0
- typing ==3.6.6
- typing_extensions ==4.4.0
- tzlocal ==1.5.1
- uritemplate ==4.1.1
- urllib3 ==1.26.14
- vega-datasets ==0.9.0
- wasabi ==0.10.1
- wcwidth ==0.2.5
- webargs ==8.2.0
- webencodings ==0.5.1
- widgetsnbextension ==3.6.1
- wordcloud ==1.8.2.2
- wrapt ==1.14.1
- xarray ==2022.12.0
- xarray-einstats ==0.4.0
- xgboost ==0.90
- xkit ==0.0.0
- xlrd ==1.2.0
- xlwt ==1.3.0
- xxhash ==3.2.0
- yarg ==0.1.9
- yarl ==1.8.2
- yellowbrick ==1.5
- zict ==2.2.0
- zipp ==3.11.0
- List *
- python 3.9.6 build