reassign
hieRarchical multi-labEl clAsSification to diScover mIssinG aNnotations (REASSIGN)
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 4 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (6.8%) to scientific vocabulary
Repository
hieRarchical multi-labEl clAsSification to diScover mIssinG aNnotations (REASSIGN)
Basic Info
- Host: GitHub
- Owner: migueleci
- License: gpl-3.0
- Language: Python
- Default Branch: master
- Size: 9.13 MB
Statistics
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
hieRarchical multi-labEl clAsSification to diScover mIssinG aNnotations (REASSIGN)
by Miguel Romero, Felipe Kenji Nakano, Jorge Finke, Camilo Rocha, and Celine Vens
Instructions
Execute "compare_data.py" to process data and create the datasets required by the method. New folders will be created within the data folder, each one of these contains the data for one sub-hierarchy of Gene Ontology and a subgraph of the gene co-expression network for rice (Oryza sativa Japonia).
Execute "reassign.py" to apply the method for each dataset (independently). The method will create the folder "pred" where the results will be stored. A subfolder will be created for each sub-hierarchy and will contains five files:
- "precision.csv" resumes the predictive performance of the method (measured using precision) for different values of the number of paths (and pairwise associations between genes and functions) to be selected.
- "precision.pdf" illustrates the predictive performance of the method in a plot containing the three variations of the method (average, sum and minimum).
- "top_{mean,sum,min}.csv" contains the selected associations for each variation of the method. Each file contain the gene and function identifier, followed by the probability of association, the probability computed for the path containing the association and a boolean value that indicates if the association is present in the newer version of the database.
Reference
Romero, M., Nakano, F. K., Finke, J., Rocha, C., & Vens, C. (2022). Hierarchy exploitation to detect missing annotations on hierarchical multi-label classification. arXiv preprint arXiv:2207.06237.
Owner
- Name: Miguel Romero
- Login: migueleci
- Kind: user
- Repositories: 9
- Profile: https://github.com/migueleci
Citation (CITATION.cff)
cff-version: 1.2.0
title: "REASSIGN"
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Miguel
family-names: Romero
email: miguelangel.romero@javerianacali.edu.co
affiliation: Pontificia Universidad Javeriana
orcid: 'https://orcid.org/0000-0002-7068-970X'
- given-names: Felipe Kenji
family-names: Nakano
- given-names: Jorge
family-names: Finke
- given-names: Camilo
family-names: Rocha
- given-names: Celine
family-names: Vens
doi: 10.48550/arXiv.2207.06237
date-released: 2022-07-12
url: "https://github.com/migueleci/reassign"