reassign

hieRarchical multi-labEl clAsSification to diScover mIssinG aNnotations (REASSIGN)

https://github.com/migueleci/reassign

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (6.8%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

hieRarchical multi-labEl clAsSification to diScover mIssinG aNnotations (REASSIGN)

Basic Info
  • Host: GitHub
  • Owner: migueleci
  • License: gpl-3.0
  • Language: Python
  • Default Branch: master
  • Size: 9.13 MB
Statistics
  • Stars: 2
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 3 years ago · Last pushed over 3 years ago
Metadata Files
Readme License Citation

README.md

hieRarchical multi-labEl clAsSification to diScover mIssinG aNnotations (REASSIGN)

by Miguel Romero, Felipe Kenji Nakano, Jorge Finke, Camilo Rocha, and Celine Vens

Instructions

  1. Execute "compare_data.py" to process data and create the datasets required by the method. New folders will be created within the data folder, each one of these contains the data for one sub-hierarchy of Gene Ontology and a subgraph of the gene co-expression network for rice (Oryza sativa Japonia).

  2. Execute "reassign.py" to apply the method for each dataset (independently). The method will create the folder "pred" where the results will be stored. A subfolder will be created for each sub-hierarchy and will contains five files:

- "precision.csv" resumes the predictive performance of the method (measured using precision) for different values of the number of paths (and pairwise associations between genes and functions) to be selected.

- "precision.pdf" illustrates the predictive performance of the method in a plot containing the three variations of the method (average, sum and minimum).

- "top_{mean,sum,min}.csv" contains the selected associations for each variation of the method. Each file contain the gene and function identifier, followed by the probability of association, the probability computed for the path containing the association and a boolean value that indicates if the association is present in the newer version of the database.

Reference

Romero, M., Nakano, F. K., Finke, J., Rocha, C., & Vens, C. (2022). Hierarchy exploitation to detect missing annotations on hierarchical multi-label classification. arXiv preprint arXiv:2207.06237.

https://doi.org/10.48550/arxiv.2207.06237

Owner

  • Name: Miguel Romero
  • Login: migueleci
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
title: "REASSIGN"
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Miguel
    family-names: Romero
    email: miguelangel.romero@javerianacali.edu.co
    affiliation: Pontificia Universidad Javeriana
    orcid: 'https://orcid.org/0000-0002-7068-970X'
  - given-names: Felipe Kenji
    family-names: Nakano
  - given-names: Jorge
    family-names: Finke
  - given-names: Camilo
    family-names: Rocha
  - given-names: Celine
    family-names: Vens
doi: 10.48550/arXiv.2207.06237
date-released: 2022-07-12
url: "https://github.com/migueleci/reassign"

GitHub Events

Total
Last Year