memetic-geneci

Software package derived from Single-GENECI that incorporates an additional local search phase to guide the evolution of individuals based on known interactions

https://github.com/adrianseguraortiz/memetic-geneci

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.4%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Software package derived from Single-GENECI that incorporates an additional local search phase to guide the evolution of individuals based on known interactions

Basic Info
  • Host: GitHub
  • Owner: AdrianSeguraOrtiz
  • License: mit
  • Language: Java
  • Default Branch: main
  • Homepage:
  • Size: 74.6 MB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 2 years ago · Last pushed about 1 year ago
Metadata Files
Readme License Citation

README.md

Memetic-GENECI

CI Release Pypi Code style: black

Memetic-GENECI is a software package derived from Single-GENECI (GEne NEtwork Consensus Inference) that incorporates an additional local search phase to guide the evolution of individuals based on known interactions. Injection of domain expert knowledge has been shown to improve the accuracy with which Single-GENECI optimises consensus between different gene regulatory network inference techniques.

Prerequisites

  • Python => 3.9
  • Docker

Instalation

sh pip install geneci==1.5.2

Integrated techniques

The same as those contemplated in Single-GENECI

Example procedure

  1. Download simulated expression data and their respective gold standards. As in Single-GENECI

  2. Inference and consensus of networks for the selected expression data. Unlike in Single-GENECI in this case a list of known interactions can be specified via the --known-interactions parameter by providing a txt or csv file with the following structure:

txt G1,G2,1 G1,G3,1 G3,G4,1 ...

In addition, the --memetic-distance-type parameter is also incorporated to consider one, some or all of the known interactions in each iteration of the local search phase, and the --memetic-probability parameter to determine the probability with which an individual is subjected to it.

  • Form 1: Procedure prefixed by the command run

sh geneci run --expression-data input_data/DREAM4/EXP/dream4_010_01_exp.csv \ --technique aracne --technique bc3net --technique c3net \ --technique clr --technique genie3_rf --technique genie3_gbm \ --technique genie3_et --technique mrnet --technique mrnetb \ --technique pcit --technique tigress --technique kboost \ --known-interactions known_interactions.txt --memetic-distance-type all \ --memetic-probability 0.5

  • Form 2: Division of the procedure into several commands

```sh

1. Inference using individual techniques

geneci infer-network --expression-data inputdata/DREAM4/EXP/dream401001exp.csv \ --technique aracne --technique bc3net --technique c3net --technique clr --technique mrnet \ --technique mrnetb --technique genie3rf --technique genie3gbm --technique genie3_et \ --technique pcit --technique tigress --technique kboost

2. Optimize the assembly of the trust lists resulting from the above command

geneci optimize-ensemble --confidence-list inferrednetworks/dream401001exp/lists/GRNARACNE.csv \ --confidence-list inferrednetworks/dream401001exp/lists/GRNBC3NET.csv \ --confidence-list inferrednetworks/dream401001exp/lists/GRNC3NET.csv \ --confidence-list inferrednetworks/dream401001exp/lists/GRNCLR.csv \ --confidence-list inferrednetworks/dream401001exp/lists/GRNGENIE3RF.csv \ --confidence-list inferrednetworks/dream401001exp/lists/GRNGENIE3GBM.csv \ --confidence-list inferrednetworks/dream401001exp/lists/GRNGENIE3ET.csv \ --confidence-list inferrednetworks/dream401001exp/lists/GRNMRNET.csv \ --confidence-list inferrednetworks/dream401001exp/lists/GRNMRNETB.csv \ --confidence-list inferrednetworks/dream401001exp/lists/GRNPCIT.csv \ --confidence-list inferrednetworks/dream401001exp/lists/GRNTIGRESS.csv \ --confidence-list inferrednetworks/dream401001exp/lists/GRNKBOOST.csv \ --gene-names inferrednetworks/dream401001exp/genenames.txt \ --known-interactions known_interactions.txt --memetic-distance-type all \ --memetic-probability 0.5 ```

  1. Representation of inferred networks. As in Single-GENECI
  2. Evaluation of the quality of the inferred gene network with respect to the gold standard. As in Single-GENECI
  3. Binarization of the inferred gene network. As in Single-GENECI

CLI

Usage:

console $ [OPTIONS] COMMAND [ARGS]...

Options:

  • --install-completion: Install completion for the current shell.
  • --show-completion: Show completion for the current shell, to copy it or customize the installation.
  • --help: Show this message and exit.

Commands:

  • apply-cut: Converts a list of confidence values into a binary matrix that represents the final gene network.
  • draw-network: Draw gene regulatory networks from confidence lists.
  • evaluate: Evaluate the accuracy of the inferred network with respect to its gold standard.
  • extract-data: Extract data from different simulators and known challenges. These include DREAM3, DREAM4, DREAM5, SynTReN, Rogers, GeneNetWeaver and IRMA.
  • infer-network: Infer gene regulatory networks from expression data. Several techniques are available: ARACNE, BC3NET, C3NET, CLR, GENIE3, MRNET, MRNET, MRNETB and PCIT.
  • optimize-ensemble: Analyzes several trust lists and builds a consensus network by applying an evolutionary algorithm.
  • run: Infer gene regulatory network from expression data by employing multiple unsupervised learning techniques and applying a genetic algorithm for consensus optimization.

Only modified command specifications are shown with respect to Single-GENECI

run

Infer gene regulatory network from expression data by employing multiple unsupervised learning techniques and applying a genetic algorithm for consensus optimization.

Usage:

console $ run [OPTIONS]

Options:

  • --expression-data PATH: Path to the CSV file with the expression data. Genes are distributed in rows and experimental conditions (time series) in columns. [required]
  • --known-interactions PATH: Path to the CSV file with the known interactions between genes. If specified, a local search process will be performed during the repair [default: None].
  • --technique [ARACNE|BC3NET|C3NET|CLR|GENIE3_RF|GENIE3_GBM|GENIE3_ET|MRNET|MRNETB|PCIT|TIGRESS|KBOOST]: Inference techniques to be performed. [required]
  • --crossover [SBXCrossover|BLXAlphaCrossover|DifferentialEvolutionCrossover|NPointCrossover|NullCrossover|WholeArithmeticCrossover]: Crossover operator [default: SBXCrossover]
  • --crossover-probability FLOAT: Crossover probability [default: 0.9]
  • --mutation [PolynomialMutation|CDGMutation|GroupedAndLinkedPolynomialMutation|GroupedPolynomialMutation|LinkedPolynomialMutation|NonUniformMutation|NullMutation|SimpleRandomMutation|UniformMutation]: Mutation operator [default: PolynomialMutation]
  • --mutation-probability FLOAT: Mutation probability. [default: 1/len(files)]
  • --repairer [StandardizationRepairer|GreedyRepair]: Solution repairer to keep the sum of weights equal to 1 [default: StandardizationRepairer]
  • --memetic-distance-type [all|some|one]: Memetic distance type [default: MemeticDistanceType.all]
  • --memetic-probability FLOAT: Memetic probability [default: 0.55]
  • --population-size INTEGER: Population size [default: 100]
  • --num-evaluations INTEGER: Number of evaluations [default: 25000]
  • --cut-off-criteria [MinConfidence|MaxNumLinksBestConf|MinConfDist]: Criteria for determining which links will be part of the final binary matrix. [default: MinConfDist]
  • --cut-off-value FLOAT: Numeric value associated with the selected criterion. Ex: MinConfidence = 0.5, MaxNumLinksBestConf = 10, MinConfDist = 0.2 [default: 0.5]
  • --quality-weight FLOAT: Weight associated with the first term of the fitness function. This term tries to maximize the quality of good links (improve trust and frequency of appearance) while minimizing their quantity. It tries to establish some contrast between good and bad links so that the links finally reported are of high reliability. [default: 0.75]
  • --topology-weight FLOAT: Weight associated with the second term of the fitness function. This term tries to increase the degree (number of links) of those genes with a high potential to be considered as hubs. At the same time, it is intended that the number of genes that meet this condition should be relatively low, since this is what is usually observed in real gene networks. The objective is to promote the approximation of the network to a scale-free configuration and to move away from random structure. [default: 0.25]
  • --threads INTEGER: Number of threads to be used during parallelization. By default, the maximum number of threads available in the system is used. [default: 8]
  • --graphics / --no-graphics: Indicate if you want to represent the evolution of the fitness value. [default: True]
  • --output-dir PATH: Path to the output folder. [default: inferred_networks]
  • --help: Show this message and exit.

optimize-ensemble

Analyzes several trust lists and builds a consensus network by applying an evolutionary algorithm

Usage:

console $ optimize-ensemble [OPTIONS]

Options:

  • --confidence-list TEXT: Paths of the CSV files with the confidence lists to be agreed upon. [required]
  • --known-interactions PATH: Path to the CSV file with the known interactions between genes. If specified, a local search process will be performed during the repair [default: None].
  • --gene-names PATH: Path to the TXT file with the name of the contemplated genes separated by comma and without space. If not specified, only the genes specified in the lists of trusts will be considered.
  • --crossover [SBXCrossover|BLXAlphaCrossover|DifferentialEvolutionCrossover|NPointCrossover|NullCrossover|WholeArithmeticCrossover]: Crossover operator [default: SBXCrossover]
  • --crossover-probability FLOAT: Crossover probability [default: 0.9]
  • --mutation [PolynomialMutation|CDGMutation|GroupedAndLinkedPolynomialMutation|GroupedPolynomialMutation|LinkedPolynomialMutation|NonUniformMutation|NullMutation|SimpleRandomMutation|UniformMutation]: Mutation operator [default: PolynomialMutation]
  • --mutation-probability FLOAT: Mutation probability. [default: 1/len(files)]
  • --repairer [StandardizationRepairer|GreedyRepair]: Solution repairer to keep the sum of weights equal to 1 [default: StandardizationRepairer]
  • --memetic-distance-type [all|some|one]: Memetic distance type [default: MemeticDistanceType.all]
  • --memetic-probability FLOAT: Memetic probability [default: 0.55]
  • --population-size INTEGER: Population size [default: 100]
  • --num-evaluations INTEGER: Number of evaluations [default: 25000]
  • --cut-off-criteria [MinConfidence|MaxNumLinksBestConf|MinConfDist]: Criteria for determining which links will be part of the final binary matrix. [default: MinConfDist]
  • --cut-off-value FLOAT: Numeric value associated with the selected criterion. Ex: MinConfidence = 0.5, MaxNumLinksBestConf = 10, MinConfDist = 0.2 [default: 0.5]
  • --quality-weight FLOAT: Weight associated with the first term of the fitness function. This term tries to maximize the quality of good links (improve trust and frequency of appearance) while minimizing their quantity. It tries to establish some contrast between good and bad links so that the links finally reported are of high reliability. [default: 0.75]
  • --topology-weight FLOAT: Weight associated with the second term of the fitness function. This term tries to increase the degree (number of links) of those genes with a high potential to be considered as hubs. At the same time, it is intended that the number of genes that meet this condition should be relatively low, since this is what is usually observed in real gene networks. The objective is to promote the approximation of the network to a scale-free configuration and to move away from random structure. [default: 0.25]
  • --threads INTEGER: Number of threads to be used during parallelization. By default, the maximum number of threads available in the system is used. [default: 8]
  • --graphics / --no-graphics: Indicate if you want to represent the evolution of the fitness value. [default: True]
  • --output-dir PATH: Path to the output folder. [default: <>/../ea_consensus]
  • --help: Show this message and exit.

Owner

  • Name: Adrián Segura
  • Login: AdrianSeguraOrtiz
  • Kind: user
  • Location: Málaga, Spain

Bioinformatics engineer, teacher, and researcher at the University of Malaga.

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: Memetic-GENECI (Memetic GEne NEtwork Consensus Inference)
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Adrián
    family-names: Segura Ortiz
    email: adrianseor.99@uma.es
    affiliation: University of Malaga
    orcid: 'https://orcid.org/0000-0003-2149-5754'
repository-code: 'https://github.com/AdrianSeguraOrtiz/Memetic-GENECI'
url: 'https://pypi.org/project/GENECI/1.5.2/'
abstract: >-
  Memetic-GENECI is a software package derived from Single-GENECI (GEne NEtwork Consensus Inference) 
  that incorporates an additional local search phase to guide the evolution of individuals 
  based on known interactions. Injection of domain expert knowledge has been shown to improve 
  the accuracy with which Single-GENECI optimises consensus between different gene regulatory 
  network inference techniques.
keywords:
  - Memetic Algorithm
  - Gene Regulatory Networks
  - Optimization
  - Bioinformatics
license: MIT
preferred-citation:
  type: conference-paper
  authors:
  - family-names: "Adrián"
    given-names: "Segura Ortiz"
    orcid: "https://orcid.org/0000-0003-2149-5754"
  - family-names: "José Manuel"
    given-names: "García Nieto"
  - family-names: "José Francisco"
    given-names: "Aldana Montes"
  doi: "10.1007/978-3-031-63772-8_1"
  book: "International Conference on Computational Science"
  publisher: "Springer Nature Switzerland"
  month: 6
  title: "Exploiting medical-expert knowledge via a novel memetic algorithm for the inference of gene regulatory networks"
  year: 2024

GitHub Events

Total
  • Push event: 2
Last Year
  • Push event: 2

Dependencies

.github/workflows/ci.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v1 composite
  • snok/install-poetry v1 composite
.github/workflows/release.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v1 composite
  • ncipollo/release-action v1 composite
  • snok/install-poetry v1 composite
.github/workflows/sync_repo.yml actions
  • actions/checkout v3 composite
components/apply_cut/Dockerfile docker
  • openjdk 17-buster build
components/draw_network/Dockerfile docker
  • python 3.10.2 build
components/evaluate/dream_prediction/Dockerfile docker
  • python 3.5.6 build
components/evaluate/generic_prediction/Dockerfile docker
  • r-base 4.1.2 build
components/extract_data/DREAM3/Dockerfile docker
  • python 3.10.2 build
components/extract_data/DREAM4/EVAL/Dockerfile docker
  • python 3.10.2 build
components/extract_data/DREAM4/EXPGS/Dockerfile docker
  • r-base 4.1.2 build
components/extract_data/DREAM5/Dockerfile docker
  • python 3.10.2 build
components/extract_data/GRNDATA/Dockerfile docker
  • r-base 4.1.2 build
components/extract_data/IRMA/Dockerfile docker
  • r-base 4.1.2 build
components/infer_network/ARACNE/Dockerfile docker
  • r-base 4.1.2 build
components/infer_network/BC3NET/Dockerfile docker
  • r-base 4.1.2 build
components/infer_network/C3NET/Dockerfile docker
  • r-base 4.1.2 build
components/infer_network/CLR/Dockerfile docker
  • r-base 4.1.2 build
components/infer_network/GENIE3/Dockerfile docker
  • python 3.10.2 build
components/infer_network/JUMP3/Dockerfile docker
  • mathworks/matlab r2020b build
components/infer_network/KBOOST/Dockerfile docker
  • r-base 4.1.2 build
components/infer_network/MRNET/Dockerfile docker
  • r-base 4.1.2 build
components/infer_network/MRNETB/Dockerfile docker
  • r-base 4.1.2 build
components/infer_network/PCIT/Dockerfile docker
  • r-base 4.1.2 build
components/infer_network/TIGRESS/Dockerfile docker
  • r-base 4.1.2 build
components/optimize_ensemble/Dockerfile docker
  • openjdk 17-buster build
EAGRN-JMetal/pom.xml maven
  • org.junit.jupiter:junit-jupiter-api 5.8.2 compile
  • org.testng:testng RELEASE compile
  • org.slf4j:slf4j-api 1.7.36
  • org.slf4j:slf4j-simple 1.7.36
  • org.uma.jmetal:jmetal-algorithm 5.11
  • org.uma.jmetal:jmetal-core 5.11
  • org.uma.jmetal:jmetal-example 5.11
  • org.uma.jmetal:jmetal-experimental 5.11
  • org.uma.jmetal:jmetal-parallel 5.11
  • org.junit.jupiter:junit-jupiter-api 5.8.2 test
  • org.mockito:mockito-all 1.10.19 test
poetry.lock pypi
  • certifi 2022.12.7
  • charset-normalizer 3.1.0
  • click 8.1.3
  • colorama 0.4.6
  • contourpy 1.0.7
  • cycler 0.11.0
  • docker 5.0.3
  • fonttools 4.39.3
  • idna 3.4
  • importlib-resources 5.12.0
  • kiwisolver 1.4.4
  • matplotlib 3.7.1
  • numpy 1.24.2
  • packaging 23.0
  • pillow 9.4.0
  • pyparsing 3.0.9
  • python-dateutil 2.8.2
  • pywin32 227
  • requests 2.28.2
  • setuptools 67.6.1
  • setuptools-scm 7.1.0
  • shellingham 1.5.0.post1
  • six 1.16.0
  • tomli 2.0.1
  • typer 0.4.2
  • typing-extensions 4.5.0
  • urllib3 1.26.15
  • websocket-client 1.5.1
  • zipp 3.15.0
pyproject.toml pypi
  • docker ^5.0.3
  • matplotlib ^3.5.2
  • python ^3.9
  • typer ^0.4.1