PhenoFeatureFinder

PhenoFeatureFinder: a python package for linking developmental phenotypes to omics features - Published in JOSS (2024)

https://github.com/bleekerlab/phenofeaturefinder

Science Score: 95.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 4 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
    4 of 6 committers (66.7%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Scientific Fields

Mathematics Computer Science - 84% confidence
Artificial Intelligence and Machine Learning Computer Science - 62% confidence
Engineering Computer Science - 60% confidence
Last synced: 4 months ago · JSON representation

Repository

A Python package dedicated to identifying plant metabolite features related to insect resistance

Basic Info
  • Host: GitHub
  • Owner: BleekerLab
  • License: other
  • Language: Jupyter Notebook
  • Default Branch: master
  • Homepage:
  • Size: 12.8 MB
Statistics
  • Stars: 0
  • Watchers: 4
  • Forks: 0
  • Open Issues: 0
  • Releases: 2
Created about 3 years ago · Last pushed about 1 year ago
Metadata Files
Readme Changelog Contributing License Code of conduct

README.md

Documentation Status PyPI - Version PyPI - Python Version DOI

PhenoFeatureFinder

Linking developmental phenotypes to metabolic features

PhenoFeatureFinder is divided into three classes: * PhenotypeAnalysis * OmicsAnalysis * FeatureSelection

Overview of the package

PhenotypeAnalysis was designed to analyse development over time through progressive stages in multiple groups or treatments. This could for example be the development of insects through their larval stages over time in different environments, or disease scores of fungal infections in multiple host plants. These types of phenotyping analyses can be challenging, due to the many variables involved (e.g. time, developmental stages, replicates, treatments), especially for researchers whose strength does not lie in data analysis. PhenotypeAnalysis offers a set of functions to visualise the development while taking into account those different variables, and to perform the necessary data preprocessing steps. From the output, it is easy to manually assign binary phenotypes to your groups (treatments, genotypes, etc), if you want to use it as input for FeatureSelection.

With OmicsAnalysis, you can filter large untargeted metabolomics datasets and visualise the structure of the data. The filtered data and a corresponding set of binary phenotypes can then be used as input for FeatureSelection. With only a few lines of code, the best fitting pipeline to link the phenotypes to metabolic features is created using Automated Machine Learning with TPOT and scikit-learn.

Although OmicsAnalysis and FeatureSelection are designed for metabolomics data, they might also be used for other types of omics data. The user would have to keep in mind that the functions were written for the specifics of metabolomics data (high sparsity, strongly correlated features) and first assess the fit for other types of data.

Installation

bash $ pip install PhenoFeatureFinder

At this moment, PhenoFeatureFinder requires python 3.9.

Usage

For each of the classes, you can find a manual with an explanation for all of their functions in the manuals folder. Alternatively, you can find the documentation of the classes and their functions on Read the Docs.

If you want to see an example of how PhenoFeatureFinder can be used for real-world data, you can take a look at one of the two examples. The first example showcases the use of the PhenotypeAnalysis class for the analysis of the development of caddisfly larvae in four freshwater streams. In the second example, the OmicsAnalysis and FeatureSelection classes are used to analyse and select interesting features from a mass spectrometry dataset of a panel of bacterial species.

Dependencies

Required for all classes: - NumPy - pandas - Matplotlib - seaborn

Additionally required for PhenotypeAnalysis: - SciPy

Additionally required for OmicsAnalysis: - scikit-learn - UpSetPlot

Additionally required for FeatureSelection: - scikit-learn - TPOT - auto-sklearn (auto-sklearn is made for Linux operating systems. On macOS it needs to be installed manually with brew and pip. You can do this by following these instructions.)

Testing

Before using PhenoFeatureFinder to analyse your data, follow the manuals using the accompanying data to test the functionality. The obtained results should be identical to those in the manual. If you run into any errors, please contact the authors.

Citation

Insert citation option when ready

Contributing

Interested in contributing? Check out the contributing guidelines. Please note that this project is released with a Code of Conduct. By contributing to this project, you agree to abide by its terms.

Author contributions

PhenoFeatureFinder was created by Lissy-Anne Denkers and Marc Galland, with input from Annabel Dekker, Valerio Bianchi and Petra Bleeker.

License

This package is licensed under the terms of the Apache License 2.0 license.

Credits

PhenoFeatureFinder was created with cookiecutter and the py-pkgs-cookiecutter template.

Useful reading

Owner

  • Name: Petra Bleeker laboratory
  • Login: BleekerLab
  • Kind: organization
  • Email: P.M.Bleeker@uva.nl
  • Location: University of Amsterdam

Laboratory of Petra Bleeker at University of Amsterdam

JOSS Publication

PhenoFeatureFinder: a python package for linking developmental phenotypes to omics features
Published
November 23, 2024
Volume 9, Issue 103, Page 7264
Authors
Lissy-Anne M. Denkers ORCID
University of Amsterdam, Department of Plant Physiology, Green Life Science Research Theme, Swammerdam Institute for Life Sciences, Amsterdam, The Netherlands
Marc D. Galland ORCID
INRAE, Institute of Genetics, Environment and Plant Protection (IGEPP—Joint Research Unit 1349), Le Rheu, France
Annabel Dekker
Enza Zaden R&D B.V., BTR-BM Bioinformatics, Enkhuizen, The Netherlands
Valerio Bianchi ORCID
Enza Zaden R&D B.V., BTR-BM Bioinformatics, Enkhuizen, The Netherlands, Wageningen Bioveterinary Research, Wageningen University & Research, Lelystad, Netherlands
Petra M. Bleeker ORCID
University of Amsterdam, Department of Plant Physiology, Green Life Science Research Theme, Swammerdam Institute for Life Sciences, Amsterdam, The Netherlands
Editor
Julia Romanowska ORCID
Tags
insect development phenotyping metabolomics omics feature selection preprocessing

GitHub Events

Total
  • Release event: 1
  • Push event: 10
Last Year
  • Release event: 1
  • Push event: 10

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 185
  • Total Committers: 6
  • Avg Commits per committer: 30.833
  • Development Distribution Score (DDS): 0.238
Past Year
  • Commits: 55
  • Committers: 1
  • Avg Commits per committer: 55.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
LissyDenkers l****s@u****l 141
Marc Galland m****d@u****l 32
Lissy Denkers l****s@f****l 9
Petra Bleeker 5****r 1
semantic-release s****e 1
Lissy Denkers l****s@w****l 1

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 14 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 5
  • Total maintainers: 2
pypi.org: phenofeaturefinder

Find metabolic features linked to insect development phenotypes

  • Versions: 5
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 14 Last month
Rankings
Dependent packages count: 10.8%
Average: 36.0%
Dependent repos count: 61.1%
Maintainers (2)
Last synced: 4 months ago

Dependencies

.github/workflows/draft-pdf.yml actions
  • actions/checkout v4 composite
  • actions/upload-artifact v1 composite
  • openjournals/openjournals-draft-action master composite
docs/requirements.txt pypi
  • alabaster ==0.7.16
  • cookiecutter >=2.6.0
  • myst-nb *
  • pytest ==8.1.1
  • pytest-cookies ==0.7.0
  • ruff ==0.3.5
  • sphinx-autoapi *
  • sphinx-rtd-theme *
  • tox ==4.14.2
  • watchdog ==4.0.0
poetry.lock pypi
  • 136 dependencies
pyproject.toml pypi
  • myst-nb 0.17.1 develop
  • pytest >=7.2.0 develop
  • python-semantic-release >=7.32.2 develop
  • sphinx-autoapi >=2.0.0 develop
  • sphinx-markdown-tables >=0.0.17 develop
  • sphinx-rtd-theme >=1.1.1 develop
  • TPOT >=0.11.7
  • matplotlib >=3.4.3
  • numpy 1.25.0
  • pandas >=1.5.1
  • python 3.9
  • scikit-learn >=0.24.1
  • scipy >=1.10.1
  • seaborn >=0.12.1
  • upsetplot >=0.8.0