pocpbenchmark_manuscript
Code and manuscript for benchmarking proteins alignment tools for improved genus delineation using the Percentage Of Conserved Proteins (POCP)
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 6 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.2%) to scientific vocabulary
Repository
Code and manuscript for benchmarking proteins alignment tools for improved genus delineation using the Percentage Of Conserved Proteins (POCP)
Basic Info
- Host: GitHub
- Owner: ClavelLab
- License: gpl-3.0
- Language: TeX
- Default Branch: main
- Size: 2.11 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Analyses workflow for POCP benchmark manuscript
popcbenchmark_manuscript contains the workflow for analyzing data produced by our benchmark ClavelLab/pocpbenchmark. We set out to compare proteins alignment tools for improved genus delineation using the Percentage Of Conserved Proteins (POCP).
A preprint of our work is available at bioRxiv:
Robust genome-based delineation of bacterial genera. Charlie Pauvert, Thomas C.A. Hitch, Thomas Clavel. bioRxiv 2025.03.17.643616; doi: https://doi.org/10.1101/2025.03.17.643616
Setup the environment for the workflow
These analyses were conducted in R 4.3.1 and in Rstudio. We recommend setting up R and specific versions using rig, and getting Rstudio from Posit. We also use renv for reproducible environment, which can be installed in R with install.packages("renv").
- Open Rstudio and create a new project via "File > New Project..."
- Select "Version Control" and then "Git"
- Type
https://github.com/ClavelLab/pocpbenchmark_manuscriptin Repository URL. - Make sure the project is going to be created in the correct subdirectory on your computer, or else edit accordingly
- Click on "Create project"
- Type
If you comfortable with the command line and git, clone the repository either with SSH or HTTPS in a suitable location.
- Rstudio warns you that
One or more packages recorded in the lockfile are not installedbecause a couple of R packages and dependencies are needed.- Install the dependencies by typing
renv::restore()in the Console and agree to the installation of the packages. - Check that all dependencies are set by typing
renv::status()in the Console where you should haveNo issues found
- Install the dependencies by typing
Our analysis workflow is orchestrated by targets and is composed of two subworkflows.
Prepare the data for the analysis
[!NOTE] You can skip to the next section if you want to start the workflow from already prepared files!
- Download the raw output files from the workflow using the "Download all" button: https://doi.org/10.5281/zenodo.14974869
- Uncompress the zip archive within your project
- Create a
data_benchmarkfolder within your project. - Move all the zip files downloaded from zenodo (
benchmark-gtdb-f__*.zip) todata_benchmark. - Ensure the two csv files are at the root of your project.
- Run the workflow with the following command:
r
Sys.setenv(TAR_PROJECT = "prepare_pocpbenchmark_data")
targets::tar_make()
Analyze the data and build the manuscript
If you skipped the first workflow, you need to download the cleaned and formatted POCP/POCPu values and metadata tables for analysis from https://doi.org/10.5281/zenodo.14975029. These are the files you would have generated with the previous section.
- Run the workflow with the following command:
r
Sys.setenv(TAR_PROJECT = "analyze_pocpbenchmark_data")
targets::tar_make()
The manuscript is then available in the _manuscript folder, both as a HTML document (index.html) and a docx document. The figures are generated in the figures folder.
Owner
- Name: The Clavel lab
- Login: ClavelLab
- Kind: organization
- Location: Germany
- Website: https://www.ukaachen.de/kliniken-institute/institut-fuer-medizinische-mikrobiologie/forschung/ag-clavel/
- Twitter: clavellab
- Repositories: 1
- Profile: https://github.com/ClavelLab
This is the official GitHub account for the research group of Prof. Thomas Clavel.
Citation (CITATION.cff)
cff-version: 1.2.0
message: Please cite the following work when using this software.
title: Code and manuscript for benchmarking proteins alignment tools for improved genus delineation using the Percentage Of Conserved Proteins (POCP)
url: https://github.com/ClavelLab/pocpbenchmark_manuscript
authors:
- family-names: Pauvert
given-names: Charlie
orcid: https://orcid.org/0000-0001-9832-2507
preferred-citation:
type: article
authors:
- family-names: Pauvert
given-names: Charlie
orcid: https://orcid.org/0000-0001-9832-2507
- family-names: Hitch
given-names: Thomas C.A.
orcid: https://orcid.org/0000-0003-2244-7412
- family-names: Clavel
given-names: Thomas
orcid: https://orcid.org/0000-0002-7229-5595
doi: 10.1101/2025.03.17.643616
identifiers:
- type: doi
value: 10.1101/2025.03.17.643616
- type: url
value: http://dx.doi.org/10.1101/2025.03.17.643616
title: Robust genome-based delineation of bacterial genera
url: http://dx.doi.org/10.1101/2025.03.17.643616
database: Crossref
date-published: 2025-03-17
year: 2025
month: 3
publisher:
name: Cold Spring Harbor Laboratory
GitHub Events
Total
- Push event: 4
- Public event: 1
- Create event: 1
Last Year
- Push event: 4
- Public event: 1
- Create event: 1
Issues and Pull Requests
Last synced: about 1 year ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0