Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: springer.com
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (15.0%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: trypanosomatics
  • License: bsd-3-clause
  • Language: R
  • Default Branch: main
  • Size: 53.2 MB
Statistics
  • Stars: 0
  • Watchers: 2
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 1 year ago · Last pushed 7 months ago
Metadata Files
Readme License Citation

README.md

Code for Analysis of Peptide Array Data

Software and code accompanying an upcoming book chapter (citation below). In this chapter, we provide annotated code for analyzing peptide microarray data from the CHAGASTOPE project (aka the Chagas Antigen and Epitope Atlas). The same type of analysis can be applied to other peptide microarray data, with adjustments to the script codes if there is a different array design.

The repository contains a series of scripts for performing the various steps involved in the analysis and visualization of CHAGASTOPE data. The scripts are sequentially organized in code folder, with the order indicated by the number in the script name (e.g., 01poolsnormalize_data). The sequential steps include normalization of array data across samples, smoothing to remove outliers, calculation/detection of reactive (antigenic) peaks and regions, and analysis of single-residue mutagenesis scanning (Alanine Scans) of epitopes as described in (Ricci et al. 2023, DOI: 10.1038/s41467-023-37522-9).

The code is divided into two sections: all main scripts contain the primary code, which reads and processes the parameters called and then run the main function. They are essentially wrappers for the main function. This main function, along with other auxiliary functions are located in the functions sub-folder. Each script is numbered consistently in both sections. This separation is intended for clarity: the main script is concise, allowing users to quickly view the parameters in use and is designed for those who do not wish to modify the code. In contrast, the scripts containing all auxiliary functions are intended for users who wish to delve into the details of the process and make modifications. Parameters are located in the config or parameters sections of the respective scripts.

Both the main scripts and those in the functions folder must be downloaded for the code to function correctly. The code can be executed either through the terminal (Bash or Windows PowerShell), which allows for modifying arguments passed to the script, or by running the script in a number of development environments that support the R programming language (RStudio is recommended).

A minimal set of data for testing purposes is provided in this repository in the data folder. The complete set of data from the Chagas Antigen and Epitope Atlas is available from the ArrayExpress database under accession numbers: E-MTAB-11651 and E-MTAB-11655.

INSTALL

All code has been tested with R version 4.3.3 and Bioconductor v3.18.

Instructions for Ubuntu Linux 22.04

The default R version in Ubuntu-24.04 is 4.3.3. This has been tested. The repository has renv config files that will auto install all required dependencies once an initial interactive R session is opened from the code directory. This interactive R session is only required once.

```

in the linux terminal

update package indices

sudo apt update

install R

sudo apt install r-base-core libx11-dev

clone this repo

git clone https://github.com/trypanosomatics/Peptide-arrays-for-Chagas-disease.git

enter the code directory and make all R scripts executable

cd Peptide-arrays-for-Chagas-disease/code/ chmod +x *.R

install all dependencies

to start auto-installation, please initiate an interactive R session

R ```

Instructions for Windows

Download and install R-4.3.3, and maybe also Rtools43 (which may be required to build R packages from source). Also, download and install git for Windows. Note that the R executables are not by default added to the PATH (see the R for Windows FAQ).

Then in a powershell terminal:

```

clone this repo

git clone https://github.com/trypanosomatics/Peptide-arrays-for-Chagas-disease.git

enter the code directory

cd Peptide-arrays-for-Chagas-disease/code/

install all dependencies

to start auto-installation, please initiate an interactive R session

R.exe ```

Install all dependencies (Linux or Windows)

For both Linux or Windows, the instructions are the same. Within an initial interactive R session, renv should bootstrap itself. You will notice a message saying "One or more packages recorded in the lockfile are not installed." After renv is installed and ready, do:

```

this is all run within an interactive R session

install all dependencies using renv::restore

this will install all requirements listed in the renv.lock file

and respecting all configs in renv/settings.json

you should respond to occasional questions

renv::restore()

check that everything is OK

renv::status()

if you get this, you're OK to go!

No issues found -- the project is in a consistent state.

```

USAGE

The more detailed descriptions of what each script does are written in the book chapter, but briefly, all code can be tested easily with a minimal data set provided in this repo. Some examples shown below:

Example Bash (Linux)

```

minimal, assumes all testing data and defaults parameters

./01poolsnormalize_data.R --testing TRUE

custom main folder and only some specific datasets

./01poolsnormalizedata.R --mainfolder /path/to/folder --sources AR,BO,BR ```

Example Windows PowerShell

For Windows / Poweshell scripts are passed as arguments to the Rscript.exe executable. The examples follow the same spirit as above.

Rscript.exe '01_pools_normalize_data.R' --testing TRUE Rscript.exe '01_pools_normalize_data.R' --main_folder "/path/to/folder" --testing FALSE --sources "AR,BO,BR"

Citation

If you use this code, please cite:

Guadalupe Romer, Ramiro B Quinteros, Fernán Agüero. Software and tools to analyze high-density peptide array data for the Chagas Antigen and Epitope Atlas (2025). In: Trypanosoma cruzi infection: methods and protocols (Karina A Gómez & Carlos A Busgaglia, eds), Methods in Molecular Biology (series). Springer / Humana Press. In process.

Bibtex citation (will be updated soon): @incollection{romer_25_software, author = {Romer G, Quinteros RB, Agüero F}, title = {Software and tools to analyze high-density peptide array data for the Chagas Antigen and Epitope Atlas}, year = {2025}, chapter = {}, pages = {}, editor = {Gómez KA, Buscaglia CA}, booktitle = {Trypanosoma cruzi infection: methods and protocols}, series = {Methods in Molecular Biology}, publisher = {Humana Press}, volume = {}, number = {}, issn = {}, doi = {}, url = {}, }

Owner

  • Name: Trypanosomatics
  • Login: trypanosomatics
  • Kind: organization
  • Location: Buenos Aires, Argentina

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: Code for Analysis of Peptide Array Data
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Guadalupe
    family-names: Romer
    email: gromer@iib.unsam.edu.ar
    affiliation: >-
      Instituto de Investigaciones Biotecnológicas (IIBIO),
      Universidad Nacional de San Martín (UNSAM) -- Consejo
      Nacional de Investigaciones Científicas y Técnicas
      (CONICET)
   - given-names: Ramiro
    family-names: Quinteros
    email: rbquinteros@iib.unsam.edu.ar
    affiliation: >-
      Instituto de Investigaciones Biotecnológicas (IIBIO),
      Universidad Nacional de San Martín (UNSAM) -- Consejo
      Nacional de Investigaciones Científicas y Técnicas
      (CONICET)
 - given-names: Fernán
    family-names: Agüero
    email: fernan@iib.unsam.edu.ar
    orcid: 'https://orcid.org/0000-0003-1331-5741'
    affiliation: >-
      Instituto de Investigaciones Biotecnológicas (IIBIO),
      Universidad Nacional de San Martín (UNSAM) -- Consejo
      Nacional de Investigaciones Científicas y Técnicas
      (CONICET)
  - name: >-
      Instituto de Investigaciones Biotecnológicas (IIBIO),
      Universidad Nacional de San Martín (UNSAM) -- Consejo
      Nacional de Investigaciones Científicas y Técnicas
      (CONICET)
    address: 'Campus Miguelete UNSAM, Av. 25 de Mayo 1401'
    city: San Martín
    country: AR
    post-code: B1650HMQ
    location: Buenos Aires
    website: 'https://www.iib.unsam.edu.ar'
repository-code: >-
  https://github.com/trypanosomatics/Peptide-arrays-for-Chagas-disease
abstract: >-
  Software and code accompanying an upcoming book chapter
  (citation below). In this chapter, we provide annotated
  code for analyzing peptide microarray data from the
  CHAGASTOPE project (aka the Chagas Antigen and Epitope
  Atlas). The same type of analysis can be applied to other
  peptide microarray data, with adjustments to the script
  codes if there is a different array design.


  The repository contains a series of scripts for performing
  the various steps involved in the analysis and
  visualization of CHAGASTOPE data. The scripts are
  sequentially organized in code folder, with the order
  indicated by the number in the script name (e.g.,
  01_pools_normalize_data). The sequential steps include
  normalization of array data across samples, smoothing to
  remove outliers, calculation/detection of reactive
  (antigenic) peaks and regions, and analysis of
  single-residue mutagenesis scanning (Alanine Scans) of
  epitopes as described in (Ricci et al. 2023, DOI:
  10.1038/s41467-023-37522-9).


  The code is divided into two sections: all main scripts
  contain the primary code, which reads and processes the
  parameters called and then run the main function. They are
  essentially wrappers for the main function. This main
  function, along with other auxiliary functions are located
  in the functions sub-folder. Each script is numbered
  consistently in both sections. This separation is intended
  for clarity: the main script is concise, allowing users to
  quickly view the parameters in use and is designed for
  those who do not wish to modify the code. In contrast, the
  scripts containing all auxiliary functions are intended
  for users who wish to delve into the details of the
  process and make modifications. Parameters are located in
  the config or parameters sections of the respective
  scripts.


  Both the main scripts and those in the functions folder
  must be downloaded for the code to function correctly. The
  code can be executed either through the terminal (Bash or
  Windows PowerShell), which allows for modifying arguments
  passed to the script, or by running the script in a number
  of development environments that support the R programming
  language (RStudio is recommended).


  A minimal set of data for testing purposes is provided in
  this repository in the data folder. The complete set of
  data from the Chagas Antigen and Epitope Atlas is
  available from the ArrayExpress database under accession
  numbers: E-MTAB-11651 and E-MTAB-11655.
keywords:
  - Peptide arrays
  - microarrays
license: BSD-3-Clause-Attribution
commit: c26becc
version: '1'
date-released: '2025-07-29'

GitHub Events

Total
  • Watch event: 1
  • Push event: 108
Last Year
  • Watch event: 1
  • Push event: 108