map2prot

Tool for mapping peptides onto a protein sequence

https://github.com/preston-gw/map2prot

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.6%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Tool for mapping peptides onto a protein sequence

Basic Info
  • Host: GitHub
  • Owner: preston-gw
  • License: mit
  • Language: R
  • Default Branch: main
  • Homepage:
  • Size: 339 KB
Statistics
  • Stars: 2
  • Watchers: 1
  • Forks: 0
  • Open Issues: 7
  • Releases: 1
Created over 2 years ago · Last pushed almost 2 years ago
Metadata Files
Readme License Citation

README.md

map2prot

Tool for mapping peptides onto a protein sequence

About

The map2prot tool is an R script that maps a set of peptides onto a protein sequence. It generates a chart in which the peptides appear as coloured blocks underneath the protein sequence. Sequence coverage (%) and various metadata are printed on the chart.

The current version of the script (v1.0.0) is valid for use with protein sequences of up to 700 amino acid residues.

The inputs are: * a text file containing the sequences of the peptides; * a *.fasta file containing one or more protein sequences, plus the index number of the sequence you want to use (1 for the first or only sequence, 2 for the second sequence, and so on).

If the file containing the peptide sequences is called 'peptides.txt', it will be interpreted as the MaxQuant output file of the same name. If the file is called something else, it will be interpreted as a plain list (as would be uploaded for a UniProt peptide search, for example).

map2prot v1.0.0 was written in R for Windows and requires R package seqinR. It was developed under R version 4.2.1 and seqinR version 4.2-16, using files generated by MaxQuant version 1.6.0.1. It was tested under R version 4.3.2 and seqinR version 4.2-36, as described in v1.0.0testingsummary.pdf.

map2prot was adapted from my earlier work on dependent peptides. Please see the citation metadata for more information.

Files

map2prot.R

The R script (i.e., the tool itself). Usage notes and detailed instructions can be found in the script header.

4f5s_A.fasta

An example of a *.fasta file. It contains the sequence of mature bovine serum albumin (A190T variant). The file originates from PRIDE project PXD013040. The albumin sequence originates from Protein Data Bank accession 4F5S.

peptides.txt

An example of a MaxQuant 'peptides.txt' file. It can be used together with 4f5s_A.fasta to test that the script is working properly. The file originates from PRIDE project PXD013040.

v1.0.0testingsummary.pdf

Document summarising testing/validation of v1.0.0.

Getting started

  1. Download and unzip the source code [Releases > v1.0.0 > Source code (zip)].
  2. Follow the instructions in the script header.
  3. Review the text output in the R console, checking for any warnings.
  4. Review the graphical output. It should look like example_output.jpg.

Acknowledgements

I would like to thank Dr Stuart Warriner (University of Leeds) and Professor Andrew Wilson (University of Birmingham) for supporting this project. My work at the University of Leeds was funded by the Engineering and Physical Sciences Research Council.

Owner

  • Login: preston-gw
  • Kind: user
  • Location: United Kingdom

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: map2prot
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: George
    family-names: Preston
    orcid: 'https://orcid.org/0000-0003-4812-139X'
    affiliation: University of Leeds
repository-code: 'https://github.com/preston-gw/map2prot'
abstract: >-
  map2prot is an R script that maps peptide sequences onto 
  a protein sequence. The main mapping/visualisation code
  was adapted from earlier work (see 
  https://github.com/preston-gw/dependent-peptides). So, 
  the most appropriate thing to cite would be the 
  publication describing that work 
  (DOI: 10.1371/journal.pone.0235263).
keywords:
  - Proteomics
  - MaxQuant
license: MIT
version: 1.0.0
preferred-citation:
  type: article
  authors:
  - family-names: "Preston"
    given-names: "George"
    orcid: "https://orcid.org/0000-0003-4812-139X"
  - family-names: "Yang"
    given-names: "Liping"
  - family-names: "Phillips"
    given-names: "David"
  - family-names: "Maier"
    given-names: "Claudia"
  doi: "10.1371/journal.pone.0235263"
  journal: "PLoS ONE"
  start: e0235263 # Article number
  title: "Visualisation tools for dependent peptide searches 
   to support the exploration of in vitro protein modifications"
  issue: 7
  volume: 15
  year: 2020

GitHub Events

Total
  • Watch event: 1
Last Year
  • Watch event: 1