transposonmapper

This repository contains the code for the transposonmapper package to map the transposons from the SATAY sequencing data.

https://github.com/satay-ll/transposonmapper

Science Score: 77.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
    3 of 11 committers (27.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.3%) to scientific vocabulary

Keywords

conda docker pytest python satay transposon yeast

Keywords from Contributors

mesh sequences interactive hacking network-simulation
Last synced: 6 months ago · JSON representation ·

Repository

This repository contains the code for the transposonmapper package to map the transposons from the SATAY sequencing data.

Basic Info
Statistics
  • Stars: 2
  • Watchers: 1
  • Forks: 3
  • Open Issues: 9
  • Releases: 6
Topics
conda docker pytest python satay transposon yeast
Created over 4 years ago · Last pushed 8 months ago
Metadata Files
Readme Changelog Contributing License Code of conduct Citation Zenodo

README.rst

**************************
SATAY and Transposonmapper
**************************

This workflow is created for processing sequencing data for SAturated Transposon Analysis in Yeast (SATAY) for Saccharomyces Cerevisiae.
It performs the steps from raw sequencing data until the transposon mapping that outputs files containing all insertion sites combined with the number of reads.

For more information regarding SATAY, see `the satay user website `_ created by the Kornmann-lab.
For more extensive documentation, `see our JupyterBook `_.

The workflow requires input sequencing data in fastq format.
It can perform the following tasks:

- sequence trimming
- quality checking raw and trimmed fastq files
- sequence alignment with reference genome (S288C Cerevisiae genome)
- quality checking bam files, indexing and sorting
- transposon mapping

The output files indicate the location of transposon insertions and the number of reads at those locations.
This is presented in both .bed and .wig format.
Also a list of genes is generated where the number and distribution of insertions and reads is presented per (essential) gene.

.. list-table::
   :widths: 25 25
   :header-rows: 1

   * - 
     - Badges
   * - **fair-software.nl recommendations**
     - 
   * - \1. Code repository
     - |GitHub Badge| |GitHub Size Badge|
   * - \2. License
     - |License Badge|
   * - \3. Community Registry
     - |Pypi Badge| |Docker Badge|
   * - \4. Enable Citation
     - |Zenodo Badge|
   * - \5. Checklists
     - |Howfairis Badge|
   * - **Code quality checks**
     -
   * - Continuous integration
     - |CI Build| |CI Publish| |CI Book|
   * - Documentation
     - |JupyterBook Badge| 
   * - Code Quality
     - |Sonarcloud Quality Gate Badge| |Sonarcloud Coverage Badge|

.. |GitHub Badge| image:: https://img.shields.io/badge/github-repo-000.svg?logo=github&labelColor=gray&color=blue
   :target: https://github.com/SATAY-LL/Transposonmapper
   :alt: GitHub Badge

.. |GitHub Size Badge| image:: https://img.shields.io/github/repo-size/SATAY-LL/Transposonmapper
   :alt: GitHub repo size

.. |License Badge| image:: https://img.shields.io/github/license/SATAY-LL/Transposonmapper
   :target: https://github.com/SATAY-LL/Transposonmapper
   :alt: License Badge

.. |Pypi Badge| image:: https://img.shields.io/pypi/v/transposonmapper?color=blue
   :target: https://pypi.org/project/transposonmapper
   :alt: Pypi Badge
  
.. |Docker Badge| image:: https://img.shields.io/docker/automated/leilaicruz/satay
   :target: https://hub.docker.com/r/leilaicruz/satay
   :alt: Docker Automated build

.. |Zenodo Badge| image:: https://zenodo.org/badge/DOI/10.5281/zenodo.5521578.svg
   :target: https://doi.org/10.5281/zenodo.5521578
   :alt: Zenodo Badge

.. |Howfairis Badge| image:: https://img.shields.io/badge/fair--software.eu-%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F%20%20%E2%97%8F-green
   :target: https://fair-software.eu
   :alt: Howfairis badge

.. |CI Build| image:: https://github.com/SATAY-LL/Transposonmapper/actions/workflows/CI_build.yml/badge.svg
   :alt: Continuous integration workflow
   :target: https://github.com/SATAY-LL/Transposonmapper/actions/workflows/CI_build.yml
   
.. |CI Publish| image:: https://github.com/SATAY-LL/Transposonmapper/actions/workflows/CI_publish.yml/badge.svg
   :alt: Continuous integration workflow
   :target: https://github.com/SATAY-LL/Transposonmapper/actions/workflows/CI_publish.yml

.. |CI Book| image:: https://github.com/SATAY-LL/Transposonmapper/actions/workflows/CI_deploy_book.yml/badge.svg
   :alt: CI to build and deploy jupyterbook in gh-pages
   :target: https://github.com/SATAY-LL/Transposonmapper/actions/workflows/CI_deploy_book.yml

.. |JupyterBook Badge| image:: https://img.shields.io/badge/docs-JupyterBook-green
   :alt: Jupyter Book documentation
   :target: https://satay-ll.github.io/Transposonmapper/Introduction.html

.. |Sonarcloud Quality Gate Badge| image:: https://sonarcloud.io/api/project_badges/measure?project=SATAY-LL_Transposonmapper&metric=alert_status
   :target: https://sonarcloud.io/dashboard?id=SATAY-LL_Transposonmapper
   :alt: Sonarcloud Quality Gate

.. |Sonarcloud Coverage Badge| image:: https://sonarcloud.io/api/project_badges/measure?project=SATAY-LL_Transposonmapper&metric=coverage
   :target: https://sonarcloud.io/component_measures?id=SATAY-LL_Transposonmapper&metric=Coverage&view=list
   :alt: Sonarcloud Coverage

***********************
Documentation for users
***********************

PyPI package
============

For users that only require post processing analysis of the data (the bam file was already analyzed),
do use the default installation. For example `pysam` won't be installed, hence Linux is not required.


.. code-block:: console

   pip install transposonmapper 

For users that require the whole processing pipeline, do use: 


.. code-block:: console

   pip install transposonmapper[linux]



For more extensive documentation, `see our JupyterBook `_.

SATAY pipeline
==============

.. image:: https://user-images.githubusercontent.com/11459658/134164634-0806ce7a-4cae-4040-9ea4-e93a27b0b4b3.png
   :width: 400
   :align: center

We provide two methods to run the SATAY pipeline, either with a Docker container (recommended) or a Linux system. The workflow relies
on the following libraries:

- `FASTQC `_ v0.11.9 or later
- `BBMap `_ v38.87 or later
- `Trimmomatic `_ v0.39 or later
- `BWA `_ v0.7.17 or later
- `SAMTools `_ v1.10 or later
- `BCFTools `_ v1.10.2-3 or later
- `Sambamba `_ v0.7.1 or later
- `Transposonmapper `_

These libraries are called as a processing pipeline by the script `satay.sh `_, 
which generates a GUI.

**Preprocessing steps**


Before inputting the data into the satay pipeline, it is necessary to preprocess the data that comes from the sequencing company. 

The pipeline does not process each digestion separately and therefore any pre-processing and trimming of the restriction sites should be done **prior** to running the pipeline.

**What we do if the sequencing data is paired end:**


If the data is paired end, only one of the pairs will map to the transposon insertion site (the end that has been sequenced from the sequencing primer), while the other end will map back to a location arbitrarily far upstream or downstream of the insertion site (depends on where the restriction site is).

**Preprocessing steps prior to use the satay pipeline:**

- Convert the data to single end by:
    - Extracting the forward reads, which are the reads that contain the sequencing primer, as it is (harsh filtering) or allowing some mismatches in the sequencing primer, due to likely sequencing errors (gentle filtering). 
   
- Remove the sequence downstream the first restriction site for NiaIII and DpnII to avoid having chimeras sequences in our data, that have poor alignment.
    - Discard reads bellow 50bp after trimming of the restriction site to ensure a decent confidence alignment score for that read. 
    

Docker
------

For a full installation and user guide for Docker containers, 
`see our documentation `_.

The Docker image is hosted at `leilaicruz/satay `_.

Prerequisites:

- Windows, macOS, Linux
- Docker 
- Xserver (for displaying the GUI)

To build the image locally in your computer, from DockerHub: 

   - create an account in DockerHub

- Pull the image 

.. code-block:: console



   docker pull leilaicruz/satay:latest


- Verify the image is in your computer 


.. code-block:: console

   docker images

- Move to where you have the Dockerfile and build the image 


.. code-block:: console

   docker build . -t leilaicruz/satay:latest

- Move to the location where you have the data you would like to mount to the container, to use ``$(pwd)`` in the command bellow (simplest option), otherwise indicate the absolute path from your computer you would like to be loaded. 


To run the docker container, use the commands for your Operating System:

.. code-block:: console

    # For Windows (and WSL):
    docker run --rm -it -e DISPLAY=host.docker.internal:0 -v /$(pwd):/data leilaicruz/satay:latest

    # For macOS
    docker run --rm -it -e DISPLAY=docker.for.mac.host.internal:0 -v $(pwd):/data leilaicruz/satay

    # For Linux
    docker run --rm -it --net=host -e DISPLAY=:0 -v $(pwd):/data leilaicruz/satay

- The flag ``-e`` enables viewing of the GUI outside the container via the Xserver 
- The flag ``-v`` mounts the current directory (pwd) on the host system to the ``data/`` folder inside the container

- Troubleshooting 

If an error regarding the connection pops up:

.. code-block:: console


    Gtk-WARNING **: cannot open display: :0

There is a solution in Linux is typing the following command in the terminal : ``xhost +``
      
 

Linux system
------------

Prerequisites:

- Anaconda
- Python 3.7, 3.8

We recommend installing all dependencies in a conda environment:

.. code-block:: console

    git clone https://github.com/SATAY-LL/Transposonmapper.git satay
    cd satay
    conda env create --file conda/environment-linux.yml
    conda activate satay-linux

To start the GUI, simply run

.. code-block:: console

    bash satay.sh


****************************
Documentation for developers
****************************

Installation
============

To install transposonmapper, do:

.. code-block:: console

    git clone https://github.com/SATAY-LL/Transposonmapper.git
    cd transposonmapper
    conda env create --file conda/environment-dev.yml
    conda activate satay-dev
    pip install -e .[dev]

Run tests (including coverage) with:

.. code-block:: console
    
    pytest





Docker image
============

For more information go to our [Jupyter Book](https://satay-ll.github.io/Transposonmapper/03-docker-doc/03-Docker-Developers.html)




Contributing
============
If you want to contribute to the development of transposonmapper and the SATAY pipeline,
have a look at the `contribution guidelines `_.


************
Contributors
************

This software is part of the research effort of the `LaanLab `_,
Department of BioNanoScience, Delft University of Technology 

- Leila Iñigo de la Cruz
- Gregory van Beek
- Maurits Kok


*******
License
*******

Copyright (c) 2020, Technische Universiteit Delft

Licensed under the Apache License, Version 2.0 (the "License"). 
The 2.0 version of the Apache License, approved by the ASF in 2004, 
helps us achieve our goal of providing reliable and long-lived software products 
through collaborative open source software development.

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.


Owner

  • Name: SATAY_LL
  • Login: SATAY-LL
  • Kind: organization

This organization is to host all the repositories related to our satay pipeline

Citation (CITATION.cff)

# YAML 1.2
---
abstract: "A library for processing sequencing data for SAturated Transposon Analysis in Yeast (SATAY)."
authors: 
  -
    affiliation: "Technische Universiteit Delft"
    family-names: "Iñigo de la Cruz"
    given-names: "Leila "
    orcid: "https://orcid.org/0000-0003-0852-9219"
  -
    affiliation: "Eramus MC Rotterdam, department of Hematology"
    family-names: "van Beek"
    given-names: Gregory
    orcid: "https://orcid.org/0000-0003-0283-7069"
  -
    affiliation: "Technische Universiteit Delft,Digital Competence Center"
    family-names: Kok
    given-names: Maurits
    orcid: "https://orcid.org/0000-0002-0564-2614"
cff-version: "1.1.0"
keywords: 
  - "transposon-mapping"
  - "Saccharomyces Cerevisiae"
license: "Apache-2.0"
message: "If you use this software, please cite it using these metadata."
repository-code: "https://github.com/SATAY-LL/Transposonmapper"
title: transposonmapper
doi: 10.5281/zenodo.4636301
version: 1.1.5
...

GitHub Events

Total
  • Issue comment event: 1
  • Push event: 1
  • Pull request event: 1
Last Year
  • Issue comment event: 1
  • Push event: 1
  • Pull request event: 1

Committers

Last synced: about 2 years ago

All Time
  • Total Commits: 714
  • Total Committers: 11
  • Avg Commits per committer: 64.909
  • Development Distribution Score (DDS): 0.664
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Leila Inigo de la Cruz l****z@g****m 240
Gregory94 g****k@t****l 155
Maurits Kok m****k@t****l 113
Gregory94 G****k@t****l 67
Gregory van Beek g****k@t****t 52
Leila Iñigo de la Cruz l****z 43
Gregory94 g****4@g****m 30
Leila Iñigo De La Cruz - TNW l****z@t****t 6
dependabot[bot] 4****] 4
Maurits Kok m****k@g****m 3
Wteunisse t****l@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 9 months ago

All Time
  • Total issues: 48
  • Total pull requests: 33
  • Average time to close issues: about 1 month
  • Average time to close pull requests: about 18 hours
  • Total issue authors: 4
  • Total pull request authors: 4
  • Average comments per issue: 1.79
  • Average comments per pull request: 1.3
  • Merged pull requests: 28
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • mwakok (29)
  • leilaicruz (17)
  • T-Wisse (1)
  • EKingma (1)
Pull Request Authors
  • leilaicruz (27)
  • Capspar (3)
  • mwakok (3)
  • EKingma (1)
Top Labels
Issue Labels
bug (5) enhancement (3) documentation (2) validation (1) update (1) help wanted (1)
Pull Request Labels
enhancement (2) bug (2) documentation (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 11 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 6
  • Total maintainers: 3
pypi.org: transposonmapper

A libray for processing sequencing data for SAturated Transposon Analysis in Yeast (SATAY)

  • Versions: 6
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 11 Last month
Rankings
Dependent packages count: 10.1%
Forks count: 16.9%
Dependent repos count: 21.6%
Average: 22.0%
Stargazers count: 27.8%
Downloads: 33.8%
Maintainers (3)
Last synced: 6 months ago

Dependencies

.github/workflows/CI_build.yml actions
  • SonarSource/sonarcloud-github-action master composite
  • actions/checkout v2 composite
  • actions/setup-python v1 composite
.github/workflows/CI_deploy_book.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v1 composite
  • peaceiris/actions-gh-pages v3.6.1 composite
.github/workflows/CI_publish.yml actions
  • actions/checkout v2 composite
  • actions/checkout v1 composite
  • actions/setup-python v1 composite
  • docker/build-push-action v2 composite
  • docker/login-action v1 composite
  • docker/setup-buildx-action v1 composite
  • pypa/gh-action-pypi-publish master composite
Dockerfile docker
  • continuumio/miniconda3 4.10.3 build
conda/environment.yml pypi
  • matplotlib *
  • scipy *
setup.py pypi
  • numpy *