albatradis

Comparative TraDIS analysis

https://github.com/quadram-institute-bioscience/albatradis

Keywords

bioinformatics bioinformatics-pipeline genomics global-health infectious-diseases microbial-genomics next-generation-sequencing pathogen research sequencing tradis transposon-mutagenesis

Last synced: 9 months ago · JSON representation

Repository

Comparative TraDIS analysis

Basic Info

Host: GitHub
Owner: quadram-institute-bioscience
License: gpl-3.0
Language: Gnuplot
Default Branch: master
Homepage:
Size: 19.6 MB

Statistics

Stars: 9
Watchers: 4
Forks: 4
Open Issues: 3
Releases: 0

Topics

bioinformatics bioinformatics-pipeline genomics global-health infectious-diseases microbial-genomics next-generation-sequencing pathogen research sequencing tradis transposon-mutagenesis

Created over 7 years ago · Last pushed over 3 years ago

Metadata Files

Readme Changelog License

AlbaTraDIS

Introduction

AlbaTraDIS is a software application for performing rapid large-scale comparative analysis of TraDIS experiments whilst also predicting the impact of inserts on nearby genes. It allows for experiements with multiple conditions to be easily analysed using statistical methods developed in the QuaTraDIS toolkit.

Installation

The software in this repository is straightforward to install, however the QuaTraDIS toolkit upon which it depends is more complex.
We recommend using docker as this should guarantee portability, however this is a little trickier to run.
Alternatively use conda. Finally if you are brave and running a debian based linux distro (e.g. Ubuntu) then follow the instructions to install from source.

Conda

The easiest way to install bioinformatics software is with Conda. Once conda is installed, run this command:

conda install -c conda-forge -c bioconda albatradis AlbaTraDIS will be immediately available to use.

Docker

Install Docker which works on Linux, OSX and Windows. There is a docker container which gets automatically built from the latest version of AlbaTraDIS. To use it you would use a command such as this (substituting in your filename/directories), using the example file in this repository: docker run --rm -it -u $(id -u ${USER}):$(id -g ${USER}) -v /path/to/data:/data sbastkowski/albatradis:latest albatradis <program arguments>

From source on Ubuntu/Debian

To install AlbaTraDIS on Ubuntu or Debian first install Quatradis: sudo apt-get update -qq && \ sudo apt-get install -y bzip2 default-jre gcc locales make python3-pip r-base unzip wget && \ sudo apt-get install -y bwa minimap2 smalt pip3 install Bio cython pysam numpy pytest-cov semantic-version snakeviz sudo Rscript -e "install.packages('BiocManager')" -e "BiocManager::install()" -e "BiocManager::install(c('edgeR','getopt', 'MASS'))" git clone https://github.com/quadram-institute-bioscience/QuaTradis.git cd QuaTradis make dev

Next install albatradis git clone https://github.com/quadram-institute-bioscience/albatradis.git pip3 install --user cython wheel albatradis These instructions were tested on Ubuntu 20. Ubuntu 18 or previous may install a version of R < 3.6. We need R 3.6 for this to work.

To install R 3.6 on ubuntu < 19:

```

Add R debian repo to package list file /etc/apt/sources.list (replace 'bionic' with your distro version if necessary)

sudo sh -c "echo 'deb https://cloud.r-project.org/bin/linux/ubuntu bionic-cran35/' >> /etc/apt/sources.list"

Add keys for R repo

gpg --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9 gpg -a --export E298A3A825C0D65DFD57CBB651716619E084DAB9 | sudo apt-key add -

Remove the old version of R

sudo apt-get remove r-base r-base-core

Update packages and install new version

sudo apt-get update sudo apt-get install r-base

Confirm R version is 3.6

R --version ```

Usage

albatradis

This is the main script for the application.

``` usage: albatradis [options] EMBLfile conditionplotfiles controlplotfiles

Tradis analysis

positional arguments: emblfile Annotation file in EMBL format plotfiles Input plot files (optionally gzipped). There must be an equal number of condition and control files

optional arguments: -h, --help show this help message and exit --spangaps SPANGAPS, -s SPANGAPS Span a gap if it is this multiple of a window size (default: 1) --iterations ITERATIONS, -i ITERATIONS No. of times to rescan (default: 1) --minimumblock MINIMUMBLOCK, -b MINIMUMBLOCK Minimum number of reads which must be in 1 block in comparison (default: 100) --minimumlogfc MINIMUMLOGFC, -f MINIMUMLOGFC Minimum log fold change +/- (default: 1) --minimumlogcpm MINIMUMLOGCPM, -c MINIMUMLOGCPM Minimum log counts per million +/- (default: 8.0) --minimumthreshold MINIMUMTHRESHOLD, -m MINIMUMTHRESHOLD Only include insert sites with this number or greater insertions (default: 5) --minimumproportioninsertions MINIMUMPROPORTIONINSERTIONS, -d MINIMUMPROPORTIONINSERTIONS If the proportion of insertions is too low compared to control, dont call decreased insertions below this level (default: 0.1) --dontnormaliseplots, -n Dont normalise input plots (default: False) --prefix PREFIX, -o PREFIX Output directory prefix (default: output) --pvalue PVALUE, -p PVALUE Dont report anything above this p-value (default: 0.05) --qvalue QVALUE, -q QVALUE Dont report anything above this q-value (default: 0.05) --strictsignal, -g A result must be present in the combined plots to be returned (default: False) --useannotation, -a Use the reference annotation rather than a sliding window (default: False) --primefeaturesize PRIMEFEATURESIZE, -z PRIMEFEATURESIZE Feature size when adding 5/3 prime block when --useannotation (default: 198) --windowinterval WINDOWINTERVAL, -l WINDOWINTERVAL Window interval (default: 25) --windowsize WINDOWSIZE, -w WINDOWSIZE Window size (default: 100) --verbose, -v Print out more information about the analysis while it runs (default: False) --debug Turn on debugging (default: False) --version show program's version number and exit ```

Positional arguments

emblfile: This is an annotated reference genome in EMBL format. It can be downloaded from the EBI website.

plot_files: These are insert site plot files, generated using bacteria_tradis script. The same reference genome must be used in all cases and must match the emblfile. Conditions are provided first, followed by controls. The number of conditions must match the number of controls, with a minimum of 1 of each. Ideally you need 2 or more of each.

Optional arguments

span_gaps: When blocks of significat change in insertions are detected they can be fragmented, possibly being at the start and end of a gene and missing in the middle. This option allows you to span these gaps to form more contigous blocks, giving neater results. If you set this too high, then different distinct mechanisms will be merged together, giving you erroneous results.

iterations: You can iteratively look for the highest signals, identify them, report them, then mask them out, and start again. This allows you to progressively identify weaker signals which may be overwhelmed ordinarily. There is no automatic stop, so if you do too many iterations, you will increase the number of false positives.

minimum_block: This is the minimum number of reads which must be in a block/gene to be considered. If you do a scatter plot experimental variation, low abundance equates to greater variability. This hard minimum threshold eliminates a great deal of noise.

minimum_logfc: The minimum log fold change in insertion sites between conditions to consider as significant. It must be an integer.

minimum_logcpm: The minimum log counts per million to consider. It must be an integer and this is approximately equivalent to relative abundance.

minimum_threshold: There can be random bits of DNA which makes it through to sequencing, but they are generally uniformly scattered throughout at low frequency. This controls the minimum threshold. Any insert site with less than this number will be set to zero at the start of the experiment.

minimumproportioninsertions: Often experiments produce different numbers of insertions, its just a natural part of the protocol. If the difference between the condition and controls are too extreme, then the statistics start to break down, particularly when calling decreased numbers of insertions, since the absense of data is probably due to the overall number of insertions/reads rather than something real. This is the minimum proportion allowable.

dontnormaliseplots: By default plots are normalised to the number of reads in the largest plot file, helping to make the statistics and plots look more uniform. You can turn this off if you wish.

prefix: The prefix of the output directory. You'll probably want to change this each time you run the script.

pvalue: Dont report anything above this p-value. You may want to reduce the value further, depending on how adventurous you are. If you set it too high you will get more erroneous results, if you set it too low you may not get any results at all.

strict_signal: By default if there is a strong signal in the forward or reverse directions above the thresholds, it get reported. You can make this even stricter, by also requiring that the combined data must also have a significant signal. This will reduce the number of genes identified, however may also reduce erroneous signals.

use_annotation: By default the software uses a sliding window to identify significant signals. You can choose instead to use the annotated genes passed in via the input EMBL file. Each gene also has a 5' and 3' feature added, so that you dont miss signals in the intergenic regions and also to identify up and down regulation.

primefeaturesize: When you --use_annotation 5' and 3' features are created around each gene. This controls the size of those features in bases.

window_interval: The number of bases to move along when using sliding windows. Ensure it is less than or equal to the windowssize otherwise you will miss parts of the genome. Ideally it should be a maximum of half the windowsize.

window_size: The size of the sliding window in bases. If you set this too high you will only get very strong signals so will miss quite a bit. If you set this too low you will get a huge amount of false positives due to the natural experimental variation. The window size should be about 10 times the average insertion density, so if there are insertions every 10 bases, the window size should be 100 bases.

Output files

annotation.embl: This file contains a modified version of the input annotation file in EMBL format. For example, it adds 5' and 3' features to each gene. It can be visualised in Artemis.

gene_report.csv: This is the main results file in the output of the script. It contains a tab delimited spreadsheet detailing genes identified as being interesting, in the below format. The first column is the gene name derived from the input annotation file. If a signal is identified in an intergenic region, the start and end coordinates are given. The next column is a categorisation of the mechanism, such as up or down regulation, knockout, unclassified etc.... The 3rd and 4th columns are the coordinates of the start and end of the gene, relative to the input annotation file. The MaxLogFC is the maximum log fold change in the signal observed in the gene (or in the 5'/3'). It is rounded to the nearest integer. If the signal is strongest in a single direction, the max log fold change in that direction is reported rather than the value for the combined analysis. The expression column indicates if the gene is experiencing an increase or decrease in insertions. The direction column indicates which direction the significant insertions where primarily detected in (or no direction if both apply). Finally the last column gives the upstream gene, which is often implicated in the mechanism.

| Gene | Category | Start | End | MaxLogFC | Expression | Direction | Upstream | | --- | --- | --- | --- | --- | --- | --- | --- | | zabC | downregulated | 100 | 500 | 1 | increasedinsertions | forward | abc | | yxxY | upregulated | 135 | 234 | 1 | decreasedinsertions | reverse | efg |

regulatedgenereport.csv: This is a filtered version of the gene_report.csv file but only includes genes which are identified as upregulated or downregulated. If no genes are identified with this pattern, the file will not be created. This is really only useful if the experiements included a promotor.

combined.csv: This comma delimited spreadsheet is the output of the Bio-TraDIS toolkit, with additional essentiality categoristations, and an example is listed below. This is the raw data from which the gene_report.csv is derived. It lists each gene or sliding window, and optionally the corresponding 5' and 3' features for a gene. The first 2 columns list the names of the gene or give the coordinates of the sliding window. The 3rd column lists the annotated function of the gene (if available in the annotation file). The numerical columns are derived from EdgeR. The 4th column gives the log fold change between the conditions and the controls. The 5th column gives the log counts per million, which can be thought of as relative abundance. The final column indicate how the essentiality has changed between the conditions and the controls, so a gene can always be non-essential in both the controls and the conditions or essential in all cases. More interestingly though is where there is a change in essentiality between the control and the conditions, indicating a large mechanistic change.

| locustag | genename | function | logFC | logCPM | PValue | q.value | | --- | --- | --- | --- | --- | --- | --- | | thrL | thrL | product | -0.4327 | 4.1269 | 0.5477 | 0.8177 | | thrL5prime | thrL5prime | product | -0.1208 | 4.5885 | 0.8555 | 0.9521 | | thrL3prime | thrL3prime | product | 1.0268 | 4.9723 | 0.1227 | 0.4258 |

forward.csv: This is identical to the combined.csv file, except only insertions in the forward direction were considered during the analysis.

reverse.csv: This is identical to the combined.csv file, except only insertions in the reverse direction were considered during the analysis.

combined.plot: This is the log fold change of each gene or sliding window, in a User plot format suitable for viewing in Artemis. It consists of 2 space delimited integers on each line, where a line corresponds to a base in the reference genome. A positive integer means there has been an increase in insertions, and a negative integer means there has been a decrease in insertions.

forward.plot: This is identical to the combined.plot file, except only insertions in the forward direction were considered during the analysis.

reverse.plot: This is identical to the combined.plot file, except only insertions in the reverse direction were considered during the analysis.

Essentiality.txt: This file is a comma delimited spreadsheet, which contains information on gene essentiality. The first column states the gene name, second column states its essentiality status (always essential, probably always essential, conditionally essential, probably conditionally essential, essential in control, probably essential in control, always non-essential , probably always non-essential, inconsistent replicates). A gene is classified as "always essential", if it is essential in all condition and all control replicates. It is classified as "probably always essential" if the gene was essential in more than 50% of the replicates in control and condition. A gene is classified as "conditionally essential" if it is essential in all replicates of the condition but in none of the replicates in the control. A gene is classified as "probably conditionally essential" if it is essential in more than 50% of replicates in the condition but in less than 50% of the replicates in the control. A gene is classified as "essential in the control" if it is essential in all replicates of the control but in none of the replicates in the condition. A gene is classified as "probably essential in the control" if it is essential in more than 50% of replicates in the control but in less than 50% of the replicates in the condition. A gene is classified as "always non-essential" if it is non-essential in all replicates in the condition and all the replicates in the control. A gene is classified as "probably always non-essential" if it is non-essential in less than 50% of replicates in the condition and in less than 50% of the replicates in the control. In any other case (e.g. 50% of replicates indicate essentiality and 50% do not) the gene gets the tag "inconsistent replicates" assigned. The third column gives the number of replicates in control in which the gene was classified as essential. The fourth column gives the number of replicates in the condition in which the gene was classified as essential. And the fifth column gives the number of replicates in control and condition (they must be the same as per input requirement).

| Gene | Essentiality | Control | Condition | Replicates | | --- | --- | --- | --- | --- | | lpxD | always essential | 2 | 2 | 2 | | ybhK | always nonessential | 0 | 0 | 2 | | pstC | inconsistent data | 0 | 1 | 2 |

albatradis-presence_absence

After you have run the albatradis script, it produces gene_report files. This script performs comparitive analysis and outputs heatmaps, combined spreadsheets, figures and trees (dendrograms).

``` usage: albatradis-presenceabsence [options] EMBLfile genereports

Take in gene report files and produce a heatmap

positional arguments: emblfile Annotation file in EMBL format genereports Gene report spreadsheets

optional arguments: -h, --help show this help message and exit --prefix PREFIX, -o PREFIX Output directory prefix (default: output) --verbose, -v Print out more information about the analysis while it runs (default: False) --debug Turn on debugging (default: False) --version show program's version number and exit ```

Positional arguments

emblfile: This is an annotated reference genome in EMBL format. It can be downloaded from the EBI website.

genereports: One of the outputs of the albatradis script is a gene_report.csv file. You will have 1 of these for each condition, and so providing all of them here will allow for the comparison of conditions. Its probably best to add the name of your condition into the file name to make it easier to identify in the output.

Optional arguments

prefix: This is the output directory prefix and there are a number of output files.

Output files

all_logfc.csv: A tab delimited spreadsheet containing the log fold change integer value for every gene against every condition. It is usually a huge big sparse matrix, so its best to process it with a script.

| Sample | aaaA | bbbB | cccC | dddD | | ------ | ---- | ---- | ---- | ---- | | Cond1 | 1 | -1 | 0 | 0 | | Cond2 | 0 | 1 | 16 | 0 | | Cond3 | 0 | 8 | 0 | 0 | | Cond4 | 1 | 0 | 9 | 0 |

filtered_logfc.csv: Identical to the all_logfc.csv file, except genes only genes with a significant signal in at least 1 condition are kept. This is much easier to look at manually.

full_heatmap.png: A visual representation of the data in all_logfc.csv in PNG format.

filtered_heatmap.png: A visual representation of the data in filtered_logfc.csv in PNG format.

distancematrixdendrogram.tre: A simple dendrogram (tree) in Newick format of the relationships between the conditions, based on the number of shared genes with significant signals. This tree can be visualised in FigTree or any number of different applications.

njnewicktree.tre: A neighbour joining tree in Newick format, created from a distance matrix, based on the number of shared genes with significant signals. This tree can be visualised in FigTree or any number of different applications.

network.csv: A graph representation of the shared genes between conditions in csv format. It can be interactively visualised with Cytoscape. There is 1 node for each condition. It is connected to one node for each gene that shows a value above 0 in logFC in this condition. One gene node can be connected to musltiple conditions but never to other gene nodes.

uniongenereport.csv: A comma separated spreadsheet in the same format as the gene_report.csv file, consisting of a union of all of the input files. A gene is represented by 1 row.

intersectiongenereport.csv: A comma separated spreadsheet in the same format as the gene_report.csv file, consisting of the intersection of all of the input files. So only genes which are found in every condition (common modes of action) are in the file. A gene is represented by 1 row.

albatradis-gene_reports

This is a helper script that you may never need as the functionality is used within the albatradis-presenceabsence script. You can take multiple genereport.csv files and perform set operations on them. It is useful if you know that a few conditions should be merged together as the mechanisms are identical.

``` usage: albatradis-genereports [options] genereport1.csv gene_report2.csv ...

Manipulate gene_report.csv files, such as performing set operations

positional arguments: genereports Gene report spreadsheets

optional arguments: -h, --help show this help message and exit --prefix PREFIX, -o PREFIX Output directory prefix (default: output) --verbose, -v Print out more information about the analysis while it runs (default: False) --debug Turn on debugging (default: False) --version show program's version number and exit ```

Positional arguments

genereports: One of the outputs of the albatradis script is a gene_report.csv file. You will have 1 of these for each condition.

Optional arguments

prefix: This is the output directory prefix and there are a number of output files.

Output files

uniongenereport.csv: A comma separated spreadsheet in the same format as the gene_report.csv file, consisting of a union of all of the input files. A gene is represented by 1 row.

intersectiongenereport.csv: A comma separated spreadsheet in the same format as the gene_report.csv file, consisting of the intersection of all of the input files. So only genes which are found in every condition (common modes of action) are in the file. A gene is represented by 1 row.

albatradis-scatterplot

This script produces scatterplots of your input data plotted against itself and the controls. It is useful as a QC metric to see if the data is biased. Basically you take sliding windows, count the number of reads in each window, then plot those values against the other condition and against the others. This is on a log scale and the outliers are the interesting points.

``` usage: albatradis-scatterplot [options] --control control1.plot --control control2.plot --condition condition1.plot --condition condition2.plot

Create scatter plot of controls vs conditions

optional arguments: -h, --help show this help message and exit --control CONTROL, -c CONTROL control files (use 2 or more) (default: None) --condition CONDITION, -d CONDITION condition files (use 2 or more) (default: None) --windowsize WINDOWSIZE, -w WINDOW_SIZE Window size (default: 50) --outputfile OUTPUTFILE, -o OUTPUTFILE Output filename prefix (default: scatter) --normalise, -n normalise the files (default: False) --verbose, -v Print out more information while it runs (default: False) --debug Turn on debugging (default: False) --version show program's version number and exit ```

Positional arguments

control_: The insert site plots of the controls, where you must have 2 or more.

condition: The insert site plots of the conditions, where you must have 2 or more.

Optional arguments

window_size: The size of the window in bases. The interval is set to the windowsize.

outputfile: The output file prefix.

normalise: Normalise the input files reads to the input file with the largest number of reads.

Output files

scatter*.png: Images of the scatterplots on a log scale in PNG format.

albatradis-annotation

Take in an EMBL file and add flanking 3 prime and 5 prime annotation. It is used as part of the albatradis --use_annotation feature, so you may not need it, as the annotated file is saved in the output directory.

``` usage: albatradis-annotation [options] EMBLfile

Take in an EMBL file and add flanking 3 prime and 5 prime annotation

positional arguments: emblfile Annotation file in EMBL format

optional arguments: -h, --help show this help message and exit --featuresize FEATURESIZE, -s FEATURE_SIZE Feature size (default: 198) --outputfile OUTPUTFILE, -o OUTPUTFILE Output file (default: output.embl) --verbose, -v Print out more information about the analysis while it runs (default: False) --debug Turn on debugging (default: False) --version show program's version number and exit ```

Positional arguments

emblfile: This is an annotated reference genome in EMBL format. It can be downloaded from the EBI website.

Optional arguments

feature_size: When you --use_annotation 5' and 3' features are created around each gene. This controls the size of those features in bases.

outputfile: The name of the output file

Output files

output.embl: The original EMBL file, plus annotated 5' and 3' features, to give another EMBL file, including the reference genome sequence.

albatradis-artemis_project

Sometimes you want to view the insert site plots in Artemis. It can be quite a manual task to open up different replicates and combinations. This script will generate a project.properties file from a spreadsheet which gets automatically loaded by Artemis (from the current working directory). This then makes it quicker to view multiple different insert site plots.

``` usage: albatradis-artemisproject [options] reference experimentsmetadata.csv

Create an artemis project file

positional arguments: reference reference EMBL file experiments_metadata experiments metadata spreadsheet

optional arguments: -h, --help show this help message and exit --control CONTROL, -c CONTROL control files (can use multiple times) (default: None) --outputfile OUTPUTFILE, -o OUTPUTFILE Output filename (default: project.properties) --verbose, -v Print out more information while it runs (default: False) --debug Turn on debugging (default: False) --version show program's version number and exit ```

Positional arguments

reference: This is an annotated reference genome in EMBL format. It can be downloaded from the EBI website.

experiments_metadata: A comma separated spreadsheet in the format of "Drug,Pathway,DetailedPathway,Impact,MIC,Induction,Rep1,Rep2".

Optional arguments

control: Path to the control files.

outputfile: The name of the Artemis project file. If you change this, then Artemis wont work.

Output files

project.properties: The Artemis project file.

Example data

Some example data to run the albatradis and albatradis-presenceabsence workflows are included in the data folder. The data is from the Triclosan dataset presented in Yasir et al. 2019. Running commands as well as description of the provided data for the albatradis script can be found here and for the albatradis-presenceabsence script here.

License

AlbaTraDIS is free software, licensed under GPLv3.

Feedback/Issues

Please report any issues or to provide feedback please go to the issues page. If you make improvements to the software, please send us the changes though a pull request so that the whole community may benefit from your work.

Citation

"AlbaTraDIS: Comparative analysis of large datasets from parallel transposon mutagenesis experiments", Andrew J. Page, Sarah Bastkowski, Muhammad Yasir, A. Keith Turner, Thanh Le Viet, George M. Savva, Mark A. Webber, Ian G. Charles PLOS Comp Bio; doi: https://doi.org/10.1101/593624

Owner

Name: Quadram Institute Bioscience
Login: quadram-institute-bioscience
Kind: organization

Website: https://quadram.ac.uk/
Repositories: 25
Profile: https://github.com/quadram-institute-bioscience

GitHub Events

Total

Issues event: 1
Watch event: 1
Issue comment event: 1

Last Year

Issues event: 1
Watch event: 1
Issue comment event: 1

Committers

Last synced: over 2 years ago

All Time

Total Commits: 217
Total Committers: 7
Avg Commits per committer: 31.0
Development Distribution Score (DDS): 0.295

Past Year

Commits: 1
Committers: 1
Avg Commits per committer: 1.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Andrew Page	a**e@q**k	153
Sarah Bastkowski	s**i@q**k	44
Daniel Mapleson	d**n@g**m	9
sarah bastkowski	4****i	4
Thanh Lê	t**t@q**k	3
andrewjpage	a**e@g**m	3
Andrea Telatin	a**a@t**m	1

Committer Domains (Top 20 + Academic)

quadram.ac.uk: 3 telatin.com: 1

Issues and Pull Requests

Last synced: 9 months ago

All Time

Total issues: 11
Total pull requests: 12
Average time to close issues: over 1 year
Average time to close pull requests: 3 months
Total issue authors: 9
Total pull request authors: 4
Average comments per issue: 1.82
Average comments per pull request: 0.0
Merged pull requests: 12
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 1
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 1
Pull request authors: 0
Average comments per issue: 1.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

pmireland (2)
joeseaman (2)
kongjc123 (1)
erland13 (1)
martinclott (1)
unawareK (1)
sbastkowski (1)
minisciencegirl (1)
francesca-short (1)

Pull Request Authors

sbastkowski (5)
thanhleviet (3)
maplesond (3)
telatin (1)

Top Labels

Issue Labels

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- pypi 48 last-month

Total dependent packages: 0
Total dependent repositories: 1
Total versions: 2
Total maintainers: 1

pypi.org: albatradis

Comparitive transposon mutagenesis experiment analysis

Homepage: https://github.com/quadram-institute-bioscience/albatradis
Documentation: https://albatradis.readthedocs.io/
License: GPLv3
Latest release: 1.0.1
published over 6 years ago

Versions: 2
Dependent Packages: 0
Dependent Repositories: 1
Downloads: 48 Last month

Rankings

Dependent packages count: 10.0%

Forks count: 15.3%

Average: 19.1%

Stargazers count: 19.3%

Dependent repos count: 21.7%

Downloads: 29.1%

Maintainers (1)

andrewjpage

Last synced: 9 months ago

albatradis

Science Score: 33.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

AlbaTraDIS

Contents

Introduction

Installation

Conda

Docker

From source on Ubuntu/Debian

Add R debian repo to package list file /etc/apt/sources.list (replace 'bionic' with your distro version if necessary)

Add keys for R repo

Remove the old version of R

Update packages and install new version

Confirm R version is 3.6

Usage

albatradis

Positional arguments

Optional arguments

Output files

albatradis-presence_absence

Positional arguments

Optional arguments

Output files

albatradis-gene_reports

Positional arguments

Optional arguments

Output files

albatradis-scatterplot

Positional arguments

Optional arguments

Output files

albatradis-annotation

Positional arguments

Optional arguments

Output files

albatradis-artemis_project

Positional arguments

Optional arguments

Output files

Example data

License

Feedback/Issues

Citation

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: albatradis

Rankings

Maintainers (1)

Dependencies