https://github.com/carolinapb/calculate-gene-abundance

https://github.com/carolinapb/calculate-gene-abundance

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.2%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: CarolinaPB
  • Language: Python
  • Default Branch: master
  • Size: 815 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 5 years ago · Last pushed almost 4 years ago
Metadata Files
Readme

README.md

Assemble transcriptome and calculate gene/transcript abundance

First follow the instructions here:

Step by step guide on how to use my pipelines
Click here for an introduction to Snakemake

ABOUT

This is a pipeline that aligns raw RNA-seq reads (downloaded from SRA) to a genome and quantifies gene abundance. Based on: Pertea, M., Kim, D., Pertea, G. M., Leek, J. T., & Salzberg, S. L. (2016). Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nature Protocols, 11(9), 1650–1667. https://doi.org/10.1038/nprot.2016.095
The final product is a table with gene abundances for every sample.

Tools used:

| DAG | |:--:| |Pipeline workflow |

Edit config.yaml with the paths to your files

yaml METALIST: /path/to/metalist.txt ASSEMBLY: /path/to/assembly.fa ANNOTATION: /path/to/annotation.gff3 OUTDIR: /path/to/outdir

  • METALIST - text file with sample details. Each row is a different sample/record. Mandatory columns:
    • Run - the most important - SRA code
    • AGE - age of individual. If not applicable, leave blank or add NA
    • tissue - tissue where the sample comes from. If not applicable, leave blank or add NA
  • ASSEMBLY - decompressed fasta file
  • ANNOTATION - annotation gff3 file
  • OUTDIR - directory where snakemake will run and where the results will be written to
    If you want the results to be written to this directory (not to a new directory), open config.yaml and comment out OUTDIR: /path/to/outdir

RESULTS

The most important results are: - _files.txt dated file with an overview of the files used to run the pipeline (for documentation purposes) - reads directory that contains alignments - results directory with final results - mergedtranscriptome.gtf - non-redundant set of transcripts - results/{SRA}/{SRA}.gtf and results/{SRA}/geneabundance.tab - re-estimated abundances for each sample - {SRA}/geneabundance{SRA}.txt - re-estimated abundances for each sample with extra informationa added - AGE and tissue - results/all_abundances.txt - final file with gene abundances for all samples

Owner

  • Login: CarolinaPB
  • Kind: user
  • Location: The Netherlands

GitHub Events

Total
Last Year

Dependencies

requirements.txt pypi
  • aioeasywebdav =2.4.0=py36_1000
  • aiohttp =3.6.2=py36h516909a_0
  • appdirs =1.4.3=py_1
  • async-timeout =3.0.1=py_1000
  • attrs =20.2.0=pyh9f0ad1d_0
  • bcrypt =3.2.0=py36h8c4c3a4_0
  • boto3 =1.14.56=pyh9f0ad1d_0
  • botocore =1.17.56=pyh9f0ad1d_0
  • brotlipy =0.7.0=py36h8c4c3a4_1000
  • c-ares =1.16.1=h516909a_3
  • ca-certificates =2020.6.20=hecda079_0
  • cachetools =4.1.1=py_0
  • cairo =1.16.0=h3fc0475_1005
  • certifi =2020.6.20=py36h9f0ad1d_0
  • cffi =1.14.1=py36h0ff685e_0
  • chardet =3.0.4=py36h9f0ad1d_1006
  • configargparse =1.2.3=pyh9f0ad1d_0
  • cryptography =3.1=py36h45558ae_0
  • datrie =0.8.2=py36h8c4c3a4_0
  • decorator =4.4.2=py_0
  • docutils =0.15.2=py36_0
  • dropbox =10.1.1=pyh9f0ad1d_0
  • expat =2.2.9=he1b5a44_2
  • filechunkio =1.6=py36_0
  • fontconfig =2.13.1=h1056068_1002
  • freetype =2.10.2=he06d7ca_0
  • fribidi =1.0.10=h516909a_0
  • ftputil =3.2=py36_0
  • gettext =0.19.8.1=hc5be6a0_1002
  • gitdb =4.0.5=py_0
  • gitpython =3.1.8=py_0
  • glib =2.65.0=h6f030ca_0
  • google-api-core =1.22.2=py36h9f0ad1d_0
  • google-auth =1.21.1=py_0
  • google-cloud-core =1.4.1=pyh9f0ad1d_0
  • google-cloud-storage =1.31.0=pyh9f0ad1d_0
  • google-crc32c =1.0.0=py36h9c36872_0
  • google-resumable-media =1.0.0=pyh9f0ad1d_0
  • googleapis-common-protos =1.51.0=py36h9f0ad1d_2
  • graphite2 =1.3.13=he1b5a44_1001
  • graphviz =2.42.3=h0511662_0
  • grpcio =1.31.0=py36h769ab6c_0
  • harfbuzz =2.7.2=hee91db6_0
  • icu =67.1=he1b5a44_0
  • idna =2.10=pyh9f0ad1d_0
  • idna_ssl =1.1.0=py36_1000
  • importlib-metadata =1.7.0=py36h9f0ad1d_0
  • importlib_metadata =1.7.0=0
  • jinja2 =2.11.2=pyh9f0ad1d_0
  • jmespath =0.10.0=pyh9f0ad1d_0
  • jpeg =9d=h516909a_0
  • jsonschema =3.2.0=py36h9f0ad1d_1
  • ld_impl_linux-64 =2.34=hc38a660_9
  • libblas =3.8.0=17_openblas
  • libcblas =3.8.0=17_openblas
  • libcrc32c =1.1.1=he1b5a44_2
  • libffi =3.2.1=he1b5a44_1007
  • libgcc-ng =9.3.0=h24d8f2e_16
  • libgfortran-ng =7.5.0=hdf63c60_16
  • libgomp =9.3.0=h24d8f2e_16
  • libiconv =1.16=h516909a_0
  • liblapack =3.8.0=17_openblas
  • libopenblas =0.3.10=pthreads_hb3c22a3_4
  • libpng =1.6.37=hed695b0_2
  • libprotobuf =3.13.0=h8b12597_0
  • libstdcxx-ng =9.3.0=hdf63c60_16
  • libtiff =4.1.0=hc7e4089_6
  • libtool =2.4.6=h516909a_1005
  • libuuid =2.32.1=h14c3975_1000
  • libwebp-base =1.1.0=h516909a_3
  • libxcb =1.13=h14c3975_1002
  • libxml2 =2.9.10=h68273f3_2
  • lz4-c =1.9.2=he1b5a44_3
  • markupsafe =1.1.1=py36h8c4c3a4_1
  • multidict =4.7.5=py36h8c4c3a4_1
  • ncurses =6.2=he1b5a44_1
  • networkx =2.5=py_0
  • numpy =1.19.1=py36h3849536_2
  • openssl =1.1.1g=h516909a_1
  • pandas =1.1.1=py36h831f99a_0
  • pango =1.42.4=h7062337_4
  • paramiko =2.7.2=pyh9f0ad1d_0
  • pcre =8.44=he1b5a44_0
  • pip =20.2.3=py_0
  • pixman =0.38.0=h516909a_1003
  • prettytable =0.7.2=py_3
  • protobuf =3.13.0=py36h831f99a_0
  • psutil =5.7.2=py36h8c4c3a4_0
  • pthread-stubs =0.4=h14c3975_1001
  • pyasn1 =0.4.8=py_0
  • pyasn1-modules =0.2.7=py_0
  • pycparser =2.20=pyh9f0ad1d_2
  • pygraphviz =1.3.1=py36_0
  • pynacl =1.3.0=py36h516909a_1001
  • pyopenssl =19.1.0=py_1
  • pyrsistent =0.16.0=py36h8c4c3a4_0
  • pysftp =0.2.9=py36_0
  • pysocks =1.7.1=py36h9f0ad1d_1
  • python =3.6.11=h4d41432_2_cpython
  • python-dateutil =2.8.1=py_0
  • python-irodsclient =0.8.2=py_0
  • python_abi =3.6=1_cp36m
  • pytz =2020.1=pyh9f0ad1d_0
  • pyyaml =5.3.1=py36h8c4c3a4_0
  • ratelimiter =1.2.0=py36h9f0ad1d_1001
  • readline =8.0=he28a2e2_2
  • requests =2.24.0=pyh9f0ad1d_0
  • rsa =3.1.4=py36_0
  • s3transfer =0.3.3=py36h9f0ad1d_1
  • setuptools =49.6.0=py36h9f0ad1d_0
  • six =1.15.0=pyh9f0ad1d_0
  • smmap =3.0.4=pyh9f0ad1d_0
  • snakemake =5.3.0=py36_1
  • snakemake-minimal =5.3.0=py36_1
  • sqlite =3.33.0=h4cf870e_0
  • tk =8.6.10=hed695b0_0
  • typing_extensions =3.7.4.2=py_0
  • urllib3 =1.25.10=py_0
  • wheel =0.35.1=pyh9f0ad1d_0
  • wrapt =1.12.1=py36h8c4c3a4_1
  • xmlrunner =1.7.7=py_0
  • xorg-kbproto =1.0.7=h14c3975_1002
  • xorg-libice =1.0.10=h516909a_0
  • xorg-libsm =1.2.3=h84519dc_1000
  • xorg-libx11 =1.6.12=h516909a_0
  • xorg-libxau =1.0.9=h14c3975_0
  • xorg-libxdmcp =1.1.3=h516909a_0
  • xorg-libxext =1.3.4=h516909a_0
  • xorg-libxpm =3.5.13=h516909a_0
  • xorg-libxrender =0.9.10=h516909a_1002
  • xorg-libxt =1.1.5=h516909a_1003
  • xorg-renderproto =0.11.1=h14c3975_1002
  • xorg-xextproto =7.3.0=h14c3975_1002
  • xorg-xproto =7.0.31=h14c3975_1007
  • xz =5.2.5=h516909a_1
  • yaml =0.2.5=h516909a_0
  • yarl =1.4.2=py36h516909a_0
  • zipp =3.1.0=py_0
  • zlib =1.2.11=h516909a_1009
  • zstd =1.4.5=h6597ccf_2