https://github.com/animesh/applied-computational-genomics

Applied Computational Genomics Course at UU: Spring 2020

https://github.com/animesh/applied-computational-genomics

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
  • .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: ncbi.nlm.nih.gov, wiley.com, nature.com
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (5.4%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Applied Computational Genomics Course at UU: Spring 2020

Basic Info
  • Host: GitHub
  • Owner: animesh
  • License: cc-by-sa-4.0
  • Default Branch: master
  • Homepage:
  • Size: 23.9 MB
Statistics
  • Stars: 0
  • Watchers: 0
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Fork of quinlan-lab/applied-computational-genomics
Created over 3 years ago · Last pushed about 4 years ago

https://github.com/animesh/applied-computational-genomics/blob/master/

### Applied Computational Genomics Course at UU: Spring 2022
- Faculty: Aaron Quinlan (aquinlan at genetics.utah.edu)
- HSEB 3515B, but Zoom (https://utah.zoom.us/j/95686980443) for the first two weeks
- Teaching assistants:
    - Holly Thorpe
    - Jason Kunisaki
    - Casey Sederman
    - Isabelle Cooperstein  
- Meets Tu and Th from 10:30-11:50 January 11, 2021.
- TA Hours (TBD): 
    - Wednesday 12PM - 1PM (https://utah.zoom.us/j/95737998575, pw: 314025)
    - Monday 2PM - 3PM (https://utah.zoom.us/j/95805178811, pw: 278239)
- [Homework Submission Link](https://uofu.app.box.com/f/462f5bfaaeb14f8ebb2b3c25f0cfab59)


### Overview
This course will provide a comprehensive introduction to fundamental concepts and experimental approaches in the analysis and interpretation of experimental genomics data. It will be structured as a series of lectures covering key concepts and analytical strategies. A diverse range of biological questions enabled by modern DNA sequencing technologies will be explored including sequence alignment, the identification of genetic variation, structural variation, and ChIP-seq and RNA-seq analysis. Students will learn and apply the fundamental data formats and analysis strategies that underlie computational genomics research. **The primary goal of the course is for students to be grounded in theory and leave the course empowered to conduct independent genomic analyses.**

### Important notes
1. Class participation is expected. Ask a question if you have one!
2. When on Zoom, cameras must be on.

### Grading policy
All assignments are due on the date stated in class. Ten percent of the grade will be deducted for each 24 hours that the assignment is late.

### Course lecture slides
- [reading assignment: 01-Brief-History-of-Bioinformatics.pdf](articles/01-Brief-History-of-Bioinformatics.pdf)
- Jan 11, 2022: Course overview and Intro to UNIX
    - [slides](https://docs.google.com/presentation/d/1B8kvetTDwUe-d7hZuV2NufVOMPM7MCxh_MyP-n9yDZo/edit?usp=sharing)
    - [youtube](https://www.youtube.com/watch?v=idl6oq-MxbM)
- Jan 13, 2022: Intro to UNIX, Part 2
    - [slides](https://docs.google.com/presentation/d/1YSXYqCSHUZGRVr00oTttv_v1u83ccPLpF5_TMtW0iRI/edit?usp=sharing)
    - [youtube](https://www.youtube.com/watch?v=GIGIUMBumME)
- [reading assignment: 02-Human-Genome-Review.pdf](articles/02-Human-Genome-Review.pdf)
- Jan 18, 2022: Intro to UNIX, Part 3 and Intro to the Human Genome
    - [slides](https://docs.google.com/presentation/d/1304Ueup_n8_vqKjQZh-AV3dDAOs2gCqNgrm8o25nBHo/edit?usp=sharing)
    - [youtube](https://www.youtube.com/watch?v=yPlpVIsaRCg)
    - Homework #1: https://gist.github.com/arq5x/c0eb84bce2086fbfbe9184668ef87b31#file-hw1-md
        - due Jan 25 at 11:59PM
        - post answers as `UNID.hw1.txt` to this [link](https://uofu.app.box.com/f/462f5bfaaeb14f8ebb2b3c25f0cfab59)
- Jan 20, 2022: Pattern searching in the human genome
    - [slides](https://docs.google.com/presentation/d/1W7bwMLAqCIB9unbv4Kswc8P7cE5ATkMhHYJfwBa64L0/edit?usp=sharing)
    - [youtube](https://youtu.be/ngpwuFh7H5M?t=22)

- Jan 25, 2022: Pattern searching in the human genome and Intro to Data Analysis in RStudio
    - [slides](https://docs.google.com/presentation/d/1KAwoHV03d4eZ6StXmT-ihvZDCzuiXVUaH6C9SOySjhA/edit?usp=sharing) 
    - [youtube](https://www.youtube.com/watch?v=Gs4XIPknksc)
- Jan 27, 2022: Data frames and Importing Data 
    - [slides](https://docs.google.com/presentation/d/12Fq7OaLR7sdfQ4DvS5qEUcQiWteEtc87JNgHJPRtv88/edit?usp=sharing)
    - Homework #2: https://gist.github.com/arq5x/c0eb84bce2086fbfbe9184668ef87b31#file-hw2-md
        - due Feb 3 at 11:59PM
        - post answers as `UNID.hw2.txt` to this [link](https://uofu.app.box.com/f/462f5bfaaeb14f8ebb2b3c25f0cfab59)
- Feb 1, 2022: More with data frames, precision v. accuracy, very basic RNA-seq analysis
    - [slides](https://docs.google.com/presentation/d/1-AMIHxuEuU1JJ_RkActGi5F8bv_R9EWBWWas2eh6AuY/edit?usp=sharing)
    - [video](https://youtu.be/yc3HH8Dxhf8)
- Feb 3, 2022: Intro to the tidyverse (guest lecturer: Charlie Murtaugh)
    - [slides](https://docs.google.com/presentation/d/1KpudXaBqi4FtsVTVJDqD8ChqX3M44REs/edit?usp=sharing&ouid=107526144078068918726&rtpof=true&sd=true)
- Feb 8, 2022: DNA sequencing technologies
    - [slides](https://docs.google.com/presentation/d/1N0DO5rlHdbrnNhyDhib8fpYYhfbtFOPcKw_Bpvv-2lA/edit?usp=sharing)
    - [youtube](https://www.youtube.com/watch?v=fgbk732NdWI)
    - Homework #3: https://gist.github.com/arq5x/c0eb84bce2086fbfbe9184668ef87b31#file-hw3-md
        - due Feb 17 at 11:59PM
        - post answers as `UNID.hw3.html` to this [link](https://uofu.app.box.com/f/462f5bfaaeb14f8ebb2b3c25f0cfab59)
- Feb 10, 2022: FASTQ format and tools
    - [slides](https://docs.google.com/presentation/d/1N0DO5rlHdbrnNhyDhib8fpYYhfbtFOPcKw_Bpvv-2lA/edit?usp=sharing)   
- Feb 15, 2022: Sequence mapping and alignment
    - [slides](https://docs.google.com/presentation/d/1RskyGhXx4Lc6wSvvb_ZuCUJGUiP2RAr9X8bGh9Kz77I/edit?usp=sharing)
    - [youtube](https://youtu.be/QuuKYEp5EUA)
- Feb 17, 2022: Sequence alignment and SAM/BAM format samtools, and IGV
    - [slides](https://docs.google.com/presentation/d/1_iT3btOZqjPmVb8Ryk5ssMBCMxoQ0MVmasZ6G0luA-c/edit?usp=sharing)
    - [youtube](https://www.youtube.com/watch?v=XU8atPxM0VQ)
- Feb 22, 2022: Samtools and IGV
    - [slides](https://docs.google.com/presentation/d/1_iT3btOZqjPmVb8Ryk5ssMBCMxoQ0MVmasZ6G0luA-c/edit?usp=sharing)
    - [youtube](https://www.youtube.com/watch?v=XU8atPxM0VQ)
- Feb 24, 2022: Poisson Processes in Biology
    - [slides](https://docs.google.com/presentation/d/18TdXaBIuxi0fmbTKREUH5ogwxIabKlFU9UPhhazEOf8/edit?usp=sharing)
    - [youtube](https://youtu.be/zS13juQqsFU)
- March 1, 2022: Uncertainty in RNA-seq data 
    - [slides](https://docs.google.com/presentation/d/1KMVLhMSqTPcsRkflvNFif6xCp_lnkAF_V73S61b0DyY/edit?usp=sharing)
    - [youtube](https://youtu.be/xItNEtQvYaU)
- March 3, 2022: An introduction to awk and bioawk
    - [slides](https://docs.google.com/presentation/d/1ZfLRLxpc12YqeCt8DWojNwKUuW1WZSp_uTXPK2g2e3o/edit#slide=id.p)
    - [youtube](https://youtu.be/iiFhBvA_wfA) 
- Homework #4: https://gist.github.com/arq5x/c0eb84bce2086fbfbe9184668ef87b31#file-hw4-v3-md
    - due Mar 24 at 11:59PM
    - post answers as `UNID.hw4.txt` to this [link](https://uofu.app.box.com/f/462f5bfaaeb14f8ebb2b3c25f0cfab59)

- Mar 15, 2022: Genetic Variation
    - [slides](https://docs.google.com/presentation/d/1JnBiaGG_eJAb1LGUiNaXP4DX-oZKaxaDVg1KFqYeAfA/edit?usp=sharing)
    - [youtube](https://www.youtube.com/watch?v=N8nTiBOSsHI)

- Mar 17, 2022: SNP and INDEL discovery (part 1)
    - [slides](https://docs.google.com/presentation/d/1D4XY9XxQiyYcwwhomRRONxCPr_bJvcC0WM4sb8vouZM/edit?usp=sharing)
    - [youtube](https://www.youtube.com/watch?v=2ro9WCOpQqI)
- Mar 22, 2022: Rates and patterns of human germline variation
    - [slides]()
    - [youtube]()
- Mar 24, 2022: VCF format, Hardy Weinberg Equilibrium, VCF toolkits
    - [slides](https://docs.google.com/presentation/d/1kt2br-ZcDIzRqx__oTdC8i4NlAhluWX2WsPT_clqMaI/edit?usp=sharing)
    - [youtube](https://www.youtube.com/watch?v=FZtWnNghRkA)
- Mar 29, 2022: VCF annotation and interpetation
    - [slides](https://docs.google.com/presentation/d/1DN99IgciDD05b5Ve_Eaym0ORPhuDFAPU1fXBPoOq5Vo/edit?usp=sharing)
    - [youtube](https://www.youtube.com/watch?v=M8UfW8RNTKI)
- Homework #5: https://gist.github.com/arq5x/c0eb84bce2086fbfbe9184668ef87b31#file-hw5-2022-md
    - due Mar 18 at 11:59PM
    - post answers as `UNID.hw5.txt` to this [link](https://uofu.app.box.com/f/462f5bfaaeb14f8ebb2b3c25f0cfab59) -->
- Mar 31, 2022: Genome Annotation and Resources
    - [slides](https://docs.google.com/presentation/d/1PU4ADdlmZu9jOkUa_FgrS5ppTJ3CsCIXwn1W9WkzApI/edit?usp=sharing)
    - [youtube](https://www.youtube.com/watch?v=ElnZGlzb4qo)
- April 5, 2022: Genome Annotation Formats.
    - [slides - 1](https://docs.google.com/presentation/d/1Eylp9pcU8xEhyBJJvL57pSjSukQdhBnG1sWCvJlCngs/edit?usp=sharing)
    - [slides - 2](https://docs.google.com/presentation/d/1yXFB72CHPiVH8zCKBwBOQg-ssmzS9xUOcTNhcsQgV1c/edit?usp=sharing)
    - [youtube](https://www.youtube.com/watch?v=tq3GeDXbZXA)

- April 7, 2022: Genome arithmetic with bedtools
    - [bedtools tutorial](http://quinlanlab.org/tutorials/bedtools/bedtools.html)
    - [bedtools docs](https://bedtools.readthedocs.io/en/latest/index.html#)
    - [youtube](https://www.youtube.com/watch?v=1R1KocKEzYY)
- Apr 12, 2022: Real world analyses with bedtools.
    - [slides](https://docs.google.com/presentation/d/1-LR5tHGbvJtmk5rdyBihzd_9viI15KnTtFOrmAHdjsc/edit?usp=sharing)
    - [youtube](https://www.youtube.com/watch?v=qV6Iv1Dco-M)
- Homework #6: solve all 10 puzzles at the end of the bedtools tutorial: http://quinlanlab.org/tutorials/bedtools/bedtools.html
    - due April 26 (last day of classes at 11:59PM
    - post answers as `UNID.hw6.txt` to this [link](https://uofu.app.box.com/f/462f5bfaaeb14f8ebb2b3c25f0cfab59)
- Apr 14, 2022: Monte Carlo simulations and more on UNIX
    - [slides 1](https://docs.google.com/presentation/d/1-LR5tHGbvJtmk5rdyBihzd_9viI15KnTtFOrmAHdjsc/edit?usp=sharing)
    - [slides 2](https://docs.google.com/presentation/d/186g0U-3M-Cy-wznAoFT6WwpgQCvYwmnegrr8UJ7vIJo/edit#slide=id.p)
    - [youtube - Monte Carlo](https://youtu.be/QQ94LZ-gWqM)
    - [youtube - bash_profile](https://youtu.be/zearEb3guLI)
- Apr 19, 2022: The Normal Distribution
    - [slides](https://docs.google.com/presentation/d/1e1cF_fPRtrZvr1Y8N_Kat4o_Ds0QOStIvJDXC2mcy0Q/edit#slide=id.g82b15b332d_0_0)
- Apr 21, 2022: Descriptive plots. The Central Limit Theorem
    - [slides 1](https://docs.google.com/presentation/d/1bcZKEh-nEq-ELVBMY9NYy4a0Ue5glAk0y17CywY9K1w/edit#slide=id.p)
    - [slides 2](https://docs.google.com/presentation/d/1Weh4t69BeEe8rCFXEsfwlOdhX-wV7I8ceePvwXbUCsc/edit?usp=sharing)
- April 26, 2022:  The t-statistic, t-distribution, t-tests, and p-values
    - [slides](https://docs.google.com/presentation/d/1X1l4UYxEzarF69W5p8PofQiAEoFNOqDlFBBppIfUc_w/edit)
    - [youtube](https://www.youtube.com/watch?v=golFyEZhVa8&feature=youtu.be)

Not covered in 2022's course, but available for reference.


- Apr 13, 2020: Q-Q plots
    - [slides](https://docs.google.com/presentation/d/1e1cF_fPRtrZvr1Y8N_Kat4o_Ds0QOStIvJDXC2mcy0Q/edit#slide=id.g82b15b332d_0_0)
- April 22, 2020:  Introduction to Linear Regression
    - [slides](https://docs.google.com/presentation/d/1ugkYc24AmKVEO0-x-M3qZ4EsSIedbSwyqD6cLN0YIiI/edit#slide=id.gd42443b26c_0_0)
    - [youtube](https://www.youtube.com/watch?v=KekLyPeet3k)
- April 27, 2020:  Introduction to tidyverse
    - [slides](https://docs.google.com/presentation/d/1tdQ5B7LhiAE5-G6sJ6nDeCna463dWjKlAenc5DCde0g/edit#slide=id.p1)
    - [youtube]

- The Central Limit Theorem and Confidence Intervals
    - [slides](https://docs.google.com/presentation/d/1Weh4t69BeEe8rCFXEsfwlOdhX-wV7I8ceePvwXbUCsc/edit?usp=sharing)
- Structural and copy number variation
    - [slides](https://docs.google.com/presentation/d/1h_MApL1p21ye0doXDIJdx8snix1VQInyPZKjPT0AhFM/edit?usp=sharing)
    - [youtube](https://www.youtube.com/watch?v=Skfzw5LwJq0)
- Patterns of Mutation in the Human Genome
    - [slides](https://drive.google.com/file/d/1qWJysIa1XAFZ_qH-kOVRDbXKGt9e9lFZ/view?usp=sharing)






### Homework
- [Homework 1: Basic Unix analysis]
- [Homework 2: DNA Pattern exploration in a FASTA file]
- [Homework 3: Working with the FASTQ format]
- [Homework 4: BAM files, samtools, IGV]
- [Homework 5: Exploring genetic variation in VCF files]
- [Homework 6: Bedtools analysis problems. Bottom of page]
- Homework 7: Probability and R


### Syllabus
- **Class 1 (Tu Jan 9; Layer): Course overview and Intro to UNIX**
    - [Class 1 Slides](https://docs.google.com/presentation/d/1B8kvetTDwUe-d7hZuV2NufVOMPM7MCxh_MyP-n9yDZo/edit?usp=sharing)
    - **Required** Reading Prior to Lecture: 
        - Part 1 of [Unix and Perl Primer for Biologists](http://korflab.ucdavis.edu/Unix_and_Perl/current.pdf)
    - Topics covered
        - Brief history of computational biology
        - Course computing environment
        - Intro. to UNIX: Part 1
            - Logging in
            - The "shell"
            - "Home"
            - Navigation
            - File system
            - Files
            - Basic commands: `ls`, `pwd`, `cd`, `mkdir`, `head`
- **Class 2 (Th Jan 11; Layer): Intro to UNIX Part 2**
    - [Class 2 Slides](https://docs.google.com/presentation/d/1YSXYqCSHUZGRVr00oTttv_v1u83ccPLpF5_TMtW0iRI/edit?usp=sharing)
    - **Required** Reading Prior to Lecture: 
        - Part 2 (Advanced UNIX) of [Unix and Perl Primer for Biologists](http://korflab.ucdavis.edu/Unix_and_Perl/current.pdf)
    - Topics covered
        - Intro. to UNIX: Part 2
          - grep
          - cut
          - redirects
    - **Homework 1 assigned. (due by start of class, Jan 17)**
 
- **Class 3 (Tu Jan 16; Quinlan): The human genome**
    - [Class 3 Slides](https://docs.google.com/presentation/d/1304Ueup_n8_vqKjQZh-AV3dDAOs2gCqNgrm8o25nBHo/edit?usp=sharing)
    - **Required** Reading Prior to Lecture: 
        - [Initial sequencing and analysis of the human genome](http://www.nature.com/nature/journal/v409/n6822/full/409860a0.html)
    - Topics covered
      - Karyotype
      - Chromosome structure
      - Centromeres
      - Banding
      - Chromatin
      - How was the genome sequenced?
        - sequencing technology
        - assembly strategy
      - Chromosomes
        - size
        - gene content
        - centromeres
      - Haplotypes
      - Genes and transcripts
      - Repeat content
        - mobile elements
        - simple repeats
      - GC content, banding
      - CpG islands  
- **Class 4 (Th Jan 18; Quinlan): Using UNIX to find patterns in a genome**
    - **Required** Reading Prior to Lecture:    
        - None.     
    - Topics covered
      - The UNIX PATH
      - Environment variables
      - Basic regular expressions with grep
      - sort
      - uniq
    - **Homework 2 (finding biological patterns in FASTA files with UNIX) assigned**
- **Class 5 (Tu Jan 23; Quinlan): Genetic variation: mutations, polymorphisms, and haplotypes**
    - **Required** Reading Prior to Lecture: 
        - [A global reference for human genetic variation](http://www.nature.com/nature/journal/v526/n7571/full/nature15393.html)
    - Topics covered
      - Genetic variation: what, why, etc.
      - Mutation vs. polymorhism
      - De novo mutation
         - Human mutation rates
      - Polymorphism
      - SNPs INDELs
        - abundance
        - frequency
        - examples
        - 1000 Genomes
        - Site frequency spectrum
      - Population stratification
      - Intro to haplotypes and recombination

- **Class 6 (Th Jan 25; Quinlan): Modern DNA sequencing technologies**
    - **Required** Reading Prior to Lecture: 
        - [Coming of age: ten years of next-generation sequencing technologies](http://www.nature.com/nrg/journal/v17/n6/full/nrg.2016.49.html)
    - Topics covered
        - Illumina sequencing
            - Overview of technology
            - Paired-end vs. single-end
        - Pacbio
        - Oxford nanopore
        - Base calling
        - FASTQ format
        - seqtk, fastx toolkit
    - **Homework 3 (working with the FASTQ format) assigned**

- **Class 7 (Tu Jan 30; Quinlan): DNA sequence mapping and alignment](https://docs.google.com/presentation/d/1RskyGhXx4Lc6wSvvb_ZuCUJGUiP2RAr9X8bGh9Kz77I/edit?usp=sharing)**
    - **Required** Reading Prior to Lecture: 
        - [Alignment of Next-Generation Sequencing Reads](http://www.annualreviews.org/doi/abs/10.1146/annurev-genom-090413-025358?journalCode=genom)
    - Topics covered
      - Sequence alignment
          - Theory
          - Mapping versus alignment
          - Local versus global alignment
              - Smith waterman
              - Needleman-wunsch
          - Advanced algorithms
          - Alignment for RNA-seq
          - Alignment for SV detection.
          - Tools
              - BWA, etc.

- **Class 8 (Th Feb 1; Quinlan): SAM/BAM format, samtools, and IGV](https://docs.google.com/presentation/d/1_iT3btOZqjPmVb8Ryk5ssMBCMxoQ0MVmasZ6G0luA-c/edit?usp=sharing)**
    - The SAM/BAM format
    - Samtools
    - IGV
    - **Homework 4 (creating and working with SAM/BAM files with samtools and IGV) assigned**

- **Class 9 (Tu Feb 6; Quinlan): SNP and INDEL discovery (part 1)](https://docs.google.com/presentation/d/1D4XY9XxQiyYcwwhomRRONxCPr_bJvcC0WM4sb8vouZM/edit?usp=sharing)**
    - **Required** Reading Prior to Lecture: 
        - [A framework for variation discovery and genotyping using next-generation DNA sequencing data](http://www.nature.com/ng/journal/v43/n5/full/ng.806.html)
    - **Optional** Reading Prior to Lecture: 
        - [A general approach to single-nucleotide polymorphism discovery.](http://www.nature.com/ng/journal/v23/n4/full/ng1299_452.html)
    - **TODO FOR NEXT TIME**: INTRODUCE POISSON MODEL OF COVERAGE. 30X IS A DISTRIBUTION
        - https://twitter.com/OmicsOmicsBlog/status/837829111879983104
    - Topics covered
      - SNP and INDEL calling
          - Theory
              - Basic concept
              - Sequencing error
              - Bayes theorem and priors
      - Assigning a genotype
      - Common problems and artifacts
          - paralogy
          - low depth
          - high error rate
          - ambiguous alignment
      - Single sample variant detection
- **Class 10 (Th Feb 8; Quinlan): SNP and INDEL discovery (part 2)](https://docs.google.com/presentation/d/12jeJQPbntPPPGYszIH1l9u83mXFVU1XdJw-bNgbFu28/edit?usp=sharing)**
    - **Required** Reading Prior to Lecture: 
        - [The VCF format and VCFtools](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3137218/)
    - Topics covered
        - VCF format
          - Attributes
          - Genotypes
        - Population calling
        - Basic annotations
    - Landscape of human genetic variation
        - Alleles and genotypes
        - Allele frequency spectrum
        - Hardy weinberg equilibrium
        - More on haplotypes and recombination
    - Exploring the format
        - examples
        - IGV
    - Manipulating VCF with bcftools
    - **Homework 5 (variant calling and working with VCF files with bcftools and UNIX) assigned**
- **[Class 11 (Tu Feb 13; Quinlan): VCF format, Hardy Weinberg Equilibrium, VCF toolkits](https://docs.google.com/presentation/d/1kt2br-ZcDIzRqx__oTdC8i4NlAhluWX2WsPT_clqMaI/edit?usp=sharing)**
    - Topics covered
        - VCF Format
        - Allele frequencies
        - Genotype frequencies
        - Hardy Weinberg Equilibrium

- **[Class 12 (Th Feb 15; Quinlan): VCF annotation and interpetation](https://docs.google.com/presentation/d/1DN99IgciDD05b5Ve_Eaym0ORPhuDFAPU1fXBPoOq5Vo/edit?usp=sharing)**
    - **Required** Reading Prior to Lecture: 
        - [Choice of transcripts and software has a large effect on variant annotation](https://genomemedicine.biomedcentral.com/articles/10.1186/gm543)
    - Topics covered
        - Concepts
          - e.g, synonymous, non-synonymous
          - frameshift
          - stopgain
          - constraint
          - impact of transcript model
    - Tools
        - Polyphen
        - SIFT
        - vcfanno
        - CADD
        - VEP
        - SnpEff

- **Class 13 (Tu Feb 20; Quinlan): Variation in genome structure**
    - **Required** Reading Prior to Lecture: 
        - [Genome structural variation discovery and genotyping](http://www.nature.com/nrg/journal/v12/n5/full/nrg2958.html)
    - Topics covered
        - The genome is repetitive
        - Segmental duplication
        - SV versus CNV
        - SV Mechanisms
            - NAHR / ectopic recombination
            - NHEJ
            - Replication mechansism
        - SV detection
        - Examples
   
- **Class 14 (Th Feb 22; Quinlan): Somatic mutation in cancer**
    - **Required** Reading Prior to Lecture: 
        - [Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples](http://www.nature.com/nbt/journal/v31/n3/full/nbt.2514.html)
    - Topics covered
        - Sources of mutation
        - Mutational landscape
        - Tumor heterogeneity
        - Somatic mutation detetion
            - why is it so hard?
        - Using mutation to track cancer evolution
        - Mosaicism and disease

- **Class 15 (Tu Feb 27; Quinlan): Genome annotation**
    - **Required** Reading Prior to Lecture: 
        - None
    - Topics covered
      - How and why do we annotate a genome?
      - Conservation
      - CpG islands
      - Repeatmasker
      - Chromatin modifications
      - DNA methylations
      - Linkage blocks

- **Class 16 (Th Mar 1; Quinlan): Genome data formats and genome arithmetic**
    - **Required** Reading Prior to Lecture: 
        - None
    - Topics covered
      - The genome as a coordinate system
      - BED format
      - GFF format
      - VCF format
      - UCSC and Biomart to retrieve genome annotations
      - UCSC and IGV to visualize
      - a bit of awk

- **Class 17 (Tu Mar 8; Quinlan): Applied genome arithmetic with bedtools; part 1**
    - **Required** Reading Prior to Lecture: 
        - [BEDTools: the Swissarmy tool for genome feature analysis](http://onlinelibrary.wiley.com/doi/10.1002/0471250953.bi1112s47/abstract?userIsAuthenticated=false&deniedAccessCustomisedMessage=)
    - Topics covered
      - The genome as a coordinate system revisited
      - Basic concepts of genome arithmetic
      - Introduction to bedtools
    - **Homework 9 (basic genome arithmetic with bedtools) assigned (due Mar 7)**

- **Class 18 (Th Mar 8; Quinlan): Applied genome arithmetic with bedtools; part 2**
- **Class 19 (Tu Mar 13; Quinlan): Digging deeper into UNIX, part 1**
    - awk
    - sed
    - tr
    - PATH
    - .bashrc
- **Class 20 (Th Mar 15; Quinlan): ChIP-seq analysis**
    - experimental design
    - protocols
    - examples
    
- Spring Break March 18-25


- **Class 21 (Tu Mar 27; Quinlan): RNA-seq analysis**
    - analyses
    - toolsets
    - Class project assignment
- **Class 22 (Th Mar 29; Quinlan): Basic probability**
    - Probability with coins and dice
    - Probability with DNA
    - Conditional probabilities
    - Use R for examples
- **Class 23 (Tu Apr 3; Quinlan): Statistical tests**
    - Gaussian
      - Z scores
    - Chi-squared
    - Fisher
    - KS test
    - Rank tests
    - Applications
- **Class 24 (Th Apr 5; Quinlan): How do I know if my observation is significant?**
    - Models
    - Expectation
    - Tests for significance
- **Class 25 (Tu Apr 10; Quinlan): Data visualization, part 1**
    - Why
    - Pattern recognition
    - Detect problems
    - Ansombes quartet
    - **Introduce class projects**
- **Class 26 (Tu Apr 12; Quinlan): Data visualization, part 2**
    - http://www.nature.com/collections/qghhqm/pointsofsignificance
    - Scatter plots
    - Histograms
    - Box whiskers
- **Class 27 (Tu Apr 17; Quinlan): Advanced topics**
    - loops
    - shuffling
    - randomization
    - advanced commands
    - basic scripts and pipelines
- **Class 28 (Th Apr 19; Quinlan): Group Presentations, part 1**
- **Class 29 (Tu Apr 24; Quinlan): Group Presentations, part 2**

Owner

  • Name: Ani
  • Login: animesh
  • Kind: user
  • Location: Norway
  • Company: Norwegian University of Science and Technology

A medical graduate from Delhi University with post-graduation in bioinformatics from Jawaharlal Nehru University, India.

GitHub Events

Total
Last Year