# [](https://github.com/RAHenriksen/NGSNGS/actions/workflows/make.yml) [](https://doi.org/10.1093/bioinformatics/btad041)
# NEXT GENERATION SIMULATOR FOR NEXT GENERATION SEQUENCING DATA
Rasmus Amund Henriksen, Lei Zhao, Thorfinn Sand Korneliussen \
Contact: rasmus.henriksen@sund.ku.dk
## VERSION
Next Generation Simulator for Next Generator Sequencing Data version 0.9.0,
This article was [published](https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btad041/6994180) in Oxford Academics as an application note the 20th of January 2023.
NGSNGS is a new program, therefore we are very interested in feedback to solve potential problems, as well as ideas for improvements or additions to specific and relevant features.
## INSTALLATION & REQUIREMENTS
* Use local installation of htslib
git clone https://github.com/RAHenriksen/NGSNGS.git
git clone https://github.com/samtools/htslib.git
cd htslib; make; cd ../NGSNGS; make HTSSRC=../htslib
* Use systemwide installation of htslib
git clone https://github.com/RAHenriksen/NGSNGS.git
cd NGSNGS; make
**NOTE:** Newer version of htslib which includes bam_set1 is required
## QUICK TUTORIAL
Examples of which parameters to include depending on the desired simulations
### simulate 1000 reads (-r) from human hg19.fa (-i), generate compressed fq.gz (-f), single end (-seq), make program use one threads (-t)
~~~~bash
./ngsngs -i hg19.fa -r 1000 -f fq -l 100 -seq se -t 1 -q1 Test_Examples/AccFreqL150R1.txt -o HgSim
~~~~
### generate bam (-f), paired end (-seq), variable fragment length (-ld norm,350,20) but fixed readlength (-cl 100)
~~~~bash
./ngsngs -i hg19.fa -r 1000 -f bam -ld norm,350,20 -cl 100 -seq pe -t 1 -q1 Test_Examples/AccFreqL150R1.txt -q2 Test_Examples/AccFreqL150R2.txt -o HgSim
~~~~
### Disable platform specific errors (-ne), adding deamination pattern with Briggs 2007 model (-m b7,...), with ancient fragment length distribution (-lf), using seed 4 (-s)
~~~~bash
./ngsngs -i hg19.fa -r 1000 -f fq -s 4 -ne -lf Test_Examples/Size_dist_sampling.txt -seq se -q1 Test_Examples/AccFreqL150R1.txt -o HgSim
~~~~
### Paired end reads, inferred cycle length from (-q1) to be 150, fragment length (-l) 400, inner distance of 100 (400-150*2)
~~~~bash
./ngsngs -i hg19.fa -r 1000 -f fq -l 400 -seq pe -q1 Test_Examples/AccFreqL150R1.txt -q2 Test_Examples/AccFreqL150R1.txt -a1 AGATCGGAAGAGCACACGTCTGAACTCCAGTCACCGATTCGATCTCGTATGCCGTCTTCTGCTTG -a2 AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATTT -o HgSim
~~~~
### Single end reads, adapter sequence (-a1) and poly-G tail (-p)
~~~~bash
./ngsngs -i hg19.fa -r 1000 -f fq -l 80 -seq se -q1 Test_Examples/AccFreqL150R1.txt -a1 AGATCGGAAGAGCACACGTCTGAACTCCAGTCACCGATTCGATCTCGTATGCCGTCTTCTGCTTG -p G -o HgSim
~~~~
NB! the adapter sequences are only concatenated to the reads, if the inferred cycle length from quality profiles is greater than the fragment length, the poly-X tail is only added if the sequence with adapter length is still below the cycle length (Cycle length - fragment length - adapter length = 150 - 80 - 65 = 5).
## GENERAL
Next Generation Simulator for Next Generator Sequencing Data version 0.9.0
~~~~bash
Next Generation Simulator for Next Generator Sequencing Data version 0.9.0
Usage
./ngsngs [options] -i -r/-c -l/-lf/-ld -seq -f