DNAtools
DNAtools: Tools for Analysing Forensic Genetic DNA Data - Published in JOSS (2020)
Science Score: 93.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 4 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: joss.theoj.org -
○Committers with academic emails
-
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Last synced: 6 months ago
·
JSON representation
Repository
Development version of the DNAtools R-package
Basic Info
- Host: GitHub
- Owner: mikldk
- License: gpl-3.0
- Language: R
- Default Branch: master
- Size: 1.43 MB
Statistics
- Stars: 0
- Watchers: 3
- Forks: 3
- Open Issues: 1
- Releases: 4
Created about 7 years ago
· Last pushed almost 2 years ago
Metadata Files
Readme
Changelog
License
README.Rmd
---
output: github_document
---
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
message = FALSE
)
```
```{r, echo = FALSE}
library(DNAtools)
```
# DNAtools
[](https://travis-ci.org/mikldk/DNAtools)
[](https://ci.appveyor.com/project/mikldk/DNAtools/branch/master)
[](https://doi.org/10.21105/joss.01981)
There are two main features of this package:
* Computation of the distribution of the numbers of alleles in DNA mixtures.
* Empirical testing of DNA match probabilities.
Each is described in a separate vignette, and a small example given
below under "Getting started".
The documentation (vignettes and manual) is both included in package
and available for reading online at .
## Install
### With internet access
To build and install from Github using R 3.3.0 (or later) and the R `devtools` package 1.11.0 (or later) run this command from within `R`:
```
devtools::install_github("mikldk/DNAtools",
build_opts = c("--no-resave-data", "--no-manual"))
```
You can also install the package without vignettes if needed as follows:
```
devtools::install_github("mikldk/DNAtools")
```
### Without internet access
To install on a computer without internet access:
1. Download `DNAtools` as a `.tar.gz` archive from GitHub, transfer to the destination computer, e.g. using removable media
1. Install `devtools` and `DNAtools` pre-requisites (`multicool`, `Rcpp`, `RcppParallel`, `RcppProgress`, `Rsolnp`)
1. Install `DNAtools` in `R` using the `devtools::install_local()` function
## Contribute, issues, and support ##
Please use the issue tracker at
if you want to notify us of an issue or need support.
If you want to contribute, please either create an issue or make a pull request.
## Getting started
Please read the vignettes for more elaborate explanations than those given below.
The below example is meant to illustrate some of the functionality the package provides in
a compact fashion.
Say that we have a reference database:
```{r}
data(dbExample, package = "DNAtools")
head(dbExample)[, 2:7]
dim(dbExample)
```
We now find the allele frequencies:
```{r}
allele_freqs <- lapply(1:10, function(x){
al_freq <- table(c(dbExample[[x*2]], dbExample[[1+x*2]]))/(2*nrow(dbExample))
al_freq[sort.list(as.numeric(names(al_freq)))]
})
names(allele_freqs) <- sub("\\.1", "", names(dbExample)[(1:10)*2])
```
```{r, include=FALSE}
txtbar <- function(x) {
y <- round(100*noa)
y2 <- lapply(y, rep.int, x = "|")
y3 <- lapply(y2, paste0, collapse = "")
ret <- data.frame(`Number of alleles` = names(x), Frequency = unlist(y3),
check.names = FALSE)
print(ret, quote = FALSE, row.names = FALSE, right = FALSE)
return(invisible(ret))
}
```
### Number of alleles
One could ask: What is the distribution of the number of alleles observed in a three person mixture?
The distribution of the number of alleles in a three person mixture can
be calculated by this package.
We focus on the D16S539 locus:
```{r}
allele_freqs$D16S539
noa <- Pnm_locus(m = 3, theta = 0, alleleProbs = allele_freqs$D16S539)
names(noa) <- seq_along(noa)
noa
```
This can be illustrated by a barchart:
```{r, echo=FALSE, results='markup', comment=''}
txtbar(noa)
```
So it is most likely that a three person mixture on D16S539 has `r names(noa)[which.max(noa)]` alleles.
This can be done for all loci at once:
```{r}
noa <- Pnm_all(m = 3, theta = 0, probs = allele_freqs, locuswise = TRUE)
noa
```
We can also find the convolution and thereby the total number of distinct alleles:
```{r}
noa <- Pnm_all(m = 3, theta = 0, probs = allele_freqs)
noa
```
This can be illustrated by a barchart:
```{r, echo=FALSE, results='markup', comment=''}
txtbar(noa)
```
So it is most likely that a three person mixture has `r names(noa)[which.max(noa)]` distinct alleles on all loci combined.
### Empirical testing of DNA match probabilities
Another relevant questions is how many matches and near-matches there are.
This can be calculated as follows:
```{r}
db_summary <- dbCompare(dbExample, hit = 6, trace = FALSE)
db_summary
```
The hit argument returns pairs of profiles that fully match at `hit` (here 6) or more loci.
The summary matrix gives the number of pairs mathcing/partially-matching at $(i,j)$ loci.
For example the row
```
partial
match 0 1 2 3 4 5 6 7 8 9 10
5 6 19 44 41 26 5
```
means that there are 6+19+44+41+26+5 = 141 pairs of profiles matching exactly at
5 loci.
Conditional on those 5 matches, there are
6 pairs not matching on the remaining 5 loci,
19 pairs partial matching on 1 locus and not matching on the remaining 4 loci,
and so on.
Owner
- Name: Mikkel Meyer Andersen
- Login: mikldk
- Kind: user
- Location: Denmark
- Repositories: 64
- Profile: https://github.com/mikldk
JOSS Publication
DNAtools: Tools for Analysing Forensic Genetic DNA Data
Published
January 16, 2020
Volume 5, Issue 45, Page 1981
Authors
Tags
short tandem repeat markers forensic genetics autosomal markers population genetics weight of evidenceGitHub Events
Total
Last Year
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Mikkel Meyer Andersen | m****l@m****k | 72 |
| tvedebrink | t****e@m****k | 8 |
| jmcurran | j****n@a****z | 4 |
| Torben Tvedebrink | t****k | 1 |
| Charlotte Soneson | c****n@g****m | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 4
- Total pull requests: 2
- Average time to close issues: 3 days
- Average time to close pull requests: 21 days
- Total issue authors: 3
- Total pull request authors: 2
- Average comments per issue: 2.5
- Average comments per pull request: 1.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- standage (2)
- thoree (1)
- tomsing1 (1)
Pull Request Authors
- andrjohns (2)
- csoneson (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- cran 376 last-month
- Total dependent packages: 1
- Total dependent repositories: 2
- Total versions: 15
- Total maintainers: 1
cran.r-project.org: DNAtools
Tools for Analysing Forensic Genetic DNA Data
- Documentation: http://cran.r-project.org/web/packages/DNAtools/DNAtools.pdf
- License: GPL-2 | GPL-3 | file LICENSE [expanded from: GPL (≥ 2) | file LICENSE]
-
Latest release: 0.2-4
published almost 4 years ago
Rankings
Forks count: 14.2%
Dependent packages count: 18.1%
Dependent repos count: 19.3%
Average: 23.8%
Downloads: 32.7%
Stargazers count: 34.6%
Maintainers (1)
Last synced:
6 months ago
Dependencies
DESCRIPTION
cran
- R >= 3.3.0 depends
- Rcpp >= 0.12.12 imports
- RcppParallel >= 4.3.20 imports
- Rsolnp >= 1.16 imports
- multicool >= 0.1 imports
- knitr * suggests
- rmarkdown * suggests
- testthat * suggests
- testthis * suggests
