spikeq

A synthetic FASTQ record generator with pattern spiking

https://github.com/rbfinch/spikeq

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.6%) to scientific vocabulary

Keywords

bioinformatics bioinformatics-tool fastq json regex spike testing
Last synced: 6 months ago · JSON representation ·

Repository

A synthetic FASTQ record generator with pattern spiking

Basic Info
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 2
Topics
bioinformatics bioinformatics-tool fastq json regex spike testing
Created over 1 year ago · Last pushed 10 months ago
Metadata Files
Readme Changelog License Citation

README.md

Generates synthetic FASTQ records free of sequences defined by regex patterns, or containing spiked sequences based on regex patterns

Crates.io License

Feature set

  • Verifies the regex pattern file meets the required format (validation of the pattern file is performed before processing; see the schema.json file in the examples directory)
  • Generates FASTQ records with random DNA sequences of specified lengths, and free from regex patterns specified in the regex pattern file
  • Inserts spike patterns derived from a regex set into a subset of sequences using the spike-sequence subcommand, resulting in a FASTQ file with a subset of sequences containing the inserted patterns, and a summary file of the inserted patterns

Usage

spikeq may be used test bioinformatics tools that process FASTQ files, such as grepq (https://github.com/Rbfinch/grepq)

Get instructions and examples using spikeq -h, and spikeq spike-sequence -h for help on the spike-sequence subcommand.

[!NOTE] The regex patterns should only include the DNA sequence characters (A, C, G, T), and not IUPAC ambiguity codes (N, R, Y, etc.). If your regex patterns contain any IUPAC ambiguity codes, then transform them to DNA sequence characters (A, C, G, T) before using them with spikeq. See regex.json in the examples directory for an example of valid pattern file.

Requirements

  • spikeq has been tested on Linux and macOS. It might work on Windows, but it has not been tested on this platform.
  • Ensure that Rust is installed on your system (https://www.rust-lang.org/tools/install)
  • If the build fails, make sure you have the latest version of the Rust compiler by running rustup update

Installation

  • From crates.io (easiest method)

    • cargo install spikeq
  • From source

    • Clone the repository and cd into the spikeq directory
    • Run cargo build --release
    • Relative to the cloned parent directory, the executable will be located in ./target/release
    • Make sure the executable is in your PATH or use the full path to the executable

Examples

```sh

Generate 1000 synthetic FASTQ records with sequence lengths between 200 and 800, and which are free from the regex patterns specified in the regex.json file (generated the FASTQ file named 459cac6f-8d65-48ed-99aa-f03930b3c02f.fastq).

spikeq -r regex.json -n 1000 -l 200,800

Generate 1000 synthetic FASTQ records with sequence lengths between 200 and 800, and which are free from the regex patterns specified in the regex.json file, then insert two patterns generated from the regex.json file into 10 sequences (generated the FASTQ file named 4b1f92dc-14e1-496f-a68b-d1683251d827.fastq, and the summary file named inserted.json ).

spikeq -r regex.json -n 1000 -l 200,800 spike-sequence --num-patterns 2 --num-sequences 10 ```

Citation

If you use spikeq in your research, please cite as follows:

Crosbie, N.D. (2024). spikeq: A synthetic FASTQ record generator with pattern spiking. 10.5281/zenodo.14211052.

Update changes

see CHANGELOG

Logo attribution

The logo was created using Inkscape and is based on the Thorn Helix SVG Vector at SVGRepo (https://www.svgrepo.com/svg/321583/thorn-helix).

License

MIT

Owner

  • Login: Rbfinch
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Crosbie"
  given-names: "Nicholas D."
  orcid: "https://orcid.org/0000-0002-0319-4248"
title: "spikeq: A synthetic FASTQ record generator with pattern spiking."
version: 1.1.6
doi: 10.5281/zenodo.14211052
date-released: 2024-11-12
url: "https://github.com/Rbfinch/spikeq"

GitHub Events

Total
  • Release event: 2
  • Watch event: 1
  • Public event: 1
  • Push event: 10
  • Create event: 2
Last Year
  • Release event: 2
  • Watch event: 1
  • Public event: 1
  • Push event: 10
  • Create event: 2

Packages

  • Total packages: 1
  • Total downloads:
    • cargo 1,812 total
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 2
  • Total maintainers: 1
crates.io: spikeq

A synthetic FASTQ record generator with pattern spiking

  • Versions: 2
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 1,812 Total
Rankings
Dependent repos count: 24.4%
Dependent packages count: 32.3%
Forks count: 39.1%
Stargazers count: 46.3%
Average: 47.6%
Downloads: 95.8%
Maintainers (1)
Last synced: 7 months ago