https://github.com/apetkau/from-samplesheet-test-nf

https://github.com/apetkau/from-samplesheet-test-nf

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (6.0%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: apetkau
  • Language: Nextflow
  • Default Branch: main
  • Size: 14.6 KB
Statistics
  • Stars: 0
  • Watchers: 2
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created almost 3 years ago · Last pushed almost 3 years ago
Metadata Files
Readme

README.md

Repostiory for testing fromSamplesheet

This repository is used to test out fromSamplesheet with paired/single-end fastq data for comparing running times.

In order to run, first do nextflow pull to download the pipeline files:

bash nextflow pull apetkau/from-samplesheet-test-nf

Now, you can follow the below instructions to test out different cases.

Paired-end

Case: 30 paired-end samples

To extract running times for paired-end data:

bash time nextflow run apetkau/from-samplesheet-test-nf -r main --input https://raw.githubusercontent.com/apetkau/from-samplesheet-test-nf/main/samplesheet.pe.30.csv

For me, this takes ~6 seconds.

To determine time spent validating the samplesheet.

```bash $ grep 'Starting validation' -A1 .nextflow.log

Sep-06 10:43:29.713 [main] DEBUG nextflow.validation.SchemaValidator - Starting validation: 'input': '/apetkau/from-samplesheet-test-nf/main/samplesheet.pe.30.csv' with 'assets/schema_input.json'

Sep-06 10:43:29.878 [main] DEBUG nextflow.validation.SchemaValidator - Validation passed: 'input': '/apetkau/from-samplesheet-test-nf/main/samplesheet.pe.30.csv' with 'assets/schema_input.json'

Sep-06 10:43:29.903 [main] DEBUG nextflow.validation.SchemaValidator - Starting validation: '/apetkau/from-samplesheet-test-nf/main/samplesheet.pe.30.csv' with 'assets/schemainput.json' Sep-06 10:43:29.952 [main] DEBUG nextflow.validation.SchemaValidator - Validation passed: '/apetkau/from-samplesheet-test-nf/main/samplesheet.pe.30.csv' with 'assets/schemainput.json' ```

That is it takes < 1 second for validation, however, this is repeated twice.

Or, in summary: * Total runtime: 6 seconds * Validation time: < 1 second (x2)

Case: 60 paired-end samples

Repeating above with the file https://raw.githubusercontent.com/apetkau/from-samplesheet-test-nf/main/samplesheet.pe.60.csv gives:

  • Total runtime: 6 seconds
  • Validation time: < 1 second (x2)

Single-end

Compare the above run-times to the single-end cases.

Case: 30 single-end samples

bash time nextflow run apetkau/from-samplesheet-test-nf -r main --input https://raw.githubusercontent.com/apetkau/from-samplesheet-test-nf/main/samplesheet.se.30.csv

``` $ grep 'Starting validation' -A1 .nextflow.log

Sep-06 10:45:47.894 [main] DEBUG nextflow.validation.SchemaValidator - Starting validation: 'input': '/apetkau/from-samplesheet-test-nf/main/samplesheet.se.30.csv' with 'assets/schema_input.json'

Sep-06 10:46:11.016 [main] DEBUG nextflow.validation.SchemaValidator - Validation passed: 'input': '/apetkau/from-samplesheet-test-nf/main/samplesheet.se.30.csv' with 'assets/schema_input.json'

Sep-06 10:46:11.035 [main] DEBUG nextflow.validation.SchemaValidator - Starting validation: '/apetkau/from-samplesheet-test-nf/main/samplesheet.se.30.csv' with 'assets/schemainput.json' Sep-06 10:46:34.027 [main] DEBUG nextflow.validation.SchemaValidator - Validation passed: '/apetkau/from-samplesheet-test-nf/main/samplesheet.se.30.csv' with 'assets/schemainput.json' ```

Summary: * Total runtime: 52 seconds * Validationt time: 24 + 13 seconds

Case: 60 single-end samples

Use https://raw.githubusercontent.com/apetkau/from-samplesheet-test-nf/main/samplesheet.se.60.csv.

  • Total runtime: 12 minutes 14 seconds (734 seconds)
  • Validation time: 368 + 360 seconds

Owner

  • Name: Aaron Petkau
  • Login: apetkau
  • Kind: user
  • Company: Public Health Agency of Canada

Bioinformatician with the Public Health Agency of Canada.

GitHub Events

Total
Last Year

Committers

Last synced: over 1 year ago

All Time
  • Total Commits: 11
  • Total Committers: 1
  • Avg Commits per committer: 11.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Aaron Petkau a****u@c****a 11
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: over 1 year ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels