Fraggler

Fraggler: A Python Package and CLI Tool for Automated Fragment Analysis - Published in JOSS (2024)

https://github.com/clinical-genomics-umea/fraggler

Science Score: 93.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Scientific Fields

Sociology Social Sciences - 87% confidence
Biology Life Sciences - 84% confidence
Mathematics Computer Science - 84% confidence
Last synced: 4 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: Clinical-Genomics-Umea
  • License: mit
  • Language: FreeBasic
  • Default Branch: main
  • Size: 11.4 MB
Statistics
  • Stars: 2
  • Watchers: 2
  • Forks: 4
  • Open Issues: 0
  • Releases: 1
Created about 2 years ago · Last pushed about 1 year ago
Metadata Files
Readme License

README.md

Build Status !pypi Download Status status

logo

Description

Fraggler is for fragment analysis in Python! Fraggler is a Python package that provides functionality for analyzing and generating reports for fsa files. It offers a Python API, command-line tool, and GUI interface.


Install

bash pip install fraggler

Dependencies

Fraggler depends on:

  • pandas
  • scikit-learn
  • lmfit
  • scipy
  • biopython
  • panel
  • altair
  • PySide6

Python API

To get an overview how the library can be used in a python environment, please look at the tutorial.ipynb.

CLI

Usage

To generate peak area reports and a peak table for all input files, use the fraggler -t area or fraggler -t peak command followed by the required arguments and any optional flags.

```bash usage: fraggler [-h] -t {area,peak} -f FSA -o OUTPUT -l {LIZ,ROX,ORANGE,ROX500} -sc SAMPLE_CHANNEL [-mindist MINDISTANCEBETWEENPEAKS] [-minsheight MINSIZESTANDARD_HEIGHT] [-cp CUSTOM_PEAKS] [-heightsample PEAKHEIGHTSAMPLEDATA] [-minratio MINRATIOTOALLOW_PEAK] [-distance DISTANCEBETWEENASSAYS] [-peakstart SEARCHPEAKS_START] [-m {gauss,voigt,lorentzian}]

Analyze your Fragment analysis files!

options: -h, --help show this help message and exit -t {area,peak}, --type {area,peak} Fraggler area or fraggler peak -f FSA, --fsa FSA fsa file to analyze -o OUTPUT, --output OUTPUT Output folder -l {LIZ,ROX,ORANGE,ROX500}, --ladder {LIZ,ROX,ORANGE,ROX500,LIZ500} Which ladder to use -sc SAMPLECHANNEL, --samplechannel SAMPLECHANNEL Which sample channel to use. E.g: 'DATA1', 'DATA2'... -mindist MINDISTANCEBETWEENPEAKS, --mindistancebetweenpeaks MINDISTANCEBETWEENPEAKS Minimum distance between size standard peaks -minsheight MINSIZESTANDARDHEIGHT, --minsizestandardheight MINSIZESTANDARDHEIGHT Minimun height of size standard peaks -cp CUSTOMPEAKS, --custompeaks CUSTOMPEAKS csv file with custom peaks to find -heightsample PEAKHEIGHTSAMPLEDATA, --peakheightsampledata PEAKHEIGHTSAMPLEDATA Minimum height of peaks in sample data -minratio MINRATIOTOALLOWPEAK, --minratiotoallowpeak MINRATIOTOALLOWPEAK Minimum ratio of the lowest peak compared to the heighest peak in the assay -distance DISTANCEBETWEENASSAYS, --distancebetweenassays DISTANCEBETWEENASSAYS Minimum distance between assays in a multiple assay experiment -peakstart SEARCHPEAKSSTART, --searchpeaksstart SEARCHPEAKSSTART Where to start searching for peaks in basepairs -m {gauss,voigt,lorentzian}, --peakareamodel {gauss,voigt,lorentzian} Which peak finding model to use ```

Example of CLI command:

bash fraggler -t area -f demo/ -o testing_fraggler -l LIZ -sc DATA1

Peak finding

  • If not specified, fraggler finds peaks agnostic in the fsa file. To specifiy custom assays with certain peaks and intervals, the user can add a .csv file to the --custom_peaks argument. The csv file MUST have the following shape:

| name | start | stop | amount | minratio | which | peakdistance | | ---- | ----- | ---- | ------ | --------- | ----- | ------------- | | prt1 | 140 | 150 | 2 | 0.2 | FIRST | 5 |

Example how how a file could look:

txt name,start,stop,amount,min_ratio,which,peak_distance prt1,135,155,2,0.2,FIRST, prt3,190,205,,0.2,FIRST, prt2,222,236,2,0.2,FIRST,5 prt4,262,290,5,,,

  • name: Name of the assay
  • start: Start of the assay in basepairs
  • stop: Stop of the assay in basepairs
  • amount: Optional. Amount of peaks in assay. If left empty every peak in the interval is included.
  • min_ratio: Optional. Only peaks with the a ratio of the min_ratio of the highest peak is included, e.g. if min_ratio == .02, only peaks with a height of 20 is included, if the highest peak is 100 units
  • which: LARGEST | FIRST. Can be left empty. Which peak should be included if there are more peaks than the amount. if FIRST is set, then the two first peaks are chosen. If LARGEST are set, then the two largests peaks in the area are chosen. Defaults to LARGEST
  • peak_distance: Optional. Distance between peaks must be under this value.

GUI Integration

The GUI for Fraggler offers an accessible way to analyze FSA files without needing to interact with the command line directly. It allows users to upload files, set parameters, and generate analysis reports visually. The GUI is built using PyQt and offers a sidebar navigation for file browsing, options for report generation, and an output view for results.

Running the GUI

To run the Fraggler GUI after installing the package, use the following command:

bash fraggler-gui

GUI Features

  • File Explorer: Navigate and select FSA files for analysis.
  • Parameter Setup: Choose analysis type, set peak detection parameters, and specify custom peaks.
  • Report Generation: Display and export detailed reports in HTML format, including peak area and peak tables.

Documentation

Click here to get full documentation of API.

Output

One example of the report generated from fraggler area can be seen here: Example report

Citation

If you use fraggler, please cite the paper!

Contributions

Please check out How to contribute

Owner

  • Name: Clinical-Genomics-Umea
  • Login: Clinical-Genomics-Umea
  • Kind: organization

JOSS Publication

Fraggler: A Python Package and CLI Tool for Automated Fragment Analysis
Published
August 26, 2024
Volume 9, Issue 100, Page 6869
Authors
William Rosenbaum ORCID
Department of Medical Biosciences, Umeå University, SE-90185, Umeå, Sweden, Clinical Genomics Umeå, Umeå University, SE-90185, Umeå, Sweden
Pär Larsson
Clinical Genomics Umeå, Umeå University, SE-90185, Umeå, Sweden
Editor
Charlotte Soneson ORCID
Tags
Paralouge ratio test PRT Fragment analysis Bioinformatics DNA analysis Automation Fragment size determination

GitHub Events

Total
  • Watch event: 2
  • Issue comment event: 3
  • Push event: 2
  • Pull request event: 2
  • Fork event: 1
Last Year
  • Watch event: 2
  • Issue comment event: 3
  • Push event: 2
  • Pull request event: 2
  • Fork event: 1

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 83
  • Total Committers: 5
  • Avg Commits per committer: 16.6
  • Development Distribution Score (DDS): 0.301
Past Year
  • Commits: 28
  • Committers: 4
  • Avg Commits per committer: 7.0
  • Development Distribution Score (DDS): 0.536
Top Committers
Name Email Commits
Williams UMU-MacBook w****m@u****e 58
Thaddaeus Sandidge t****e@g****m 13
Sandidge t****e@v****m 8
Charlotte Soneson c****n@g****m 3
Pär Larsson p****n@u****e 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 3
  • Total pull requests: 20
  • Average time to close issues: about 1 month
  • Average time to close pull requests: about 15 hours
  • Total issue authors: 1
  • Total pull request authors: 3
  • Average comments per issue: 2.33
  • Average comments per pull request: 0.15
  • Merged pull requests: 20
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 5
  • Average time to close issues: N/A
  • Average time to close pull requests: 2 days
  • Issue authors: 0
  • Pull request authors: 3
  • Average comments per issue: 0
  • Average comments per pull request: 0.6
  • Merged pull requests: 5
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • KatyBrown (3)
Pull Request Authors
  • willros (34)
  • ThaddaeusSandidge (2)
  • csoneson (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

.github/workflows/pdoc.yaml actions
  • actions/checkout v4 composite
  • actions/deploy-pages v4 composite
  • actions/setup-python v5 composite
  • actions/upload-pages-artifact v3 composite
requirements.txt pypi
  • altair ==5.2.0
  • biopython ==1.83
  • fire ==0.5.0
  • lmfit ==1.2.2
  • matplotlib ==3.8.3
  • networkx ==3.2.1
  • pandas ==2.2.0
  • panel ==1.3.8
  • scikit-learn ==1.4.1
  • scipy ==1.12.0
setup.py pypi
  • altair *
  • biopython *
  • bokeh ==3.3.4
  • colorama *
  • fire *
  • lmfit *
  • matplotlib *
  • networkx *
  • numpy *
  • openpyxl *
  • pandas *
  • panel ==1.3.8
  • scikit-learn *
  • scipy *
  • setuptools *
.github/workflows/rere_test.yaml actions
  • actions/checkout v3 composite
  • actions/setup-python v3 composite