delfies
delfies: a Python package for the detection of DNA breakpoints with neo-telomere addition - Published in JOSS (2025)
Science Score: 98.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 7 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: joss.theoj.org, zenodo.org -
○Committers with academic emails
-
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Repository
Querying genomes for evidence of Programmed DNA Elimination
Basic Info
Statistics
- Stars: 2
- Watchers: 3
- Forks: 1
- Open Issues: 3
- Releases: 8
Metadata Files
README.md
delfies is a tool that identifies genomic locations where double-strand
breaks have occurred followed by telomere addition. It was initially designed
and validated for studying the process of Programmed DNA Elimination in
nematodes, but should work for other clades and applications too.
For details/to credit the tool, please see/cite the associated paper:
Letcher, B. and Delattre, M. (2025). delfies: a Python package for the detection of DNA breakpoints with neo-telomere addition. Journal of Open Source Software, 10(105), 7385, https://doi.org/10.21105/joss.07385
Getting started
delfies takes as input a genome fasta (gzipped supported) and an indexed SAM/BAM of
sequencing reads aligned to the genome.
sh
delfies --help
samtools index <aligned_reads>.bam
delfies <genome>.fa.gz <aligned_reads>.bam <output_dir>
cat <output_dir>/breakpoint_locations.bed
For how to obtain a suitable SAM/BAM, see input data, and for
downloading a real genome and BAMs for a test run of delfies, see test run.
Table of Contents
Installation
Using pip (or equivalent - uv, etc.):
```sh
Install latest release from PyPI
pip install delfies
Or install a specific release from PyPI:
pip install delfies==0.10.0
Or clone and install tip of main
git clone https://github.com/bricoletc/delfies/ pip install ./delfies ```
Input data
Sequencing technologies
delfies is designed to work with both Illumina short reads and ONT or PacBio
long reads. Long reads are better for finding breakpoints in more repetitive
regions of the genome. A high fraction of sequenced bases with a quality >Q20
is desirable (e.g. >70%). I found delfies worked on recent data from all three
sequencing technologies: see test run below.
Aligners
To produce a SAM/BAM with which you can find breakpoints, you need to use a read
aligner that reports soft clips (parts of a reads that are not aligned to the
reference). Both bowtie2 (in --local mode) and minimap2 (by default) do this.
Use minimap2 for long reads (>300bp), with the appropriate preset (e.g. -x map-ont
for Nanopore data).
Test run with real data
I provide a processed subset of publicly-available data here: https://doi.org/10.5281/zenodo.14101797.
The data consist of a 2kbp region of the assembled genome of Oscheius onirici
and three alignment BAMs from sequencing data produced using Illumina, ONT and
PacBio. The data were aligned to the 2kbp region using minimap2. See the
Zenodo link for details on the sequencing data (read lengths, error rates) and
public links to the raw data.
You can run delfies on the inputs in this archive to make sure it is properly
installed and produces the expected outputs:
```sh wget https://zenodo.org/records/14282333/files/delfieszenodotestdata.tar.gz tar xf delfieszenodotestdata.tar.gz
Run delfies; for example, having defined genome, bam and odirname variables:
delfies --threads 16 \ --teloforwardseq TTAGGC \ --breakpointtype all \ --minmapq 20 \ --minsupportingreads 6 \ ${genome} ${bam} ${odirname}
Compare with the expected outputs:
find delfieszenodotestdata -name "*breakpointlocations.bed" | xargs cat ```
User Manual
CLI options
sh
delfies --help
- Do use the
--threadsoption if you have multiple cores/CPUs available. - [Breakpoints]
- There are two types of breakpoints: see detailed docs.
- Nearby breakpoints can be clustered together to account for variability in breakpoint location (
--clustering_threshold).
- [Region selection]: You can select a specific region to focus on, specified as a string or as a BED file.
- [Telomeres]
- Specify the telomere sequence for your organism using
--telo_forward_seq. If you're unsure, I recommend the tool telomeric-identifier for finding out. - By default,
delfiesdiscards breakpoints occurring inside telomere arrays, as they in theory correspond to false positives (cutting + telomere addition at existing telomeres). You can keep these breakpoints with--keep_telomeric_breakpoints.
- Specify the telomere sequence for your organism using
- [Aligned reads]
- To analyse confidently-aligned reads only, you can filter reads by MAPQ (
--min_mapq) and by bitwise flag (--read_filter_flag). - You can tolerate more or less mutations in the assembly telomeres (and in the sequencing reads) using
--telo_max_edit_distanceand--telo_array_size.
- To analyse confidently-aligned reads only, you can filter reads by MAPQ (
Outputs
The two main outputs of delfies are:
breakpoint_locations.bed: a BED-formatted file containing the location of identified elimination breakpoints.breakpoint_sequences.fasta: a FASTA-formatted file containing the sequences of identified elimination breakpoints
Validating breakpoints
I highly recommend visualising your results! E.g., by loading your input
fasta and BAM and output delfies' output breakpoint_locations.bed in
IGV.
Confident/true breakpoints will typically have:
- Good read support. Note that breakpoints are ordered by read support in the
delfiesoutput filebreakpoint_locations.bed, and you can require a minimum number of supporting reads using the CLI option--min_supporting_reads. - A difference in read coverage before and after the breakpoint. The nature of this difference depends on the ratio between cells with and without the breakpoint. As an example, in organisms that eliminate parts of their genome in the soma, if most sequenced cells are from the soma, expect more reads before the breakpoint than after it ('before' and 'after' defined relative to the reported breakpoint strand).
Ultimately though, only biological experiments can truly validate identified breakpoints.
Applications
- The fasta output enables looking for sequence motifs that occur at breakpoints, e.g. using MEME.
- The BED output enables classifying a genome into retained and eliminated regions. The 'strand' of breakpoints is especially useful for this: see detailed docs.
- The BED output also enables assembling past somatic telomeres: for how to do this, see detailed docs.
Detailed documentation
For more details on delfies, including outputs and applications, see detailed_docs.
Contributing
Contributions always welcome!
Please see CONTRIBUTING.md for how (reporting issues, requesting
features, contributing code). This document includes instructions on how to run
delfies' unit and functional tests.
Owner
- Name: Brice Letcher
- Login: bricoletc
- Kind: user
- Company: EMBL-EBI
- Twitter: bricoletc
- Repositories: 2
- Profile: https://github.com/bricoletc
Bioinformatician and early-career researcher - EMBL-EBI and CNRS ~~~~~~ Parsing my way through DNA sequence data
JOSS Publication
delfies: a Python package for the detection of DNA breakpoints with neo-telomere addition
Authors
Tags
Bioinformatics Genomics Programmed DNA Elimination Soma/germline differentiationCitation (CITATION.cff)
cff-version: "1.2.0"
authors:
- family-names: Letcher
given-names: Brice
orcid: "https://orcid.org/0000-0002-8921-6005"
- family-names: Delattre
given-names: Marie
orcid: "https://orcid.org/0000-0003-1640-0300"
doi: 10.5281/zenodo.14526258
message: If you use this software in your own work, please cite the associated article in the
Journal of Open Source Software.
preferred-citation:
authors:
- family-names: Letcher
given-names: Brice
orcid: "https://orcid.org/0000-0002-8921-6005"
- family-names: Delattre
given-names: Marie
orcid: "https://orcid.org/0000-0003-1640-0300"
date-published: 2025-01-12
doi: 10.21105/joss.07385
issn: 2475-9066
issue: 105
journal: Journal of Open Source Software
publisher:
name: Open Journals
start: 7385
title: "delfies: a Python package for the detection of DNA breakpoints
with neo-telomere addition"
type: article
url: "https://joss.theoj.org/papers/10.21105/joss.07385"
volume: 10
title: "delfies: a Python package for the detection of DNA breakpoints
with neo-telomere addition"
GitHub Events
Total
- Create event: 5
- Release event: 3
- Issues event: 10
- Watch event: 3
- Issue comment event: 7
- Push event: 27
- Pull request event: 1
Last Year
- Create event: 5
- Release event: 3
- Issues event: 10
- Watch event: 3
- Issue comment event: 7
- Push event: 27
- Pull request event: 1
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Brice Letcher | b****r@e****r | 144 |
| andrewhsiao11 | 9****1 | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 8
- Total pull requests: 1
- Average time to close issues: 12 days
- Average time to close pull requests: 9 days
- Total issue authors: 3
- Total pull request authors: 1
- Average comments per issue: 1.5
- Average comments per pull request: 1.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 8
- Pull requests: 1
- Average time to close issues: 12 days
- Average time to close pull requests: 9 days
- Issue authors: 3
- Pull request authors: 1
- Average comments per issue: 1.5
- Average comments per pull request: 1.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- bricoletc (3)
- natir (3)
- andrewhsiao11 (2)
Pull Request Authors
- andrewhsiao11 (2)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 71 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 5
- Total maintainers: 1
pypi.org: delfies
delfies is a tool for the detection of DNA Elimination breakpoints
- Documentation: https://delfies.readthedocs.io/
- License: MIT
-
Latest release: 0.10.0
published 9 months ago
Rankings
Maintainers (1)
Dependencies
- black 24.1.1 develop
- flake8 7.0.0 develop
- isort 5.13.2 develop
- mccabe 0.7.0 develop
- mypy-extensions 1.0.0 develop
- packaging 23.2 develop
- pathspec 0.12.1 develop
- platformdirs 4.2.0 develop
- pycodestyle 2.11.1 develop
- pyflakes 3.2.0 develop
- tomli 2.0.1 develop
- typing-extensions 4.7.1 develop
- click 8.1.7
- colorama 0.4.6
- numpy 1.21.1
- pybedtools 0.9.1
- pysam 0.22.0
- six 1.16.0
- black ^24.1.1 develop
- flake8 ^7.0.0 develop
- isort ^5.13.2 develop
- click ^8.1.7
- pybedtools ^0.9.1
- pysam ^0.22.0
- python ^3.8.1
- actions/checkout v4 composite
- actions/upload-artifact v3 composite
- openjournals/openjournals-draft-action master composite
- actions/checkout v4 composite
- actions/download-artifact v4 composite
- actions/setup-python v5 composite
- actions/upload-artifact v4 composite
- pypa/gh-action-pypi-publish release/v1 composite
- snok/install-poetry v1 composite
