readfish
CLI tool for flexible and fast adaptive sampling on ONT sequencers
Science Score: 59.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 15 DOI reference(s) in README -
✓Academic publication links
Links to: biorxiv.org -
✓Committers with academic emails
3 of 9 committers (33.3%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.3%) to scientific vocabulary
Keywords
Repository
CLI tool for flexible and fast adaptive sampling on ONT sequencers
Basic Info
- Host: GitHub
- Owner: LooseLab
- License: gpl-3.0
- Language: Python
- Default Branch: main
- Homepage: https://looselab.github.io/readfish/
- Size: 9.92 MB
Statistics
- Stars: 183
- Watchers: 13
- Forks: 35
- Open Issues: 16
- Releases: 9
Topics
Metadata Files
README.md
If you are anything like us (Matt), reading a README is the last thing you do when running code. PLEASE DON'T DO THAT FOR READFISH. This will effect changes to your sequencing and - if you use it incorrectly - cost you money. We have added a list of GOTCHAs at the end of this README. We have almost certainly missed some... so - if something goes wrong, let us know so we can add you to the GOTCHA hall of fame!
[!NOTE]
We also have more detailed documentation for your perusal at https://looselab.github.io/readfish[!NOTE]
Now also see our cool FAQ.[!WARNING] Breaking for any version of
MinKNOW <= 6.0.0As ofreadfish >=2024.3.0we no longer support guppy.
readfish is a Python package that integrates with the Read Until API.
The Read Until API provides a mechanism for an application to connect to a MinKNOW server to obtain read data in real-time. The data can be analysed in the way most fit for purpose, and a return call can be made to the server to unblock the read in progress and so direct sequencing capacity towards reads of interest.
This implementation of readfish requires Dorado server version >= 7.3.9 and MinKNOW version core >= 6.0.0 . It will not work on earlier versions.
To run with earlier versions of MinKNOW please use an earlier version of readfish.
The code here has been tested with Dorado in GPU mode using GridION Mk1 and NVIDIA RTX4090s on live sequencing runs and on MacOSX M2Max using playback on a simulated run (see below for how to test this). This code is run at your own risk as it DOES affect sequencing output. You are strongly advised to test your setup prior to running (see below for example tests).
Supported Sequencing Platforms
The following platforms are supported:
- PromethION Big Boy
- P2Solo Smol Big Boy
- P2i Not so Smol Big Boy
- GridION Box
- MinION Smol Boy
[!WARNING] PromethION support is currently only available using the Mappy-rs plugin only. See here for more information.
Supported OS's
The following OSs are supported:
- Linux yay
- MacOS boo (Apple Silicon Only)
[!NOTE]
Note - MacOS supports is on MinKNOW 5.7 and greater using Dorado basecaller on Apple Silicon devices only.
Citation
The paper is available at nature biotechnology and bioRxiv
If you use this software please cite: 10.1038/s41587-020-00746-x
Readfish enables targeted nanopore sequencing of gigabase-sized genomes Alexander Payne, Nadine Holmes, Thomas Clarke, Rory Munro, Bisrat Debebe, Matthew Loose Nat Biotechnol (2020); doi: https://doi.org/10.1038/s41587-020-00746-x
Other works
An update preprint is available at bioRxiv
Barcode aware adaptive sampling for Oxford Nanopore sequencers Alexander Payne, Rory Munro, Nadine Holmes, Christopher Moore, Matt Carlile, Matthew Loose bioRxiv (2021); doi: https://doi.org/10.1101/2021.12.01.470722
Installation
Our preferred installation method is via conda.
The environment is specified as:
yaml
name: readfish
channels:
- bioconda
- conda-forge
- defaults
dependencies:
- python=3.10
- pip
- pip:
- readfish[all]
Saving the snippet above as readfish_env.yml and running the following commands will create the environment.
console
conda env create -f readfish_env.yml
conda activate readfish
Apple Silicon
Some users may encounter an issue with grpcio on apple silicon. This can be fixed by reinstalling grpcio as follows:
console
pip uninstall grpcio
GRPC_PYTHON_LDFLAGS=" -framework CoreFoundation" pip install grpcio --no-binary :all:
Installing with development dependencies
A conda yaml file is available for installing with dev dependencies - development.yml
bash
curl -LO https://raw.githubusercontent.com/LooseLab/readfish/e30f1fa8ac7a37bb39e9d8b49251426fe1674c98/docs/development.yml?token=GHSAT0AAAAAACBZL42IS3QVM4ZGPPW4SHB6ZE67V6Q
conda env create -f development.yml
conda activate readfish_dev
|
Important !!
| |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | MinKNOW has now transitioned from Guppy to Dorado. Until MinKNOW version 5.9 both Guppy and Dorado used ont-pyguppy-client-lib.As of MinKNOW version 5.9 and Dorado server version 7.3.9 and greater Dorado required an alternate library,
ont-pybasecall-client-lib, but guppy could still be used. As of MinKNOW 6.0 Guppy support has been deprecated and only Dorado support is provided. It is important to ensure the correct library is installed for your specific configuration and the listed
ont-pyguppy-client-lib or ont-pybasecaller-client-lib version may not match the versions installed on your system. To fix this, Please see this issue, using the appropriate library. |
ONT's Dorado Basecall Server GPU should be installed and running as a server.
Alternatively, install readfish into a python virtual-environment
```console # Make a virtual environment python3 -m venv readfish . ./readfish/bin/activate pip install --upgrade pip # Install our readfish Software pip install readfish[all] # Install ont_pybasecall_client that matches your dorado basecall server version. E.G. pip install ont_pybasecall_client_lib==7.1.2 ```Usage
```console
usage: readfish [-h] [--version] ...
positional arguments:
Sub-commands
targets Run targeted sequencing
barcode-targets
Run targeted sequencing
unblock-all Unblock all reads
validate readfish TOML Validator
options:
-h, --help show this help message and exit
--version show program's version number and exit
See 'TOML File
For information on the TOML files see TOML.md. There are several example TOMLS, with comments explaining what each field does, as well as the overall purpose of the TOML file here .
Testing
To test readfish on your configuration we recommend first running a playback experiment to test unblock speed and then selection.
The following steps should all happen with a configuration (test) flow cell inserted into the target device.
A simulated device can also be created within MinKNOW, following these instructions. This assumes that you are runnning MinKNOW locally, using default ports. If this is not the case a developer API token is required on the commands as well, as well as setting the correct port.
If no test flow cell is available, a simulated device can be created within MinKNOW, following the below instructions.
Adding a simulated position for testing
1. Linux
In the readfish virtual environment we created earlier:
- See help
```console
python -m minknow_api.examples.manage_simulated_devices --help
```
- Add Minion position
```console
python -m minknow_api.examples.manage_simulated_devices --add MS00000
```
- Add PromethION position
```console
python -m minknow_api.examples.manage_simulated_devices --prom --add S0
```
2. Mac
In the readfish virtual environment we created earlier:
- See help
```console
python -m minknow_api.examples.manage_simulated_devices --help
```
- Add Minion position
```console
python -m minknow_api.examples.manage_simulated_devices --add MS00000
```
- Add PromethION position
```console
python -m minknow_api.examples.manage_simulated_devices --prom --add S0
```
As a back up it is possible to restart MinKNOW with a simulated device. This is done as follows:
1. Stop `minknow`
On Linux:
```console
cd /opt/ont/minknow/bin
sudo systemctl stop minknow
```
1. Start MinKNOW with a simulated device
On Linux
```console
sudo ./mk_manager_svc -c /opt/ont/minknow/conf --simulated-minion-devices=1 &
```
You _may_ need to add the host `127.0.0.1` in the MinKNOW UI.
Configuring bulk FAST5 file Playback
Download an open access bulk FAST5 file, either [R9.4.1 4khz][bulk - R9.4.1] or [R10 (5khz)][bulk - R10.4 5khz].
This file is 21Gb so make sure you have sufficient space.
A promethION bulkfile is also available but please note this is [R10.4 4khz][bulk - promethION - R10.4 4khz] and so will give slightly unexpected results on MinKNOW which assumes 5khz.
This file is approx 35Gb in size.
Previously to set up Playback using a pre-recorded bulk FAST5 file, it was necessary to edit the sequencing configuration file that MinKNOW uses. This is currently no longer the case. The "old method" steps are left after this section for reference only or if the direct playback from a bulk file option is removed in future.
To start sequencing using playback, simply begin setting up the run in the MinKNOW UI as you would usually.
Under Run Options you can select Simulated Playback and browse to the downloaded Bulk Fast5 file.

> [!NOTE]
> Note - The below instructions, whilst they will still work, are no longer required. They are left here for reference only. As of Minknow 5.7, it is possible to select a bulk FAST5 file for playback in the MinKNOW UI.
Old method Configuring bulk FAST5 file Playback
To setup a simulation the sequencing configuration file that MinKNOW uses must be edited.
Steps:
1. Download an open access bulkfile - either [R9.4.1][bulk - R9.4.1] or [R10 (5khz)][bulk - R10.4 5khz]. These files are approximately 21Gb so make sure you have plenty of space. The files are from NA12878 sequencing data using either R9.4.1 or R10.4 pores. Data is not barcoded and the libraries were ligation preps from DNA extracted from cell lines.
1. A promethION bulkfile is also available but please note this is [R10.4, 4khz][bulk - promethION - R10.4 4khz], and so will give slightly unexpected results on MinKNOW which assumes 5khz.
1. Copy a sequencing TOML file to the `user_scripts` folder:
On Mac if your MinKNOW output directory is the default:
```console
mkdir -p /Library/MinKNOW/data/user_scripts/simulations
cp /Applications/MinKNOW.app/Contents/Resources/conf/package/sequencing/sequencing_MIN106_DNA.toml /Library/MinKNOW/data/user_scripts/simulations/sequencing_MIN106_DNA_sim.toml
```
On Linux:
```console
sudo mkdir -p /opt/ont/minknow/conf/package/sequencing/simulations
cp /opt/ont/minknow/conf/package/sequencing/sequencing_MIN106_DNA.toml /opt/ont/minknow/conf/package/sequencing/simulations/sequencing_MIN106_DNA_sim.toml
```
1. Edit the copied file to add the following line under the line that reads "`[custom_settings]`":
```text
simulation = "/full/path/to/your_bulk.FAST5"
```
Change the text between the quotes to point to your downloaded bulk FAST5 file.
1. Optional, If running Dorado in GPU mode, you can set the parameter `break_reads_after_seconds = 1.0`
to `break_reads_after_seconds = 0.4`. This results in a smaller read chunk. For R10.4 this is not required but can be tried. For adaptive sampling on PromethION, this should be left at 1 second.
1. In MinKNOW >= 6.0.0 this value defaults to 0.8 which is a reasonable balance.
1. In the MinKNOW GUI, right click on a sequencing position and select `Reload Scripts`.
Your version of MinKNOW will now playback the bulkfile rather than live sequencing.
1. Start a sequencing run as you would normally, selecting the corresponding flow
cell type to the edited script (here FLO-MIN106) as the flow cell type.
Whichever instructions you followed, the run should start and immediately begin a mux scan. Let it run for around
five minutes after which your read length histogram should look as below:

Testing unblock response
Now we shall test unblocking by running `readfish unblock-all` which will simply eject
every single read on the flow cell.
1. To do this run:
```console
readfish unblock-all --device Testing base-calling and mapping
To test selective sequencing you must have access to a [dorado basecall server](https://community.nanoporetech.com/downloads/dorado/release_notes).
and a readfish TOML configuration file.
1. First make a local copy of the example TOML file:
```console
curl -O https://raw.githubusercontent.com/LooseLab/readfish/master/docs/_static/example_tomls/human_chr_selection.toml
```
1. If on PromethION, edit the `mapper_settings.mappy` section to read:
```toml
[mapper_settings.mappy-rs]
```
1. If on MinKNOW core>=5.9.0 and Dorado server version >=7.3.9, edit the `basecaller` section to read:
```toml
[caller_settings.dorado]
```
1. Modify the `fn_idx_in` field in the file to be the full path to a [minimap2](https://github.com/lh3/minimap2) index of the human genome.
1. Modify the `targets` fields for each condition to reflect the naming convention used in your index. This is the sequence name only, up to but not including any whitespace.
e.g. `>chr1 human chromosome 1` would become `chr1`. If these names do not match, then target matching will fail.
We can now validate this TOML file to see if it will be loaded correctly.
```console
readfish validate human_chr_selection.toml
```
Errors with the configuration will be written to the terminal along with a text description of the conditions for the experiment as below.
```text
2023-10-05 15:29:18,934 readfish /home/adoni5/mambaforge/envs/readfish_dev/bin/readfish validate human_chr_selection.toml
2023-10-05 15:29:18,934 readfish command='validate'
2023-10-05 15:29:18,934 readfish log_file=None
2023-10-05 15:29:18,934 readfish log_format='%(asctime)s %(name)s %(message)s'
2023-10-05 15:29:18,934 readfish log_level='info'
2023-10-05 15:29:18,934 readfish no_check_plugins=False
2023-10-05 15:29:18,934 readfish no_describe=False
2023-10-05 15:29:18,934 readfish prom=False
2023-10-05 15:29:18,934 readfish toml='human_chr_selection.toml'
2023-10-05 15:29:18,934 readfish.validate eJydVk1v2zgQvetXEMqlxdryxyZAGyAHt0WKAk1TNNlTkBVoiZKIUKQiUonTX79vSEmW2zRo1/BBIkdvZt68GfKIXXV1zdunU3Z9efGZZUYXsmSFVIIVpmWt4GruZC3YlluRcaWkLmdM6FZmFR7JKDpi7tGwrGpNbayphWWv8MLWS8Z1ztar16zAFnOVYFVXc81KoWHGpGacWaDAWStKaXQCrOtK2v6V8aZREnjOMLiGC661UJbxrDXWesTHylCsyjxmTCiVRAOE2PG6wRYeQ1ZdK/KQVKc1xf4oXYUQ2be3yXGyYttO3e0zd8I6GHm8jxSvzJjjbSkc3LeC2UZkspCAzGUrMqeeKB+KyJlaBZx+AXA1UBiLEYiTZYwXeOD2dLo6s4B3M6FzPLVgjswokj6RGQyrdj1bzlbL5eyvOCYMvxQTbZ8KqsCa0h1Dm3n3AugIOHhB0iBy61+tzAVxwvvEZgyUbw1ICQHYUA5hBWu4c6LVoBJ84eta7gjeG0/TRki+qklmHzwHnr8NXJrCW20FKoUdofLAo9g1ikuNMPBhbbCSC8elGmBzk3U1UuCOBDEHWuVcY08XC2WMFYpvkxJ17LaJNAvINS+krRYUTFK5WpH7d710RTsqwaNFN2E1tcJRrW1Sdk3zdOurMv39639szuJgEY8UW2Y6oFaIRI8tIqglPpKhX5r3bXPoPChE81pEfdOdsTjXPG1XS5JjKt4k6/R4udw2Nj25q76nBbeONLHJ81ZA/T2j5eioz3FONQOLBe+UY7y3JiUFUyhENhkIXLi6WYSMFif4JdFgjFCeNyH/54jjHnm7pnMeVupcPsi844rmBWQTGhD/y6/Xny6/bD4jJu7bFVKivAcdZTQG7jvpBFMkQRLcN1GbB7xDE9T3ubR8SzrKxbYrU2U8UUo+iDQ4K+7joDFZamT//rDCNUbItML0/nKFvcW0wn6BlO0f5q3tc8FI8i4H56RSEFCgp3TmY++sSNhVZTqVU9MI6BQRnm+urjd+AGh2cfEpKnQq810KvSOxRQVKFjw3Wp4sPvTat4t30khNcwRpZRY6L+yiKv9+k2qTcuXAAk/K74nFuHRhA8MgvUjqWlLJLhtiA/X5ujmfVo5oDGl4N39G/+S7hhfk5ktXb5GgF6YvziAPsSP9vwpM0qEwUPl6fNsbEMNGq6fXkU4HnDN2HPng/LFwUGMUzQrxZ1PhiIOMJyvtPBw0IVA/fUaaw2n0QRRSS+dtkFfc6a0y2V2Mady0JhMij30KsXWmgSIzgVaA0GI/3DgBK0w8G0UksyPWf3RKMxGD0IBl7zarOn1HhNNo5o2jwwrzVRS06donoge7Nb8DqjZeSDkUahHZjMlEDMg+E20eB4d9wKfsn/DglUuMDAaHgQ+BDVZ9SFbcd6RqxETZ0jds/AZneEni8jn0NXc/HuX+3An3huFmkdux8s7fH/obAz2tkujmpi/O7W1Ec5KEh/tDSidzHNVSp73DM7ZCHliQdVczPYqw3+5J5CNf4yHGcxHVfLfHOSYcvnseJzQ0nUsjEMpB0WO6FTgeQRryxbghBYUJsUUvpRMXPPNjabhInLGb+Mv7dLlav10vk1V8SwX58bbhbyN7JqNwY0qNnxeH1Yvp+433QeE6Uov0x0TrL0Jebj3lZOI9JFGNg0L+CvBlRC9eh3vZFHvWX60cUwKHhd8afA3RFwV5G9ppmMO/HT3pRDq/CqQAPuTxPPQL2NSq/lu6L/YeLJIIm9AtGez9JBGmLjqCvIxDYPJ7MQltxmaazhqChOdnA/8NyIGWKeJP4jvA/gXiz7IPWqD7mQ16ZnvIyF/n0oNenFDyv3yEG+IeMvoPUvBL7w==
2023-10-05 15:29:18,937 readfish.validate Loaded TOML config without error
2023-10-05 15:29:18,937 readfish.validate Initialising Caller
2023-10-05 15:29:18,945 readfish.validate Caller initialised
2023-10-05 15:29:18,945 readfish.validate Initialising Aligner
2023-10-05 15:29:18,947 readfish.validate Aligner initialised
2023-10-05 15:29:18,948 readfish.validate Configuration description:
Region hum_test (control=False).
Region applies to section of flow cell (# = applied, . = not applied):
################################
################################
################################
################################
################################
################################
################################
################################
2023-10-05 15:29:18,948 readfish.validate Using the mappy plugin. Using reference: /home/adoni5/Documents/Bioinformatics/refs/hg38_no_alts.fa.gz.split/hg38_chr_M.mmi.
Region hum_test has targets on 1 contig, with 1 found in the provided reference.
This region has 2 total targets (+ve and -ve strands), covering approximately 100.00% of the genome.
```
1. If your toml file validates then run the following command:
1.
```console
readfish targets --toml Testing expected results from a selection experiment.
The only way to test readfish on a playback run is to look at changes in read length for rejected vs accepted reads. To do this:
1. Start a fresh simulation run using the bulkfile provided above.
2. Restart the readfish command (as above):
```console
readfish targets --toml Analysing results with readfish stats
Once a run is complete, it can be analysed with the readfish stats command.
HTML file output is optional.
```console
readfish stats --toml | Condition | Reads | Alignments | Yield | Median read lengths | Number of targets | Percent target | Estimated coverage | |||||||
| On-Target | Off-Target | Total | On-Target | Off-Target | Total | Ratio | On-target | Off-target | Combined | |||||
| hum_test | 112,058 | 819 (0.73%) | 111,239 (99.27%) | 112,058 | 9.27 Mb (5.49%) | 159.43 Mb (94.51%) | 168.69 Mb | 1:17.20 | 0 b | 896 b | 896 b | 2 | 3.60% | 0.08 X |
| On-Target | Off-Target | Total | On-Target | Off-Target | Total | Ratio | On-target | Off-target | Combined | |||||
| Condition | Reads | Alignments | Yield | Median read lengths | Number of targets | Percent target | Estimated coverage | |||||||
| Condition Name | hum_test | ||||||||||||||||||||
| Condition | Contig | Contig Length | Reads | Alignments | Yield | Median read lengths | N50 | Number of targets | Percent target | Estimated coverage | |||||||||||
| Mapped | Unmapped | Total | On-Target | Off-Target | Total | On-Target | Off-Target | Total | Ratio | On-target | Off-target | Combined | On-Target | Off-Target | Total | ||||||
| hum_test | chr1 | 248,956,422 | 10,015 | 0 | 10,015 | 6 (0.06%) | 10,009 (99.94%) | 10,015 | 48.65 Kb (0.37%) | 13.03 Mb (99.63%) | 13.08 Mb | 1:267.87 | 0 b | 891 b | 891 b | 0 b | 1.35 Kb | 1.35 Kb | 0 | 0.00% | 0.00 X |
| hum_test | chr2 | 242,193,529 | 8,825 | 0 | 8,825 | 9 (0.10%) | 8,816 (99.90%) | 8,825 | 47.36 Kb (0.36%) | 13.05 Mb (99.64%) | 13.09 Mb | 1:275.51 | 0 b | 894 b | 894 b | 0 b | 1.49 Kb | 1.49 Kb | 0 | 0.00% | 0.00 X |
| hum_test | chr3 | 198,295,559 | 8,005 | 0 | 8,005 | 6 (0.07%) | 7,999 (99.93%) | 8,005 | 193.03 Kb (1.73%) | 10.98 Mb (98.27%) | 11.17 Mb | 1:56.86 | 0 b | 893 b | 893 b | 0 b | 1.42 Kb | 1.42 Kb | 0 | 0.00% | 0.00 X |
| hum_test | chr4 | 190,214,555 | 7,381 | 0 | 7,381 | 30 (0.41%) | 7,351 (99.59%) | 7,381 | 861.07 Kb (7.29%) | 10.95 Mb (92.71%) | 11.81 Mb | 1:12.72 | 0 b | 917 b | 917 b | 0 b | 1.60 Kb | 1.60 Kb | 0 | 0.00% | 0.00 X |
| hum_test | chr5 | 181,538,259 | 7,545 | 0 | 7,545 | 5 (0.07%) | 7,540 (99.93%) | 7,545 | 50.70 Kb (0.50%) | 10.18 Mb (99.50%) | 10.23 Mb | 1:200.68 | 0 b | 896 b | 896 b | 0 b | 1.40 Kb | 1.40 Kb | 0 | 0.00% | 0.00 X |
| hum_test | chr6 | 170,805,979 | 5,808 | 0 | 5,808 | 9 (0.15%) | 5,799 (99.85%) | 5,808 | 116.44 Kb (1.35%) | 8.53 Mb (98.65%) | 8.65 Mb | 1:73.28 | 0 b | 905 b | 905 b | 0 b | 1.49 Kb | 1.49 Kb | 0 | 0.00% | 0.00 X |
| hum_test | chr7 | 159,345,973 | 6,383 | 0 | 6,383 | 2 (0.03%) | 6,381 (99.97%) | 6,383 | 26.06 Kb (0.29%) | 9.11 Mb (99.71%) | 9.14 Mb | 1:349.59 | 0 b | 895 b | 895 b | 0 b | 1.44 Kb | 1.44 Kb | 0 | 0.00% | 0.00 X |
| hum_test | chr8 | 145,138,636 | 5,208 | 0 | 5,208 | 1 (0.02%) | 5,207 (99.98%) | 5,208 | 285 b (0.00%) | 7.43 Mb (100.00%) | 7.43 Mb | 1:26061.60 | 0 b | 892 b | 892 b | 0 b | 1.44 Kb | 1.44 Kb | 0 | 0.00% | 0.00 X |
| hum_test | chr9 | 138,394,717 | 4,253 | 0 | 4,253 | 23 (0.54%) | 4,230 (99.46%) | 4,253 | 91.15 Kb (1.50%) | 6.00 Mb (98.50%) | 6.09 Mb | 1:65.85 | 0 b | 899 b | 899 b | 0 b | 1.46 Kb | 1.46 Kb | 0 | 0.00% | 0.00 X |
| hum_test | chr10 | 133,797,422 | 4,424 | 0 | 4,424 | 15 (0.34%) | 4,409 (99.66%) | 4,424 | 95.02 Kb (1.37%) | 6.86 Mb (98.63%) | 6.95 Mb | 1:72.16 | 0 b | 915 b | 915 b | 0 b | 1.56 Kb | 1.56 Kb | 0 | 0.00% | 0.00 X |
| hum_test | chr11 | 135,086,622 | 5,349 | 0 | 5,349 | 1 (0.02%) | 5,348 (99.98%) | 5,349 | 287 b (0.00%) | 6.89 Mb (100.00%) | 6.89 Mb | 1:23997.50 | 0 b | 896 b | 896 b | 0 b | 1.35 Kb | 1.35 Kb | 0 | 0.00% | 0.00 X |
| hum_test | chr12 | 133,275,309 | 5,508 | 0 | 5,508 | 3 (0.05%) | 5,505 (99.95%) | 5,508 | 2.63 Kb (0.03%) | 7.59 Mb (99.97%) | 7.59 Mb | 1:2888.96 | 0 b | 893 b | 893 b | 0 b | 1.40 Kb | 1.40 Kb | 0 | 0.00% | 0.00 X |
| hum_test | chr13 | 114,364,328 | 3,414 | 0 | 3,414 | 8 (0.23%) | 3,406 (99.77%) | 3,414 | 85.71 Kb (1.80%) | 4.69 Mb (98.20%) | 4.77 Mb | 1:54.67 | 0 b | 900 b | 900 b | 0 b | 1.43 Kb | 1.43 Kb | 0 | 0.00% | 0.00 X |
| hum_test | chr14 | 107,043,718 | 3,541 | 0 | 3,541 | 12 (0.34%) | 3,529 (99.66%) | 3,541 | 244.18 Kb (4.79%) | 4.86 Mb (95.21%) | 5.10 Mb | 1:19.90 | 0 b | 892 b | 892 b | 0 b | 1.42 Kb | 1.42 Kb | 0 | 0.00% | 0.00 X |
| hum_test | chr15 | 101,991,189 | 3,033 | 0 | 3,033 | 3 (0.10%) | 3,030 (99.90%) | 3,033 | 4.29 Kb (0.11%) | 3.79 Mb (99.89%) | 3.80 Mb | 1:883.07 | 0 b | 867 b | 867 b | 0 b | 1.31 Kb | 1.31 Kb | 0 | 0.00% | 0.00 X |
| hum_test | chr16 | 90,338,345 | 3,276 | 0 | 3,276 | 1 (0.03%) | 3,275 (99.97%) | 3,276 | 1.97 Kb (0.04%) | 4.51 Mb (99.96%) | 4.51 Mb | 1:2294.28 | 0 b | 900 b | 900 b | 0 b | 1.41 Kb | 1.41 Kb | 0 | 0.00% | 0.00 X |
| hum_test | chr17 | 83,257,441 | 3,378 | 0 | 3,378 | 10 (0.30%) | 3,368 (99.70%) | 3,378 | 16.81 Kb (0.36%) | 4.72 Mb (99.64%) | 4.73 Mb | 1:280.52 | 0 b | 907 b | 907 b | 0 b | 1.43 Kb | 1.43 Kb | 0 | 0.00% | 0.00 X |
| hum_test | chr18 | 80,373,285 | 3,158 | 0 | 3,158 | 3 (0.09%) | 3,155 (99.91%) | 3,158 | 186.59 Kb (4.06%) | 4.41 Mb (95.94%) | 4.59 Mb | 1:23.61 | 0 b | 899 b | 899 b | 0 b | 1.47 Kb | 1.47 Kb | 0 | 0.00% | 0.00 X |
| hum_test | chr19 | 58,617,616 | 2,110 | 0 | 2,110 | 0 (0.00%) | 2,110 (100.00%) | 2,110 | 0 b (0.00%) | 2.53 Mb (100.00%) | 2.53 Mb | 0:0.00 | 0 b | 857 b | 857 b | 0 b | 1.27 Kb | 1.27 Kb | 0 | 0.00% | 0.00 X |
| hum_test | chr20 | 64,444,167 | 370 | 0 | 370 | 370 (100.00%) | 0 (0.00%) | 370 | 3.60 Mb (100.00%) | 0 b (0.00%) | 3.60 Mb | 1:0.00 | 0 b | 2.88 Kb | 2.88 Kb | 0 b | 32.28 Kb | 32.28 Kb | 1 | 100.00% | 0.06 X |
| hum_test | chr21 | 46,709,983 | 265 | 0 | 265 | 265 (100.00%) | 0 (0.00%) | 265 | 3.06 Mb (100.00%) | 0 b (0.00%) | 3.06 Mb | 1:0.00 | 0 b | 2.63 Kb | 2.63 Kb | 0 b | 33.54 Kb | 33.54 Kb | 1 | 100.00% | 0.07 X |
| hum_test | chr22 | 50,818,468 | 1,741 | 0 | 1,741 | 28 (1.61%) | 1,713 (98.39%) | 1,741 | 421.99 Kb (14.61%) | 2.47 Mb (85.39%) | 2.89 Mb | 1:5.85 | 0 b | 922 b | 922 b | 0 b | 1.63 Kb | 1.63 Kb | 0 | 0.00% | 0.00 X |
| hum_test | chrM | 16,569 | 19 | 0 | 19 | 0 (0.00%) | 19 (100.00%) | 19 | 0 b (0.00%) | 16.82 Kb (100.00%) | 16.82 Kb | 0:0.00 | 0 b | 774 b | 774 b | 0 b | 1.11 Kb | 1.11 Kb | 0 | 0.00% | 0.00 X |
| hum_test | chrX | 156,040,895 | 5,636 | 0 | 5,636 | 5 (0.09%) | 5,631 (99.91%) | 5,636 | 3.19 Kb (0.04%) | 7.46 Mb (99.96%) | 7.46 Mb | 1:2336.71 | 0 b | 905 b | 905 b | 0 b | 1.38 Kb | 1.38 Kb | 0 | 0.00% | 0.00 X |
| hum_test | chrY | 57,227,415 | 116 | 0 | 116 | 4 (3.45%) | 112 (96.55%) | 116 | 117.28 Kb (27.65%) | 306.90 Kb (72.35%) | 424.19 Kb | 1:2.62 | 0 b | 989 b | 989 b | 0 b | 28.59 Kb | 28.59 Kb | 0 | 0.00% | 0.00 X |
| hum_test | unmapped | 0 | 0 | 3,297 | 3,297 | 0 (0.00%) | 3,297 (100.00%) | 3,297 | 0 b (0.00%) | 9.10 Mb (100.00%) | 9.10 Mb | 0:0.00 | 0 b | 508 b | 508 b | 0 b | 16.81 Kb | 16.81 Kb | 0 | 0.00% | 0.00 X |
Common Gotcha's
These may or may not (!) be mistakes we have made already...
1. If the previous run has not fully completed - i.e is still base-calling or processing raw data,you may connect to the wrong instance and see nothing happening. Always check the previous run has finished completely.
1. If you have forgotten to remove your simulation line from your sequencing toml you will forever be trapped in an inception like resequencing of old data... Don't do this!
1. If base-calling doesn't seem to be working check:
- Check your base-calling server is running.
- Check the ip of your server is correct.
- Check the port of your server is correct.
1. If you are expecting reads to unblock but they do not - check that you have set control=false in your readfish toml file. control=true will prevent any unblocks but does otherwise run the full analysis pipeline.
1. Oh no - every single read is being unblocked - I have nothing on target!
- Double check your reference file is in the correct location.
- Double check your targets exist in that reference file.
- Double check your targets are correctly formatted with contig name matching the record names in your reference (Exclude description - i.e the contig name up to the first whitespace).
Happy readfish-ing!
Acknowledgements
We're really grateful to lots of people for help and support. Here's a few of them...
From the lab: Teri Evans, Sam Holt, Lewis Gallagher, Chris Alder, Thomas Clarke
From ONT: Stu Reid, Chris Wright, Rosemary Dokos, Chris Seymour, Clive Brown, George Pimm, Jon Pugh
From the Nanopore World: Nick Loman, Josh Quick, John Tyson, Jared Simpson, Ewan Birney, Alexander Senf, Nick Goldman, Miten Jain, Lukas Weilguny
And for our Awesome Logo please checkout out @tim_bassford from @TurbineCreative!
Changelog
2024.3.0
This release is breaking for ALL versions of MinKNOW <= 6 and no longer supports Guppy.
- Introducing support for MinKNOW >=6.0.0 and deprecating support for earlier versions.
- Removing support for legacy guppy base caller and only supporting Dorado in future.
- Optimising batch sending to the base caller through the use of
pass_readsrather thanpass_read - Adding the new strand classifications as used by MinKNOW, including strand2 and short.
2024.2.0
- Add a dorado base-caller which addressed issue #347 - chiefly in Dorado 7.3.9 ONT have moved to
ont-pybasecall-client-lib, and connections fromont_pyguppy_client_libraiseConnection error. ... LOAD_CONFIG. Reply: INVALID_PROTOCOL(#344) - Adds version checking for MinKNOW and Guppy/Dorado, logs if not compatibile (#351)
2024.1.0
- bug fix type for
--wait-on-readytype and actual function (#327), (#323) - mutiple suffix
.mmisupport (#330) - Change the default
unblock_durationon theAnalysisclass to useDEFAULT_UNBLOCKvalue defined in_cli_args.py. Change type on the Argparser for--unblock-durationto float. (#313) - Big dog Duplex feature - adds ability to select duplex reads that cover a target region. See pull request for details (#324)
2023.1.1
- Fix Readme Logo link (#296)
- Fix bug where we had accidentally started requiring barcoded TOMLs to specify a region. Thanks to @jamesemery for catching this. (#299)
- Correctly handle overriding a decision in internal statistics tracking. (#299) <!-- end-changelog -->
Owner
- Name: LooseLab
- Login: LooseLab
- Kind: organization
- Repositories: 7
- Profile: https://github.com/LooseLab
GitHub Events
Total
- Issues event: 23
- Watch event: 14
- Issue comment event: 36
- Push event: 4
- Pull request review event: 2
- Pull request event: 6
- Fork event: 3
- Create event: 3
Last Year
- Issues event: 23
- Watch event: 14
- Issue comment event: 36
- Push event: 4
- Pull request review event: 2
- Pull request event: 6
- Fork event: 3
- Create event: 3
Committers
Last synced: about 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| Alex Payne | a****e@n****k | 133 |
| Matt | m****e@n****k | 107 |
| Adoni5 | r****1@g****m | 51 |
| alex | 3****s | 25 |
| Matt Loose | m****e@D****l | 11 |
| Thomas Clarke | T****e@l****m | 6 |
| Svennd | s****n@g****m | 2 |
| Alexander Payne | a****2@g****m | 2 |
| Matt Loose | m****e@d****k | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 150
- Total pull requests: 40
- Average time to close issues: about 1 year
- Average time to close pull requests: about 1 month
- Total issue authors: 64
- Total pull request authors: 4
- Average comments per issue: 4.53
- Average comments per pull request: 1.4
- Merged pull requests: 30
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 13
- Pull requests: 5
- Average time to close issues: 17 days
- Average time to close pull requests: 19 days
- Issue authors: 10
- Pull request authors: 2
- Average comments per issue: 1.77
- Average comments per pull request: 0.4
- Merged pull requests: 2
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- Adoni5 (26)
- mattloose (23)
- alexomics (19)
- lborcard (4)
- ahfitzpa (3)
- jamesemery (3)
- HJTsai (3)
- satriobio (3)
- hasindu2008 (3)
- awjga (2)
- ythuang0522 (2)
- DanteV19 (2)
- jennieli421 (2)
- maximilianmordig (2)
- ps-account (2)
Pull Request Authors
- Adoni5 (42)
- mattloose (12)
- alexomics (4)
- satriobio (3)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 125 last-month
- Total docker downloads: 9
- Total dependent packages: 0
- Total dependent repositories: 1
- Total versions: 11
- Total maintainers: 3
pypi.org: readfish
ONT adaptive sampling
- Documentation: https://looselab.github.io/readfish
- License: MIT License
-
Latest release: 2024.3.0
published over 1 year ago
Rankings
Dependencies
- actions/checkout v4 composite
- actions/download-artifact v3 composite
- actions/setup-python v4 composite
- actions/upload-artifact v3 composite
- peaceiris/actions-gh-pages v3 composite
- pre-commit/action v3.0.0 composite
- pypa/gh-action-pypi-publish release/v1 composite
- actions/checkout v4 composite
- juliangruber/read-file-action v1 composite
- peter-evans/create-or-update-comment a35cf36e5301d70b76f316e867e7788a55a31dae composite
- actions/stale v8 composite
- cattrs *
- exceptiongroup python_version<"3.11"
- minknow_api *
- more_itertools *
- numpy *
- read_until @ git+https://github.com/nanoporetech/read_until_api@v3.4.1
- readfish [all]
- readfish_summarise >= 0.2.4
- rtoml *