EchoFlow: End-to-end self-supervised sonar-image pipeline
EchoFlow: End-to-end self-supervised sonar-image pipeline - Published in JOSS (2026)
Science Score: 92.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 4 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: joss.theoj.org, zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Keywords
Repository
Containerized data pipeline for processing Kongsberg .raw files
Basic Info
Statistics
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 13
- Releases: 2
Topics
Metadata Files
README.md
Echoflow
Do you have Terabytes of unprocessed Sonar data? Use this for easy viewing and inference!
EchoFlow is a three-stage containerised pipeline that converts raw Kongsberg EK80 echosounder files into echograms and DINO ViT attention maps.
Stages:
1. Conversion (raw) — decodes .raw pings to volume-backscattering strength NetCDF files via pyEcholab.
2. Pre-processing (preprocessing) — contrast-stretches and tiles echograms as PNGs.
3. Inference (infer) — runs a DINO Vision Transformer to produce per-patch attention heat-maps.
Features
- Preprocessing: Prepares the data for inference.
- Inference: Utilises DINO for visual inference and attention inspection.
- Docker Support: Run the pipeline in an isolated Docker environment for consistency and ease of use.
Repository Structure
.
├── raw_consumer/ # process raw to xarray
│ ├── Dockerfile.raw # Dockerfile for volumetric backscatter computing with pyEcholab
│ ├── preprocessing.py # Script for converting raw file to volumetric backscatter cubes
├── preprocessing/ # Preprocessing components
│ ├── Dockerfile.preprocessing # Dockerfile for preprocessing
│ ├── preprocessing.py # Script for preprocessing input data
├── inference/ # Inference components
│ ├── Dockerfile.infer # Dockerfile for inference
│ ├── attention_inspect.py # Script for inspecting attention maps
│ ├── inspect_attention.py # Main script for running inference and inspection
│ ├── requirements.txt # Python dependencies for inference demo
│ ├── utils.py # Utility functions for inference demo
│ ├── vision_transformer.py # DINO Vision Transformer model implementation
├── docker-compose.yml # Docker Compose file to run the entire pipeline
├── entrypoint.sh # Entrypoint script for Docker container
├── infer.py # Main script to run inference outside Docker
├── run_docker.sh # Script to run the pipeline using Docker
├── watchdog.py # Script to watch for changes in the pipeline
Installation
Prerequisites
- Docker ≥ 24
- Docker Compose v2 (
docker compose— note: no hyphen) - Git
- AWS CLI (for downloading the sample input file)
Clone with submodules
bash
git clone --recurse-submodules https://github.com/erlingdevold/EchoFlow.git
If you have already cloned without submodules:
bash
git submodule update --init --recursive
Without Docker (native install)
The raw_consumer stage requires HDF5 and netCDF C libraries:
- macOS:
brew install hdf5 netcdf - Ubuntu/Debian:
apt-get install libhdf5-dev libnetcdf-dev - Docker images include these already — no action needed for containerised use.
Then install Python dependencies for the stages you need:
bash
pip install -r raw_consumer/requirements.txt
pip install -r preprocessing/requirements.txt
pip install -r inference/requirements.txt
Populate input
The command below fetches a publicly available NOAA EK80 test file (~105 MB) into data/input/ and initialises git submodules before building the Docker images.
```bash aws s3 cp --no-sign-request \ "s3://noaa-wcsd-pds/data/raw/BellM.Shimada/SH2306/EK80/Hake-D20230811-T165727.raw" \ data/input/
touch ./inference/checkpoint.pth
git submodule sync --recursive ```
Setup
Running with Docker
Run the full pipeline
bash
docker compose up --build
This builds and starts all three stages (Conversion, Pre-processing, Inference) plus the progress monitor.
Run individual stages
You can also run each stage independently:
| Stage | Command |
|-------|---------|
| Stage 1 — Conversion | docker compose up --build raw |
| Stage 2 — Pre-processing | docker compose up --build preprocessing |
| Stage 3 — Inference | docker compose up --build infer |
| All stages (no monitor) | docker compose up --build raw preprocessing infer |
Each stage reads from and writes to bind-mounted directories under ./data/, so stages can be run in sequence without rebuilding upstream containers.
Monitor dashboard
Once the pipeline is running, a progress monitor is available at http://localhost:8050.
The dashboard is a pipeline progress monitor — it tracks file counts and tail-logs for each stage. It is not an inference viewer. Actual outputs are written to:
data/raw_consumer/— converted NetCDF files (Stage 1 output)data/preprocessing/— echogram PNGs (Stage 2 output)data/inference/— attention map PNGs (Stage 3 output)
Performance / parallelism
Stages 1 (conversion) and 2 (pre-processing) use process pools to parallelise work across files — controlled by the MAX_WORKERS env var (defaults to CPU count). Stage 3 (inference) uses batched GPU inference with a configurable BATCH_SIZE (default 4) and auto-detects CUDA when available. Because .raw files are large XML datagrams, file I/O is the primary bottleneck for stages 1–2; adding CPU cores within a node yields proportional throughput gains. Parallelism is intra-node only and there is no built-in cluster or scheduler integration.
Benchmarking
The benchmark script runs the full three-stage pipeline inside Docker on real EK80 data, varying MAX_WORKERS to measure scaling:
```bash
1. Download 64 test echograms (~6.4 GB) from NOAA public bucket
bash downloadbenchdata.sh
2. Build Docker images
docker compose build
3. Install requirements
pip install tyro
4. Run benchmark (default: 1,2,4,8 workers)
python benchmark.py ```
See python benchmark.py --help for options. Reference results on a 12-core Intel i7-12700, 32 GB RAM, RTX 3090 (64 files, 6.7 GB):
| Workers | Stage 1 (s) | Stage 2 (s) | Stage 3 (s) | Total (s) | s/file | |---------|-------------|-------------|-------------|-----------|--------| | 1 | 139.6 | 514.0 | 393 | 1046.6 | 16.35 | | 2 | 86.5 | 275.7 | 393 | 755.2 | 11.80 | | 4 | 56.3 | 156.2 | 393 | 605.5 | 9.46 | | 8 | 56.9 | 97.2 | 393 | 547.1 | 8.55 |
Stages 1–2 scale with MAX_WORKERS; stage 3 is GPU-bound. In production, stages overlap via the watchdog. Run python benchmark.py on your hardware to regenerate.
ENV variables
Global
MAX_WORKERS— max parallel workers for stages 1–2 (default: CPU count)
watchdog.py
LOG_DIR(default:"/data/log")INPUT_DIR(default:"/data/sonar")OUTPUT_DIR(default:"/data/processed")
raw.py (Stage 1)
INPUT_DIR(default:"/data/sonar")OUTPUT_DIR(default:"/data/processed")LOG_DIR(default:"log")
preprocessing.py (Stage 2)
INPUT_DIR(default:"/data/processed")OUTPUT_DIR(default:"/data/test_imgs")LOG_DIR(default:".")KEEP_INTERMEDIATES— preserve.ncfiles after processing (default:"true")
inspect_attention.py (Stage 3)
INPUT_DIR(default:"/data/test_imgs")OUTPUT_DIR(default:"/data/inference")LOG_DIR(default:".")PATCH_SZ(default:8)ARCH(default:'vit_small')DOWNSAMPLE_SIZE(default:5000)BATCH_SIZE— images per GPU forward pass (default:4)DEVICE—"cuda","cpu", or auto-detect (default: auto)
Output
The output of the inference step, including generated attention maps and transformed images, will be saved in data/inference/. Each run creates a subdirectory named after the input file for organised output management.
Contributing
See CONTRIBUTING.md for guidelines on reporting bugs, suggesting features, and submitting pull requests.
License
Licensed under the MIT License — see LICENSE for details.
Acknowledgements
This pipeline uses the DINO Vision Transformer for attention-based image analysis. The implementation is based on research from the original DINO paper by Facebook AI Research (FAIR).
Owner
- Name: Erling Arvola Devold
- Login: erlingdevold
- Kind: user
- Website: erlingdevdev.github.io
- Repositories: 15
- Profile: https://github.com/erlingdevold
JOSS Publication
EchoFlow: End-to-end self-supervised sonar-image pipeline
Tags
computer-vision self-supervised-learning vision-transformer marine-robotics fisheries-acoustics containerizationCitation (CITATION.cff)
---
cff-version: 1.2.0
title: "EchoFlow"
version: "1.0.0"
doi: 10.5281/zenodo.15634054
message: "If you use EchoFlow in your research, please cite it as:"
authors:
- family-names: Devold
given-names: Erling
orcid: 0009-0000-0949-5992
date-released: 2025-06-10
license: MIT
url: https://github.com/erlingdevold/EchoFlow
GitHub Events
Total
- Release event: 1
- Issues event: 7
- Issue comment event: 5
- Push event: 17
Last Year
- Release event: 1
- Issues event: 7
- Issue comment event: 5
- Push event: 17
Issues and Pull Requests
Last synced: about 1 month ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- Jinja2 ==3.1.4
- MarkupSafe ==2.1.5
- contourpy ==1.2.1
- cycler ==0.12.1
- filelock ==3.15.4
- fonttools ==4.53.1
- fsspec ==2024.6.1
- kiwisolver ==1.4.5
- matplotlib ==3.9.2
- mpmath ==1.3.0
- networkx ==3.3
- numpy ==2.0.1
- nvidia-cublas-cu12 ==12.1.3.1
- nvidia-cuda-cupti-cu12 ==12.1.105
- nvidia-cuda-nvrtc-cu12 ==12.1.105
- nvidia-cuda-runtime-cu12 ==12.1.105
- nvidia-cudnn-cu12 ==9.1.0.70
- nvidia-cufft-cu12 ==11.0.2.54
- nvidia-curand-cu12 ==10.3.2.106
- nvidia-cusolver-cu12 ==11.4.5.107
- nvidia-cusparse-cu12 ==12.1.0.106
- nvidia-nccl-cu12 ==2.20.5
- nvidia-nvjitlink-cu12 ==12.6.20
- nvidia-nvtx-cu12 ==12.1.105
- packaging ==24.1
- pillow ==10.4.0
- pyparsing ==3.1.2
- python-dateutil ==2.9.0.post0
- six ==1.16.0
- sympy ==1.13.2
- torch ==2.4.0
- torchvision ==0.19.0
- triton ==3.0.0
- typing_extensions ==4.12.2
- matplotlib *
- netcdf4 *
- numpy *
- scikit-learn *
- xarray *
- Babel ==2.14.0
- Jinja2 ==3.1.3
- MarkupSafe ==2.1.5
- PyYAML ==6.0.1
- Pygments ==2.17.2
- QtPy ==2.4.1
- Send2Trash ==1.8.3
- aiobotocore ==2.12.2
- aiohttp ==3.9.3
- aioitertools ==0.11.0
- aiosignal ==1.3.1
- anyio ==4.3.0
- argon2-cffi ==23.1.0
- argon2-cffi-bindings ==21.2.0
- arrow ==1.3.0
- asciitree ==0.3.3
- asttokens ==2.4.1
- async-lru ==2.0.4
- attrs ==23.2.0
- beautifulsoup4 ==4.12.3
- bleach ==6.1.0
- botocore ==1.34.51
- certifi ==2024.2.2
- cffi ==1.16.0
- cftime ==1.6.3
- charset-normalizer ==3.3.2
- click ==8.1.7
- cloudpickle ==3.0.0
- comm ==0.2.2
- contourpy ==1.2.1
- cycler ==0.12.1
- dask ==2024.4.1
- debugpy ==1.8.1
- decorator ==5.1.1
- defusedxml ==0.7.1
- distributed ==2024.4.1
- executing ==2.0.1
- fasteners ==0.19
- fastjsonschema ==2.19.1
- flox ==0.9.6
- fonttools ==4.51.0
- fqdn ==1.5.1
- frozenlist ==1.4.1
- fsspec ==2024.3.1
- future ==1.0.0
- geographiclib ==2.0
- geopy ==2.4.1
- h11 ==0.14.0
- httpcore ==1.0.5
- httpx ==0.27.0
- idna ==3.6
- importlib_metadata ==7.1.0
- ipykernel ==6.29.4
- ipython ==8.23.0
- ipywidgets ==8.1.2
- isoduration ==20.11.0
- jedi ==0.19.1
- jmespath ==1.0.1
- json5 ==0.9.24
- jsonpointer ==2.4
- jsonschema ==4.21.1
- jsonschema-specifications ==2023.12.1
- jupyter ==1.0.0
- jupyter-console ==6.6.3
- jupyter-events ==0.10.0
- jupyter-lsp ==2.2.5
- jupyter_client ==8.6.1
- jupyter_core ==5.7.2
- jupyter_server ==2.13.0
- jupyter_server_terminals ==0.5.3
- jupyterlab ==4.1.6
- jupyterlab_pygments ==0.3.0
- jupyterlab_server ==2.26.0
- jupyterlab_widgets ==3.0.10
- kiwisolver ==1.4.5
- locket ==1.0.0
- lxml ==5.2.1
- matplotlib ==3.8.4
- matplotlib-inline ==0.1.6
- mistune ==3.0.2
- msgpack ==1.0.8
- multidict ==6.0.5
- nbclient ==0.10.0
- nbconvert ==7.16.3
- nbformat ==5.10.4
- nest-asyncio ==1.6.0
- netCDF4 ==1.6.5
- notebook ==7.1.2
- notebook_shim ==0.2.4
- numcodecs ==0.12.1
- numpy ==1.26.4
- numpy-groupies ==0.10.2
- opencv-python ==4.9.0.80
- overrides ==7.7.0
- packaging ==24.0
- pandas ==2.2.1
- pandocfilters ==1.5.1
- parso ==0.8.4
- partd ==1.4.1
- pexpect ==4.9.0
- pillow ==10.3.0
- platformdirs ==4.2.0
- prometheus_client ==0.20.0
- prompt-toolkit ==3.0.43
- psutil ==5.9.8
- ptyprocess ==0.7.0
- pure-eval ==0.2.2
- pycparser ==2.22
- pynmea2 ==1.19.0
- pyparsing ==3.1.2
- python-dateutil ==2.9.0.post0
- python-json-logger ==2.0.7
- pytz ==2024.1
- pyzmq ==25.1.2
- qtconsole ==5.5.1
- referencing ==0.34.0
- requests ==2.31.0
- rfc3339-validator ==0.1.4
- rfc3986-validator ==0.1.1
- rpds-py ==0.18.0
- s3fs ==2024.3.1
- scipy ==1.13.0
- six ==1.16.0
- sniffio ==1.3.1
- sortedcontainers ==2.4.0
- soupsieve ==2.5
- stack-data ==0.6.3
- tblib ==3.0.0
- terminado ==0.18.1
- tinycss2 ==1.2.1
- toolz ==0.12.1
- tornado ==6.4
- tqdm ==4.66.2
- traitlets ==5.14.2
- types-python-dateutil ==2.9.0.20240316
- typing_extensions ==4.11.0
- tzdata ==2024.1
- uri-template ==1.3.0
- urllib3 ==2.0.7
- wcwidth ==0.2.13
- webcolors ==1.13
- webencodings ==0.5.1
- websocket-client ==1.7.0
- widgetsnbextension ==4.0.10
- wrapt ==1.16.0
- xarray ==2024.3.0
- xarray-datatree ==0.0.6
- yarl ==1.9.4
- zarr ==2.17.2
- zict ==3.0.0
- zipp ==3.18.1
