funmixer

Unmixing nested concentration observations on river networks. An implementation of the method described in Barnes & Lipp (2024)

https://github.com/alexlipp/funmixer

Science Score: 57.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 5 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.8%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Unmixing nested concentration observations on river networks. An implementation of the method described in Barnes & Lipp (2024)

Basic Info
Statistics
  • Stars: 2
  • Watchers: 1
  • Forks: 2
  • Open Issues: 1
  • Releases: 0
Created almost 4 years ago · Last pushed 7 months ago
Metadata Files
Readme Citation

README.md

Funmixer - Unmixing nested observed concentrations in river networks for source regions

This repository implements an efficient solution to the unmixing of nested concentrations in a (river) network using convex optimisation. The method is described in our article in Water Resources Research

Data input assumptions

The algorithm requires:

1) A GDAL readable raster of D8 flow directions. We use the ESRI/Arc D8 convention of representing directions with increasing powers of 2 (i.e., 1, 2, 4, 8 etc.) with sink pixels indicated by 0. We assume that every cell in the domain eventually flows into a sink node within the domain (or is itself a sink node). This assumption requires that every boundary pixel is set to be a sink.

2) A .csv file which contains the names, locations and geochemical observations at the sample sites. Sample names (e.g., 'SampleA') are expected in column 1 and the x and y-coordinates of the sample sites in columns 2 and 3. The x and y-coordinates of the sample sites need to be in the same reference system as the D8 raster. It is assumed that the sample sites have already been manually aligned onto the drainage network. Subsequent columns contain the name of a given tracer (e.g., Mg) and their concentrations (arbitrary units).

funmixer does include some basic data preprocessing functions that can be used to align the sample sites to the drainage network and fix the boundary conditions of the D8 raster. An example of use is given in the examples/ directory. Example, valid, datasets are contained in data/d8.asc and sample_data.dat.

Some common data input problems can be solved by: - Checking that there is not trailing white-space at the end of the sample site data table. - Ensuring that the D8 flow-direction raster is in the same reference system as the sample site coordinates. - Ensuring that the D8 raster is surrounded by a boundary of sink pixels.

Installation

The following assumes a UNIX operating using the conda package manager. conda is preferred as it (in general) allows for easier installation of the gdal dependency than, for instance, pip. Whilst this package was developed on a UNIX system, the following commands (or similar) should be possible on a Windows OS using an Anaconda prompt.

First, clone the repository into a local directory:

git clone https://github.com/AlexLipp/funmixer/ [LOCAL_DIRECTORY]

A conda environment file (requirements.yaml) is provided containing the python dependencies. A conda environment entitled funmixer can be generated from it by running:

conda env create -f requirements.yaml

The environment can then be activated using

conda activate funmixer

Next, install the funmixer python package using:

pip install -e .

This command installs the funmixer python package that can then be imported as normal (e.g., import funmixer).

Problem solving

If you encounter any problems with installation you can contact us or raise an issue on this repository. Based on user feedback, some common problems and solutions are given below:

  • If you're having problems related to permissions, try using sudo before the pip command (e.g., sudo pip install -e .).

Testing

To check if installation has happened correctly you can run the synthetic test script:

python3 tests/synthetic_test.py

This script generates a synthetic dataset and recovers the original input. The results are then plotted.

Unit-tests

Formal unit-tests can be run using:

pytest tests/random_networks_test.py These tests randomly generate sample networks (full R-ary trees and balanced trees) up to 100 nodes in size, with random source concentrations and sub-basin areas drawn from distributions spanning two orders of magnitude. The tests pass if all the inputted upstream source chemistry is recovered to a relative accuracy of 1%.

Runtime Benchmark

A timing benchmark can be run using:

python tests/runtime_benchmark.py run This script benchmarks the runtime of the algorithm for the GUROBI, ECOS and SCS solvers for branching networks up to 500 nodes. This takes ~ 30 minutes to run on standard laptop hardware. The results are cached to file and be visualised using:

python tests/runtime_benchmark.py plot

Usage

Some documented example scripts are given in the directory examples/, and are run from the root directory of the repository, e.g.,

python examples/unmix_mwe.py

Cite

If you use this please cite the paper, which is published at Water Resources Research.

Barnes, R. and Lipp, A. Using convex optimization to efficiently apportion tracer and pollutant sources from point concentration observations, DOI 10.1029/2023WR036159, 2024.

A .cff citation file is also provided in the repository.

Owner

  • Name: Alex Lipp
  • Login: AlexLipp
  • Kind: user
  • Location: Oxford, UK
  • Company: Merton College, Oxford

Earth & Environmental Scientist at the University of Oxford

Citation (CITATION.cff)

cff-version: 1.0.0
title: funmixer
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Richard
    family-names: Barnes
    email: rijard.barnes@gmail.com
    orcid: 'https://orcid.org/0000-0002-0204-6040'
  - given-names: Alexander
    family-names: Lipp
    email: a.lipp@ucl.ac.uk
    orcid: 'https://orcid.org/0000-0003-2130-8576'
identifiers:
  - type: url
    value: 'https://github.com/AlexLipp/funmixer/'
    description: Code Repository
  - type: doi
    value: 10.1029/2023WR036159
    description: Published manuscript DOI
repository-code: 'https://github.com/AlexLipp/funmixer/'
abstract: >-
  Rivers transport elements, minerals, chemicals, and
  pollutants produced in their upstream basins. A sample
  from a river is a mixture of all of its upstream sources,
  making it challenging to pinpoint the contribution from
  each individual source. Here, we show how a nested sample
  design and convex optimization can be used to efficiently
  unmix downstream samples of a well-mixed, conservative
  tracer into the contributions of their upstream sources.
  Our approach is significantly faster than previous
  methods. We represent the river's sub-catchments, defined
  by sampling sites, using a directed acyclic graph. This
  graph is used to build a convex optimization problem
  which, thanks to its convexity, can be quickly solved to
  global optimality---in under a second on desktop hardware
  for datasets of $\sim$100 samples or fewer. Uncertainties
  in the upstream predictions can be generated using Monte
  Carlo resampling. We provide an open-source implementation
  of this approach in Python. The inputs required are
  straightforward: a table containing sample locations and
  observed tracer concentrations, along with a D8
  flow-direction raster map. As a case study, we use this
  method to map the elemental geochemistry of sediment
  sources for rivers draining the Cairngorms mountains, UK.
  This method could be extended to non-conservative and
  non-steady state tracers. We also show, theoretically, how
  multiple tracers could be simultaneously inverted to
  recover upstream run-off or erosion rates as well as
  source concentrations. Overall, this approach can provide
  valuable insights to researchers in various fields,
  including water quality, geochemical exploration,
  geochemistry, hydrology, and wastewater epidemiology.

GitHub Events

Total
  • Issues event: 2
  • Issue comment event: 1
  • Push event: 7
  • Pull request review event: 2
  • Pull request review comment event: 6
  • Create event: 2
Last Year
  • Issues event: 2
  • Issue comment event: 1
  • Push event: 7
  • Pull request review event: 2
  • Pull request review comment event: 6
  • Create event: 2

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 2
  • Total pull requests: 12
  • Average time to close issues: 1 minute
  • Average time to close pull requests: about 2 hours
  • Total issue authors: 2
  • Total pull request authors: 2
  • Average comments per issue: 1.5
  • Average comments per pull request: 0.0
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 4
  • Average time to close issues: 1 minute
  • Average time to close pull requests: 13 minutes
  • Issue authors: 2
  • Pull request authors: 1
  • Average comments per issue: 1.5
  • Average comments per pull request: 0.0
  • Merged pull requests: 4
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • simon-m-mudd (1)
  • AlexLipp (1)
Pull Request Authors
  • AlexLipp (11)
  • r-barnes (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

pyproject.toml pypi
setup.py pypi
  • cvxpy *
  • cython *
  • gdal *
  • hypothesis *
  • matplotlib *
  • networkx *
  • numpy *
  • pandas *
  • pygraphviz *
  • pytest *
  • tqdm *