https://github.com/cgre-aachen/bayseg
An unsupervised machine learning algorithm for the segmentation of spatial data sets.
Science Score: 23.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: researchgate.net, springer.com -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (17.2%) to scientific vocabulary
Keywords
Repository
An unsupervised machine learning algorithm for the segmentation of spatial data sets.
Basic Info
Statistics
- Stars: 63
- Watchers: 11
- Forks: 15
- Open Issues: 8
- Releases: 0
Topics
Metadata Files
README.md
BaySeg
Easy-to-use unsupervised spatial segmentation in Python.
Contents
Introduction
A Python library for unsupervised clustering of n-dimensional datasets, designed for the segmentation of one-, two- and three-dimensional data in the field of geological modeling and geophysics. The library is based on the algorithm developed by Wang et al., 2017 and combines Hidden Markov Random Fields with Gaussian Mixture Models in a Bayesian inference framework. It currently supports up to two physical dimension and is in an early development stage.
Examples
1D: Segmentation of geophysical well log data

(Above well log data used from machine learning contest of Hall, 2016)
2D: Combined segmentation of geophysical and remote sensing data
You can try out how BaySeg segments 2D data sets by using an interactive Jupyter Notebook in your own web browser, enabled by Binder:
Installation
As the library is still in early development, the current way to install it is to clone this repository and then import it manually to your projects. We plan to provide convenient installation using PyPi in the future.
Dependencies
BaySeg depends on several genius components of the Python eco-system:
numpyfor efficient numerical implementationscikit-learnfor mixture modelsscipyfor its statistical functionalitymatplotlibfor plottingtqdmprovides convenient progress meters
Cloning directly from GitHub
First clone the repository using the command (or by manually downloading the zip file from the GitHub page)
git clone https://github.com/cgre-aachen/bayseg.git
then append the path to the repository:
import sys
sys.path.append("path/to/cloned/repository/bayseg")
to import the module:
import bayseg
Getting Started
Instantiate the classifier with the n-dimensional array storing the data and the number of labels:
clf = bayseg.BaySeg(data_ndarray, n_labels)
Then use the fit() method to classify your data with your desired number of iterations:
clf.fit(n_iter)
References
- Wang, H., Wellmann, J. F., Li, Z., Wang, X., & Liang, R. Y. (2017). A Segmentation Approach for Stochastic Geological Modeling Using Hidden Markov Random Fields. Mathematical Geosciences, 49(2), 145-177.
- Wang, H., Wellmann, F., Zhang, T., Schaaf, A., Kanig, R. M., Verweij, E., ... & van der Kruk, J. (2019). Pattern Extraction of Topsoil and Subsoil Heterogeneity and Soil‐Crop Interaction Using Unsupervised Bayesian Machine Learning: An Application to Satellite‐Derived NDVI Time Series and Electromagnetic Induction Measurements. Journal of Geophysical Research: Biogeosciences.
- Wang, H. (2020). Finding patterns in subsurface using Bayesian machine learning approach. Underground Space, 5(1), 84-92.
- Herbert, C., Camps, A., Wellmann, F., & Vall‐llossera, M. (2021). Bayesian unsupervised machine learning approach to segment Arctic sea ice using SMOS data. Geophysical Research Letters, 48(6).
- Hall, B. (2016). Facies classification using machine learning. The Leading Edge, 35(10), 906-909.
Contact
The library is based on research Hui Wang and Florian Wellmann for a research project in the German Collaborative Research Center SFB TR32. It was rewritten in Python from a Matlab code by Alexander Schaaf.
Bayseg is currently being developed by the LuF Computational Geoscience and Reservoir Engineering (CGRE) and the Aachen Institute for Advanced Study in Computational Engineering Science (AICES) at RWTH Aachen University, Germany.
For more information and contacts, please see: http://www.cgre.rwth-aachen.de/

Owner
- Name: Computational Geoscience and Reservoir Engineering @RWTH Aachen
- Login: cgre-aachen
- Kind: organization
- Email: florian.wellmann@cgre.rwth-aachen.de
- Location: Aachen, Germany
- Website: https://www.cgre.rwth-aachen.de
- Repositories: 36
- Profile: https://github.com/cgre-aachen
We investigate novel methods to integrate geoscientific data and knowledge in process simulations of subsurface flow and transport problems.
GitHub Events
Total
- Watch event: 3
- Fork event: 1
Last Year
- Watch event: 3
- Fork event: 1
Issues and Pull Requests
Last synced: 11 months ago
All Time
- Total issues: 33
- Total pull requests: 5
- Average time to close issues: about 1 month
- Average time to close pull requests: about 11 hours
- Total issue authors: 3
- Total pull request authors: 2
- Average comments per issue: 0.12
- Average comments per pull request: 0.8
- Merged pull requests: 5
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- alex-schaaf (21)
- derTPK (1)
- hwang051785 (1)
Pull Request Authors
- alex-schaaf (3)
- christophherbert (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- matplotlib *
- numpy *
- pandas *
- pytest *
- scikit_learn *
- scipy *
- tqdm *