CoreBreakout
CoreBreakout: Subsurface Core Images to Depth-Registered Datasets - Published in JOSS (2020)
Science Score: 95.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in JOSS metadata -
✓Academic publication links
Links to: joss.theoj.org -
✓Committers with academic emails
3 of 5 committers (60.0%) from academic institutions -
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Keywords from Contributors
Scientific Fields
Repository
Segmentation and depth-alignment of geological core sample image columns via Mask-RCNN
Basic Info
Statistics
- Stars: 27
- Watchers: 4
- Forks: 16
- Open Issues: 4
- Releases: 2
Metadata Files
README.md
CoreBreakout
Requirements, installation, and contribution guidelines can be found below. Our full usage and API documentation can be found at: corebreakout.readthedocs.io
Overview
corebreakout is a Python package built around matterport/Mask_RCNN for the segmentation and depth-alignment of geological core sample images. It provides utilities and an API to enable the workflow depicted in the figure below, as well as a CoreColumn data structure to manage and manipulate the resulting depth-registered image data:

We are currently using this package to enable research on Lithology Prediction of Slabbed Core Photos Using Machine Learning Models, and are working on getting a DOI for the project through the Journal of Open Source Software.
Getting Started
Target Platform
This package was developed on Linux (Ubuntu, PopOS), and has also been tested on OS X. It may work on other platforms, but we make no guarantees.
Requirements
In addition to Python>=3.6, the packages listed in requirements.txt are required. Notable exceptions to the list are:
1.3<=tensorflow-gpu<=1.14(or possibly justtensorflow)mrcnnvia submodule: matterport/Mask_RCNN
The TensorFlow requirement is not explicitly listed in requirements.txt due to the ambiguity between tensorflow and tensorflow-gpu in versions <=1.14. The latter is almost certainly required for training new models, although it may be possible to perform inference with saved models on CPU, and use of the CoreColumn data structure does not require a GPU.
Note that TensorFlow GPU capabilities are implemented with CUDA, which requires a supported NVIDIA GPU.
Additional (Optional) Requirements
Optionally, jupyter is required to run demo and test notebooks, and pytest is required to run unit tests. Both of these should be manually installed if you plan to modify or contribute to the package source code.
We also provide a script for extraction of top/base depths from core image text using pytesseract. After installing the Tesseract OCR Engine on your machine, you can install the pytesseract package with conda or pip.
Download code
$ git clone --recurse-submodules https://github.com/rgmyr/corebreakout.git
$ cd corebreakout
Download data (optional)
To make use of the provided dataset and model, or to train new a model starting from the pretrained COCO weights, you will need to download the assets.zip folder from the v0.2 Release.
Unzip and place this folder in the root directory of the repository (its contents will be ignored by git -- see the .gitignore). If you would like to place it elsewhere, you should modify the paths in corebreakout/defaults.py to point to your preferred location.
The current version of assets/data has JSON annotation files which include an imageData field representing the associated images as strings. For now you can delete this field and reduce the size of the data with scripts/prune_imageData.py:
$ python scripts/prune_imageData.py assets/
Installation
We recommend installing corebreakout and its dependencies in an isolated environment, and further recommend the use of conda. See Conda: Managing environments.
To create a new conda environment called corebreakout-env and activate it:
$ conda create -n corebreakout-env python=3.6 tensorflow-gpu=1.14
$ conda activate corebreakout-env
Note: If you want to try a CPU-only installation, then replace tensorflow-gpu with tensorflow. You may also lower the version number if you are on a machine with CUDA<10.0 (required for TensorFlow>=1.13). See TensorFlow GPU requirements for more compatibility details.
Then install the rest of the required packages into the environment:
$ conda install --file requirements.txt
Finally, install mrcnn and corebreakout using pip. Develop mode installation (-e) is recommended (but not required) for corebreakout, since many users will want to change some of the default parameters to suit their own data without having to reinstall afterward:
$ pip install ./Mask_RCNN
$ pip install -e .
Usage
Please refer to our readthedocs page for full documentation!
Development and Community Guidelines
Submit an Issue
- Navigate to the repository's issue tab
- Search for existing related issues
- If necessary, create and submit a new issue
Contributing
- Please see
CONTRIBUTING.mdand the Code of Conduct for how to contribute to the project
Testing
- Most
corebreakoutfunctionality not requiring trained model weights can be verified withpytest:
$ cd <root_directory>
$ pytest .
- Model usage via the
CoreSegmenterclass can be verified by runningtests/notebooks/test_inference.ipynb(requires saved model weights) - Plotting of
CoreColumns can be verified by runningtests/notebooks/test_plotting.ipynb
Owner
- Name: Ross Meyer
- Login: rgmyr
- Kind: user
- Location: Golden, CO
- Company: Colorado School of Mines
- Repositories: 2
- Profile: https://github.com/rgmyr
Research [Data] Scientist @ CSM CoRE
JOSS Publication
CoreBreakout: Subsurface Core Images to Depth-Registered Datasets
Authors
Tags
image processing deep learning geology geoscience subsurfaceGitHub Events
Total
- Watch event: 3
- Fork event: 1
Last Year
- Watch event: 3
- Fork event: 1
Committers
Last synced: 5 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| rgmyr | r****r@u****u | 26 |
| Thomas | 3****o | 12 |
| Jesse Pisel | j****l | 2 |
| Daniel S. Katz | d****z@i****g | 2 |
| Katherine Barnhart | k****t@u****v | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 13
- Total pull requests: 9
- Average time to close issues: about 2 months
- Average time to close pull requests: about 20 hours
- Total issue authors: 9
- Total pull request authors: 6
- Average comments per issue: 2.54
- Average comments per pull request: 0.11
- Merged pull requests: 8
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- rgmyr (2)
- jessepisel (2)
- MonkeyLever (2)
- JesperDramsch (2)
- atwahsz (1)
- LukasMosser (1)
- metazool (1)
- brendonhall (1)
- mobiuscreek (1)
Pull Request Authors
- ThomasMGeo (3)
- jessepisel (2)
- kbarnhart (1)
- arfon (1)
- danielskatz (1)
- zanejobe (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- IPython *
- Pillow *
- cython *
- dill *
- h5py *
- jupyter *
- keras >=2.0.8,<=2.2.5
- matplotlib *
- numpy <=1.16.4
- opencv-python *
- pandas *
- scikit-image *
- scipy *
- tensorflow <=1.14
- IPython *
- Pillow *
- cython *
- dill *
- h5py *
- jupyter *
- keras >=2.0.8,<=2.2.5
- matplotlib *
- numpy <=1.16.4
- opencv-python *
- pandas *
- scikit-image *
- scipy *
- IPython *
- Pillow *
- cython *
- dill *
- h5py *
- imgaug *
- keras >=2.0.8,<=2.2.5
- matplotlib *
- numpy <=1.16.4
- opencv-python *
- scikit-image *
- scipy *
