CoreBreakout

CoreBreakout: Subsurface Core Images to Depth-Registered Datasets - Published in JOSS (2020)

https://github.com/rgmyr/corebreakout

Science Score: 95.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in JOSS metadata
✓
Academic publication links
Links to: joss.theoj.org
✓
Committers with academic emails
3 of 5 committers (60.0%) from academic institutions
○
Institutional organization owner
✓
JOSS paper metadata
Published in Journal of Open Source Software

Keywords from Contributors

particles

Scientific Fields

Mathematics Computer Science - 84% confidence

Artificial Intelligence and Machine Learning Computer Science - 71% confidence

Last synced: 6 months ago · JSON representation

Repository

Segmentation and depth-alignment of geological core sample image columns via Mask-RCNN

Basic Info

Host: GitHub
Owner: rgmyr
License: other
Language: Python
Default Branch: master
Homepage:
Size: 206 MB

Statistics

Stars: 27
Watchers: 4
Forks: 16
Open Issues: 4
Releases: 2

Created about 7 years ago · Last pushed over 5 years ago

Metadata Files

Readme Changelog Contributing License Code of conduct

CoreBreakout

Requirements, installation, and contribution guidelines can be found below. Our full usage and API documentation can be found at: corebreakout.readthedocs.io

Overview

corebreakout is a Python package built around matterport/Mask_RCNN for the segmentation and depth-alignment of geological core sample images. It provides utilities and an API to enable the workflow depicted in the figure below, as well as a CoreColumn data structure to manage and manipulate the resulting depth-registered image data:

We are currently using this package to enable research on Lithology Prediction of Slabbed Core Photos Using Machine Learning Models, and are working on getting a DOI for the project through the Journal of Open Source Software.

Getting Started

Target Platform

This package was developed on Linux (Ubuntu, PopOS), and has also been tested on OS X. It may work on other platforms, but we make no guarantees.

Requirements

In addition to Python>=3.6, the packages listed in requirements.txt are required. Notable exceptions to the list are:

1.3<=tensorflow-gpu<=1.14 (or possibly just tensorflow)
mrcnn via submodule: matterport/Mask_RCNN

The TensorFlow requirement is not explicitly listed in requirements.txt due to the ambiguity between tensorflow and tensorflow-gpu in versions <=1.14. The latter is almost certainly required for training new models, although it may be possible to perform inference with saved models on CPU, and use of the CoreColumn data structure does not require a GPU.

Note that TensorFlow GPU capabilities are implemented with CUDA, which requires a supported NVIDIA GPU.

Additional (Optional) Requirements

Optionally, jupyter is required to run demo and test notebooks, and pytest is required to run unit tests. Both of these should be manually installed if you plan to modify or contribute to the package source code.

We also provide a script for extraction of top/base depths from core image text using pytesseract. After installing the Tesseract OCR Engine on your machine, you can install the pytesseract package with conda or pip.

Download code

$ git clone --recurse-submodules https://github.com/rgmyr/corebreakout.git $ cd corebreakout

Download data (optional)

To make use of the provided dataset and model, or to train new a model starting from the pretrained COCO weights, you will need to download the assets.zip folder from the v0.2 Release.

Unzip and place this folder in the root directory of the repository (its contents will be ignored by git -- see the .gitignore). If you would like to place it elsewhere, you should modify the paths in corebreakout/defaults.py to point to your preferred location.

The current version of assets/data has JSON annotation files which include an imageData field representing the associated images as strings. For now you can delete this field and reduce the size of the data with scripts/prune_imageData.py:

$ python scripts/prune_imageData.py assets/

Installation

We recommend installing corebreakout and its dependencies in an isolated environment, and further recommend the use of conda. See Conda: Managing environments.

To create a new conda environment called corebreakout-env and activate it:

$ conda create -n corebreakout-env python=3.6 tensorflow-gpu=1.14 $ conda activate corebreakout-env

Note: If you want to try a CPU-only installation, then replace tensorflow-gpu with tensorflow. You may also lower the version number if you are on a machine with CUDA<10.0 (required for TensorFlow>=1.13). See TensorFlow GPU requirements for more compatibility details.

Then install the rest of the required packages into the environment:

$ conda install --file requirements.txt

Finally, install mrcnn and corebreakout using pip. Develop mode installation (-e) is recommended (but not required) for corebreakout, since many users will want to change some of the default parameters to suit their own data without having to reinstall afterward:

$ pip install ./Mask_RCNN $ pip install -e .

Usage

Please refer to our readthedocs page for full documentation!

Development and Community Guidelines

Submit an Issue

Navigate to the repository's issue tab
Search for existing related issues
If necessary, create and submit a new issue

Contributing

Please see CONTRIBUTING.md and the Code of Conduct for how to contribute to the project

Testing

Most corebreakout functionality not requiring trained model weights can be verified with pytest:

$ cd <root_directory> $ pytest .

Model usage via the CoreSegmenter class can be verified by running tests/notebooks/test_inference.ipynb (requires saved model weights)
Plotting of CoreColumns can be verified by running tests/notebooks/test_plotting.ipynb

Owner

Name: Ross Meyer
Login: rgmyr
Kind: user
Location: Golden, CO
Company: Colorado School of Mines

Repositories: 2
Profile: https://github.com/rgmyr

Research [Data] Scientist @ CSM CoRE

JOSS Publication

CoreBreakout: Subsurface Core Images to Depth-Registered Datasets

Published

June 19, 2020

DOI

10.21105/joss.01969

Volume 5, Issue 50, Page 1969

Authors

Ross G. Meyer

Department of Geology and Geological Engineering, Colorado School of Mines

Thomas P. Martin

Department of Geology and Geological Engineering, Colorado School of Mines

Zane R. Jobe

Department of Geology and Geological Engineering, Colorado School of Mines

Editor

Katy Barnhart

GitHub Events

Total

Watch event: 3
Fork event: 1

Last Year

Watch event: 3
Fork event: 1

Committers

Last synced: 7 months ago

All Time

Total Commits: 43
Total Committers: 5
Avg Commits per committer: 8.6
Development Distribution Score (DDS): 0.395

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
rgmyr	r**r@u**u	26
Thomas	3****o	12
Jesse Pisel	j****l	2
Daniel S. Katz	d**z@i**g	2
Katherine Barnhart	k**t@u**v	1

Committer Domains (Top 20 + Academic)

usgs.gov: 1 ieee.org: 1 utexas.edu: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 13
Total pull requests: 9
Average time to close issues: about 2 months
Average time to close pull requests: about 20 hours
Total issue authors: 9
Total pull request authors: 6
Average comments per issue: 2.54
Average comments per pull request: 0.11
Merged pull requests: 8
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

rgmyr (2)
jessepisel (2)
MonkeyLever (2)
JesperDramsch (2)
atwahsz (1)
LukasMosser (1)
metazool (1)
brendonhall (1)
mobiuscreek (1)

Pull Request Authors

ThomasMGeo (3)
jessepisel (2)
kbarnhart (1)
arfon (1)
danielskatz (1)
zanejobe (1)

Top Labels

Issue Labels

enhancement (1)

Pull Request Labels

Dependencies

docs/rtd_requirements.txt pypi

IPython *
Pillow *
cython *
dill *
h5py *
jupyter *
keras >=2.0.8,<=2.2.5
matplotlib *
numpy <=1.16.4
opencv-python *
pandas *
scikit-image *
scipy *
tensorflow <=1.14

requirements.txt pypi

IPython *
Pillow *
cython *
dill *
h5py *
jupyter *
keras >=2.0.8,<=2.2.5
matplotlib *
numpy <=1.16.4
opencv-python *
pandas *
scikit-image *
scipy *

setup.py pypi

IPython *
Pillow *
cython *
dill *
h5py *
imgaug *
keras >=2.0.8,<=2.2.5
matplotlib *
numpy <=1.16.4
opencv-python *
scikit-image *
scipy *

CoreBreakout

Science Score: 95.0%

Keywords from Contributors

Scientific Fields

Repository

Basic Info

Statistics

Metadata Files

README.md

CoreBreakout

Overview

Getting Started

Target Platform

Requirements

Additional (Optional) Requirements

Download code

Download data (optional)

Installation

Usage

Development and Community Guidelines

Submit an Issue

Contributing

Testing

Owner

JOSS Publication

CoreBreakout: Subsurface Core Images to Depth-Registered Datasets

Authors

Editor

Tags

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies