CoreBreakout

CoreBreakout: Subsurface Core Images to Depth-Registered Datasets - Published in JOSS (2020)

https://github.com/rgmyr/corebreakout

Science Score: 95.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
    3 of 5 committers (60.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords from Contributors

particles

Scientific Fields

Mathematics Computer Science - 84% confidence
Artificial Intelligence and Machine Learning Computer Science - 71% confidence
Last synced: 4 months ago · JSON representation

Repository

Segmentation and depth-alignment of geological core sample image columns via Mask-RCNN

Basic Info
  • Host: GitHub
  • Owner: rgmyr
  • License: other
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 206 MB
Statistics
  • Stars: 27
  • Watchers: 4
  • Forks: 16
  • Open Issues: 4
  • Releases: 2
Created almost 7 years ago · Last pushed over 5 years ago
Metadata Files
Readme Changelog Contributing License Code of conduct

README.md

CoreBreakout

status

Requirements, installation, and contribution guidelines can be found below. Our full usage and API documentation can be found at: corebreakout.readthedocs.io

Overview

corebreakout is a Python package built around matterport/Mask_RCNN for the segmentation and depth-alignment of geological core sample images. It provides utilities and an API to enable the workflow depicted in the figure below, as well as a CoreColumn data structure to manage and manipulate the resulting depth-registered image data:

We are currently using this package to enable research on Lithology Prediction of Slabbed Core Photos Using Machine Learning Models, and are working on getting a DOI for the project through the Journal of Open Source Software.

Getting Started

Target Platform

This package was developed on Linux (Ubuntu, PopOS), and has also been tested on OS X. It may work on other platforms, but we make no guarantees.

Requirements

In addition to Python>=3.6, the packages listed in requirements.txt are required. Notable exceptions to the list are:

The TensorFlow requirement is not explicitly listed in requirements.txt due to the ambiguity between tensorflow and tensorflow-gpu in versions <=1.14. The latter is almost certainly required for training new models, although it may be possible to perform inference with saved models on CPU, and use of the CoreColumn data structure does not require a GPU.

Note that TensorFlow GPU capabilities are implemented with CUDA, which requires a supported NVIDIA GPU.

Additional (Optional) Requirements

Optionally, jupyter is required to run demo and test notebooks, and pytest is required to run unit tests. Both of these should be manually installed if you plan to modify or contribute to the package source code.

We also provide a script for extraction of top/base depths from core image text using pytesseract. After installing the Tesseract OCR Engine on your machine, you can install the pytesseract package with conda or pip.

Download code

$ git clone --recurse-submodules https://github.com/rgmyr/corebreakout.git $ cd corebreakout

Download data (optional)

To make use of the provided dataset and model, or to train new a model starting from the pretrained COCO weights, you will need to download the assets.zip folder from the v0.2 Release.

Unzip and place this folder in the root directory of the repository (its contents will be ignored by git -- see the .gitignore). If you would like to place it elsewhere, you should modify the paths in corebreakout/defaults.py to point to your preferred location.

The current version of assets/data has JSON annotation files which include an imageData field representing the associated images as strings. For now you can delete this field and reduce the size of the data with scripts/prune_imageData.py:

$ python scripts/prune_imageData.py assets/

Installation

We recommend installing corebreakout and its dependencies in an isolated environment, and further recommend the use of conda. See Conda: Managing environments.


To create a new conda environment called corebreakout-env and activate it:

$ conda create -n corebreakout-env python=3.6 tensorflow-gpu=1.14 $ conda activate corebreakout-env

Note: If you want to try a CPU-only installation, then replace tensorflow-gpu with tensorflow. You may also lower the version number if you are on a machine with CUDA<10.0 (required for TensorFlow>=1.13). See TensorFlow GPU requirements for more compatibility details.


Then install the rest of the required packages into the environment:

$ conda install --file requirements.txt


Finally, install mrcnn and corebreakout using pip. Develop mode installation (-e) is recommended (but not required) for corebreakout, since many users will want to change some of the default parameters to suit their own data without having to reinstall afterward:

$ pip install ./Mask_RCNN $ pip install -e .

Usage

Please refer to our readthedocs page for full documentation!

Development and Community Guidelines

Submit an Issue

  • Navigate to the repository's issue tab
  • Search for existing related issues
  • If necessary, create and submit a new issue

Contributing

Testing

  • Most corebreakout functionality not requiring trained model weights can be verified with pytest:

$ cd <root_directory> $ pytest .

  • Model usage via the CoreSegmenter class can be verified by running tests/notebooks/test_inference.ipynb (requires saved model weights)
  • Plotting of CoreColumns can be verified by running tests/notebooks/test_plotting.ipynb

Owner

  • Name: Ross Meyer
  • Login: rgmyr
  • Kind: user
  • Location: Golden, CO
  • Company: Colorado School of Mines

Research [Data] Scientist @ CSM CoRE

JOSS Publication

CoreBreakout: Subsurface Core Images to Depth-Registered Datasets
Published
June 19, 2020
Volume 5, Issue 50, Page 1969
Authors
Ross G. Meyer ORCID
Department of Geology and Geological Engineering, Colorado School of Mines
Thomas P. Martin ORCID
Department of Geology and Geological Engineering, Colorado School of Mines
Zane R. Jobe ORCID
Department of Geology and Geological Engineering, Colorado School of Mines
Editor
Katy Barnhart ORCID
Tags
image processing deep learning geology geoscience subsurface

GitHub Events

Total
  • Watch event: 3
  • Fork event: 1
Last Year
  • Watch event: 3
  • Fork event: 1

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 43
  • Total Committers: 5
  • Avg Commits per committer: 8.6
  • Development Distribution Score (DDS): 0.395
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
rgmyr r****r@u****u 26
Thomas 3****o 12
Jesse Pisel j****l 2
Daniel S. Katz d****z@i****g 2
Katherine Barnhart k****t@u****v 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 13
  • Total pull requests: 9
  • Average time to close issues: about 2 months
  • Average time to close pull requests: about 20 hours
  • Total issue authors: 9
  • Total pull request authors: 6
  • Average comments per issue: 2.54
  • Average comments per pull request: 0.11
  • Merged pull requests: 8
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • rgmyr (2)
  • jessepisel (2)
  • MonkeyLever (2)
  • JesperDramsch (2)
  • atwahsz (1)
  • LukasMosser (1)
  • metazool (1)
  • brendonhall (1)
  • mobiuscreek (1)
Pull Request Authors
  • ThomasMGeo (3)
  • jessepisel (2)
  • kbarnhart (1)
  • arfon (1)
  • danielskatz (1)
  • zanejobe (1)
Top Labels
Issue Labels
enhancement (1)
Pull Request Labels

Dependencies

docs/rtd_requirements.txt pypi
  • IPython *
  • Pillow *
  • cython *
  • dill *
  • h5py *
  • jupyter *
  • keras >=2.0.8,<=2.2.5
  • matplotlib *
  • numpy <=1.16.4
  • opencv-python *
  • pandas *
  • scikit-image *
  • scipy *
  • tensorflow <=1.14
requirements.txt pypi
  • IPython *
  • Pillow *
  • cython *
  • dill *
  • h5py *
  • jupyter *
  • keras >=2.0.8,<=2.2.5
  • matplotlib *
  • numpy <=1.16.4
  • opencv-python *
  • pandas *
  • scikit-image *
  • scipy *
setup.py pypi
  • IPython *
  • Pillow *
  • cython *
  • dill *
  • h5py *
  • imgaug *
  • keras >=2.0.8,<=2.2.5
  • matplotlib *
  • numpy <=1.16.4
  • opencv-python *
  • scikit-image *
  • scipy *