crackmnist

CrackMNIST - A Large-Scale Dataset for Crack Tip Detection in Digital Image Correlation Data

https://github.com/dlr-wf/crackmnist

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 25 DOI reference(s) in README
✓
Academic publication links
Links to: zenodo.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (8.0%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

CrackMNIST - A Large-Scale Dataset for Crack Tip Detection in Digital Image Correlation Data

Basic Info

Host: GitHub
Owner: dlr-wf
License: mit
Language: Python
Default Branch: master
Size: 1.41 MB

Statistics

Stars: 3
Watchers: 0
Forks: 2
Open Issues: 0
Releases: 1

Created over 1 year ago · Last pushed about 1 year ago

Metadata Files

Readme License Citation

CrackMNIST - A Large-Scale Dataset for Crack Tip Detection in Digital Image Correlation Data

Introduction

Fatigue crack growth (FCG) experiments play a crucial role in materials science and engineering, particularly for the safe design of structures and components. However, conventional FCG experiments are both time-consuming and costly, relying primarily on integral measurement techniques such as the potential drop method to determine crack length.

Digital Image Correlation (DIC) is a non-contact optical technique that enables full-field displacement measurements during experiments. Accurately identifying crack tip positions from DIC data is essential but challenging due to inherent noise and artifacts.

Recently, a deep learning-based approach was introduced to automatically detect crack tip positions [1,2]. This method involved manually annotating a single experiment to train a convolutional neural network (CNN). Furthermore, an iterative crack tip correction technique was later developed to enhance detection accuracy [3]. However, this method is not fully automated and requires more time than applying a pre-trained CNN. With the rise of self-driven laboratories generating vast amounts of DIC data [4,5], reliable crack tip detection is essential for efficient and rapid data evaluation.

References:

Strohmann T et al. (2021) Automatic detection of fatigue crack paths using digital image correlation and convolutional neural networks. Fatigue and Fracture of Engineering Materials and Structures 44: 1336-1348 https://doi.org/10.1111/ffe.13433
Melching D et al. (2022) Explainable machine learning for precise faticue crack tip detection. Scientific Reports 12, 9513 https://doi.org/10.1038/s41598-022-13275-1
Melching D et al. (2024) An iterative crack tip correction algorithm discovered by physical deep symbolic regression. International Journal of Fatigue, 187, 108432 https://doi.org/10.1016/j.ijfatigue.2024.108432
Paysan F et al. (2023) A Robot-Assisted Microscopy System for Digital Image Correlation in Fatigue Crack Growth Testing. Experimental Mechanics, 63, 975-986 https://doi.org/10.1007/s11340-023-00964-9
Strohmann T et al. (2024) Next generation fatigue crack growth experiments of aerospace materials. Scientific Reports 14, 14075 https://doi.org/10.1038/s41598-024-63915-x

Objective

The objective of this project is to create a diverse, large-scale, and standardized dataset designed for the training and evaluation of deep learning-based crack tip detection methods. In addition to supporting research and practical applications, the dataset aims to serve an educational purpose by providing a high-quality resource for students and researchers in the field of material science and mechanics.

DIC data

The dataset contains DIC data in the form of planar displacement fields ($ux, uy$) both measured in $mm$ from eight FCG experiments performed on different materials and specimen geometries. The tested materials (AA2024, AA7475 and AA7010) are aluminum alloys with an average Young's modulus (E) of approximately 70 GPa and a Poisson’s ratio (ν) of 0.33. For details, please refer to the corresponding data sheets.

The applied maximum nominal uniform stress for MT-Specimen is σ_N is 47 MPa (sinusoidal loading, constant amplitude). The minimum load can be derived from R=F_min/F_max. The expected Stress Intensity Factors K_I vary approximately between 1 and 40 MPa√m.

| Experiment | Material | Specimen Type | Thickness [mm] | Orientation | R | |------------------|:------------------:|:-------------:|:--------------:|:-----------:|:---:| | MT1602024LT1 | AA2024^r | MT160 | 2 | LT | 0.1 | | MT1602024LT2 | AA2024^r | MT160 | 2 | LT | 0.3 | | MT1602024LT3 | AA2024^r | MT160 | 2 | LT | 0.5 | | MT1602024TL1 | AA2024^r | MT160 | 2 | TL | 0.1 | | MT1602024TL2 | AA2024^r | MT160 | 2 | TL | 0.3 | | MT1607475LT1 | AA7475^r | MT160 | 4 | LT | 0.1 | | MT1607475TL1 | AA7475^r | MT160 | 4 | TL | 0.3 | | CT757010SL451 | AA7010^f | CT75 | 12 | SL45° | 0.1 |

^r Rolled Material ^f Forged Material

Data annotation

Crack tip positions in the DIC data are annotated with the high-fidelity crack tip correction method from 3.

Crack tip annotation

The crack tip positions are stored as binary segmentation masks such that the labelled datasets can directly be used for training semantic segmentation models.

Labelled datasets

We provide three datasets of different sizes ("S", "M", "L"). The datasets are split into training, validation, and test sets. The following table shows the number of samples in each dataset.

| Dataset | Training | Validation | Test | |---------|----------|------------|-------| | S | 10048 | 5944 | 5944 | | M | 21672 | 11736 | 11672 | | L | 42088 | 11736 | 16560 |

The datasets are provided in four different pixel resolutions ($28 \times 28$, $64 \times 64$, $128 \times 128$, $256 \times 256$) and stored in HDF5 format.

An overview which experiment is included in which dataset for training, validation and testing can be found in the file size_splits.json.

Visualization of labelled data samples

The following figure shows examples of labelled data samples from the CrackMNIST dataset.

CrackMNIST samples

The inputs consist of the planar displacement fields ($ux, uy$), and the outputs are the binary segmentation masks.

Visualization of different pixel resolutions

The figure below shows the y-displacement field of a DIC sample at different pixel resolutions.

DIC pixel resolutions

Usage

Installation

The package can be installed via pip: bash pip install crackmnist Datasets are uploaded to Zenodo and are downloaded automatically upon usage.

Getting started

The datasets can be loaded using the implemented class CrackMNIST as follows ```python from crackmnist import CrackMNIST

train_dataset = CrackMNIST(split="train", pixels=28, size="S") ``Here, the parameterssplit,pixels, andsize` specify the dataset split, and pixel resolution, respectively.

The folder examples contains a Jupyter notebook getting_started.ipyb that demonstrates how to train a simple U-Net segmentation model, and evaluate and visualize the results.

Contributors

Code implementation and data annotation by: - Erik Schultheis - David Melching

Experiment conduction and DIC data acquisition by: - Florian Paysan - Ferdinand Dömling - Eric Dietrich

Supervision and conceptualization by: - David Melching - Eric Breitbarth

Citation

If you use the dataset or code in your research, please cite this GitHub repository:

bibtex @misc{crackmnist, title={Crack-MNIST - A Large-Scale Dataset for Crack Tip Detection in Digital Image Correlation Data}, author={David Melching and Erik Schultheis and Florian Paysan and Ferdinand Dömling and Eric Dietrich and Eric Breitbarth}, journal={GitHub repository}, howpublished={\url{https://www.github.com/dlr-wf/crackmnist}}, year={2025} }

License and Limitations

The package is developed for research and educational purposes only and must not be used for any production or specification purposes. We do not guarantee in any form for its flawless implementation and execution. However, if you run into errors in the code or find any bugs, feel free to contact us.

The code is licensed under MIT License (see LICENSE file). The datasets are licensed under Creative Commons Attribution 4.0 International License (CC BY 4.0).

Owner

Name: German Aerospace Center (DLR) - Institute of Materials Research
Login: dlr-wf
Kind: organization
Location: Germany

Repositories: 1
Profile: https://github.com/dlr-wf

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this package, please cite it as below."
authors:
- family-names: "Melching"
  given-names: "David"
  orcid: "https://orcid.org/0000-0001-5111-6511"
- family-names: "Schultheis"
  given-names: "Erik"
  orcid: "https://orcid.org/0009-0007-4728-7124"
- family-names: "Paysan"
  given-names: "Florian"
- family-names: "Dömling"
  given-names: "Ferdinand"
- family-names: "Dietrich"
  given-names: "Eric"
- family-names: "Breitbarth"
  given-names: "Eric"
  orcid: "https://orcid.org/0000-0002-3479-9143"
title: "CrackMNIST - A Large-Scale Dataset for Crack Tip Detection in Digital Image Correlation Data"
version: 1.0.0
doi: 10.5281/zenodo.15013128
date-released: 2025-03-12
url: "https://github.com/dlr-wf/crackmnist"

GitHub Events

Total

Release event: 1
Watch event: 2
Public event: 1
Push event: 3
Fork event: 2
Create event: 1

Last Year

Release event: 1
Watch event: 2
Public event: 1
Push event: 3
Fork event: 2
Create event: 1

Packages

Total packages: 1
Total downloads:
- pypi 10 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 1
Total maintainers: 1

pypi.org: crackmnist

CrackMNIST - A Large-Scale Dataset for Crack Tip Detection in Digital Image Correlation Data

Documentation: https://crackmnist.readthedocs.io/
License: MIT License Copyright (c) 2025 German Aerospace Center (DLR) Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
Latest release: 1.0.0
published over 1 year ago

Versions: 1
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 10 Last month

Rankings

Dependent packages count: 9.5%

Average: 31.5%

Dependent repos count: 53.5%

Maintainers (1)

melc-da

Last synced: 10 months ago

Dependencies

pyproject.toml pypi

alive-progress ~=3.2.0
h5py ~=3.12.1
numpy ~=2.1.3
torch ~=2.5.1
torchvision ~=0.20.1

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

crackmnist

Science Score: 67.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

CrackMNIST - A Large-Scale Dataset for Crack Tip Detection in Digital Image Correlation Data

Introduction

Objective

DIC data

Data annotation

Labelled datasets

Visualization of labelled data samples

Visualization of different pixel resolutions

Usage

Installation

Getting started

Contributors

Citation

License and Limitations

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Packages

pypi.org: crackmnist

Rankings

Maintainers (1)

Dependencies