sorghum_100_masks
Science Score: 65.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 6 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
✓Institutional organization owner
Organization cct-datascience has institutional domain (datascience.cals.arizona.edu) -
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.1%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: cct-datascience
- License: bsd-3-clause
- Language: Shell
- Default Branch: main
- Size: 7.26 MB
Statistics
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
Generate Image Masks and Canopy Cover from Sorghum 100 Images
This is based on the Sorghum 100 dataset (Ren et al 2021). Some code that may be useful for generating masks that separate pixes with plant from soil, and calculates canopy cover as the % of the image that is plant.
The soil masking algorithm is from Burnette et al (2019) and is documented here: https://github.com/terraref/extractors-stereo-rgb/tree/master/canopycover
The Sorghum 100 dataset has been used in two Kaggle competitions:
- 2021: https://www.kaggle.com/c/sorghum-biomass-prediction/overview/iccv-2021-cvppa
- 2022: https://www.kaggle.com/competitions/sorghum-id-fgvc-9/overview
Proceedure
Download data
Follow these instructions to get Kaggle data.
sh
pip install kaggle
export KAGGLE_USERNAME=datadinosaur
export KAGGLE_KEY=xxxxxxxxxxxxxx
kaggle competitions download -c sorghum-biomass-prediction
Run algorithms
The core masking algorithm is here https://github.com/AgPipeline/transformer-soilmask.
It is called in the mask_file.sh script (Burnette et al 2018, Burnette et al 2019)
sh
unzip sorghum-biomass-prediction.zip
for script in `ls run_all_part*`
do
nohup bash $script > nohup${script}.out 2>&1 &
done
To count / check progress:
```sh
The total count of source files (not masked)
find . -name "*.png" | grep -v mask | wc -l
The total count of generated mask files
find . -name "*mask.png" | wc -l ```
there are
* 277,327 training images (according to Kaggle)
* 19,442 testing images (according to Kaggle)
* 324,927 total images (according to find + wc)
The core canopy cover generating algorithm can be found at https://github.com/AgPipeline/transformer-canopycover.
It is called in the canopy_cover_file.sh script
To generate canopy cover as a set of individual CSV files:
sh
for script in `ls run_canopycover_part*`
do
nohup bash $script > nohup${script}.out 2>&1 &
done
Similar to above, the following can be used to count / check progress:
```sh
The total count of masked image files
find . -name "*.png" | grep mask | wc -l
The total count of the CSV files generated so far
find . -name "*.csv" | wc -l ```
To merge the individual CSV files into a single file named sorghum_biomass_canopycover.csv in the current folder:
```bash
Merge CSV files - assumes the CSV files can be found in a folder named "data" residing in the current folder
python3 mergecsv.py --output-file sorghumbiomass_canopycover.csv $PWD/data/ ./ ```
Running the merge_csv.py script for the first time will create the output file and add the found data to it.
Subsequent runs of this script will append the found data to the existing output file.
The canopy cover value goes from 0.0 = no plants to 100.0 = all plants / no soil. The file contains timestamp, canopy cover, species, site, and method columns. Some of these columns may be empty if there isn't a value available for that column. At a minimum, the canopy cover, site, and method columns will be populated.
A sample of the header and the first few rows of data:
|localdatetime|canopycover|species|site|method| |--------------|------------|-------|----|------| | |0| |2017-04-2613-59-06-988_mask|Green Canopy Cover Estimation from Field Scanner RGB images | | |0.737| |2017-05-0512-28-24-229_mask|Green Canopy Cover Estimation from Field Scanner RGB images |
References
Burnette, Maxwell, et al. (2018) "TERRA-REF data processing infrastructure." Proceedings of the Practice and Experience on Advanced Research Computing. 2018. 1-7. https://doi.org/10.1145/3219104.3219152
Burnette et al (2019) terraref/extractors-stereo-rgb: Season 6 Data Publication (2019) (Version S6Pub2019). Zenodo. http://doi.org/10.5281/zenodo.3406304
Christophe Schnaufer, Julian L. Pistorius, and David S. LeBauer "An open, scalable, and flexible framework for automated aerial measurement of field experiments", Proc. SPIE 11414, Autonomous Air and Ground Sensing Systems for Agricultural Optimization and Phenotyping V, 114140A (19 May 2020); https://doi.org/10.1117/12.2560008
Ren, C., Dulay, J., Rolwes, G., Pauli, D., Shakoor, N., & Stylianou, A. (2021). Multi-resolution outlier pooling for sorghum classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 2931-2939).
Owner
- Name: CCT Data Science
- Login: cct-datascience
- Kind: organization
- Email: cct-datascience@arizona.edu
- Location: United States of America
- Website: datascience.cals.arizona.edu
- Twitter: cct_datascience
- Repositories: 29
- Profile: https://github.com/cct-datascience
Citation (CITATION.cff)
cff-version: 1.2.0
title: >-
Generate Image Masks and Canopy Cover from Sorghum
100 Images
message: >-
If you use this software, please cite it using these metadata.
type: software
authors:
- given-names: Chris
family-names: Schnaufer
email: schnaufer@arizona.edu
affiliation: University of Arizona
orcid: 'https://orcid.org/0000-0002-6150-4558'
- given-names: David
family-names: LeBauer
email: dlebauer@arizona.edu
affiliation: University of Arizona
orcid: 'https://orcid.org/0000-0001-7228-053X'
identifiers:
- type: doi
value: 10.5281/zenodo.6456476
description: DOI for most recent version on Zenodo
repository-code: >-
https://github.com/cct-datascience/sorghum_100_masks
license: BSD-3-Clause
references:
- authors:
- given-names: Max
family-names: Burnette
affiliation: NCSA
- given-names: David
family-names: LeBauer
affiliation: University of Arizona
- given-names: Zongyang
family-names: Li
- given-names: Wei
family-names: Qin
- given-names: Solmaz
family-names: Hajmohammadi
- given-names: Craig
family-names: Willis
affiliation: University of Illinois
- given-names: Sidike
family-names: Paheding
affiliation: Saint Louis University
- given-names: Nick
family-names: Heyek
affiliation: University of Illinois
version: S6_Pub_2019
type: software
title: "terraref/extractors-stereo-rgb: Season 6 Data Publication (2019) (Version S6_Pub_2019)"
date-released: 2019-09-12
repository-code: https://github.com/terraref/extractors-stereo-rgb
license: BSD-3-Clause
- authors:
- given-names: Chris
family-names: Schnaufer
email: schnaufer@arizona.edu
affiliation: University of Arizona
orcid: 'https://orcid.org/0000-0002-6150-4558'
- given-names: David
family-names: LeBauer
email: dlebauer@arizona.edu
affiliation: University of Arizona
orcid: 'https://orcid.org/0000-0001-7228-053X'
title: "An open, scalable, and flexible framework for automated aerial measurement of field experiments"
type: "conference-paper"
conference:
name: "Proc. SPIE 11414, Autonomous Air and Ground Sensing Systems for Agricultural Optimization and Phenotyping V, 114140A"
date-start: 2020-05-19