agml

AgML is a centralized framework for agricultural machine learning. AgML provides access to public agricultural datasets for common agricultural deep learning tasks, with standard benchmarks and pretrained models, as well the ability to generate synthetic data and annotations.

https://github.com/project-agml/agml

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    15 of 28 committers (53.6%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (18.9%) to scientific vocabulary

Keywords

agriculture computer-vision dataset deep-learning image-classification object-detection pytorch semantic-segmentation synthetic-data
Last synced: 6 months ago · JSON representation

Repository

AgML is a centralized framework for agricultural machine learning. AgML provides access to public agricultural datasets for common agricultural deep learning tasks, with standard benchmarks and pretrained models, as well the ability to generate synthetic data and annotations.

Basic Info
Statistics
  • Stars: 235
  • Watchers: 14
  • Forks: 34
  • Open Issues: 11
  • Releases: 27
Topics
agriculture computer-vision dataset deep-learning image-classification object-detection pytorch semantic-segmentation synthetic-data
Created over 4 years ago · Last pushed 10 months ago
Metadata Files
Readme Contributing License Code of conduct Authors

README.md

agml logo


👨🏿‍💻👩🏽‍💻🌈🪴 Want to join the AI Institute for Food Systems team and help lead AgML development? 🪴🌈👩🏼‍💻👨🏻‍💻

We're looking to hire a postdoc with both Python library development and ML experience. Send your resume and GitHub profile link to jmearles@ucdavis.edu!


Overview

AgML is a comprehensive library for agricultural machine learning. Currently, AgML provides access to a wealth of public agricultural datasets for common agricultural deep learning tasks. In the future, AgML will provide ag-specific ML functionality related to data, training, and evaluation. Here's a conceptual diagram of the overall framework.

agml framework

AgML supports both the TensorFlow and PyTorch machine learning frameworks.

Installation

To install the latest release of AgML, run the following command:

shell pip install agml

NOTE: Some features of AgML, such as synthetic data generation, require GUI applications. When running AgML through Windows Subsystem for Linux (WSL), it may be necessary to configure your WSL environment to utilize these features. Please follow the Microsoft documentation to install all necessary prerequisites and update WSL. The latest version of WSL includes built-in support for running Linux GUI applications.

Quick Start

AgML is designed for easy usage of agricultural data in a variety of formats. You can start off by using the AgMLDataLoader to download and load a dataset into a container:

```python import agml

loader = agml.data.AgMLDataLoader('appleflowersegmentation') ```

You can then use the in-built processing methods to get the loader ready for your training and evaluation pipelines. This includes, but is not limited to, batching data, shuffling data, splitting data into training, validation, and test sets, and applying transforms.

```python import albumentations as A

Batch the dataset into collections of 8 pieces of data:

loader.batch(8)

Shuffle the data:

loader.shuffle()

Apply transforms to the input images and output annotation masks:

loader.masktochannelbasis() loader.transform( transform = A.RandomContrast(), dualtransform = A.Compose([A.RandomRotate90()]) )

Split the data into train/val/test sets.

loader.split(train = 0.8, val = 0.1, test = 0.1) ```

The split datasets can be accessed using loader.train_data, loader.val_data, and loader.test_data. Any further processing applied to the main loader will be applied to the split datasets, until the split attributes are accessed, at which point you need to apply processing independently to each of the loaders. You can also turn toggle processing on and off using the loader.eval(), loader.reset_preprocessing(), and loader.disable_preprocessing() methods.

You can visualize data using the agml.viz module, which supports multiple different types of visualization for different data types:

```python

Disable processing and batching for the test data:

testds = loader.testdata testds.batch(None) testds.reset_prepreprocessing()

Visualize the image and mask side-by-side:

agml.viz.visualizeimageandmask(testds[0])

Visualize the mask overlaid onto the image:

agml.viz.visualizeoverlaidmasks(test_ds[0]) ```

AgML supports both the TensorFlow and PyTorch libraries as backends, and provides functionality to export your loaders to native TensorFlow and PyTorch formats when you want to use them in a training pipeline. This includes both exporting the AgMLDataLoader to a tf.data.Dataset or torch.utils.data.DataLoader, but also internally converting data within the AgMLDataLoader itself, enabling access to its core functionality.

```python

Export the loader as a tf.data.Dataset:

trainds = loader.traindata.export_tensorflow()

Convert to PyTorch tensors without exporting.

trainds = loader.traindata trainds.astorch_dataset() ```

You're now ready to use AgML for training your own models! Luckily, AgML comes with a training module that enables quick-start training of standard deep learning models on agricultural datasets. Training a grape detection model is as simple as the following code:

```python import agml import agml.models

import albumentations as A

loader = agml.data.AgMLDataLoader('grapedetectioncaliforniaday') loader.split(train = 0.8, val = 0.1, test = 0.1) processor = agml.models.preprocessing.EfficientDetPreprocessor( image_size = 512, augmentation = [A.HorizontalFlip(p=0.5)] ) loader.transform(processor)

model = agml.models.DetectionModel(numclasses=loader.numclasses)

model.run_training(loader) ```

Public Dataset Listing

AgML contains a wide variety of public datasets from various locations across the world:

AgML Dataset World Map

The following is a comprehensive list of all datasets available in AgML. For more information, you can use agml.data.public_data_sources(...) with various filters to filter datasets according to your desired specification.

| Dataset | Task | Number of Images | | :--- | ---: | ---: | beandiseaseuganda | Image Classification | 1295 | carrotweedsgermany | Semantic Segmentation | 60 | plantseedlingsaarhus | Image Classification | 5539 | soybeanweeduav_brazil | Image Classification | 15336 | sugarcanedamageusa | Image Classification | 153 | cropweedsgreece | Image Classification | 508 | sugarbeetweedsegmentation | Semantic Segmentation | 1931 | rangelandweedsaustralia | Image Classification | 17509 | fruitdetectionworldwide | Object Detection | 565 | leafcountingdenmark | Image Classification | 9372 | appledetectionusa | Object Detection | 2290 | mangodetectionaustralia | Object Detection | 1730 | appleflowersegmentation | Semantic Segmentation | 148 | applesegmentationminnesota | Semantic Segmentation | 670 | riceseedlingsegmentation | Semantic Segmentation | 224 | plantvillageclassification | Image Classification | 55448 | autonomousgreenhouseregression | Image Regression | 389 | grapedetectionsyntheticday | Object Detection | 448 | grapedetectioncaliforniaday | Object Detection | 126 | grapedetectioncalifornianight | Object Detection | 150 | guavadiseasepakistan | Image Classification | 306 | appledetectionspain | Object Detection | 967 | appledetectiondrone_brazil | Object Detection | 689 | plantdocclassification | Image Classification | 2598 | plantdocdetection | Object Detection | 2598 | wheatheadcounting | Object Detection | 6512 | peachpearflowersegmentation | Semantic Segmentation | 42 | redgrapesandleavessegmentation | Semantic Segmentation | 258 | whitegrapesandleavessegmentation | Semantic Segmentation | 273 | ghairomainedetection | Object Detection | 500 | ghaigreencabbage_detection | Object Detection | 500 | ghaiiceberglettuce_detection | Object Detection | 500 | riseholmestrawberryclassification_2021 | Image Classification | 3520 | ghaibroccolidetection | Object Detection | 500 | beansyntheticearlygrowth_aerial | Semantic Segmentation | 2500 | ghaistrawberryfruit_detection | Object Detection | 500 | vegannmulticroppresence_segmentation | Semantic Segmentation | 3775 | cornmaizeleaf_disease | Image Classification | 4188 | tomatoleafdisease | Image Classification | 11000 | vinevirusphoto_dataset | Image Classification | 3866 | tomatoripenessdetection | Object Detection | 804 | embrapawgisdgrape_detection | Object Detection | 239 | growliflowercauliflowersegmentation | Semantic Segmentation | 1542 | strawberrydetection2023 | Object Detection | 204 | strawberrydetection2022 | Object Detection | 175 | almondharvest2021 | Object Detection | 50 | almondbloom2023 | Object Detection | 100 | geminiflowerdetection_2022 | Object Detection | 134 | geminileafdetection_2022 | Object Detection | 25 | geminipoddetection_2022 | Object Detection | 98 | geminiplantdetection_2022 | Object Detection | 402 | paddydiseaseclassification | Image Classification | 10407 | onionleafclassification | Image Classification | 4502 | chillileafclassification | Image Classification | 10974 | orangeleafdisease_classification | Image Classification | 5813 | papayaleafdisease_classification | Image Classification | 2159 | blackgramplantleafdiseaseclassification | Image Classification | 1007 | arabicacoffeeleafdiseaseclassification | Image Classification | 58549 | bananaleafdisease_classification | Image Classification | 1288 | coconuttreedisease_classification | Image Classification | 5798 | riceleafdisease_classification | Image Classification | 3829 | tealeafdisease_classification | Image Classification | 5867 | betelleafdisease_classification | Image Classification | 3589 | javaplumleafdiseaseclassification | Image Classification | 2400 | sunflowerdiseaseclassification | Image Classification | 2358 | cucumberdiseaseclassification | Image Classification | 7689 | iNatAg | Image Classification | 4720903 | iNatAg-mini | Image Classification | 560844 | soybeaninsectclassification | Image Classification | 6410 |

iNatAg and iNatAg-mini

AgML provides an API with direct access to iNatAg (and iNatAg-mini), one of the world's largest collections of agricultural images dedicated for the task of image classification. Collectively, this dataset contains over 4 million images along with detailed species classificaations and enables access to a variety of large-scale agricultural machine learning tasks. You can instantiate the iNatAg (or iNatAg-mini, a smaller variant of iNatAg for smaller-scale applications) dataset as follows:

```python

To select a collection of scientific family names.

loader = agml.data.AgMLDataLoader.fromparent("iNatAg", filters={"familyname": ["...", "..."]})

To select common names.

loader = agml.data.AgMLDataLoader.fromparent("iNatAg", filters={"commonname": "..."}) ```

Usage Information

Using Public Agricultural Data

AgML aims to provide easy access to a range of existing public agricultural datasets The core of AgML's public data pipeline is AgMLDataLoader. You can use the AgMLDataLoader or agml.data.download_public_dataset() to download the dataset locally from which point it will be automatically loaded from the disk on future runs. From this point, the data within the loader can be split into train/val/test sets, batched, have augmentations and transforms applied, and be converted into a training-ready dataset (including batching, tensor conversion, and image formatting).

To see the various ways in which you can use AgML datasets in your training pipelines, check out the example notebook.

Annotation Formats

A core aim of AgML is to provide datasets in a standardized format, enabling the synthesizing of multiple datasets into a single training pipeline. To this end, we provide annotations in the following formats:

  • Image Classification: Image-To-Label-Number
  • Object Detection: COCO JSON
  • Semantic Segmentation: Dense Pixel-Wise

Contributions

We welcome contributions! If you would like to contribute a new feature, fix an issue that you've noticed, or even just mention a bug or feature that you would like to see implemented, please don't hesitate to use the Issues tab to bring it to our attention.

See the contributing guidelines for more information.

Funding

This project is partly funded by the National AI Institute for Food Systems.

Owner

  • Name: AgML
  • Login: Project-AgML
  • Kind: organization

AgML is a comprehensive library for agricultural machine learning.

GitHub Events

Total
  • Create event: 20
  • Issues event: 4
  • Release event: 4
  • Watch event: 53
  • Delete event: 8
  • Issue comment event: 15
  • Member event: 1
  • Push event: 129
  • Pull request review event: 25
  • Pull request review comment event: 34
  • Pull request event: 32
  • Fork event: 6
Last Year
  • Create event: 20
  • Issues event: 4
  • Release event: 4
  • Watch event: 53
  • Delete event: 8
  • Issue comment event: 15
  • Member event: 1
  • Push event: 129
  • Pull request review event: 25
  • Pull request review comment event: 34
  • Pull request event: 32
  • Fork event: 6

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 914
  • Total Committers: 28
  • Avg Commits per committer: 32.643
  • Development Distribution Score (DDS): 0.248
Past Year
  • Commits: 147
  • Committers: 7
  • Avg Commits per committer: 21.0
  • Development Distribution Score (DDS): 0.612
Top Committers
Name Email Commits
amogh7joshi j****n@g****m 687
Naitik Jain n****n@s****n 54
Mason Earles j****s@J****l 31
Mason Earles 2****s 21
Leandro G. Almeida l****a@g****m 18
Heesup Yun h****n@u****u 16
Dario Guevara d****a@u****u 13
smbanx s****x@g****m 10
alexolenskyj a****j@u****u 9
Dario Guevara d****1@g****m 8
github-actions[bot] g****] 7
Mason Earles j****s@c****u 6
Pranav Raja p****a@p****n 5
Mason Earles j****s@c****u 5
pranavraja99 p****9@i****m 3
varunUCDavis v****a@u****u 3
dguevara d****a@a****u@v****u 3
Naitik n****1@g****m 2
Ooberaj y****k@c****u 2
Ooberaj y****k@c****u 2
ctyeong c****g@g****m 2
Mason Earles j****s@c****u 1
Ooberaj y****k@Y****l 1
Ooberaj y****k@c****u 1
Ooberaj y****k@c****u 1
Pranav Raja p****a@P****l 1
amnjoshi a****i@a****u@v****u 1
momtanu-ag m****y@u****u 1

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 29
  • Total pull requests: 52
  • Average time to close issues: 30 days
  • Average time to close pull requests: 6 days
  • Total issue authors: 15
  • Total pull request authors: 10
  • Average comments per issue: 4.1
  • Average comments per pull request: 0.23
  • Merged pull requests: 48
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 3
  • Pull requests: 24
  • Average time to close issues: 15 days
  • Average time to close pull requests: 9 days
  • Issue authors: 3
  • Pull request authors: 4
  • Average comments per issue: 1.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 24
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • ctyeong (7)
  • masonearles (3)
  • Vincent-WangCH (3)
  • StupiddCupid (3)
  • alexolenskyj (2)
  • NielsRogge (1)
  • khawar-islam (1)
  • bradezard131 (1)
  • boudiafA (1)
  • xml94 (1)
  • Akshatha-Mohan (1)
  • Icecream-blue-sky (1)
  • andreaceruti (1)
  • dariojavo (1)
  • cmbadgujar10 (1)
Pull Request Authors
  • amogh7joshi (27)
  • lalmei (19)
  • naitikjain3071 (6)
  • Ooberaj (5)
  • smbanx (4)
  • dariojavo (3)
  • pranavraja99 (1)
  • momtanu-ag (1)
  • ctyeong (1)
  • alexolenskyj (1)
Top Labels
Issue Labels
bug (11) synthetic (7) enhancement (4) dataset (2) documentation (1)
Pull Request Labels
enhancement (19) release (9) bug (7) synthetic (7) documentation (7) dataset (5) models (1)

Packages

  • Total packages: 2
  • Total downloads: unknown
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 0
    (may contain duplicates)
  • Total versions: 54
proxy.golang.org: github.com/Project-AgML/AgML
  • Versions: 27
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 9.0%
Average: 9.6%
Dependent repos count: 10.2%
Last synced: 6 months ago
proxy.golang.org: github.com/project-agml/agml
  • Versions: 27
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 9.0%
Average: 9.6%
Dependent repos count: 10.2%
Last synced: 6 months ago

Dependencies

requirements.txt pypi
  • albumentations *
  • matplotlib *
  • numpy *
  • opencv-python *
  • pyyaml >=5.4.1
  • scikit-learn *
  • tensorflow *
  • torch *
  • torchvision *
  • tqdm *
setup.py pypi
  • line *