Recent Releases of mcity_data_engine

mcity_data_engine - v1.1.0: Depth Estimation, Segmentation, Class Mapping, Dataset SUNRGBD

Initial release of the Monocular Depth Estimation Module, adding support for depth estimation, segmentation, and class mapping. To use this module, configure your workflow and dataset in the config:

SELECTED_WORKFLOW = ["auto_label_mask"] SELECTED_DATASET = {"name": "SUNRGBD", "n_samples": None}

Each individual workflow can be configured in WORKFLOWS = {...}

Visual Data

As mentioned in V1.0.0, the Mcity Data Engine supports visual data. Leveraging Voxel51, we convert any dataset into their dataset format and perform all operations on it. This way, the Data Engine can be utilized with any dataset as long as it can be converted into the V51 format.

This release expands dataset support by integrating additional dataset

The SUNRGB-D dataset contains 10,335 RGB-D images, each of which has a corresponding RGB image, depth image, and camera intrinsics. It contains images from the NYU depth v2, Berkeley B3DO, and SUN3D datasets.

The integration of SUNRGB-D further strengthens the Mcity Data Engine’s ability to process high-quality depth estimation and segmentation data.

Data Curation

Unlike the other datasets currently available in the Mcity Data Engine, SUNRGB-D requires a manual download.

To download the Dataset, use the following commands:

curl -o sunrgbd.zip https://rgbd.cs.princeton.edu/data/SUNRGBD.zip unzip sunrgbd.zip Load the Dataset into Voxel51, using:

python main.py

The dataset should then appear in your Voxel51 environment:

mde_gt_heatmaps

Customize Heatmaps (Optional)

When working with depth maps, adjusting the color scheme and opacity can improve visualization.

To modify these settings: - Click the palette icon at the top-left corner of Voxel51. - Select value for Color annotations by. - Scroll to the bottom to Colorscale and select name with type viridis in the text box for optimal contrast.

mde_color_customization

Merged PRs

  • Updated teacher to autolabel: naming consistency, config, and pytest @isabelmoore in https://github.com/mcity/mcitydata_engine/pull/127

Full Changelog: https://github.com/mcity/mcitydataengine/commits/v1.1.0

- Python
Published by daniel-bogdoll 11 months ago

mcity_data_engine - v1.0.0: AWS Integration, Data Curation, Model Training, Model Inference

Initial release of the Mcity Data Engine with a focus on the task of object detection for visual datasets. To use the Mcity Engine, simply select your workflows and the dataset in the config:

SELECTED_WORKFLOW = ["embedding_selection", "auto_labeling"] SELECTED_DATASET = {"name": "fisheye8k", "n_samples": None}

Each individual workflow can be configured in WORKFLOWS = {...}. Start the Mcity Data Engine with python main.py.

Visual Data

The Mcity Data Engine supports visual data. Leveraging Voxel51, we convert any dataset into their dataset format and perform all operations on it. This way, the Data Engine can be utilized with any dataset as long as it can be converted into the V51 format.

Currently connected datasets:

image

Data Curation

To select samples of interest, the Data Engine initially provides three workflows:

Selection by Embedding Ensemble

image6

Leveraging the Voxel51 Brain component, the Data Engine computes image embeddings based on an ensemble of models. These are leveraged to select both representative and unique samples.

Selection by Language-Prompted Zero-Shot Ensemble

image4

Leveraging Zero-Shot Object Detection Models from Hugging Face, the Mcity Data Engine identifies images that include n instances of classes of interest. It combines an ensemble of models to reduce both false positives and false negatives.

Selection by Anomaly Detection

image3

Leveraging anomaly detection models from Anomalib, the Data Engine detects frames that contain anomalies. This workflow requires a labeled dataset. During training, a known class is treated as an anomaly and excluded from the training dataset. During inference, samples with a high anomaly score can be selected for inspection.

Data Labeling

The Data Engine is connected to CVAT for manual labeling. Based on the results of the data curation workflows, samples can be filtered, manually inspected, and labeled. Given trained models, auto labeling can be performed through model inference.

Model Training and Inference

The Mcity Data Engine currently supports three model sources for model training and inference: - Hugging Face Object Detection - Ultralytics - Custom Repositories (CO-DETR example)

Trained models are uploaded to Hugging Face and can be used for later inference. Custom models are wrapped in a Container and can be used through Docker or Singularity.

Merged PRs

  • [pdoc] Updated documentation by @github-actions in https://github.com/mcity/mcitydataengine/pull/34
  • Data Download and Extraction from AWS by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/44
  • Dataset n samples by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/45
  • added requirementscolab.txt by @rajanikant-patnaik in https://github.com/mcity/mcitydata_engine/pull/47
  • Auto labeling by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/53
  • Auto Labeling working and tested for all HF models in config. RT-DETR commented out, as there is a tensor mismatch by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/54
  • Test for Anomaly Detection Inference by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/84
  • Anomaly detection model coverage by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/86
  • Aws Download group specification + test by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/88
  • Auto labeling Test for Hugging Face by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/89
  • Zero shot stable by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/95
  • Add ensemble exploration workflow and enhance logging for detection collection by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/96
  • Data-selection-notebook by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/97
  • Bump cryptography from 43.0.1 to 44.0.1 by @dependabot in https://github.com/mcity/mcitydataengine/pull/52
  • [pdoc] Updated documentation by @github-actions in https://github.com/mcity/mcitydataengine/pull/35
  • Updating mask-teacher with changes from main by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/104
  • colab changes by @rajanikant-patnaik in https://github.com/mcity/mcitydataengine/pull/106
  • Colab by @rajanikant-patnaik in https://github.com/mcity/mcitydataengine/pull/107
  • added for colab by @rajanikant-patnaik in https://github.com/mcity/mcitydataengine/pull/109
  • added for colab by @rajanikant-patnaik in https://github.com/mcity/mcitydataengine/pull/110
  • Add public Docs with GitHub Actions to test by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/108
  • added for colab by @rajanikant-patnaik in https://github.com/mcity/mcitydataengine/pull/111
  • Merge mask-auto-labeling into main by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/105
  • Back to 2-step workflow (tests and docs needs installed pip env) by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/112
  • Create dependabot.yml by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/113
  • changed 51 layout by @rajanikant-patnaik in https://github.com/mcity/mcitydataengine/pull/122
  • Update testsdocumentation.yml by @daniel-bogdoll in https://github.com/mcity/mcitydata_engine/pull/123
  • Bump ultralytics from 8.3.75 to 8.3.78 by @dependabot in https://github.com/mcity/mcitydataengine/pull/125
  • Bump huggingface-hub from 0.27.0 to 0.29.1 by @dependabot in https://github.com/mcity/mcitydataengine/pull/124
  • Bump transformers from 4.48.1 to 4.49.0 by @dependabot in https://github.com/mcity/mcitydataengine/pull/121
  • Bump wandb from 0.19.1 to 0.19.6 by @dependabot in https://github.com/mcity/mcitydataengine/pull/119
  • Bump anomalib from 1.1.1 to 1.2.0 by @dependabot in https://github.com/mcity/mcitydataengine/pull/118
  • Bump accelerate from 1.1.1 to 1.4.0 by @dependabot in https://github.com/mcity/mcitydataengine/pull/120
  • Bump datasets from 3.2.0 to 3.3.1 by @dependabot in https://github.com/mcity/mcitydataengine/pull/114
  • Automatic grouping of fields in sidebar by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/128
  • Bump timm from 1.0.14 to 1.0.15 by @dependabot in https://github.com/mcity/mcitydataengine/pull/133
  • Bump datasets from 3.3.1 to 3.3.2 by @dependabot in https://github.com/mcity/mcitydataengine/pull/135
  • Bump wandb from 0.19.6 to 0.19.7 by @dependabot in https://github.com/mcity/mcitydataengine/pull/134
  • Zero shot object detection Improvements by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/129
  • New group for Ensemble Selection by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/130
  • Ultralytics Model Training and Inference by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/132
  • Add YouTube video to README by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/140
  • Supporting dataset views by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/141
  • Integrating CoDETR changes by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/137
  • [pdoc] Updated documentation by @github-actions in https://github.com/mcity/mcitydataengine/pull/103
  • [pdoc] Updated documentation by @github-actions in https://github.com/mcity/mcitydataengine/pull/143
  • Cleanup, Documentation, Evaluation of prior Merges by @daniel-bogdoll in https://github.com/mcity/mcitydataengine/pull/142

Full Changelog: https://github.com/mcity/mcitydataengine/commits/v1.0.0

- Python
Published by daniel-bogdoll 12 months ago

mcity_data_engine - v0.1

Release Notes

  • Parallelized multi-GPU workflow introduced for zero-shot object detection

- Python
Published by daniel-bogdoll about 1 year ago