Recent Releases of perspective-occupancy-map

📦 Release Notes — POMv2 v1.0.0

This is the first public release of POMv2, a modular deep learning pipeline for semantic top-view segmentation from monocular perspective images.

Two-Branch Architecture:
- Semantic POM Head using DeepLabV3 to predict object footprints in perspective view.
- PON Encoder to extract spatial features projected into BEV space.
Learned Perspective-to-BEV Projection:
- Uses camera geometry to map perspective logits into a top-down grid.
- Fused with learned features and decoded using a UNet.
Multi-Level Supervision:
- Joint loss on both Semantic POM and BEV segmentation for stronger training signals.
Temporal Stability via POM:
- Perspective occupancy maps offer stable spatial cues across frames.
- Helps improve generalization and model robustness.
Cross-Dataset Generalization:
- Supports pretraining the perspective segmentation branch on large datasets (e.g., Cityscapes).
Experiment Management:
- YAML-based configuration system.
- Integrated with Weights & Biases (wandb) for logging, checkpointing, and visualization.

Modular codebase with models/, datasets/, utils/, and config-driven training (train.py) and evaluation (eval.py) scripts.

Released under the MIT License.

- Jupyter Notebook
Published by shantanusingh16 11 months ago