automated-california-av-dataset
AV Collision report dataset from California DMV
Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.1%) to scientific vocabulary
Repository
AV Collision report dataset from California DMV
Basic Info
- Host: GitHub
- Owner: saquibmh
- License: cc0-1.0
- Language: Python
- Default Branch: main
- Size: 160 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 2
Metadata Files
README.md
Collision Report Extraction
A collision report extractor to extract Information from California DMV AV collision reports.
The California DMV receives collision reports for Autonomous Vehicles in PDF format, making it challenging to manually compile all the data from these reports. To address this issue, an extractor has been developed to automatically extract relevant information from the collision reports and consolidate it into a single Excel file. This tool aims to simplify the process of gathering and analyzing collision data for Autonomous Vehicles.
Authors: Saquib M Haroon and Alyssa Ryan @ University of Arizona
Libraries used
The successful implementation of the extractor relied on the utilization of specific libraries. These libraries played a crucial role in generating the final Excel file from the Autonomous Vehicle collision reports.
easyOCR
pdf2image
openpyxl
OpenCV
NumPy
Extracted Dataset
Find the latest extracted dataset upto June 2023 here
Future Work
Use NLP models to automatically extract Injury information from the description.
Geocode the Address so as to identify collision coordinates.
Please feel free to contribute to this project
Lab website
Visit our Website: Ryan Research Lab.
Please Cite the Paper
Haroon, S. M., & Ryan, A. (2024). Understanding key factors in automated vehicle collisions: Automating data extraction and analyzing key insights using explainable AI. Journal of Transportation Safety & Security, 1-24. Link
Owner
- Login: saquibmh
- Kind: user
- Repositories: 1
- Profile: https://github.com/saquibmh
Citation (CITATION.cff)
cff-version: 1.1.0
message: "If you use this software, please cite it as below."
authors:
- family-names: Haroon
given-names: Saquib Mohammed
orcid: https://orcid.org/0009-0001-3788-9590
title: saquibmh/Automated-California-AV-Dataset: Automated AV database generator
version: 1.0
date-released: 2023-07-10