mini-webvision
Creates Mini-WebVision the Dataset for pytorch dataloader.
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (4.1%) to scientific vocabulary
Repository
Creates Mini-WebVision the Dataset for pytorch dataloader.
Basic Info
- Host: GitHub
- Owner: sangamesh-kodge
- License: apache-2.0
- Language: Python
- Default Branch: master
- Homepage: https://www.linkedin.com/in/sangamesh-kodge/
- Size: 6.84 KB
Statistics
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
Readme.md
Mini-WebVision
This project preprocess the Google images partition of WebVision 1.0 Dataset to obtain Mini-WebVision dataset and gives a directory structure ImageNet1k dataset.
Mini-WebVision - contains about 61K Google images on the first 50 classes from the WebVision dataset. - Number of train images - 61234 - Number of val images - 2500 (50 per class)
Below is the final directory structure for this project:
Mini-WebVision
train
| nxxxxxxxx
| nxxxxxxxx
| ...
|
val
| nxxxxxxxx
| nxxxxxxxx
| ...
|
info
| xxxx
| xxxx
| ...
|
create_MiniWebVision_as_ImageNet.sh
helper.py
Readme.md
Citation.cff
LICENSE
Instructions
- Clone this repository
- Navigate to the root of this project
Run the following command in your terminal/command prompt:
bash sh create_MiniWebVision_as_ImageNet.sh
Expected Terminal logs
``` (base) Mini-WebVision >sh createMiniWebVisionasImageNet.sh --2024-02-07 11:04:19-- https://data.vision.ee.ethz.ch/cvl/webvision/googleresized256.tar Resolving data.vision.ee.ethz.ch (data.vision.ee.ethz.ch)... 129.132.52.178, 2001:67c:10ec:36c2::178 Connecting to data.vision.ee.ethz.ch (data.vision.ee.ethz.ch)|129.132.52.178|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 16980316160 (16G) [application/x-tar] Saving to: googleresized_256.tar
100%[================================================================================>] 16,980,316,160 16.2MB/s in 11m 59s
2024-02-07 11:16:19 (22.5 MB/s) - googleresized256.tar saved [16980316160/16980316160]
--2024-02-07 11:16:19-- https://data.vision.ee.ethz.ch/cvl/webvision/valimages256.tar Resolving data.vision.ee.ethz.ch (data.vision.ee.ethz.ch)... 129.132.52.178, 2001:67c:10ec:36c2::178 Connecting to data.vision.ee.ethz.ch (data.vision.ee.ethz.ch)|129.132.52.178|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 873574400 (833M) [application/x-tar] Saving to: valimages256.tar
100%[===================================================================================>] 873,574,400 24.2MB/s in 35s
2024-02-07 11:16:55 (23.7 MB/s) - valimages256.tar saved [873574400/873574400]
--2024-02-07 11:16:55-- https://data.vision.ee.ethz.ch/cvl/webvision/info.tar Resolving data.vision.ee.ethz.ch (data.vision.ee.ethz.ch)... 129.132.52.178, 2001:67c:10ec:36c2::178 Connecting to data.vision.ee.ethz.ch (data.vision.ee.ethz.ch)|129.132.52.178|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 190914560 (182M) [application/x-tar] Saving to: info.tar
100%[===================================================================================>] 190,914,560 24.6MB/s in 8.3s
2024-02-07 11:17:04 (21.9 MB/s) - info.tar saved [190914560/190914560]
Creating directory structure similar to ImageNet for training dataset
Creating directory structure similar to ImageNet for val dataset
Removing Redundant files.
Mini-WebVision Dataset Processed!
```
Conclusion
The project preprocess the Google images partition of WebVision 1.0 Dataset to obtain Mini_WebVision dataset and gives a directory structure similar to ImageNet. The script automates the preprocessing and provides a directory structure for the Google partition, similar to ImageNet.
Source Code
This repository is developed over github codebase for preprocess Google partion of WebVision 1.0 found at WebVision1.0-Google
Citation
Kindly cite the repository if you use the code. Thanks!
APA
Kodge, S. (2024). MiniWebVision [Computer software]. https://github.com/sangamesh-kodge/Mini-WebVision
Bibtex
@software{Kodge_MiniWebVision_2024,
author = {Kodge, Sangamesh},
month = feb,
title = {{MiniWebVision}},
url = {https://github.com/sangamesh-kodge/Mini-WebVision},
year = {2024}
}
Owner
- Login: sangamesh-kodge
- Kind: user
- Repositories: 1
- Profile: https://github.com/sangamesh-kodge
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Kodge" given-names: "Sangamesh" orcid: "https://orcid.org/0000-0001-9713-5400" title: "MiniWebVision" date-released: 2024-2-7 url: "https://github.com/sangamesh-kodge/Mini-WebVision"
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1