masters-thesis-ip102-classification

Code for my Master's Thesis in Data Science: Pest Insect Classification on the IP102 Dataset. The paper focuses on improving classification performance on this dataset by addressing two of its major issues, namely class imbalance and intra-class variance.

https://github.com/alexandruiordan99/masters-thesis-ip102-classification

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.3%) to scientific vocabulary

Keywords

computervision ip102
Last synced: 6 months ago · JSON representation

Repository

Code for my Master's Thesis in Data Science: Pest Insect Classification on the IP102 Dataset. The paper focuses on improving classification performance on this dataset by addressing two of its major issues, namely class imbalance and intra-class variance.

Basic Info
  • Host: GitHub
  • Owner: AlexandruIordan99
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 66.4 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
computervision ip102
Created over 1 year ago · Last pushed 11 months ago
Metadata Files
Readme Citation

ReadMe.md

Overview

This repository contains code for a Master's Thesis on Pest Insect Classification using the IP102 dataset. The paper focuses on improving classification performance on this dataset by addressing two of its major issues, namely class imbalance and intra-class variance. The former is addressed through the use of data augmentation, and transfer-learning using ImageNet weights, while the latter is addressed through the use of a clustering algorithm. These techniques are implemented using EfficientNet models across versions 1 and 2.

In the main folder, the DatasettoBGR lets users quickly change and save the IP102 dataset in a BGR format. The F1Calculator lets users calculate F1-Scores using the accuracy and recall metrics from the model evaluation output. Finally, geometricsmote is a local copy of the code from https://github.com/georgedouzas/imbalanced-learn-extra. This local copy was necessary because the geometricsmote import was not compatible with Python 3.12.6. The authors are credited at the top of that file.

To run:

Use a Linux machine

Install up to date machine learning drivers from NVIDIA

Prefferably use at least an NVIDIA RTX 4080 (10GB of RAM). Otherwise many of the models will stop working due to not enough VRAM. If you understandably do not own such an expensive graphics card, lower the batch size of the model. This will however lower its accuracy.

Clone the repo

Obtain the dataset from its authors or from Kaggle. This is linked instead of the authors page because their google drive link is broken.

Update the paths to the locations of your respective training, validation and testing sets.

Owner

  • Login: AlexandruIordan99
  • Kind: user

Citation (Citations.py)


      

GitHub Events

Total
  • Push event: 5
Last Year
  • Push event: 5