Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.2%) to scientific vocabulary
Last synced: 7 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: BMW-lab-MSU
  • License: bsd-3-clause
  • Language: MATLAB
  • Default Branch: main
  • Size: 226 KB
Statistics
  • Stars: 0
  • Watchers: 3
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created over 1 year ago · Last pushed 9 months ago
Metadata Files
Readme License Citation

README.md

drone-detection

This repository was for part of Trevor Vannoy's dissertation.

Data setup

You can specify the locations of the data and results folders in the dataSetup.m script. By default, the scripts uses the following relative path setup, where code is a folder containing this repository: ├── code ├── data │   ├── combined │   ├── preprocessed │   ├── raw │   │   ├── 2022-06-23 │   │   ├── 2022-06-24 │   │   ├── 2022-07-28 │   │   ├── 2022-07-29 │   ├── testing │   ├── training │   └── validation └── results ├── changepoint-results │   └── runtimes ├── testing └── training ├── classifiers ├── data-sampling ├── default-params └── hyperparameter-tuning

Running the code

[!IMPORTANT] In general, you need to call pathSetup.m first before running anything, as that script adds all the folders in this repo to your MATLAB path. Additionally, if you use the default relative data path setup described above, you must run all of your code from the root of this repository, not the subfolders; if you specify full paths in dataSetup.m, then you can run the code from anywhere.

[!TIP] This code is designed to run on a computing cluster. If you have access to a computing cluster that uses slurm, you can update and use the scripts in the slurm folder. The code will still run perfectly fine on a normal desktop computer—it will just take longer.

The folder icons (📁) after the headings link to the relevant folder in the repository.

There are three main portions of code, listed below. You have to run the data preparation first, but the supervised learning and changepoint detection sections are independent. Use the following links to jump to the sections.

Data preparation 📁

  1. (optional) Convert the csv label files into .mat files using convertAllLabels.m; this has already been done in the archived dataset.
  2. Combine the individual raw data files into larger groups that can then be split into training and testing sets; this is done with combineDataForTrainingTesting.m
  3. Preprocess the data using preprocess.m
  4. Split the preprocessed data into the training, validation, and testing sets. spiltData.m
Using slurm If you have access to a slurm cluster, these steps can be done by running [`prepare-data.sh`](slurm/perpare-data.sh).

Supervised learning 📁

Feature extraction 📁

Before training any of the feature-based algorithms, we need to precompute the features:

  • training data: precomputeTrainingFeatures.m
  • validation data: precomputeValidationFeatures.m
  • testing data: precomputeTestingFeatures.m
Using slurm If you have access to a slurm cluster, these steps can be done by running [`precompute-features.sh`](slurm/precompute-features.sh).

Training 📁

Using slurm If you have access to a slurm cluster, all the training and testing can be launched using [`run-row-methods.sh`](slurm/run-row.methods.sh) and [`run-image-methods.sh`](slurm/run-image-methods.sh).

Data sampling parameter tuning (row methods only)

For the row-based methods, we first need to create the grid search parameters using createDataSamplingGrid.m. Once that is done, we can perform the grid search.

For the feature-based methods, call the evalSamplingGridRowFeatureMethod function with the grid search index, e.g. matlab evalSamplingGridRowFeatureMethod(@AdaBoost,1,UseParallel=true,UseGPU=true)

For the deep learning methods, call the evalSamplingGridRowDataMethod function with the grid search index, e.g. matlab evalSamplingGridRowDataMethod(@CNN1d,1,UseParallel=true,UseGPU=true)

This grid search was designed to run in parallel on a computing cluster, specifically using slurm job arrays. See the samplingGridSearch*.slurm scripts for full details of the function calls for each of the classifiers. In particular, for the neural networks that had more than one hidden layer, we have to pass in the parameters (e.g., layer sizes) into the evalSamplingGrid* functions.

If you don't have access to a computing cluster, you run the grid search methods in a for loop: matlab for gridIdx = 1:16 evalSamplingGridRowFeatureMethod(@AdaBoost,1,UseParallel=true,UseGPU=true) end You could also use a parallel for loop, which may or may not be faster than running each iteration with parallel feature extraction and training (UseParallel=true): matlab parfor gridIdx = 1:16 evalSamplingGridRowFeatureMethod(@AdaBoost,1,UseParallel=false,UseGPU=true) end

Once the grid search for an algorithm is done, run the selectBestSamplingParams function to save the sampling parameters that resulted in the best MCC value, e.g.: matlab selectBestSamplingParams("AdaBoost")

Train default 2D CNNs (image methods only) 📁

For the image-based methods (2D CNNs), we need to train networks with the default hyperparameters. This is because the default hyperparameters might perform better than the parameters found during tuning, and thus we would prefer to use the default parameters for the final training.

The default 2D CNNS can be trained with the trainCNN2dManualParams function. See the trainDefaultCNN2d*.slurm scripts to see the relevant function call for each 2D CNN.

For example, here's the code used to train the default 3-layer 2D CNN: matlab p.FilterSize=[16,2;,16,2;16,2]; p.Nfilters=[20,20,20]; trainCNN2dManualParams(@CNN2d,UseGPU=true,ClassifierParams=p)

Model hyperparameter tuning 📁

Create hyperparameter search values

First, we need to create mat files that contain the model's hyperparameter search values. Each model has a separate function to create it's associated hyperparameter search values, except AdaBoost and RUSBoost, which use the same search values.

Examples: ```matlab

For AdaBoost and RUSBoost

createBoostTreesHyperparamSearchRange matlab createCNN2d1LayerHyperparamSearchRange ```

See hyperparameter-tuning for the functions that create the hyperparameter search values.

Tune hyperparameters

There are three different hyperparameter tuning functions, one for each of the algorithm types: - feature engineering methods: tuneHyperparamsRowFeatureMethod - 1D CNNs: tuneHyperparamsRowDataMethod - 2D CNNs: tuneHyperparamsImageMethod

Examples: matlab tuneHyperparamsRowFeatureMethod("StatsNeuralNetwork1Layer",@StatsNeuralNetwork,UseParallel=true); tuneHyperparamsRowDataMethod("CNN1d1Layer",@CNN1d,UseGPU=true,UseParallel=true); tuneHyperparamsImageMethod("CNN2d3Layer",@CNN2d,UseGPU=true);

See the tuneHyperparams*.slurm scripts in the slurm folder to see the function calls for each method.

Final training 📁

After hyperparameter tuning is done, the algorithms need to be trained one final time on the entire training/validation set.

Similar to the hyperparameter tuning, there are three different training functions, one for each of the algorithm types: - feature engineering methods: trainRowFeatureMethod - 1D CNNs: trainRowDataMethod - 2D CNNs: trainImageMethod Each of the methods take the classifier name as a string.

Examples: matlab trainRowFeatureMethod("AdaBoost"); trainRowDataMethod("CNN1d5Layer"); trainImageMethod("CNN2d3Layer");

Testing 📁

Using slurm Again, if you have access to a slurm cluster, all the training and testing code can be launched using [`run-row-methods.sh`](slurm/run-row.methods.sh) and [`run-image-methods.sh`](slurm/run-image-methods.sh).

Similar to the hyperparameter tuning, there are three different testing functions, one for each of the algorithm types: - feature engineering methods: testRowFeatureMethod - 1D CNNs: testRowDataMethod - 2D CNNs: testImageMethod Each of the methods take the classifier name as a string.

Examples: matlab testRowFeatureMethod("AdaBoost"); testRowDataMethod("CNN1d5Layer"); testImageMethod("CNN2d3Layer");

See the train*.slurm scripts in the slurm folder to see the function calls for each method.

Owner

  • Name: BMW Lab @ MSU
  • Login: BMW-lab-MSU
  • Kind: organization
  • Location: Montana State University

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Vannoy"
  given-names: "Trevor C."
  orcid: "https://orcid.org/0000-0003-4034-9963"
- family-names: "Sweeney"
  given-names: "Nathaniel B."
- family-names: "Whitaker"
  given-names: "Bradley M."
  orcid: "https://orcid.org/0000-0001-8884-9743"
title: "BMW-lab-MSU/insect-detection-remote-sensing-mdpi"
version: 1.0.2
doi: 10.5281/zenodo.10055809
date-released: 2024-01-26
url: "https://github.com/BMW-lab-MSU/insect-detection-remote-sensing-mdpi"

GitHub Events

Total
  • Delete event: 1
  • Push event: 8
  • Create event: 1
Last Year
  • Delete event: 1
  • Push event: 8
  • Create event: 1