diachrysia-classification

Classification of two twin moth species using microscopic spectroscopy and machine learning

https://github.com/kadyb/diachrysia-classification

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: sciencedirect.com
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.3%) to scientific vocabulary

Keywords

classification dataset machine-learning moths r spectroscopy
Last synced: 4 months ago · JSON representation ·

Repository

Classification of two twin moth species using microscopic spectroscopy and machine learning

Basic Info
  • Host: GitHub
  • Owner: kadyb
  • License: mit
  • Language: R
  • Default Branch: main
  • Homepage:
  • Size: 9.65 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
classification dataset machine-learning moths r spectroscopy
Created almost 5 years ago · Last pushed almost 4 years ago
Metadata Files
Readme License Citation

README.md

Diachrysia classification

This repository contains the data, code, and results for “Reflectance spectroscopy and machine learning as a tool for the categorization of twin species based on the example of the Diachrysia genus” article.

Dataset

There are two datasets in the data directory - the first containing legislative information on moths (Legislative), the second containing the measured spectra (Spectra). The files are available in two formats - .csv and .xlsx.

Legislative file contains the following columns: - ID - Individual identifier (species-number) - Species - Diachrysia chrysitis or Diachrysia stenochrysis - Sex - Male (♂) or Female (♀) - Year_catch - year of the moth catch - Day_catch - day of the year when the moth was caught - Locality - place where the moth was caught - UTM_code - zone in UTM coordinate system - Longitude - east–west position in degrees - Latitude - north–south position in degrees - Feature_level - level of marking the morphological feature, where 1 means weak, 2 means strong, 3 means very strong

Spectra file contains the following columns: - ID - Individual identifier (species-number) - Species - Diachrysia chrysitis or Diachrysia stenochrysis - Scale - part of the scale on which the spectrometer measurement was made (Glass or Brown) - 400-2100 - spectral band number

Reproduction

  1. Open the diachrysia-classification.Rproj project file in RStudio.
  2. Run 01_randomforest.R to build classification models, assess the performance of classification at the general and individual level and determine the importance of spectral features.
  3. Run 02_KS_test.R to determine importance of the spectral bands for species discrimination using Kolmogorov–Smirnov test.
  4. Run 03_LDA_best_features.R to determine the most useful spectral bands for species classification using Linear Discriminant Analysis and D-statistic.
  5. Run 04_LDA_combinations.R to determine the minimum set of spectral features to distinguish species with 100% accuracy.

Results

The code results were saved in results directory: - ks-test.csv - importance of the spectral bands for species discrimination determined by D-statistic - rf-importance.csv - average importance of the spectral features for classification in the random forest models

Owner

  • Name: Krzysztof Dyba
  • Login: kadyb
  • Kind: user
  • Location: Poland
  • Company: Adam Mickiewicz University

Spatial Data Science | Remote Sensing | R

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use data or code from this repository, please cite it as below."
preferred-citation:
  type: article
  authors:
  - family-names: "Dyba"
    given-names: "Krzysztof"
    orcid: "https://orcid.org/0000-0002-8614-3816"
  - family-names: "Wąsala"
    given-names: "Roman"
    orcid: "https://orcid.org/0000-0001-7348-987X"
  - family-names: "Piekarczyk"
    given-names: "Jan"
    orcid: "https://orcid.org/0000-0002-2405-6741"
  - family-names: "Gabała"
    given-names: "Elżbieta"
    orcid: "https://orcid.org/0000-0002-9296-8208"
  - family-names: "Gawlak"
    given-names: "Magdalena"
    orcid: "https://orcid.org/0000-0001-7994-2882"
  - family-names: "Jasiewicz"
    given-names: "Jarosław"
    orcid: "https://orcid.org/0000-0003-2837-9078"
  - family-names: "Ratajkiewicz"
    given-names: "Henryk"
    orcid: "https://orcid.org/0000-0002-3512-1350"
  title: "Reflectance spectroscopy and machine learning as a tool for the categorization of twin species based on the example of the Diachrysia genus"
  journal: "Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy"
  doi: "10.1016/j.saa.2022.121058"
  url: "https://www.sciencedirect.com/science/article/pii/S1386142522002062"
  volume: 273
  pages: 121058
  year: 2022
  month: 5

GitHub Events

Total
Last Year

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels