ml_individual_tree_mortality

Does Machine Learning outperform Logistic Regression in predicting individual tree mortality? - code and data

https://github.com/aitorvv/ml_individual_tree_mortality

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 5 DOI reference(s) in README
  • Academic publication links
    Links to: researchgate.net, scholar.google, zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.5%) to scientific vocabulary

Keywords

decision-trees forestry knn logistic-regression machine-learning modeling mortality naive-bayes random-forest survival svm
Last synced: 6 months ago · JSON representation

Repository

Does Machine Learning outperform Logistic Regression in predicting individual tree mortality? - code and data

Basic Info
  • Host: GitHub
  • Owner: aitorvv
  • License: mit
  • Language: R
  • Default Branch: main
  • Homepage:
  • Size: 1.29 GB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 5
Topics
decision-trees forestry knn logistic-regression machine-learning modeling mortality naive-bayes random-forest survival svm
Created almost 2 years ago · Last pushed 12 months ago
Metadata Files
Readme License Citation

README.md

Does Machine Learning outperform Logistic Regression in predicting individual tree mortality?

:computer: :floppydisk: :barchart: Original data, code and results related to the study


Manuscript DOI: Does machine learning outperform logistic regression in predicting individual tree mortality?

:openfilefolder: Repository DOI: DOI


:sparkles: Highlights

  • 6 different Machine Learning algorithms were compared in predicting individual tree mortality.
  • Effects of dataset size, variable set, thinning, inventory length, and cross-validation were studied.
  • Random Forest reached a higher performance level in all the case studies proposed except on cross-validation.
  • Logistic binomial Regression seems to be a more robust algorithm regarding cross-validation.

:book: Abstract

Tree mortality is a crucial process in forest dynamics and a key component of forest growth models and simulators. Factors like competition, drought, and pathogens drive tree mortality, but the underlying mechanism is challenging to model. The current environmental changes are even complicating model approaches as they influence and alter all the factors involving mortality. However, innovative classification algorithms can go deep into data to find patterns that can model or even explain their relationship. We use Logistic binomial Regression as the reference algorithm for predicting individual tree mortality. However, different machine learning (ML) alternatives already applied to other forest modeling topics can be used for this purpose. Here, we compare the performance of five different ML algorithms (Decision Trees, Random Forest, Naive Bayes, K-Nearest Neighbour, and Support Vector Machine) against Logistic binomial Regression in individual tree mortality classification under 40 different case studies and a cross-validation case study. The data used corresponds to Norway spruce long-term experimental plots, which have a total of 75,522 tree records and a 10.28% mortality rate on average. Through different case studies, when more variables were used, general performance improved as expected, while more extensive datasets decreased the performance level of the algorithms. Performance was also higher when plots remained without management compared to thinned ones. Random Forest outperformed the other algorithms in all the cases except cross-validation, where it was the weaker one. Our results demonstrate the potential of ML in assessing tree mortality. When the model application is not clearly defined and/or model interpretability is needed, Logistic binomial Regression is still the best tool for evaluating individual tree mortality.


:file_folder: Repository Contents

  • :openfilefolder: 1_data: raw and processed data, check here for a detailed description
  • :openfilefolder: 2_code: compilation of the code used for data curation, analysis and outputs included in the document, check here for a detailed description
  • :openfilefolder: 3_figures: figures, charts, tables and additional resources included in the document, check here for a detailed description
  • :openfilefolder: 4_bibliography: compilation of all the literature cited or consulted during the creation of the document

:thinking: How to use the resouces of that repository

:dizzy: To download the information of that repository, you can follow this guide.

:recycle: To reproduce the analysis, users must:

  • :floppy_disk: Data:

    • WorldClim data required for the simulations must be downloaded from its official website
  • :computer: Prerequisites: installation and code: R must be installed to run the code with the used libraries across each script (RStudio was also used to develop the code). Some analyses (specifically when training RF models) will request high computation power, which can provoke out-of-memory in a normal computer. Access to high-computing services is highly recommended in those cases.

  • :scroll: Usage: follow the numerical order of the scripts to reproduce each step correctly


:link: About the authors

Aitor Vzquez Veloso:

Email ORCID Google Scholar ResearchGate LinkedIn X Description

Astor Torao Caicoya:

Description

ORCID ResearchGate LinkedIn Description

Felipe Bravo Oviedo:

ORCID ResearchGate LinkedIn X Description

Peter Biber:

Description

ORCID ResearchGate Description

Enno Uhl:

Description

ORCID ResearchGate

Hans Pretzsch:

Description

ORCID ResearchGate Description


License

MIT License

The content of this repository is under the MIT license.


:pencil: How to cite this repository?

You can use the citation file or copy the citation directly into APA or BibTeX using the bottom Cite this repository on the right hand side of the repository content, here are more details.


Does Machine Learning outperform Logistic Regression in predicting individual tree mortality?

Owner

  • Name: aitor
  • Login: aitorvv
  • Kind: user
  • Location: Universidad de Valladolid, Palencia
  • Company: iuFOR

GitHub Events

Total
  • Release event: 2
  • Watch event: 1
  • Push event: 7
  • Fork event: 1
  • Create event: 2
Last Year
  • Release event: 2
  • Watch event: 1
  • Push event: 7
  • Fork event: 1
  • Create event: 2