ml_individual_tree_mortality
Does Machine Learning outperform Logistic Regression in predicting individual tree mortality? - code and data
Science Score: 49.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 5 DOI reference(s) in README -
✓Academic publication links
Links to: researchgate.net, scholar.google, zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.5%) to scientific vocabulary
Keywords
Repository
Does Machine Learning outperform Logistic Regression in predicting individual tree mortality? - code and data
Basic Info
Statistics
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 5
Topics
Metadata Files
README.md
Does Machine Learning outperform Logistic Regression in predicting individual tree mortality?
:computer: :floppydisk: :barchart: Original data, code and results related to the study
Manuscript DOI: Does machine learning outperform logistic regression in predicting individual tree mortality?
:openfilefolder: Repository DOI: 
:sparkles: Highlights
- 6 different Machine Learning algorithms were compared in predicting individual tree mortality.
- Effects of dataset size, variable set, thinning, inventory length, and cross-validation were studied.
- Random Forest reached a higher performance level in all the case studies proposed except on cross-validation.
- Logistic binomial Regression seems to be a more robust algorithm regarding cross-validation.
:book: Abstract
Tree mortality is a crucial process in forest dynamics and a key component of forest growth models and simulators. Factors like competition, drought, and pathogens drive tree mortality, but the underlying mechanism is challenging to model. The current environmental changes are even complicating model approaches as they influence and alter all the factors involving mortality. However, innovative classification algorithms can go deep into data to find patterns that can model or even explain their relationship. We use Logistic binomial Regression as the reference algorithm for predicting individual tree mortality. However, different machine learning (ML) alternatives already applied to other forest modeling topics can be used for this purpose. Here, we compare the performance of five different ML algorithms (Decision Trees, Random Forest, Naive Bayes, K-Nearest Neighbour, and Support Vector Machine) against Logistic binomial Regression in individual tree mortality classification under 40 different case studies and a cross-validation case study. The data used corresponds to Norway spruce long-term experimental plots, which have a total of 75,522 tree records and a 10.28% mortality rate on average. Through different case studies, when more variables were used, general performance improved as expected, while more extensive datasets decreased the performance level of the algorithms. Performance was also higher when plots remained without management compared to thinned ones. Random Forest outperformed the other algorithms in all the cases except cross-validation, where it was the weaker one. Our results demonstrate the potential of ML in assessing tree mortality. When the model application is not clearly defined and/or model interpretability is needed, Logistic binomial Regression is still the best tool for evaluating individual tree mortality.
:file_folder: Repository Contents
- :openfilefolder: 1_data: raw and processed data, check here for a detailed description
- :openfilefolder: 2_code: compilation of the code used for data curation, analysis and outputs included in the document, check here for a detailed description
- :openfilefolder: 3_figures: figures, charts, tables and additional resources included in the document, check here for a detailed description
- :openfilefolder: 4_bibliography: compilation of all the literature cited or consulted during the creation of the document
:thinking: How to use the resouces of that repository
:dizzy: To download the information of that repository, you can follow this guide.
:recycle: To reproduce the analysis, users must:
:floppy_disk: Data:
- WorldClim data required for the simulations must be downloaded from its official website
:computer: Prerequisites: installation and code: R must be installed to run the code with the used libraries across each script (RStudio was also used to develop the code). Some analyses (specifically when training RF models) will request high computation power, which can provoke out-of-memory in a normal computer. Access to high-computing services is highly recommended in those cases.
:scroll: Usage: follow the numerical order of the scripts to reproduce each step correctly
:link: About the authors
Aitor Vzquez Veloso:
Astor Torao Caicoya:
Felipe Bravo Oviedo:
Peter Biber:
Enno Uhl:
Hans Pretzsch:
License
The content of this repository is under the MIT license.
:pencil: How to cite this repository?
You can use the citation file or copy the citation directly into APA or BibTeX using the bottom Cite this repository on the right hand side of the repository content, here are more details.
Does Machine Learning outperform Logistic Regression in predicting individual tree mortality?
Owner
- Name: aitor
- Login: aitorvv
- Kind: user
- Location: Universidad de Valladolid, Palencia
- Company: iuFOR
- Website: https://es.linkedin.com/in/aitorvazquezveloso
- Twitter: aitorvv
- Repositories: 1
- Profile: https://github.com/aitorvv
GitHub Events
Total
- Release event: 2
- Watch event: 1
- Push event: 7
- Fork event: 1
- Create event: 2
Last Year
- Release event: 2
- Watch event: 1
- Push event: 7
- Fork event: 1
- Create event: 2

