damagedlogginganalyzer
A project about an analyzation of a statistic of damaged logging (wood) in Germany using Python.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.5%) to scientific vocabulary
Keywords
Repository
A project about an analyzation of a statistic of damaged logging (wood) in Germany using Python.
Basic Info
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 1
Topics
Metadata Files
README.md
DamagedLoggingAnalyzer

A project about of analyzing a statistic of damaged logging wood in Germany using Python.
This is my individual project for the module Research Software Engineering in SS24. The task was to analyze a dataset from genesis.destatis using Python and to find interesting aspects and potential questions that could be explored using this data.
If you are only interested in the results, please jump to the section Damaged Logging.
Installation
bash
git clone git@github.com:HokageM/DamagedLoggingAnalyzer.git
cd DamagedLoggingAnalyzer
pip install .
Usage
Commandline
```bash usage: damagedlogganalyzer [-h] [--version] [--calculate-most-dangerous-reasons] [--plot-reason-dependencies] [--plot-owner-dependencies] [--plot-temporal-dependencies-all] [--predict] [--out-dir OUT_DIR] CSV
Analyzes the data about damaged wood from the CSV file.
positional arguments: CSV Path to the CSV containing the statistic.
options: -h, --help show this help message and exit --version show program's version number and exit --calculate-most-dangerous-reasons Calculates the most dangerous reasons for each specie. --plot-reason-dependencies Create combined plots for each specie and owner combinations for all reasons. Plots will be saved in: output-path/Specie/allreasons/Owner/plot.png. --plot-owner-dependencies Create combined plots for each specie and reason combinations for all owners. Plots will be saved in: output-path/Specie/Reason/allowners/plot.png. --plot-temporal-dependencies-all Create plots for temporal dependencies for each specie, reason and owner combination. Plots will be saved in: output-path/Specie/Reason/Owner/plot.png. Note: use --plot- owner-dependencies and --plot-reason-dependencies. --predict Estimates a death count function using Polynomial Regression with K-Fold Cross Validation to predict the numbers for the year 2024. Plots will be saved in: output- path/Prediction2024/Specie/Reasons/Owner/plot.png.Note: will created a new model for every specie, reason and owner combination. --out-dir OUTDIR Output directory for the plots. ```
Library
The following classes are available:
python
from damagedlogginganalyzer.DamagedLoggingAnalyzer import DamagedLoggingAnalyzer
from damagedlogginganalyzer.CSVAnalyzer import CSVAnalyzer
from damagedlogginganalyzer.Plotter import Plotter
from damagedlogginganalyzer.WoodOracle import WoodOracle
from damagedlogginganalyzer.Oracle import Oracle
The classes CSVAnalyzer and Oracle are independent of this project and can be used for other projects.
Moreover, the classes DamagedLoggingAnalyzer, WoodOracle and Plotter are specific for this project / data set.
Damaged Logging
Note: You can optionally read the notebook storyofthis_project.ipynb to have an interactive experience with the project.
What is the dataset about?
The dataset contains statistics on forest wood harvesting due to various damages in Germany, listed by year, type of wood species groups, and ownership types of forests.
Each entry specifies the volume of wood harvested (in cubic meters) due to different causes such as wind/storm, snow/ice damage, insects, drought, and other reasons.
Why is this dataset interesting?
Here are some interesting aspects and potential questions, which could explore using this data:
Temporal Trends: How has the damage-caused wood harvesting changed over the years? Are there increasing trends in certain types of damage like drought or insects, possibly linked to climate change? How will the year 2024 look like in terms of the volume of wood harvested due to different causes?
Damage Types: Which type of damage causes the most wood harvesting? How do different types of forests compare in their vulnerability to specific damage types?
Forest Management: Are there noticeable differences in wood harvesting due to damage across different forest ownership types (e.g., state-owned vs. privately-owned forests)? This could reflect different management practices and their effectiveness.
Potential Questions (will not be answered in this project):
Impact of Extreme Weather: Are there particular years with exceptionally high damage that could be correlated to extreme weather events or climate anomalies?
Economic and Ecological Impact: What might be the economic impact of these losses? How might these harvesting activities due to damages impact the ecological balance and biodiversity in these forests?
Temporal Trends
Question: How has the damage-caused wood harvesting changed over the years?
I created individual plots for the total volume of wood harvested due to different reasons (drought, wind/storm, snow, insects, miscellaneous, total) and different owners over the years for different types of wood species. Here are some examples (the other plots can be found in the plots/specie/reason/owner/plot.png directory or can be generated with the following command:
bash
damaged_logg_analyzer data/DamagedLoggingWoodFixTable.csv --plot-temporal-dependencies-all --out-dir plots
Deaths of Oak and Red Oak caused by insects and owned by Insgesamt over the years in Germany:

Deaths of Pine caused by insects and owned by Insgesamt over the years in Germany:

Additionally, I created combined plots for the different types of wood species.
Note: In the following, I will only show the combined plots for the different types of wood species and owned by Insgesamt. The other plots can be found in the plots/specie/all_reasons/owner/plot.png directory or can be generated with the following command:
bash
damaged_logg_analyzer data/DamagedLoggingWoodFixTable.csv --plot-reason-dependencies --out-dir plots
Total Oak and Red Oak deaths over the years in Germany:

Total Beech and Hardwood deaths over the years in Germany:

Total Spruce deaths over the years in Germany:

Total Pine deaths over the years in Germany:

Total tree deaths over the years in Germany:

All in all, one can see that the deaths of all species due to the most reasons depend on the year and fluctuate between high and low values.
However, the total deaths of all species are increasing over the years especially for the reasons Sonstiges (miscellaneous), which could be caused by fires, diseases, or other reasons.
The definition of Sonstiges is not clear in the dataset.
Question: Are there increasing trends in certain types of damage like drought or insects, possibly linked to climate change?
All in all, the death of all species due to drought, snow, and insects can be modeled as linear (near constant) functions. Please look in Prediction_2024 for the function estimations.
The deaths of all species due to wind/storm depends on the year and fluctuate between high and low values. But one can see a very high number of deaths due to wind/storm in the year 2006 and 2018.
The deaths of all species due to Sonstiges (miscellaneous) can be modeled quit good with a polynomial function and are increasing over the years.
The total deaths of all species are increasing over the years.
Question: How will the year 2024 look like in terms of the volume of wood harvested due to different causes?
I used polynomial regression with k-fold cross validation to predict the volume of wood harvested due to different causes in the year 2024. Note: The prediction is based on the data from 2006 to 2023. All plots can be found in the Prediction_2024 directory or can be generated with the following command:
bash
damaged_logg_analyzer data/DamagedLoggingWoodFixTable.csv --predict --out-dir path/to/output
The death of all species due to "Sonsitges" (miscellaneous) can be modeled quit good with a polynomial function, e.g. for the Beech and Hardwood species group:

In some cases prediction does not make sense, because the death do not follow a polynomial function and depend on other factors, e.g. death causes by insects:

Deaths due to nature like wind/storm, snow, and drought can be modeled as linear functions, e.g. for the Beech and Hardwood species group. Note: One need to handle the outliers in the data, e.g. the death of the year 2018 for the Beech and Hardwood species group due to wind/storm, this can be done by using a Ridge Regression model. Those outliers come from special events like storms, which are not predictable with the current model.

Damage Types
Question: Which type of damage causes the most wood harvesting?
This is solved by calculating the maximum damage for each type of wood species group, which can be done with the following command:
bash
damaged_logg_analyzer data/DamagedLoggingWoodFixTable.csv --calculate-most-dangerous-reasons
The maximum damage for each type of wood species group is:
| Specie | Reason | Amount | |--------------------------------------------------------|-------------|--------| | Eiche und Roteiche | Wind/ Sturm | 2048 | | Buche und sonstiges Laubholz | Wind/ Sturm | 9124 | | Kiefer und L�rche | Wind/ Sturm | 19806 | | Fichte und Tanne und Douglasie und sonstiges Nadelholz | Sonstiges | 208725 | | Insgesamt | Sonstiges | 218181 |
Question: How do different types of forests compare in their vulnerability to specific damage types?
This is also solved by the following command:
bash
damaged_logg_analyzer data/DamagedLoggingWoodFixTable.csv --calculate-most-dangerous-reasons --plot-temporal-dependencies-all
Analyzing the plots show that Buche und sonstiges Laubholz and Eiche und Roteiche have fewer deaths in any reason
compared to Kiefer und L�rche and Fichte und Tanne und Douglasie und sonstiges Nadelholz.
The death counts are up to 10 times higher for Kiefer und L�rche and Fichte und Tanne und Douglasie und sonstiges Nadelholz
compared to Buche und sonstiges Laubholz and Eiche und Roteiche.
Forest Management
Question: Are there noticeable differences in wood harvesting due to damage across different forest ownership types (e.g., state-owned vs. privately-owned forests)? This could reflect different management practices and their effectiveness.
This is solved by the following command:
bash
damaged_logg_analyzer data/DamagedLoggingWoodFixTable.csv --plot-owner-dependencies
Analyzing the plots show that the deaths due to different reasons are similar for the different owners. So it seems that the owner does not have a big impact on the death count. Here are some examples and the other plots can be found in the plots/specie/reason/all_owners/plot.png:
Deaths of Oak and Red Oak caused by insects and owned by Insgesamt over the years in Germany:

Deaths of Pine caused by insects and owned by Insgesamt over the years in Germany:

Contact Information
If you have any questions, suggestions, or concerns about this project, feel free to contact me:
- LinkedIn: Mike Trzaska
- GitHub: HokageM
Statistic about Damaged Logging
From: genesis.destatis
Statistic Number: 41261-0003
Citation
If you use this software, please cite it as described in the CITATION.cff file.
License
This project is licensed under the MIT License - see the LICENSE file for details.
Owner
- Name: Mike Trzaska
- Login: HokageM
- Kind: user
- Location: Hamburg
- Company: Eppendorf Liquidhandling GmbH
- Repositories: 1
- Profile: https://github.com/HokageM
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: DamagedLoggingAnalyzer
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Mike
family-names: Trzaska
email: m.trzaska663@gmail.com
repository-code: 'https://github.com/HokageM/DamagedLoggingAnalyzer'
abstract: >-
A project about of analyzing a statistic of damaged
logging wood in Germany using Python.
This is my individual project for the module Research
Software Engineering in SS24. The task was to analyze a
dataset from genesis.destatis using Python and to find
interesting aspects and potential questions that could be
explored using this data.
keywords:
- csv
- scikit-learn
- pandas
- polynomial-regression
license: MIT
version: v1.0
date-released: '2024-05-17'
GitHub Events
Total
- Watch event: 1
Last Year
- Watch event: 1
Issues and Pull Requests
Last synced: 11 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- actions/checkout v3 composite
- actions/setup-python v3 composite