mdl_build_census_database
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.9%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: luckinet
- License: gpl-3.0
- Language: R
- Default Branch: main
- Size: 728 KB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
mdlbuidcensus_database
Rationale
This module has the purpose of building a harmonized database of census statistics on areal land use, forestry and agricultural commodities (crops and livestock). Basic data are sourced from the FAO and more detailed data are taken from open sources such as national or pan-regional statistical agencies (databases and yearbooks) or other collation efforts that produce sub-national census datasets (such as countryStat, FAO Data Lab).
The input data
Methods
Meta data
The module-specific meta data capture ...
```
----
geography : Brazil
spatial : Nation, Estado, Municipality
period : (1974)1990 - 2022
variables :
- land : hectares_covered
- crops : hectaresplanted, hectaresharvested, tonsproduced, kiloPerHectareyield
- livestock : number_heads
- tech : -
- social : -
sampling : survey, census
----
```
Tools
The output
Change-Log
Please find a documentation of recent changes here.
Acknowledgements
other snippets
Scripts (in the folder '/src') are organised either per data-series (such as fao, countrystat or eurostat) or per nation. Each script follows a clearly defined template, where
1) the meta-data are recorded,
2) geometries (if available) and data tables are recorded and
3) geometries (if available) and data tables are normalized (whereby territory names are matched with the gazetteer, commodities/land-use concepts are matched with the LUCKINet land-use ontology and tables are translated to a common standard via tabshiftr).
After collecting all information in a harmonized database some further steps are required. The final script 99_make_database.R carries these out:
- summarize values per territorial unit, in case they were double reported or when external concepts had to be harmonized so that several external concepts refer to the same harmonized concept.
- optionally interpolate missing values (depending on the model run)
- carry out checks that ensure the patterns are within reasonable bounds.
- determine quality flags for provenance documentation.
Database structure
Each script produces an *.rds-file that contains a data-frame of the harmonized data tables and a geopackage (*.gpkg) file of the geometry associated to those data (typically based on GADM). Each harmonized table then contains the following columns:
| name | type | description |
|:---------- |:--------- |:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| tabID | integer | the identifier of the specific table (see inv_tables.csv) from which the observation originates |
| geoID | integer | the identifier of the specific geometry data-series to which the observation is associated/where it occurs |
| gazID | integer | the administrative hierarchy identifier |
| gazName | character | the (hierarchical) name of the territorial unit. This is a combination of all the parents up to the territory in question |
| gazMatch | character | the match between the harmonized territorial unit and the external territorial unit |
| year | YYYY | the year in which the census observation has been recorded |
| ontoID | character | the identifier of the land use concept |
| ontoName | character | the (hierarchical) name of the land use concept |
| ontoMatch| character | the match between the harmonized land use concept and the external land use concept |
| harvested | numeric | the area that was harvested hectare |
| planted | numeric | the area that was planted hectare |
| area | numeric | either the area of landcover or land use or in case an agricultural commodity is quantified only in coarse detail without specification of whether it is measured by harvested or planted area [hectare] |
| production | numeric | the production quantity tonnes |
| yield | numeric | the yield production per harvested area |
| headcount | numeric | the number of animals (for livestock only) |
| ... | numeric | possibly other variables that are also reported and which may give some indication of or hint at the above variables |
Each geometry contains a layer per territorial level with an associated attribute table that has the following columns:
| name | type | description | | :------- | :-------- | :------------------------------------------------------------------------------------------------------------------------ | | fid | integer | territorial unit identifier | | gazID | integer | the administrative hierarchy identifier | | gazName | character | the (hierarchical) name of the territorial unit. This is a combination of all the parents up to the territory in question | | gazClass | numeric | the class to which the territorial units are associated in the gazetteer | | match | character | the match of the harmonised and the external territorial concept | | external | character | the (hierarchical) name of the external territorial unit | | geoID | integer | the identifier of the geometry dataseries from which the territory originates | | geom | geometry | the geometric information of the territorial unit (simple features standard) |
script structure
Owner
- Name: LUCKINet
- Login: luckinet
- Kind: organization
- Website: https://www.idiv.de/en/luckinet.html
- Repositories: 3
- Profile: https://github.com/luckinet
Welcome to the "land-use change knowledge integration" networks' software repository
Citation (CITATION.cff)
cff-version: 1.2.0
title: luckinet - build census database
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Steffen
family-names: Ehrmann
email: steffen.ehrmann@posteo.de
affiliation: >-
Deutsches Zentrum für integrative
Biodiversitätsforschung (iDiv) Halle-Jena-Leipzig
orcid: 'https://orcid.org/0000-0002-2958-0796'
repository-code: 'https://github.com/luckinet/mdl_build_census_database'
repository: 'https://github.com/luckinet/loca'
abstract: >-
This is the main script for building a database of
(national and sub-national) census data for all crop and
land-use dimensions of LUCKINet and all livestock
dimensions of GPW. It is a module of the LOCA (LUCKINet
overall computation algorithm) pipeline and depends on
input from other files in the repository
https://github.com/luckinet/loca.
keywords:
- census data
- subnational
- landuse
- livestock
- crop
- production
- harvested area
- planted area
- luckinet
- global pasture watch
license: CC-BY-4.0
version: 0.7.0
date-released: '2025-03-14'
GitHub Events
Total
- Release event: 1
- Push event: 11
- Create event: 1
Last Year
- Release event: 1
- Push event: 11
- Create event: 1