Science Score: 72.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
✓DOI references
Found 2 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
✓Committers with academic emails
1 of 2 committers (50.0%) from academic institutions -
✓Institutional organization owner
Organization nerc-ceh has institutional domain (www.ceh.ac.uk) -
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.7%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: NERC-CEH
- License: other
- Language: R
- Default Branch: master
- Size: 91.7 MB
Statistics
- Stars: 2
- Watchers: 4
- Forks: 1
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Tracking Land-Use Change
This is an open, reproducible, computational research project on land-use change in the UK.
The code is developed by:
based on a combination of targets and workflowr packages in R.
The background to the method for the estimation of land-use change is described in this paper.
Project dependencies
This is an open, shareable, reproducible, computational research project.
All the computational work and document preparation is done with the R statistical computing environment.
The research project is contained in a single directory, with the exception that some data sets are too large to store on GitHub.
We use the
renvpackage to manage the R package versions used by the projectWe are using the
targetspackage to structure the project so that the work is computationally reproducible.The project code and documents are shared publicly on GitHub at https://github.com/NERC-CEH/luct
The main report is produced using
bookdownand shared publicly on GitHub at https://nerc-ceh.github.io/luct/We are exploring the
workflowrpackage to structure the project so that all the materials and outputs are available via an openly accessible, automatically generated website. However, GitHub cannot currently show both thebookdownandworkflowrwebsite documents simultaneously, so this is still under investigation.
Workflow management
The project uses the R targets
package to structure and manage the workflow and to make it reproducible.
Central to this is the idea of the workflow as a "pipeline" - a defined
list of functions which transform data.
Here, the core pipeline contains the computational steps that read, reformat
and process the input data (time series and maps of land use and land-use
change data), and run the data assimilation steps that estimate the matrices
of land-use change, and produce the maps of past land use.
Potentially there can be multiple pipelines, which produce other analyses,
reports, or publications, in addition to the core process.
These are used to generate documentation in
the form of web pages with the workflowr package, but are not discussed further
here.
The core pipeline is defined in the file _targets.R as a list of "targets".
The targets represent the steps in the series of computations which make up the
pipeline.
A target is defined with the syntax tar_target(target_name, function_name(inputs)).
The target is thus a named R data object which is the outcome of a named function
with specified inputs. The one exception to this is that the target may simply be
a file for input or output.
In the current project, the core pipeline is a list of 87 targets which specify the
input files, the reformatting and transformation of these data, and subsequent
calculations which make up the data assimilation algorithm.
The pipeline is managed using a "Make"-like
procedure, which analyses the dependencies between the different steps in the pipeline.
If there have been no changes to the code in the target functions or input data since
the last time it was run, it identifies that everything is up-to-date, and no further
computation is needed. If any the source code of target function or the content of any
data file has changed, it identifies which parts of the pipeline are affected by this,
and all the dependencies are recomputed. This has several advantages: forcing the workflow
to be declared at a higher level of abstraction; only running the necessary computation,
so saving run-time for tasks that are already up to date; and most importantly,
providing tangible evidence that the results match the underlying code and data,
and confirm the computation is reproducible. So as to identify changes, each target is
represented by its
hash value, stored in the
_targets directory.
Project directory structure
_targets directory
This directory is managed by the targets package. It contains the
metadata describing the status of the computational pipelines and the
cached results of those computations.
analysis directory
workflowr creates a set of
standard directories. See the package documentation for details on how
these directories are used. The analysis contains
rmarkdown notebooks which document
the workflow. These are still in development.
R directory
This contains the bulk of the R source code for the functions used in the project.
data-raw directory
This contains the raw data files for the project, in their original form as far as possible. To avoid duplication, this is a symbolic link to an earlier iteration However, many of these are too large to share via GitHub, and would need to be shared by another mechanism (e.g. as binary assets).
data directory
This contains the processed data files resulting from transformations of the raw data. This typically involves reprojection, reclassification, filtering and unit conversions. Again, many of these are too large to share via GitHub.
docs directory
This contains the html web pages generated by the Rmarkdown files in
with workflowr or bookdown.
output directory
This contains output files from the project, the results of the data assimlation.
slurm directory
This contains files for the steps which require high-performance computing, run via slurm, the widely used job scheduling system on HPC systems. These are generic enough to run on any HPC machine with slurm, and have been run on both JASMIN and POLAR, althouh the queue names, number of processors and memory limits will be system-specific.
manuscripts directory
The report is prepared and formatted using
bookdown in a subdirectory of
manuscripts that contains all
the necessary infrastructure files (templates, bibliographies, etc.).
renv directory
The renv package keeps track of the
R packages (and their versions) used by the project. It allows anyone to
reinstate the same packages and versions in their local copy of the
project.
The renv directory contains the information need by renv to
reinstate the local package environment
.gitignore
.gitignore in the R project root directory is used for all manual
entries so that all the manual rules are in one place. Packages, such as
renv, may create their own .gitignore files in subdirectories that
they manage.
Installation
Assuming you already have a current version of R installed, clone the project repository https://github.com/NERC-CEH/luct from GitHub.
When you open the project, you may get warning messages about packages
not being installed. This is because you need to use the
renv package to reinstate the
packages that are used by the project.
Install
renvin that project if it is not already installedUse
renv::restore()to install all the needed packages in the project-specific library:renv::restore()
Get data
Any files in data, output and _targets that are more than
trivially small are not shared via Git and GitHub. They will be shared
via a separate, yet to be determined, mechanism (e.g.
Zenodo).
renv collaboration
The renv package is used to keep track of the installed packages and
their versions. See the renv collaboration
guide or
the workflow for synchronising package environments between
collaborators.
Still to do
- More detailed setup instructions and notes should go in this project-level
READ.mdfile. - The
README.mdfiles in the subdirectories are currently generic, but should describe the purpose of each subdirectory and the files in that directory.
Acknowledgements
The website is based on a template by Ross W. Gayler
Owner
- Name: UK Centre for Ecology & Hydrology
- Login: NERC-CEH
- Kind: organization
- Location: UK
- Website: http://www.ceh.ac.uk/
- Repositories: 155
- Profile: https://github.com/NERC-CEH
Citation (CITATION)
The project is currently work-in-progress so please contact us first if you wish to cite it. As work is finalised we will add preferred citations.
GitHub Events
Total
- Watch event: 1
- Create event: 1
Last Year
- Watch event: 1
- Create event: 1
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| ADCEH\plevy | p****y@c****k | 53 |
| plevy | p****y@l****l | 15 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 9 months ago
All Time
- Total issues: 0
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 0
- Total pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0