template-repo-data-analysis

Reusable, project-oriented data analysis template designed to align with the FAIR principles. The template offers a structured and scalable approach for managing scientific data and code, particularly suited for collaborative and open science environments.

https://github.com/agrdatasci/template-repo-data-analysis

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 3 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.5%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Reusable, project-oriented data analysis template designed to align with the FAIR principles. The template offers a structured and scalable approach for managing scientific data and code, particularly suited for collaborative and open science environments.

Basic Info
  • Host: GitHub
  • Owner: AgrDataSci
  • License: cc-by-4.0
  • Language: R
  • Default Branch: main
  • Homepage:
  • Size: 665 KB
Statistics
  • Stars: 2
  • Watchers: 3
  • Forks: 0
  • Open Issues: 0
  • Releases: 2
Created about 3 years ago · Last pushed 11 months ago
Metadata Files
Readme License Citation

README.md

License DOI <!-- badges: end -->

A FAIR and project-oriented template for open science data workflows

Kauê de Sousa, Marie-Angélique Laporte

Here we introduce a reusable, project-oriented data analysis template designed to align with the FAIR principles (Findable, Accessible, Interoperable, and Reusable). The template offers a structured and scalable approach for managing scientific data and code, particularly suited for collaborative and open science environments. It integrates good practices from R-based workflows, GitHub–Zenodo integration, and metadata standards including DataCite and CFF.


Objectives

The template aims to:

  • Improve consistency and reproducibility across research projects.
  • Ensure research outputs are FAIR and ready for long-term archiving.
  • Provide a ready-to-use metadata structure for data and code publication.
  • Support automated workflows and interoperability with platforms like Zenodo, GitHub, and institutional repositories.

Repository structure

The project follows a project-oriented layout inspired by best practices in the R community.

```text template-repo-data-analysis/ ├── data/ # Anonimized raw and cleaned datasets ├── docs/ # Reports or additional documentation ├── metadata/ # Metadata files and templates │ ├── project-metadata.xlsx # Excel file with project metadata │ ├── project-metadata.json # JSON file with project metadata for DataCite │ ├── example-metadata-data-mip-uganda.csv # Description of the dataset used as example │ └── README.md
├── output/ # Model results, figures, tables ├── script/ # Scripts for validation, metadata generation, etc. ├── .gitignore # Indicates which files or folders to exclude from version control ├── LICENSE # A valid license file stablishing the rights to use the data ├── CITATION.cff # Used by GitHub and Zenodo to generate citation metadata ├── template-repo-data-analysis.Rproj # RStudio file to set up the environment (must be renamed) └── README.md # Project overview

```

Best practices incorporated

  • Project-oriented structure.
  • FAIR-compliant metadata with Zenodo and DataCite standards.
  • Automation via R scripts for JSON conversion and validation.
  • Citation support with CITATION.cff file.
  • Clear separation of raw data, outputs, scripts, and documentation.
  • GitHub–Zenodo integration for DOI minting.

Metadata management

The metadata/ folder contains:

  • metadata.xlsx: Main spreadsheet with sheets for general metadata, authors, funders, dates, and communities.
  • metadata.json: DataCite-compliant metadata for institutional or repository submission.
  • CITATION.cff: Used by GitHub and Zenodo to generate citation metadata (it must be placed on the main root).
  • README.md: Guide to using the metadata tools.

How to use this template

  1. Clone or fork the repository and open it in RStudio.
  2. Complete metadata.xlsx using the provided structure.
  3. Add your data
  4. Run your analysis
  5. Run the R scripts from the script/ folder to:
    • Validate metadata
    • Convert to JSON
  6. Publish to Zenodo:
    • Enable the GitHub repo in your Zenodo GitHub settings.
    • Create a GitHub release. Zenodo will use the CITATION.cff to generate the DOI metadata.

Reusability and extensions

This template is suitable for: - Research projects requiring data publication. - Collaborative projects across CGIAR and university partners. - Students and researchers new to reproducible data workflows.

The structure is extensible and can support: - Additional metadata schemas and controled vocabularies (e.g., Dublin Core, DCAT) - Workflow automation using GitHub Actions or R scripts - Integration with institutional data catalogs


Contribute

This template is open for improvement! You can: - Suggest edits via GitHub Issues - Fork the repository for your own project - Contribute back improvements via pull request


References

Owner

  • Name: AgrDataSci
  • Login: AgrDataSci
  • Kind: organization
  • Email: k.desousa@cgiar.org
  • Location: France

We develop methods and tools to support sustainable food systems, rural development and digital inclusion

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this dataset or software, please cite it as below."
title: "A FAIR and project-oriented template for open science data workflows"
version: "v1.2"
date-released: 2025-06-09
authors:
  - family-names: de Sousa
    given-names: Kauê
    orcid: https://orcid.org/0000-0002-7571-7845
    affiliation: "Bioversity International; University of Inland Norway"
  - family-names: Laporte
    given-names: Marie-Angélique
    orcid: https://orcid.org/0000-0002-8461-9745
    affiliation: "Bioversity International"
repository-code: https://github.com/AgrDataSci/template-repo-data-analysis
license: CC-BY-4.0
keywords:
  - data-driven research
  - FAIR principles
  - data management
  - open science

GitHub Events

Total
  • Release event: 2
  • Issue comment event: 1
  • Push event: 31
  • Pull request event: 2
  • Create event: 3
Last Year
  • Release event: 2
  • Issue comment event: 1
  • Push event: 31
  • Pull request event: 2
  • Create event: 3

Committers

Last synced: 12 months ago

All Time
  • Total Commits: 11
  • Total Committers: 1
  • Avg Commits per committer: 11.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Kauê de Sousa d****e@g****m 11

Issues and Pull Requests

Last synced: about 1 year ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • marieALaporte (1)
Top Labels
Issue Labels
Pull Request Labels