biodt-fair

Documentation, resources, and other materials related to FAIR and open science from the BioDT project.

https://github.com/biodt/biodt-fair

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.4%) to scientific vocabulary

Keywords

biodiversity digital-twin fair open-science ro-crate
Last synced: 4 months ago · JSON representation ·

Repository

Documentation, resources, and other materials related to FAIR and open science from the BioDT project.

Basic Info
Statistics
  • Stars: 4
  • Watchers: 2
  • Forks: 2
  • Open Issues: 0
  • Releases: 0
Topics
biodiversity digital-twin fair open-science ro-crate
Created almost 3 years ago · Last pushed 7 months ago
Metadata Files
Readme License Citation Authors Codemeta

README.md

BioDT - FAIR

About the Project

The Biodiversity Digital Twin (BioDT) project is an EU-funded initiative running from June 2022 to May 2025. The project includes a dedicated Work Package focused on "Improving Quality of Data, Workflows and Models through FAIR Principles".

Throughout the project, we've addressed several aspects of FAIR implementation:

  • Data stream FAIRification
  • Collaboration with research infrastructures (GBIF, DiSSCo, LifeWatch ERIC, eLTER)
  • Semantic mapping
  • Data quality indicators
  • Workflow enhancement

This repository serves as a technical companion to our official project outputs, providing:

  • Working materials and technical resources related to FAIR principles
  • Tools for implementing FAIR across data, software, models, and workflows
  • Open access to resources for the wider biodiversity informatics community
  • Long-term access to these resources beyond the project's conclusion

Publications & Presentations: Available in our Zenodo community. Official Deliverables: Project milestones and deliverables will be available as they become available afer the formal approval of the EU.

BioDT FAIR Deliverables and Milestones

Image created for the BioDT project by @juliancervos

Repository content

The information on this repository is distributed across different sections and pages. The Documentation page: for detailed information about the RO-Crate metadata profiles developed within BioDT. The Issues and Pull requests tabs: as usual for any code git repository, this is aimed at collaborating in the development of the materials in this repository. Issues can be opened to start a discussion for any topic concerning FAIR in BioDT, while Pull Requests capture discussions around specific code contributions.

Prototype Digital Twin (pDT) directories

Each prototype Digital Twin (pDT) has its own dedicated directory in this repository where FAIR-related materials are stored, including metadata descriptions for digital objects under development. For the most current FAIR metadata information, please refer to each pDT's individual repository within the BioDT GitHub organization.

The pDTs and their associated datasets and models vary in maturity levels (for detailes see the papers in RIO Journal collection. Throughout the project, we have identified key challenges and areas for FAIR improvement, implemented solutions within BioDT in collaboration with data providers and research infrastructures, and documented requirements for future collaborations and resource allocation to fully implement FAIR principles.

Example FAIRification: Grassland pDT

The Grassland pDT demonstrates our practical FAIR implementation approach. eLTER grassland data was collated, harmonised, and published on B2Share alongside a static description of the Grassmind model. Using the RO-Crate Python package ro-crate-py, we developed a script to automatically generate RO-Crate metadata files for the datasets relevant to the project, linking them to the model description. These RO-Crate files received their own Persistent Identifiers on B2Share (e.g., http://hdl.handle.net/11304/23a8d7d8-07bb-4405-a01b-96efa3bb09b0).

fdo_profiles/ directory

In this folder, we will store the metadata profiles (and related materials) that we will use in BioDT —which are closely related to the FAIR Digital Objects (FDO) and RO-Crate frameworks. Such profiles are explained in detail on the documentation for the metadata profiles. The profile work has been focused on:

  • Kernel attributes: this is about the attributes that apply to all digital objects in BioDT, regardless of their purpose. It covers fundamental metadata such as IDs, type, author, license...
  • Model attributes: the models/ subdirectory covers the metadata for the main software from each pDT.
  • Dataset attributes: the datasets/ subdirectory includes the profiles for the different types of data used in BioDT.
  • Workflow attributes: the workflows/ subdirectory focuses on the elements that bring everything together (connecting the data to the models, sending jobs to HPC, etc).
  • Mapping Set attributes: the mapping-sets/ subdirectory contains resources for mappings between semantic artefacts.
  • Additionally, other auxiliary resources can be found in the other/ directory.

examples/ directory

This directory contains some materials that have been developed mainly for illustrative purposes. For example:

  • leipzig_workshop/: This subdirectory contains a Jupyter notebook and some metadata files used during the BioDT workshop in Leipzig, Nov 2023. It aims to give an introduction to FAIR and RO-Crate in the context of data for BioDT.
  • dataset_ro-crate.ipynb: It goes over how to turn an existing dataset into an RO-Crate, with descriptions on the main elements of RO-Crate and how some FAIR principles are achieved. To check it out, simply click on the file and go through the text.
  • fdo_examples_basic.ipynb: Short illustration of what FDOs can enable within BioDT, developed as an example for the MS26 milestone. To be further extended with more content (e.g. an RO-Crate example for collection records).
  • fdo_definitions.py: To support the previous notebooks, this contains some example class definitions of FAIR Digital Objects (FDOs) classes for BioDT. This will be further developed as the project progresses to reflect our understanding of how FDOs can function within BioDT.

Usage

This repo contains mostly JSON metadata files and isolated Python scripts taken from other code repositories. Any relevant software dependencies needed to run such scripts can be installed using Poetry (see pyproject.toml).

RO-Crate

The RO-Crate framework has been adopted in the BioDT project to build FAIR Digital Twins and to address the challenges of packaging and describing different digital objects in a machine-actionable and interoperable way. We have created profiles (https://biodt.github.io/biodt-fair/metadata_profiles) which consist of a number of metadata attributes, designed to strike a balance between providing enough details about the digital object they are describing, while remaining as minimal as possible. Most of the attributes come from Schema.org (the standard vocabulary that RO-Crate relies on), yet other initiatives and community standards have been taken into consideration for the attribute selection. The metadata structure provides detailed provenance, including authorship, licensing, and also more type-specific information, such as spatial and temporal coverage (in the case of datasets) or software version and requirements (for models).

Example RO-Crate Profiles

Image created for the BioDT project by @juliancervos

License

European Union Public Licence v. 1.2

Owner

  • Name: BioDT
  • Login: BioDT
  • Kind: organization

Horizon EU Biodiversity Digital Twin

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: biodt-fair
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Julian
    family-names: Lopez Gordillo
    email: julian.lopezgordillo@naturalis.nl
    affiliation: Naturalis Biodiversity Center
    orcid: 'https://orcid.org/0000-0003-0401-5122'
  - given-names: Sharif
    family-names: Islam
    email: sharif.islam@naturalis.nl
    affiliation: Naturalis Biodiversity Center
    orcid: 'https://orcid.org/0000-0001-8050-0299'
  - given-names: Christoph
    family-names: Wohner
    orcid: 'https://orcid.org/0000-0002-0655-3699'
    email: christoph.wohner@umweltbundesamt.at
    affiliation: Environment Agency Austria
identifiers:
  - type: swh
    value: 'swh:1:dir:d7503dfd028fc233b929dd5fdd3e5bee3dd90805'
repository-code: 'https://github.com/BioDT/biodt-fair'
url: 'https://biodt.github.io/biodt-fair/'
abstract: >-
  Documentation, resources, and other materials related to
  FAIR and open science from the BioDT project.
keywords:
  - biodiversity
  - fair
  - digital-twin
  - ro-crate
  - open-science
license: EUPL-1.2

CodeMeta (codemeta.json)

{
  "@context": "https://w3id.org/codemeta/3.0",
  "type": "SoftwareSourceCode",
  "applicationCategory": "Biodiversity",
  "author": [
    {
      "id": "https://orcid.org/0000-0003-0401-5122",
      "type": "Person",
      "affiliation": {
        "type": "Organization",
        "name": "Naturalis Biodiversity Center"
      },
      "email": "julian.lopezgordillo@naturalis.nl",
      "familyName": "Lopez Gordillo",
      "givenName": "Julian"
    }
  ],
  "codeRepository": "https://github.com/BioDT/biodt-fair",
  "contributor": [
    {
      "id": "https://orcid.org/0000-0001-8050-0299",
      "type": "Person",
      "affiliation": {
        "type": "Organization",
        "name": "Naturalis Biodiversity Center"
      },
      "email": "sharif.islam@naturalis.nl",
      "familyName": "Islam",
      "givenName": "Sharif"
    },
    {
      "id": "https://orcid.org/0000-0002-0655-3699",
      "type": "Person",
      "affiliation": {
        "type": "Organization",
        "name": "Environment Agency Austria"
      },
      "email": "christoph.wohner@umweltbundesamt.at",
      "familyName": "Wohner",
      "givenName": "Christoph"
    }
  ],
  "description": "Documentation, resources, and other materials related to FAIR and open science from the BioDT project.",
  "funder": {
    "type": "Organization",
    "name": "European Comission"
  },
  "identifier": "https://archive.softwareheritage.org/swh:1:dir:d7503dfd028fc233b929dd5fdd3e5bee3dd90805",
  "keywords": [
    "biodiversity",
    "fair",
    "digital-twin",
    "ro-crate"
  ],
  "license": "https://spdx.org/licenses/EUPL-1.2",
  "name": "biodt-fair",
  "programmingLanguage": "Python 3",
  "developmentStatus": "inactive",
  "funding": "HORIZON-RIA"
}

GitHub Events

Total
  • Issues event: 5
  • Delete event: 2
  • Issue comment event: 5
  • Push event: 53
  • Create event: 2
Last Year
  • Issues event: 5
  • Delete event: 2
  • Issue comment event: 5
  • Push event: 53
  • Create event: 2

Committers

Last synced: almost 2 years ago

All Time
  • Total Commits: 20
  • Total Committers: 1
  • Avg Commits per committer: 20.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 20
  • Committers: 1
  • Avg Commits per committer: 20.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Julian Lopez Gordillo j****v@g****m 20

Issues and Pull Requests

Last synced: almost 2 years ago

All Time
  • Total issues: 3
  • Total pull requests: 0
  • Average time to close issues: 10 days
  • Average time to close pull requests: N/A
  • Total issue authors: 2
  • Total pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 3
  • Pull requests: 0
  • Average time to close issues: 10 days
  • Average time to close pull requests: N/A
  • Issue authors: 2
  • Pull request authors: 0
  • Average comments per issue: 0.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • juliancervos (4)
  • sharifX (2)
  • jgrieb (1)
Pull Request Authors
  • sharifX (1)
Top Labels
Issue Labels
question (1) pdt:ces (1) ri:gbif (1) vocabs (1)
Pull Request Labels