tool-discovery

The CLARIAH tool discovery repository holds the Tool Source registry, configurations for the software metadata harvesting pipeline.

https://github.com/clariah/tool-discovery

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    2 of 17 committers (11.8%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.5%) to scientific vocabulary

Keywords from Contributors

linked-data semantic-web linked-data-api open-api sparql swagger-ui
Last synced: 9 months ago · JSON representation

Repository

The CLARIAH tool discovery repository holds the Tool Source registry, configurations for the software metadata harvesting pipeline.

Basic Info
Statistics
  • Stars: 12
  • Watchers: 6
  • Forks: 8
  • Open Issues: 2
  • Releases: 12
Created about 5 years ago · Last pushed 11 months ago
Metadata Files
Readme Contributing Codemeta

README.md

GitHub build Project Status: Active -- The project has reached a stable, usable state and is being actively developed.

CLARIAH Tool Discovery

This repository contains everything related to tool discovery and software metadata in CLARIAH. * Dockerfile: The docker container for the CLARIAH Tool Discovery pipeline, including both the harvester and the server and API powering the CLARIAH Tool Store. * source-registry/: The tool source registry, contains the source repositories locations and service endpoints for all CLARIAH tools. This is open for contributions * etc/, static/: supporting files for the deployment at * legacy/cmdi/: Contains legacy CMDI metadata as gathered in WP3 task MD4T at Utrecht University

Service

The tool discovery service, consisting of a harvester that runs on regular intervals (each night) and a tool store, is deployed at https://tools.clariah.nl (production, may not be available yet at this time!) and https://tools.dev.clariah.nl (development).

All harvested data is also available as individual files via https://tools.dev.clariah.nl/files/

Links

Usage

For CLARIAH (local development):

docker build -t clariah-tool-discovery . docker run -itd -p 8080:80 --env-file=local-dev.env --name=cm-srv -v codemeta_volume:/tool-store-data --restart=unless-stopped clariah-tool-discovery

We recommend you to also pass an extra --env GITHUB_TOKEN=.......... or you will likely hit GitHub's API rate limit during harvestinh. Similarly you can pass a ZENODO_ACCESS_TOKEN

More generic:

docker build -t codemeta-server-tool --build-arg nginx_pass=some_password . docker run -itd -p 80:80 --env-file=my-env.env --name=cm-srv -v codemeta_volume:/tool-store-data --restart=unless-stopped codemeta-server-tool

To use local yamls for sources harvesting (rather than a remote git repo); add to run -v $PWD/source-registry/:/usr/src/source-registry/source-registry/ and set LOCAL_SOURCE_REGISTRY=true in my-env.env.

Event-based collection, i.e. allowing clients to pushing codemeta files, can be enabled by setting --env-arg UPLOADER=true, you can then POST your codemeta.json file with curl -u <nginx-user> -XPOST -H "Content-Type: application/json" -dcodemeta.json -u user <url>/rest/

For private git repo add to docker run -e GIT_USER='youruser' -e GIT_PASSWORD='yourtoken' To clean up remove the volume codemeta_volume

Integration: API usage instructions

If you want to query the Tool Store from other software, please read this document for instructions on how to use our SPARQL endpoint.

Owner

  • Name: CLARIAH
  • Login: CLARIAH
  • Kind: organization

CLARIAH offers humanities scholars a Common Lab providing access to large collections of digital resources and innovative tools for research

CodeMeta (codemeta.json)

{
  "@context": [
    "https://w3id.org/codemeta/3.0",
    "http://schema.org",
    "https://w3id.org/software-types",
    "https://w3id.org/software-iodata"
  ],
  "author": {
    "@id": "https://orcid.org/0000-0002-1046-0006",
    "affiliation": {
      "@id": "https://huc.knaw.nl"
    },
    "familyName": "van Gompel",
    "givenName": "Maarten",
    "url": "https://proycon.anaproy.nl",
    "@type": "Person"
  },
  "@id": "https://github.com/CLARIAH/tool-discovery.git",
  "@type": "SoftwareSourceCode",
  "name": "CLARIAH Tool Discovery",
  "version": "2.0.0",
  "dateCreated": "2022-01-05T16:21:48Z",
  "dateModified": "2025-03-07T12:45:44Z",
  "description": "This is the over-arching project for CLARIAH Tool Discovery, its components harvest and aggregate codemeta from source repositories and service endpoints, automatically converting known metadata schemes in the process. This project holds the Tool Source Registry, pointing to all the tools that are to be harvested. It also holds the validation schema.",
  "developmentStatus": [
    "https://www.repostatus.org/#active",
    "https://w3id.org/research-technology-readiness-levels#Level8Complete"
  ],
  "codeRepository": "https://github.com/CLARIAH/tool-discovery.git",
  "contIntegration": "https://github.com/CLARIAH/tool-discovery/actions/",
  "license": "https://spdx.org/licenses/GPL-3.0-only",
  "identifier": "tool-discovery",
  "issueTracker": "https://github.com/CLARIAH/tool-discovery/issues",
  "readme": "https://github.com/proycon/CLARIAH/tool-discovery/blob/master/README.md",
  "softwareHelp": [
    {
      "@type": "HowTo",
      "name": "Contributor Guidelines",
      "description": "This explains how to add your own software to be harvested by us and answers various Frequently Asked Questions.",
      "url": "https://github.com/CLARIAH/tool-discovery/blob/master/CONTRIBUTING.md"
    },
    {
      "@type": "TechArticle",
      "name": "Software Metadata Requirements",
      "description": "This document specifies the technical and organisational requirements for your software metadata, including the precise form in which it can be supplied.",
      "url": "https://github.com/CLARIAH/clariah-plus/blob/main/requirements/software-metadata-requirements.md"
    },
    {
      "@type": "TechArticle",
      "name": "Querying the CLARIAH Tool Store for integration in other software",
      "description": "This document is intended for other developers and explains that want to make use of the information that has been harvested and aggregated in the tool discovery pipeline. It explains the API available in our backend.",
      "url": "https://github.com/CLARIAH/tool-discovery/blob/master/API_USAGE.md"
    }
  ],
  "applicationCategory": [
    "https://vocabs.dariah.eu/tadirah/exploration",
    "https://vocabs.dariah.eu/tadirah/browsing",
    "https://vocabs.dariah.eu/tadirah/discovering",
    "https://vocabs.dariah.eu/tadirah/gathering",
    "https://w3id.org/nwo-research-fields#SoftwareForHumanities",
    "https://w3id.org/nwo-research-fields#DatabasesForHumanities"
  ],
  "keywords": [
    "metadata",
    "codemeta",
    "rdf",
    "software metadata",
    "schema.org",
    "linked data",
    "harvester"
  ],
  "maintainer": {
    "@id": "https://orcid.org/0000-0002-1046-0006",
    "affiliation": {
      "@id": "https://huc.knaw.nl"
    },
    "familyName": "van Gompel",
    "givenName": "Maarten",
    "url": "https://proycon.anaproy.nl",
    "@type": "Person"
  },
  "producer": {
    "@id": "https://huc.knaw.nl",
    "@type": "Organization",
    "name": "KNAW Humanities Cluster",
    "url": "https://huc.knaw.nl",
    "parentOrganization": {
      "@id": "https://knaw.nl",
      "@type": "Organization",
      "name": "KNAW",
      "url": "https://knaw.nl",
      "location": {
        "@type": "Place",
        "name": "Amsterdam"
      }
    }
  },
  "programmingLanguage": "shell",
  "screenshot": [
    "https://raw.githubusercontent.com/proycon/codemeta-server/master/screenshot_index_cards.jpg",
    "https://raw.githubusercontent.com/proycon/codemeta-server/master/screenshot_index_table.jpg",
    "https://raw.githubusercontent.com/proycon/codemeta-server/master/screenshot_page.jpg",
    "https://raw.githubusercontent.com/proycon/codemeta-server/master/screenshot_sparql.jpg"
  ],
  "softwareRequirements": [
    {
      "@id": "/dependency/codemetapy",
      "@type": "SoftwareApplication",
      "identifier": "codemetapy",
      "name": "codemetapy",
      "runtimePlatform": "Python 3"
    },
    {
      "@id": "/dependency/codemeta-harvester",
      "@type": "SoftwareApplication",
      "identifier": "codemeta-harvester",
      "name": "codemeta-harvester"
    },
    {
      "@id": "/dependency/codemeta-server",
      "@type": "WebApplication",
      "identifier": "codemeta-server",
      "name": "codemeta-server"
    }
  ],
  "isSourceCodeOf": {
    "@type": "WebApplication",
    "name": "CLARIAH Tools",
    "url": "https://tools.clariah.nl",
    "description": "This is a web portal where you can find all tools (i.e. software and software services) developed in the CLARIAH project, as well as some tools from predecessors and sister projects. This list is automatically harvested from the tool producers and providers themselves, and updated daily. Our tools are designed for researchers and developers in the Humanities and Social Sciences. Not all tools are suitable for all audiences and not all tools are mature and stable, this information should be clearly indicated for each tool, so you can make an informed judgement whether a tool might be suitable for you.",
    "provider": "https://huc.knaw.nl"
  },
  "funding": {
    "@type": "Grant",
    "name": "CLARIAH-PLUS (NWO grant 184.034.023)",
    "funder": {
      "@type": "Organization",
      "name": "NWO",
      "url": "https://www.nwo.nl"
    }
  }
}

GitHub Events

Total
  • Issues event: 1
  • Watch event: 3
  • Issue comment event: 15
  • Push event: 33
  • Pull request event: 22
  • Fork event: 6
  • Create event: 1
Last Year
  • Issues event: 1
  • Watch event: 3
  • Issue comment event: 15
  • Push event: 33
  • Pull request event: 22
  • Fork event: 6
  • Create event: 1

Committers

Last synced: over 1 year ago

All Time
  • Total Commits: 293
  • Total Committers: 17
  • Avg Commits per committer: 17.235
  • Development Distribution Score (DDS): 0.143
Past Year
  • Commits: 32
  • Committers: 2
  • Avg Commits per committer: 16.0
  • Development Distribution Score (DDS): 0.031
Top Committers
Name Email Commits
Maarten van Gompel p****n@a****l 251
Jaap Blom j****m@b****l 8
mlongobardo-gituname u****n 6
Dirk Roorda t****n@i****m 5
David de Boer d****d@d****l 4
Peter Kleiweg p****g@r****l 3
Bram Buitendijk b****k@d****l 3
Kathrin Dentler k****r@t****c 2
Odijk j****k@u****l 2
Menzo Windhouwer m****r@d****l 2
kerim1 k****r@d****l 1
jessededoes d****s@x****l 1
Xander Wilcke w****e@v****l 1
Rik D.T. Janssen 1****n 1
Richard Zijdeman r****n@i****l 1
Hayco de Jong h****g@d****l 1
mwigham 3****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 9 months ago

All Time
  • Total issues: 10
  • Total pull requests: 52
  • Average time to close issues: 19 days
  • Average time to close pull requests: 7 days
  • Total issue authors: 6
  • Total pull request authors: 20
  • Average comments per issue: 1.9
  • Average comments per pull request: 1.1
  • Merged pull requests: 51
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 23
  • Average time to close issues: N/A
  • Average time to close pull requests: 1 day
  • Issue authors: 1
  • Pull request authors: 5
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.61
  • Merged pull requests: 23
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • proycon (4)
  • ddeboer (2)
  • pebbe (1)
  • firmao (1)
  • kequach (1)
  • rlzijdeman (1)
Pull Request Authors
  • firmao (16)
  • jblom (7)
  • brambg (3)
  • pebbe (2)
  • JanOdijk (2)
  • xmichele (2)
  • jan-niestadt (2)
  • dirkroorda (2)
  • mwigham (2)
  • menzowindhouwer (2)
  • BeritJanssen (2)
  • ddeboer (2)
  • jrvosse (1)
  • JessedeDoes (1)
  • rlzijdeman (1)
Top Labels
Issue Labels
bug (5) ready (4) question (1)
Pull Request Labels

Dependencies

schemas/shacl/requirements.txt pypi
  • pyshacl *
.github/workflows/shacl.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v3 composite
Dockerfile docker
  • proycon/codemeta-harvester $BASETAG build