fairscape-cli
Data Validation and Packaging utility for sending evidence graphs to FAIRSCAPE
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (13.1%) to scientific vocabulary
Repository
Data Validation and Packaging utility for sending evidence graphs to FAIRSCAPE
Basic Info
- Host: GitHub
- Owner: fairscape
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://fairscape.github.io/fairscape-cli/
- Size: 30.5 MB
Statistics
- Stars: 2
- Watchers: 1
- Forks: 1
- Open Issues: 0
- Releases: 4
Metadata Files
README.md
fairscape-cli
A utility for packaging objects and validating metadata for FAIRSCAPE.
Documentation: https://fairscape.github.io/fairscape-cli/
Features
fairscape-cli provides a Command Line Interface (CLI) that allows the client side to create, manage, and publish scientific data packages:
- RO-Crate Management: Create and manipulate RO-Crate packages locally.
- Initialize RO-Crates in new or existing directories.
- Add data, software, and computation metadata.
- Copy files into the crate structure alongside metadata registration.
- Schema Handling: Define, infer, and validate data schemas (Tabular, HDF5).
- Create schema definition files.
- Add properties with constraints.
- Infer schemas directly from data files.
- Validate data files against specified schemas.
- Register schemas within RO-Crates.
- Data Import: Fetch data from external sources and convert them into RO-Crates.
- Import NCBI BioProjects.
- Convert Portable Encapsulated Projects (PEPs) to RO-Crates.
- Build Artifacts: Generate derived outputs from RO-Crates.
- Create detailed HTML datasheets summarizing crate contents.
- Generate provenance evidence graphs (JSON and HTML).
- Release Management: Organize multiple related RO-Crates into a cohesive release package.
- Initialize a release structure.
- Automatically link sub-crates and propagate metadata.
- Build a top-level datasheet for the release.
- Publishing: Publish RO-Crate metadata to external repositories.
- Upload RO-Crate directories or zip files to Fairscape.
- Create datasets on Dataverse instances.
- Mint or update DOIs on DataCite.
Requirements
Python 3.8+
Installation
console
$ pip install fairscape-cli
Command Overview
The CLI is organized into several top-level commands:
rocrate: Core local RO-Crate manipulation (create, add files/metadata).
schema: Operations on data schemas (create, infer, add properties, add to crate).
validate: Validate data against schemas.
import: Fetch external data into RO-Crate format (e.g., bioproject, pep).
build: Generate outputs from RO-Crates (e.g., datasheet, evidence-graph).
release: Manage multi-part RO-Crate releases (e.g., create, build).
publish: Publish RO-Crates to repositories (e.g., fairscape, dataverse, doi).
Use --help for details on any command or subcommand:
console
$ fairscape-cli --help
$ fairscape-cli rocrate --help
$ fairscape-cli rocrate add --help
$ fairscape-cli schema create --help
Examples
Creating an RO-Crate
Create an RO-Crate in a specified directory:
console
$ fairscape-cli rocrate create \
--name "My Analysis Crate" \
--description "RO-Crate containing analysis scripts and results" \
--organization-name "My Org" \
--project-name "My Project" \
--keywords "analysis" \
--keywords "python" \
--author "Jane Doe" \
--version "1.1.0" \
./my_analysis_crate
Initialize an RO-Crate in the current working directory:
```console
Navigate to an empty directory first if desired
mkdir myanalysiscrate && cd myanalysiscrate
$ fairscape-cli rocrate init \ --name "My Analysis Crate" \ --description "RO-Crate containing analysis scripts and results" \ --organization-name "My Org" \ --project-name "My Project" \ --keywords "analysis" \ --keywords "python" ```
Adding Content and Metadata to an RO-Crate
These commands support adding both the file and its metadata (add) or just the metadata (register).
Add a dataset file and its metadata:
console
$ fairscape-cli rocrate add dataset \
--name "Raw Measurements" \
--author "John Smith" \
--version "1.0" \
--date-published "2023-10-27" \
--description "Raw sensor measurements from Experiment A." \
--keywords "raw-data" \
--keywords "sensors" \
--data-format "csv" \
--source-filepath "./source_data/measurements.csv" \
--destination-filepath "data/measurements.csv" \
./my_analysis_crate
Add a software script file and its metadata:
console
$ fairscape-cli rocrate add software \
--name "Analysis Script" \
--author "Jane Doe" \
--version "1.1.0" \
--description "Python script for processing raw measurements." \
--keywords "analysis" \
--keywords "python" \
--file-format "py" \
--source-filepath "./scripts/process_data.py" \
--destination-filepath "scripts/process_data.py" \
./my_analysis_crate
Register computation metadata (metadata only):
```console
Assuming the script and dataset were added previously and have GUIDs:
Dataset GUID: ark:59852/dataset-raw-measurements-xxxx
Software GUID: ark:59852/software-analysis-script-yyyy
$ fairscape-cli rocrate register computation \ --name "Data Processing Run" \ --run-by "Jane Doe" \ --date-created "2023-10-27T14:30:00Z" \ --description "Execution of the analysis script on the raw measurements." \ --keywords "processing" \ --used-dataset "ark:59852/dataset-raw-measurements-xxxx" \ --used-software "ark:59852/software-analysis-script-yyyy" \ --generated "ark:59852/dataset-processed-results-zzzz" \ ./myanalysiscrate
Note: You would typically register the generated dataset ('processed-results') separately.
```
Register dataset metadata (metadata only, file assumed present or external):
console
$ fairscape-cli rocrate register dataset \
--name "Processed Results" \
--guid "ark:59852/dataset-processed-results-zzzz" \
--author "Jane Doe" \
--version "1.0" \
--description "Processed results from the analysis script." \
--keywords "results" \
--data-format "csv" \
--filepath "results/processed.csv" \
--generated-by "ark:59852/computation-data-processing-run-wwww" \
./my_analysis_crate
Schema Management
Create a tabular schema definition file:
console
$ fairscape-cli schema create \
--name 'Measurement Schema' \
--description 'Schema for raw sensor measurements' \
--schema-type tabular \
--separator ',' \
--header true \
./measurement_schema.json
Add properties to the tabular schema file:
```console
Add a string property (column 0)
$ fairscape-cli schema add-property string \ --name 'Timestamp' \ --index 0 \ --description 'Measurement time (ISO8601)' \ ./measurement_schema.json
Add a number property (column 1)
$ fairscape-cli schema add-property number \ --name 'Value' \ --index 1 \ --description 'Sensor reading' \ --minimum 0 \ ./measurement_schema.json ```
Infer a schema from an existing data file:
console
$ fairscape-cli schema infer \
--name "Inferred Results Schema" \
--description "Schema inferred from processed results" \
./my_analysis_crate/results/processed.csv \
./processed_schema.json
Add an existing schema file to an RO-Crate:
console
$ fairscape-cli schema add-to-crate \
./measurement_schema.json \
./my_analysis_crate
Validation
Validate a data file against a schema file:
```console
Successful validation
$ fairscape-cli validate schema \ --schema-path ./measurementschema.json \ --data-path ./myanalysis_crate/data/measurements.csv
Example failure
$ fairscape-cli validate schema \ --schema-path ./measurementschema.json \ --data-path ./sourcedata/measurements_invalid.csv ```
Importing Data
Import an NCBI BioProject into a new RO-Crate:
console
$ fairscape-cli import bioproject \
--accession PRJNA123456 \
--author "Importer Name" \
--output-dir ./bioproject_prjna123456_crate \
--crate-name "Imported BioProject PRJNA123456"
Convert a PEP project to an RO-Crate:
console
$ fairscape-cli import pep \
./path/to/my_pep_project \
--output-path ./my_pep_rocrate \
--crate-name "My PEP Project Crate"
Building Outputs
Generate an HTML datasheet for an RO-Crate:
```console $ fairscape-cli build datasheet ./myanalysiscrate
Output will be ./myanalysiscrate/ro-crate-datasheet.html by default
```
Generate a provenance graph for a specific item within the crate:
```console
Assuming 'ark:59852/dataset-processed-results-zzzz' is the item of interest
$ fairscape-cli build evidence-graph \ ./myanalysiscrate \ ark:59852/dataset-processed-results-zzzz \ --output-json ./myanalysiscrate/prov/resultsprov.json \ --output-html ./myanalysiscrate/prov/resultsprov.html ```
Release Management
Create the structure for a multi-part release:
```console $ fairscape-cli release create \ --name "My Big Release Q4 2023" \ --description "Combined release of Experiment A and Experiment B crates" \ --organization-name "My Org" \ --project-name "Overall Project" \ --keywords "release" \ --keywords "experiment-a" \ --keywords "experiment-b" \ --version "2.0" \ --author "Release Manager" \ --publisher "My Org Publishing" \ ./mybigrelease
Manually copy or move your individual RO-Crate directories (e.g., experimentacrate, experimentbcrate)
into the ./mybigrelease directory now.
```
Build the release (link sub-crates, update metadata, generate datasheet):
console
$ fairscape-cli release build ./my_big_release
Publishing
Upload an RO-Crate to Fairscape:
```console
Ensure FAIRSCAPEUSERNAME and FAIRSCAPEPASSWORD are set as environment variables or use options
$ fairscape-cli publish fairscape \
--rocrate ./myanalysiscrate \
--username
Works with either directories or zip files
$ fairscape-cli publish fairscape \
--rocrate ./myanalysiscrate.zip \
--username
Publish RO-Crate metadata to Dataverse:
```console
Ensure DATAVERSEAPITOKEN is set as an environment variable or use --token
$ fairscape-cli publish dataverse \
--rocrate ./myanalysiscrate/ro-crate-metadata.json \
--url https://my.dataverse.instance.edu \
--collection mycollectionalias \
--token
Mint a DOI using DataCite:
```console
Ensure DATACITEUSERNAME and DATACITEPASSWORD are set or use options
$ fairscape-cli publish doi \
--rocrate ./myanalysiscrate/ro-crate-metadata.json \
--prefix 10.1234 \
--username MYORG.MYREPO \
--password
Contribution
If you'd like to request a feature or report a bug, please create a GitHub Issue using one of the templates provided.
License
This project is licensed under the terms of the MIT license.
Owner
- Name: fairscape
- Login: fairscape
- Kind: organization
- Repositories: 3
- Profile: https://github.com/fairscape
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: FAIRSCAPE CLI
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Maxwell Adam
family-names: Levinson
email: mal8ch@virginia.edu
affiliation: University of Virginia
orcid: 'https://orcid.org/0000-0003-0384-8499'
- given-names: Sadnan
family-names: Al Manir
affiliation: University of Virginia
email: ma3xy@virginia.edu
orcid: 'https://orcid.org/0000-0003-4647-3877'
- given-names: Timothy
family-names: Clark
email: twclark@virginia.edu
affiliation: University of Virginia
orcid: 'https://orcid.org/0000-0003-4060-7360'
repository-code: 'https://github.com/fairscape/fairscape-cli'
url: 'https://fairscape.net'
abstract: >-
FAIRSCAPE is a FAIRness and AI-readiness service providing
deep provenance graphs and data dictionaries with data
element validation on uploaded data, software, and
computations, with special reference to biomedical
datasets. FAIRSCAPE provenance graphs are represented
using the Evidence Graph Ontology, EVI, an OWL2
representation derived from W3C PROV and specialized for
biomedical research data. The FAIRSCAPE server is a
cloud-ready environment that processes RO-Crate
data+metadata packages produced by the FAIRSCAPE CLI- and
GUI-based clients, registers them with persistent IDs,
decomposes and registers their components, computes
provenance graph entailments of each component, and
integrates the provenance graphs. FAIRSCAPE server
provides a web-based GUI for inspecting metadata,
visualizing provenance graphs, and obtaining downloads
packaged as RO-Crates.
FAIRSCAPE is supported by the U.S. National Institutes of
Health Bridge2AI program under grants OT2OD032742
[Bridge2AI: Cell Maps for AI (CM4AI) Data Generation
Project] and OT2OD032701 [Bridge2AI: Patient-Focused
Collaborative Hospital Repository Uniting Standards
(CHoRUS) for Equitable AI], and by the Frederick Thomas
Fund of the University of Virginia.
keywords:
- FAIR
- Data Science
- EVI
- Ontology
- Provenance
license: MIT
GitHub Events
Total
- Release event: 3
- Delete event: 2
- Push event: 123
- Pull request event: 9
- Create event: 9
Last Year
- Release event: 3
- Delete event: 2
- Push event: 123
- Pull request event: 9
- Create event: 9
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 0
- Total pull requests: 2
- Average time to close issues: N/A
- Average time to close pull requests: less than a minute
- Total issue authors: 0
- Total pull request authors: 2
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 2
- Average time to close issues: N/A
- Average time to close pull requests: less than a minute
- Issue authors: 0
- Pull request authors: 2
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- mlev71 (4)
- coleslaw481 (2)
Pull Request Authors
- mlev71 (11)
- sadnanalmanir (3)
- jniestroy (3)
- pdurbin (1)
- coleslaw481 (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 350 last-month
- Total dependent packages: 3
- Total dependent repositories: 0
- Total versions: 43
- Total maintainers: 2
pypi.org: fairscape-cli
A utility for packaging objects and validating metadata for FAIRSCAPE
- Homepage: https://github.com/fairscape/fairscape-cli
- Documentation: https://fairscape.github.io/fairscape-cli/
- License: Copyright 2023 THE RECTOR AND VISITORS OF THE UNIVERSITY OF VIRGINIA Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
-
Latest release: 1.1.6
published 7 months ago
Rankings
Dependencies
- actions/cache v2 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- actions/checkout v4 composite
- actions/setup-python v4 composite
- python 3.11.6-slim build
- click ^8.1.3
- fairscape-models *
- fairscape-models ^0.1.2
- imageio ^2.27.0
- mkdocs-material ^9.1.18
- pandas ^2.0.0
- prettytable ^3.7.0
- pydantic *
- pyld ^2.0.3
- pyld *
- python ^3.8
- Click *
- prettytable >=3.9.0
- pydantic >=2.5.1