Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (9.2%) to scientific vocabulary
Last synced: 7 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: BAMresearch
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 104 KB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 3
Created over 2 years ago · Last pushed over 2 years ago
Metadata Files
Readme License Citation

README.md

ExtPybis

Motivation

ExtPybis is an extension of the Pybis library meant for improving data synchronisation with an openBIS datastore.

Glossary

Disclaimer:

The terms sample and objects are aliases for each other, you can use them synonymously within the pybis environment. We use Sample and Collection primarily in ExtPybis

  • ExtPybis - A Subclass of the Openbis class in the pybis library. It is used for connecting to the openBIS database and for all database accesses.
  • Space - The highest "directory" abstraction in the datastore. Contains all other openBIS "directories". One space is defined per user or per group.
  • Project - Second highest "directory" abstraction in the datastore. Contained in a space, you can define infinitely many of them.
    • Example: UCT_Compression
  • Collection / Experiment - Third highest "directory" abstraction in the datastore. Contained in a project, you can define infinitely many of them.
    • Example: UCTCompressionTestseries01
  • Sample Type / Object Type - A blueprint defining the general structure of a specific experimental set-up (metadata, relations ... ), you need to define in order to upload samples / objects of real world experiments.
    • Example: EXPERIMENTALSTEPUCT
  • Sample / Object - An abstract representation of your experiment with your defined metadata also called Experimental Step in the openBIS database. Contained in a collection, you can define infinitely many of them.
    • Example: UCTCompressionExperiment1710_2022
  • Dataset - An openBIS abstraction containing your data files like Excel sheets, csvs or dats with some metadata information.
    • Example: UCTCompressionExperiment17102022RAW_DATA
  • Property - One piece of metadata you use do describe your samples / objects.
    • Example: NAME
  • Parents of a sample / object - You can define another sample / object to be a parent of a different sample / object.
    • Example: A concrete mixture could be a parent of a concrete compression experiment
    • Note: Also defined are Children with the same relation in the other direction.

You can look at the definition from pybis here

Indentification in openBIS

Each sample / object in openBIS will automatically ge - a Permid - A unique number - a Code - A unique name consists of the sample type's / object type's prefix plus a counting up number - an Identifier - Path SPACE/PROJECT/CODE - a Path - Directory path to current sample / object usually SPACE/PROJECT/COLLECTION/CODE. - a Type - Name of the sample type / object type of current sample / object. - a Experiment - Directory path to collection, if current sample / object is part of a collection / experiment.

The permid or the code can be used to clearly identify a specific sample / object and e.g. link it to others.

Structure

ExtPybis

Is an extension of the existing openbis class defined in pybis and as such extends all pybis functionality. ExtPybis provides additional functionality for getting an overview of the datastore and find out what the structure of the database looks like. ExtPybis also makes it easier to search for existing samples / objects and retrieve their identifiers from their metadata like NAME.

Example Workflow of ExtPybis

1. Connecting to the database

```python from extpybis.openbis import ExtOpenbis

o = ExtOpenbis('https://yourdatabase.com') # Define an ExtPybis object with the url to your openBIS Database o.connecttodatastore(username='name', password=None) # If username is not given, the username will be read from your account (windows/linux). If the password is not given, there will be a prompt. ```

2. Create a blueprint for the samples / objects - create a new sample type / object type

```python

In order to define a new sample / object type use the following method (only possible as admin)

o.createsampletype( samplecode='EXPERIMENTALSTEPUCT', # The code identifier of the new sample / object type sampleprefix='UCT', # The prefix of the sample / object. Will appear before every sample / object code sampleproperties={ '$name': ['VARCHAR', 'Name', 'Name'] # Default system property 'date': ['DATE', 'Date' 'Date of the experiment'] 'property1': ['VARCHAR', 'mylabel1', 'mydescription1'], # Define properties here with data in the order 'property2': ['INTEGER', 'mylabel2', 'mydescription2'], # [Data Type, Label, Description] 'property3': ['FLOAT', 'mylabel2', 'my_description2'] } ) ```

You can find all the possible property types like 'VARCHAR' or 'FLOAT' here under 'create property types'

You can also define the sample / object type in the Web-View of the datastore.

You need to be an admin to create new sample / object types.

3. Create and upload samples / objects from type "EXPERIMENTALSTEPUCT"

For each measured experiment a sample / object of the corresponding type (EXPERIMENTALSTEPUCT) has to be created.

a) Define the sample

```python concreteexperiment = o.newsample( type = 'EXPERIMENTALSTEPUCT', # Has to be the same as a defined sample type. Here the one from chapter 2 space = '', # your preassigned personal space project = 'UCTCompression', collection = '/UCTCompression/UCTCompressionTestseries01', )

experimentmetadata = {'$name': 'ConcreteExperiment1', 'date': '17:10:2022'} concreteexperiment.setprops(experimentmetadata) ```

b) Upload the Sample to the Database

python sample.save()

Good to know:

  • To upload a sample / object of a given type that type has to already exist in the datastore (See 2. Create a new sample type / object type).
  • The project and collection the sample should be uploaded to, must exist before uploading. Creating new projects/ collection can be done via the Web-View (ELN) or by pybis: python project_obj = o.new_project( space='<Your Username>', code='UCT_Compression', description="...") project_obj.save() collection_obj = o.new_collection( project='UCT_Compression', code='UCT_Compression_Testseries01', type="COLLECTION") collection_obj.save()
  • You can look up which metadata the Experimental Step Type will accept by either getting the Excel import template running the ExtPybis get_metadata_import_template() method or get the properties of an Experimental Step Type by running the ExtPybis get_sample_type_properties method. (Or via the admin Web-View.)

Return of o.get_metadata_import_template(EXPERIMENTAL_STEP_UCT) as an Excel Sheet or Pandas Dataframe (o = connected ExtPybis instance)

| | Param | Label | Description | Value | |-----|-----------|-----------|------------------------|-------| | 0 | $name | Name | Name | | | 1 | date | Date | Date of the experiment | | | 2 | property1 | mylabel1 | mydescription1 | | | 3 | property2 | mylabel2 | mydescription2 | | | 4 | property3 | mylabel3 | mydescription3 | |

If you use the Excel Sheet above for filling in metadata, then the upload to the sample can be done via o.import_props_from_template(<path to filled out Excell sheet>, sample).

4. Upload corresponding datasets.

python concrete_dataset = o.new_dataset( type = 'RAW_DATA', # Or PROCESSED_DATA or some other Dataset Type. Check your own datastore with o.get_dataset_types() for available datasets collection = concrete_experiment.collection, sample = concrete_experiment.identifier, files = ['path_to_file1', 'path_to_file2'], props = {'$name': 'UCT_Compression_Experiment_17_10_2022_RAW_DATA',} )

If you have done everything correctly you should be able to see your Experimental Step of type 'EXPERIMENTALSTEPUCT' and your datasets in the Web-View of the database in the location you defined.

5. Searching for specific samples / objects in database.

```python

a Pandas DataFrame of all suitable samples / objects

samples = o.getsamples( space="", # search in this space project="TESTAMBETON", # search in this project props=['$NAME', 'DATE'], # show these properties in the results using props="*" retrieves all properties where={ "DATE": "2022-05-10", # query }, ).df ```

For detailed information see here under 'search for samples / objects'. Searching is also possible via Web-View.

Generating and working with type checkers.

You can generate type-checkers based on pydantic models to check your samples for formatting errors before uploading them to openbis

```python sampleproperties = {"$name": "samplename"}

SampleModel = o.generatetypechecker("SAMPLETYPE") samplemodelreturn = SampleModel(**sampleproperties) sampleproperties = samplemodelreturn.dict(excludeunset=True) # excludeunset is for not including other property types that were not set in sample_properties ```

The sample_properties have been checked for their types and casted/formatted if possible. The functionality includes:

  • Casting strings like "2.13" or "7" to float or integer respectively
  • Checking if CONTROLLED_VOCABULARY Properties are withing their defined vocabularies
  • Casting date-strings into the required format. Ex "17.10.2023 10:45" will be formatted to "2023-10-17"

Owner

  • Name: Bundesanstalt für Materialforschung und -prüfung
  • Login: BAMresearch
  • Kind: organization
  • Email: oss@bam.de
  • Location: Berlin/Germany

German Federal scientific research institute for materials testing and research

Citation (CITATION.cff)

cff-version: 0.0.2
title: ExtPybis
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Cezary
    family-names: Kujath
    email: cezary.kujath@bam.de
    affiliation: Bundesanstalt für Materialforschung und -prüfung (BAM)
  - given-names: Annika
    family-names: Robens-Radermacher
    email: annika.robens-radermacher@bam.de
    affiliation: Bundesanstalt für Materialforschung und -prüfung (BAM)
    orcid: 'https://orcid.org/0000-0001-9653-6085'
repository-code: 'https://github.com/BAMresearch/ExtPybis'
abstract: >-
  ExtPybis is an extension of the Pybis library meant for improving data synchronisation with an openBIS datastore.
keywords:
  - pybis
  - obenBIS
license: MIT

GitHub Events

Total
Last Year

Dependencies

.github/workflows/extpybis_tests.yml actions
  • actions/cache v2 composite
  • actions/checkout v2 composite
  • conda-incubator/setup-miniconda v2 composite
.github/workflows/publish_conda.yml actions
  • actions/checkout v3 composite
  • conda-incubator/setup-miniconda v2 composite
environment.yml pypi
pyproject.toml pypi