gfbio-data-search

GFBio Data Search

https://github.com/gfbio/gfbio-data-search

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 3 DOI reference(s) in README
✓
Academic publication links
Links to: zenodo.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.3%) to scientific vocabulary

Last synced: 6 months ago · JSON representation ·

Repository

GFBio Data Search

Basic Info

Host: GitHub
Owner: gfbio
License: lgpl-3.0
Language: TypeScript
Default Branch: main
Homepage:
Size: 8.9 MB

Statistics

Stars: 1
Watchers: 12
Forks: 0
Open Issues: 0
Releases: 2

Created over 2 years ago · Last pushed 7 months ago

Metadata Files

Readme Changelog License Citation Zenodo

GFBio Data Search

Description

The GFBio Dataset Search, built upon the Dai:Si Dataset Search UI, facilitates the exploration of datasets distributed and published across the GFBio data centers. It is an integral part of the GFBio Search and Harvesting Infrastructure, as depicted below.

Version

Current version: 1.0.0

See CHANGELOG.md for details on version history and changes.

Developer Guide

This section provides a guide for setting up and operating the GFBio Dataset Search for local development. It focuses on the local development stack, outlined in the Docker Compose file (docker-compose.yml). This file configures three main services: a Node Express API for the backend, an Angular application for the frontend, and an Elasticsearch index for indexing and retrieving search results.

Docker Stack

```yaml version: "3"

services: backend: build: context: . dockerfile: ./docker/backend/Dockerfile containername: gfbiosearchbackenddev envfile: - ./search/backend/.env ports: - "3000:3000" volumes: - ./search/backend:/backend networks: - customnetwork

frontend: build: context: . dockerfile: ./docker/frontend/Dockerfile.dev containername: gfbiosearchfrontenddev volumes: - ./search/frontend:/frontend ports: - "4200:4200" environment: - CHOKIDARUSEPOLLING=true networks: - customnetwork

index: image: docker.elastic.co/elasticsearch/elasticsearch:7.10.0 containername: gfbiosearchindexdev environment: - discovery.type=single-node ports: - "9200:9200" - "9300:9300" volumes: - esdata:/usr/share/elasticsearch/data networks: - custom_network ulimits: memlock: soft: -1 hard: -1 deploy: resources: limits: memory: 2g

volumes: esdata:

networks: custom_network: driver: bridge ```

To initiate the local development stack for the GFBio Dataset Search, perform the following three steps: copy the environment file for backend configuration from a template, build the Docker containers, and populate the Elasticsearch index with dummy data. These steps are automated by an 'init' command in the Makefile, executable on Linux and Unix-like systems. Execute the command from the base folder of the repository:

bash make init

For general operations within the local development environment, use docker-compose to start, stop, and rebuild containers:

bash docker-compose up

To rebuild the services, especially after changes to the Docker configuration:

bash docker-compose up --build

To stop the running services and clean up the local development environment:

bash docker-compose down

After starting the stack, access the frontend in your browser at:

localhost:4200

Modifications to the frontend source code will automatically trigger a browser reload, reflecting changes immediately. Changes to the backend code will automatically restart the Node server.

Note: When changing information in the backend environment file, you must rebuild the containers to apply the changes, as environment variables are set during build time.

Frontend and Backend Code

The backend and frontend code are located under the search folder:

search ├── backend │ ├── package.json │ ├── package-lock.json │ ├── README.md │ ├── server.js │ ├── src │ │ ├── ... │ └── tests ├── frontend │ ├── angular.json │ ├── dist │ │ └── DatasetSearch │ ├── karma.conf.js │ ├── package.json │ ├── package-lock.json │ ├── README.md │ ├── src │ │ ├── ... │ ├── tsconfig.app.json │ ├── tsconfig.json │ ├── tsconfig.spec.json │ └── tslint.json └── LICENSE

The Index

Information about the Elasticsearch index is located in the index folder and comprises the current mapping, the script to populate the index with dummy data and the sample data:

index ├── index_mapping.json ├── populate_index.sh └── sample_data.json

Contact Us

Please email any questions and comments to our Service Helpdesk (info@gfbio.org).

References

Shafiei, F., Löffler, F., Thiel, S., Opasjumruskit, K., Grabiger, D., Rauh, P., König-Ries, B.: [Dai:Si] - A Modular Dataset Retrieval Framework with a Semantic Search for Biological Data, 2021. Link

Acknowledgements

This work was supported by the German Research Foundation (DFG) within the project “Establishment of the National Research Data Infrastructure (NFDI)” in the consortium NFDI4Biodiversity (project number 442032008).
This work was supported by the German Research Foundation (DFG) within the project "German Federation for Biological Data e.V.: Concept for a sustainable research data management of environmental data for Germany" (project number 408180549).

Owner

Name: GFBio
Login: gfbio
Kind: organization
Location: Germany

Website: https://www.gfbio.org/
Repositories: 18
Profile: https://github.com/gfbio

German Federation for Biological Data

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Pfaff"
  given-names: "Claas-Thido"
  orcid: "https://orcid.org/0000-0003-1572-8590"
- family-names: "Weber"
  given-names: "Marc"
  orcid: "https://orcid.org/0000-0003-0694-5817"
- family-names: "Franz"
  given-names: "Linus"
  orcid: "https://orcid.org/0009-0007-0441-8630"
title: "GFBio Data Search"
abstract: "The GFBio Data Search is based on the Dai:Si search UI (https://api.semanticscholar.org/CorpusID:240005304) and enables a search of datasets\nwhich are distributed and published across the GFBio Data Centers (https://gfbio.org/data-centers/). The data centers and the data sources they provide are listed in an aggregator service. A harvester service collects the resources from the aggregator and extracts the information into an Elasticsearch index."
type: software
doi: 10.5281/zenodo.8308204
keywords:
  - Research Software
  - Research Data Management
  - Biodiversity
  - Ecology
  - GfBio
license: GNU LGPL v3.0

GitHub Events

Total

Watch event: 1
Push event: 3
Create event: 2

Last Year

Watch event: 1
Push event: 3
Create event: 2

Dependencies

docker/backend/Dockerfile docker

node 14 build

docker/frontend/Dockerfile docker

httpd 2.4 build
node 14 build

docker-compose.yml docker

httpd 2.4

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

gfbio-data-search

Science Score: 67.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

GFBio Data Search

Description

Version

Developer Guide

Docker Stack

Frontend and Backend Code

The Index

Contact Us

References

Acknowledgements

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Dependencies