genome-portal

This is the repository for the Swedish Reference Genome Portal, a service facilitating access and discovery of genome data of non-model eukaryotic species studied in Sweden

https://github.com/scilifelabdatacentre/genome-portal

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
    Links to: zenodo.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.4%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

This is the repository for the Swedish Reference Genome Portal, a service facilitating access and discovery of genome data of non-model eukaryotic species studied in Sweden

Basic Info
Statistics
  • Stars: 0
  • Watchers: 5
  • Forks: 0
  • Open Issues: 2
  • Releases: 11
Created almost 2 years ago · Last pushed 6 months ago
Metadata Files
Readme Contributing License Citation Codeowners

README.md

Swedish Reference Genome Portal

This repository contains the source code for the Swedish Reference Genome Portal, which:

  • Showcases genome research performed in Sweden on non-model eukaryotic species.
  • Lowers the barrier of entry to access, visualise, and interpret genome data.
  • Encourages sharing of genomic annotations, even the seldom-published kind.
  • Strives to present FAIR data, available in public repositories.

Table of Contents

  1. Overview
  2. Cite this portal
  3. Contributing
  4. Funding
  5. Contact us
  6. Technical overview
  7. Credits

Overview

  • The Swedish Reference Genome Portal website is built using the Hugo static web generator.

  •  The JBrowse2 genome browser is embedded within the website to visually explore genome datasets.

  • Primary data file sources are available in public repositories (such as ENA), and prepared for display on JBrowse by our Makefile recipes (essentially compressing and indexing).

  • The code for the Genome Portal is available under an MIT (open source) license.

  • The Genome Portal website is currently hosted by the KTH Royal Institute of Technology in Stockholm.

Cite this portal

DOI

See 'Cite this repository' in the "About" section at the top right of this page.

Contributing

Two types of contributions are especially welcome:

  • Datasets for display in the portal: Consult our requirements for including a genome dataset to the portal, and contact us if you have any questions.

  • Source code and documentation: We welcome contributions, small and large, to our codebase and documentation. They will be published after review and approval by the Genome Portal team. Fork, open a PR, or contact us to discuss ideas!

Funding

This service is supported by SciLifeLab and the Knut and Alice Wallenberg Foundation through the Data-Driven Life Science (DDLS) program, as well as by the Swedish Foundation for Strategic Research (SSF).

Contact us

We welcome all questions and suggestions (including feature requests or bug reports).

Technical overview

This section contains high-level technical documentation about the source code.

Repository layout

  • The config/ directory contains information about data sources (tracks and assemblies) displayed in the genome browser.

    • Each species subdirectory includes:
    • config.yml : specifies the assembly and tracks to be displayed in JBrowse2.
    • config.json : starting point from which to generate a complete JBrowse2 configuration, based on config.yaml. A common use is to define default browsing sessions.
  • Different make recipes prepare the material described in config/ for use by JBrowse2. The main operations are downloading data files, compressing using bgzip and indexing with samtools.

  • The website content resides in the hugo directory.

    • Most importantly, each species gets:
    • A content subdirectory in hugo/content/species/ (e.g. hugo/content/species/clupea_harengus).
    • A data directory in hugo/data/ (taxonomic information and statistics).
    • An assets directory in hugo/assets (data inventory).
  • The scripts folder contains executables to help:

    1. Build and serve the website using Docker.
    2. Add a new species to the website content.
    3. Add new datasets to the portal.
  • The tests folder contains tests and fixtures, mainly covering the data preparation scripts.

  • The docker folder contains two Docker files:

    1. docker/data.dockerfile used for data preparation (everything that make needs).
    2. docker/hugo.dockerfile used to build and serve the website.

Local development

The steps described below requires docker to be installed.

1. Clone the repository

git clone git@github.com:ScilifelabDataCentre/genome-portal.git cd genome-portal

2. Build and install the genomic data

```bash

Build local image from docker/data.dockerfile

./scripts/dockerbuild data

Run the dockermake script to build the assets and install them locally.

./scripts/dockermake ```

You may need to be patient, some files are tens of Gigabytes. Should only a subset of species be of interest, you can restrict the scope of the build:

bash ./scripts/dockermake SPECIES=clupea_harengus,linum_tenue

3. Run the web application container

Then to run the website locally, you have several options:

Using the latest development image

bash docker pull ghcr.io/scilifelabdatacentre/swg-hugo-site:dev ./scripts/dockerserve -t dev

Using a local build

bash ./scripts/dockerbuild -t local -k hugo ./scripts/dockerserve -t local

Using the Hugo development server

This last method is adequate when you want to see changes to the source immediately reflected in the web browser.

It requires the additional step of installing the JBrowse static bundle in hugo/static/browser:

bash ./scripts/download_jbrowse v2.15.4 hugo/static/browser ./scripts/dockerserve -d


Either of these methods will serve you the website at http://localhost:8080/.

Making a new release/updating the dev cluster

We use kubernetes to deploy and manage both the production and development instances of the genome portal.

This repository is responsible for making the 2 docker images needed for the deployment. This is controlled by this GH actions workflow file.

To update the production instance we need to create a new release with GitHub: - Identify a commit to base the release on. - Agree with the team on the: - commit to tag. - the planned version number (we use semantic versioning) - The contents of the release, use the previous releases as inspiration - Once you have the go ahead, either: - Create an annotated tag locally (e.g: git tag -a v1.3.1 "v1.3.1" ) and push the tag, Then create the release (on that tag) using GitHub's interface. - Create the release using GitHub's interface and specify the commit you want to use and get GitHub to automatically create the tag for you.
- Once the release is published, a GH actions workflow will be triggered automatically to build the two images. The docker images will be tagged with the same string as used for the git tag (i.e. vX.X.X). They will also be given the tag "latest". You can see the docker images created from this repository here.

To update the development instance

  • Identify the commit you want the docker images to be built off of.
  • If the commit is on the main branch a GH actions workflow run will have already built the images (unless the commit message was prefixed to skip CI). The images will be tagged with the full commit hash. If the image is already built your job on this repository is already done.
  • If the commit is on any other branch you'll need to trigger a workflow_dispatch to create the docker images.
  • Head to the actions tag on GitHub and to the action "Build and push both docker images to the GitHub Container Registry". From there click run manual workflow. You can choose to specify the name of the image tag if you want. Otherwise leave the input blank and it will be tagged with the full commit hash.

Once the images are built you can head over to our private repository that contains the kubernetes manifest files and follow the instructions there on how to apply your changes to the cluster.

Credits

The Swedish Reference Genome Portal is developed and maintained by the DDLS Data Science Node in Evolution and Biodiversity (DSN-EB) team as part of the SciLifeLab Data Platform, operated by the  SciLifeLab Data Centre. Members if the DSN-EB team are affiliated with SciLifeLab Data Centre  and the National Bioinformatics Infrastructure Sweden (NBIS), based at Uppsala University and the Swedish Museum of Natural History.

Owner

  • Name: SciLifeLab Data Centre
  • Login: ScilifelabDataCentre
  • Kind: organization
  • Location: Stockholm and Uppsala, Sweden

The SciLifeLab Data Centre provides the SciLifeLab platforms with services for IT and data management.

Citation (CITATION.cff)

cff-version: 1.2.0
title: The Swedish Reference Genome Portal
message: "If you use or reuse this software, please cite it using the following metadata."
type: software
authors:
  - family-names: Lantz
    given-names: Henrik
  - family-names: Brink
    given-names: Daniel P.
    orcid: "https://orcid.org/0000-0003-4041-0250"
  - family-names: Crean
    given-names: Rory
  - family-names: Ågren
    given-names: Quentin
  - family-names: Fuentes-Pardo
    given-names: Angela P. 
    orcid: "https://orcid.org/0000-0002-5734-9030"
  - family-names: Kochari
    given-names: Arnold
    orcid: "https://orcid.org/0000-0003-1373-5121"
  - family-names: Kultima
    given-names: Hanna
    orcid: "https://orcid.org/0000-0001-7724-2567"
  - family-names: Persson
    given-names: Bengt
  - family-names: Rung
    given-names: Johan
    orcid: "https://orcid.org/0000-0001-5875-8429"
repository-code: "https://github.com/ScilifelabDataCentre/genome-portal"
url: "https://genomes.scilifelab.se"
license: MIT
identifiers:
  - description: "This is the collection of archived snapshots of all versions of the Swedish Reference Genome Portal"
    type: doi
    value: 10.5281/zenodo.14049736
references:
  - authors:
      - family-names: Diesh
        given-names: Colin
      - family-names: Stevens
        given-names: Garrett J.
      - family-names: Xie
        given-names: Peter
      - family-names: De Jesus Martinez
        given-names: Teresa
      - family-names: Hershberg
        given-names: Elliot A.
      - family-names: Leung
        given-names: Angel
      - family-names: Guo
        given-names: Emma
      - family-names: Dider
        given-names: Shihab
      - family-names: Zhang
        given-names: Junjun
      - family-names: Bridge
        given-names: Caroline
      - family-names: Hogue
        given-names: Gregory
      - family-names: Duncan
        given-names: Andrew
      - family-names: Morgan
        given-names: Matthew
      - family-names: Flores
        given-names: Tia
      - family-names: Bimber
        given-names: Benjamin N.
      - family-names: Haw
        given-names: Robin
      - family-names: Cain
        given-names: Scott
      - family-names: Buels
        given-names: Robert M.
      - family-names: Stein
        given-names: Lincoln D.
      - family-names: Holmes
        given-names: Ian H. 
    doi: 10.1186/s13059-023-02914-z
    issue: 1
    journal: Genome Biology
    scope: "Please cite this paper for referencing JBrowse 2, the genome browser software utilized on the Swedish Reference Genome Portal."
    title: "JBrowse 2: a modular genome browser with views of synteny and structural variation"
    type: article
    volume: 24
    year: 2023

GitHub Events

Total
  • Create event: 62
  • Commit comment event: 2
  • Release event: 6
  • Delete event: 54
  • Member event: 1
  • Issue comment event: 122
  • Push event: 402
  • Pull request review comment event: 174
  • Pull request event: 87
  • Pull request review event: 213
Last Year
  • Create event: 62
  • Commit comment event: 2
  • Release event: 6
  • Delete event: 54
  • Member event: 1
  • Issue comment event: 122
  • Push event: 402
  • Pull request review comment event: 174
  • Pull request event: 87
  • Pull request review event: 213

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 1
  • Total pull requests: 110
  • Average time to close issues: 20 days
  • Average time to close pull requests: 6 days
  • Total issue authors: 1
  • Total pull request authors: 5
  • Average comments per issue: 1.0
  • Average comments per pull request: 1.84
  • Merged pull requests: 97
  • Bot issues: 0
  • Bot pull requests: 2
Past Year
  • Issues: 0
  • Pull requests: 71
  • Average time to close issues: N/A
  • Average time to close pull requests: 8 days
  • Issue authors: 0
  • Pull request authors: 5
  • Average comments per issue: 0
  • Average comments per pull request: 2.27
  • Merged pull requests: 59
  • Bot issues: 0
  • Bot pull requests: 1
Top Authors
Issue Authors
  • RMCrean (1)
  • brinkdp (1)
  • apfuentes (1)
Pull Request Authors
  • RMCrean (65)
  • kwentine (34)
  • brinkdp (32)
  • apfuentes (3)
  • dependabot[bot] (2)
Top Labels
Issue Labels
Pull Request Labels
dependencies (2) python (1)

Dependencies

hugo/Dockerfile docker
  • alpine 3.19.1 build
  • nginxinc/nginx-unprivileged alpine build
.github/workflows/container_reg.yml actions
  • actions/checkout v4 composite
  • docker/build-push-action v5 composite
  • docker/login-action v3 composite
  • docker/metadata-action v5 composite
.github/workflows/lighthouse.yml actions
  • actions/checkout v4 composite
  • actions/setup-node v4 composite
  • peaceiris/actions-hugo v3 composite
.github/workflows/precommit.yml actions
  • actions/checkout v4 composite
  • actions/setup-python v5 composite
.github/workflows/trivy-scan.yml actions
  • actions/checkout v4 composite
  • aquasecurity/trivy-action master composite
  • github/codeql-action/upload-sarif v3 composite
requirements.txt pypi
  • pre-commit ==3.7.0
  • requests ==2.31.0