ShUShER

ShUShER: private browser-based placement of sensitive genome samples on phylogenetic trees - Published in JOSS (2021)

https://github.com/amkram/shusher

Science Score: 95.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: biorxiv.org, joss.theoj.org
  • Committers with academic emails
    2 of 3 committers (66.7%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Scientific Fields

Engineering Computer Science - 40% confidence
Last synced: 6 months ago · JSON representation

Repository

Private, browser-based placement of genome sequences on phylogenetic trees using UShER.

Basic Info
  • Host: GitHub
  • Owner: amkram
  • License: agpl-3.0
  • Language: JavaScript
  • Default Branch: master
  • Homepage:
  • Size: 101 MB
Statistics
  • Stars: 11
  • Watchers: 1
  • Forks: 1
  • Open Issues: 0
  • Releases: 10
Created almost 5 years ago · Last pushed over 1 year ago
Metadata Files
Readme Contributing License Code of conduct Zenodo

README.md

private (Shh :shushing_face:) Ultrafast Sample placement on Existing tRees

[![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](LICENSE) [![Integration Tests](https://github.com/amkram/shusher/actions/workflows/build_and_test.yml/badge.svg)](https://github.com/amkram/shusher/actions/workflows/build_and_test.yml) | :computer_mouse: Access ShUShER here! | | --- |
ShUShER is a browser tool for placing sensitive genome sequences on phylogenetic trees using UShER.

Usage | How it works | Installation

Contents

Usage

:warning: This tool is intended to be used only for sequences that cannot be shared publicly. If you do not have this requirement, please use the UShER web tool and submit your sequences to an INSDC member institution (NCBI, EMBL-EBI, or DDBJ) and GISAID

ShUShER is currently designed for use with SARS-CoV-2 genomes. The user supplies a set of samples in FASTA or VCF format, and the provided samples are placed on a continuously growing global tree (read more). After placement, subtrees containing user samples can be visualized (using Auspice).

Loading samples

Samples can be provided to ShUShER in either FASTA (.fa, .fasta, .fna) or VCF format (.vcf).

All samples must be in a single file. When you load your samples into ShUShER, they will not be uploaded to our servers and the data will remain on your computer.

Running UShER

After loading your samples, two input fields will appear:

| | | --- |

The first field selects the existing tree to place your samples on. Currently, the only option is the global SARS-CoV-2 tree (maintained here).

After UShER places your samples on the global tree, it will output subtrees containing your samples. The second field allows you to select how many closely-related samples from the global tree to include in each subtree.

Interpreting results

After UShER has finished running, a table of information about your samples will be displayed.

Two numbers are reported for each sample:

Number of maximally parsimonious placements is the number of potential placements in the tree with minimal parsimony score. A higher number indicates a less confident placement.

Parsimony score is the number of mutations/changes that must be added to the tree when placing this sample. The higher the number, the more diverged the sample.

Visualizing subtrees

Each sample in the table will have a button, e.g. allowing you to open the subtree containing that sample in Auspice. The subtree visualization will open in a new browser tab (but data is not sent over the Internet).

Downloading data

Newick files for each of the generated subtrees can be downloaded at the bottom of the Auspice visualization page.

How it works

The ShUShER web app uses a ported version of UShER that can be run client-side in a web browser. The original C++ code base is compiled to WebAssembly with Emscripten and wrapped in a React frontend (read more about the port here). User-provided samples are not transmitted across the Internet, and computation is performed locally in the browser. We use a modified version of Auspice to display the subtrees computed by UShER. The visualization opens in a new browser tab, using localStorage to share data between tabs without transmitting any user data over the web.

FASTA to VCF conversion is performed by aligning each provided sample pairwise to the reference SARS-CoV-2 genome. The implementation of pairwise alignment is from Nextclade.

Installation (for developers)

SHUShER currently only supports building on Linux systems, and has been tested on Ubuntu 20.04

If you would like to run ShUShER locally or modify the source, first download the source code, e.g.:

wget https://github.com/amkram/shusher/archive/refs/tags/latest.tar.gz

tar xvzf latest.tar.gz

The above command will download the latest tagged release of ShUShER. View all "Releases" in the right sidebar if you want to download a specific version. Alternately, cloning this repository will give you the latest, unreleased code, but may be unstable.

The downloaded source code contains code for building both the web app and the UShER port.

Running the web app locally

Enter the web-app subdirectory and run

npm install

To build the app, run

npm run build

And to start the local server, run

npm start

You should now be able to access ShUShER in your browser at localhost:4000

Compiling UShER to WebAssembly

The directory usher-port contains the original C++ UShER code and a script that will compile it to WebAssembly. You only need to compile UShER yourself if you want to change the UShER source code. Otherwise, the web app will automatically use the most recent pre-compiled release from this repository.

1. Install Dependencies

sudo apt-get update

sudo apt-get install wget python3 build-essential cmake protobuf-compiler dh-autoreconf

2. Compile UShER

./installUbuntuWeb.sh

This script will download the C++ library dependencies of UShER, make some modifications necessary for WebAssembly compilation, and then compile them using emscripten. Output in the build directory includes usher.wasm, usher.js, usher.data, and usher.worker.js, all of which are used by the ShUShER web app.

3. Specify custom UShER code

By default, the web app grabs the latest tagged release of the WebAssembly UShER bundle from this repository. If you compiled UShER yourself using the above steps, you can tell ShUShER to use your compiled code instead.

In the web-app subdirectory, edit package.json and change the following line:

config: {
  usherBundle: "latest"
}

to

config: {
  usherBundle: "[path to build output]"
}

Contributing

We welcome and encourage contributions to ShUShER from the community. If you would like to contribute, please read the contribution guidelines and code of conduct.

About this repository

usher-port contains the scripts and files needed to compile UShER to WebAssembly. See here for details on the process.

web-app contains the React application that uses the UShER port.

Twice a day, the UShER C++ source hosted in this repository is updated from the main UShER repository.

Upon each push to the master branch, the integration test Github Action is run, which (1) compiles the latest source from the main UShER repo to a binary executable, (2) compiles UShER to WebAssembly with this repo's latest code, and (3) runs both on a sample file and compares the outputs, ensuring they are the same.

New releases are tagged periodically and pushed to the live web app.

Acknowledgements

This project uses or adapts code from several open-source projects. We are grateful for their contributions.

Pairwise sequence alignment uses the implementation from Nextclade.

Visualization of subtrees is performed with Auspice.

Scripts to modify the Auspice server are from auspice.us.

Nextclade, Auspice, and auspice.us are part of the Nextstrain project.

The core functionality of this tool is a ported version of UShER.

Owner

  • Name: Alex Kramer
  • Login: amkram
  • Kind: user
  • Location: Santa Cruz, CA
  • Company: @corbett-lab

Graduate student at UC Santa Cruz - Biomolecular Engineering and Bioinformatics

JOSS Publication

ShUShER: private browser-based placement of sensitive genome samples on phylogenetic trees
Published
October 14, 2021
Volume 6, Issue 66, Page 3677
Authors
Alexander Kramer ORCID
Department of Biomolecular Engineering, University of California Santa Cruz. Santa Cruz, CA 95064, USA, Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
Yatish Turakhia ORCID
Department of Electrical and Computer Engineering, University of California, San Diego; San Diego, CA 92093, USA
Russell Corbett-Detig ORCID
Department of Biomolecular Engineering, University of California Santa Cruz. Santa Cruz, CA 95064, USA, Genomics Institute, University of California Santa Cruz, Santa Cruz, CA 95064, USA
Editor
Mikkel Meyer Andersen ORCID
Tags
phylogenetics WebAssembly React SARS-CoV-2

GitHub Events

Total
  • Watch event: 1
  • Push event: 1
Last Year
  • Watch event: 1
  • Push event: 1

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 353
  • Total Committers: 3
  • Avg Commits per committer: 117.667
  • Development Distribution Score (DDS): 0.008
Past Year
  • Commits: 5
  • Committers: 1
  • Avg Commits per committer: 5.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Alex Kramer a****r@u****u 350
Alex Kramer a****.@u****u 2
root r****t@D****n 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Dependencies

web-app/package-lock.json npm
  • 891 dependencies
web-app/package.json npm
  • @material-ui/core ^4.11.4
  • @material-ui/icons ^4.11.2
  • @material-ui/lab ^4.0.0-alpha.58
  • auspice 2.23.0
  • express ^4.17.1
  • heroku-ssl-redirect 0.0.4
  • lodash ^4.17.21
  • node-gzip ^1.1.2
  • react-file-drop ^3.1.2
  • react-redux ^7.2.4
.github/workflows/build_and_test.yml actions
  • actions/checkout v2 composite
  • actions/github-script v3 composite
.github/workflows/bundle.yml actions
  • actions/checkout v2 composite
  • marvinpinto/action-automatic-releases latest composite
.github/workflows/deploy_heroku.yml actions
  • actions/checkout v2 composite
  • akhileshns/heroku-deploy v3.12.12 composite
.github/workflows/deploy_production.yml actions
  • appleboy/ssh-action master composite
.github/workflows/pull_usher.yml actions
  • EndBug/add-and-commit v7.2.1 composite
  • actions/checkout v2 composite
usher-port/Dockerfile docker
  • ubuntu 18.04 build
usher-port/environment.yml pypi