The UCSCXenaTools R package

The UCSCXenaTools R package: a toolkit for accessing genomics data from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq - Published in JOSS (2019)

https://github.com/ropensci/ucscxenatools

Science Score: 95.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 10 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org
  • Committers with academic emails
    1 of 4 committers (25.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords

api-client bioinformatics ccle downloader icgc r tcga toil treehouse ucsc ucsc-xena
Last synced: 4 months ago · JSON representation

Repository

:package: An R package for accessing genomics data from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq https://cran.r-project.org/web/packages/UCSCXenaTools/

Basic Info
Statistics
  • Stars: 109
  • Watchers: 4
  • Forks: 14
  • Open Issues: 1
  • Releases: 15
Topics
api-client bioinformatics ccle downloader icgc r tcga toil treehouse ucsc ucsc-xena
Created over 6 years ago · Last pushed 4 months ago
Metadata Files
Readme Changelog License

README.Rmd

---
output: github_document
---



```{r, echo = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)
```

# UCSCXenaTools logo



[![CRAN status](https://www.r-pkg.org/badges/version/UCSCXenaTools)](https://cran.r-project.org/package=UCSCXenaTools)
[![lifecycle](https://img.shields.io/badge/lifecycle-stable-blue.svg)](https://lifecycle.r-lib.org/articles/stages.html)
[![R-CMD-check](https://github.com/ropensci/UCSCXenaTools/actions/workflows/main.yml/badge.svg)](https://github.com/ropensci/UCSCXenaTools/actions/workflows/main.yml) 
[![](https://cranlogs.r-pkg.org/badges/grand-total/UCSCXenaTools?color=orange)](https://cran.r-project.org/package=UCSCXenaTools)
[![rOpenSci](https://badges.ropensci.org/315_status.svg)](https://github.com/ropensci/software-review/issues/315)
[![DOI](https://joss.theoj.org/papers/10.21105/joss.01627/status.svg)](https://doi.org/10.21105/joss.01627)



**UCSCXenaTools** is an R package for accessing genomics data from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq. 
Public omics data from UCSC Xena are supported through [**multiple turn-key Xena Hubs**](https://xenabrowser.net/datapages/), which are a collection of UCSC-hosted public databases such as TCGA, ICGC, TARGET, GTEx, CCLE, and others. Databases are normalized so they can be combined, linked, filtered, explored and downloaded.

**Who is the target audience and what are scientific applications of this package?**

* Target Audience: cancer and clinical researchers, bioinformaticians
* Applications: genomic and clinical analyses

## Table of Contents

* [Installation](#installation)
* [Data Hub List](#data-hub-list)
* [Basic usage](#basic-usage)
* [Citation](#citation)
* [How to contribute](#how-to-contribute)
* [Acknowledgment](#acknowledgment)

## Installation

Install stable release from r-universe/CRAN with:

```{r, eval=FALSE}
install.packages('UCSCXenaTools', repos = c('https://ropensci.r-universe.dev', 'https://cloud.r-project.org'))
#install.packages("UCSCXenaTools")
```

You can also install devel version of **UCSCXenaTools** from github with:

```{r gh-installation, eval = FALSE}
# install.packages("remotes")
remotes::install_github("ropensci/UCSCXenaTools")
```

If you want to build vignette in local, please add two options:

```{r, eval=FALSE}
remotes::install_github("ropensci/UCSCXenaTools", build_vignettes = TRUE, dependencies = TRUE)
```

## Data Hub List

All datasets are available at .

Currently, **UCSCXenaTools** supports the following data hubs of UCSC Xena.

* UCSC Public Hub: 
* TCGA Hub: 
* GDC Xena Hub (new): 
* GDC v18.0 Xena Hub (old): 
* ICGC Xena Hub: 
* Pan-Cancer Atlas Hub: 
* UCSC Toil RNAseq Recompute Compendium Hub: 
* PCAWG Xena Hub: 
* ATAC-seq Hub: 
* Treehouse Xena Hub: 
* ...

Users can update dataset list from the newest version of UCSC Xena by hand with `XenaDataUpdate()` function, followed
by restarting R and `library(UCSCXenaTools)`.

If any url of data hub is changed or a new data hub is online, please remind me by emailing to  or [opening an issue on GitHub](https://github.com/ropensci/UCSCXenaTools/issues).


## Basic usage

Download UCSC Xena datasets and load them into R by **UCSCXenaTools** is a workflow with `generate`, `filter`, `query`, `download` and `prepare` 5 steps, which are implemented as `XenaGenerate`, `XenaFilter`, `XenaQuery`, `XenaDownload` and `XenaPrepare` functions, respectively. They are very clear and easy to use and combine with other packages like `dplyr`.

To show the basic usage of **UCSCXenaTools**, we will download clinical data of LUNG, LUAD, LUSC from TCGA (hg19 version) data hub. Users can learn more about **UCSCXenaTools** by running `browseVignettes("UCSCXenaTools")` to read vignette.

### XenaData data.frame

**UCSCXenaTools** uses a `data.frame` object (built in package) `XenaData` to generate an instance of `XenaHub` class, which records information of all datasets of UCSC Xena Data Hubs.

You can load `XenaData` after loading `UCSCXenaTools` into R.

```{r}
library(UCSCXenaTools)
data(XenaData)

head(XenaData)
```

### Workflow

Select datasets.

```{r}
# The options in XenaFilter function support Regular Expression
XenaGenerate(subset = XenaHostNames=="tcgaHub") %>% 
  XenaFilter(filterDatasets = "clinical") %>% 
  XenaFilter(filterDatasets = "LUAD|LUSC|LUNG") -> df_todo

df_todo
```

Query and download.

```{r}
XenaQuery(df_todo) %>%
  XenaDownload() -> xe_download
```

Prepare data into R for analysis.

```{r}
cli = XenaPrepare(xe_download)
class(cli)
names(cli)
```

## More to read

- [Introduction and basic usage of UCSCXenaTools](https://shixiangwang.github.io/home/en/tools/ucscxenatools-intro/)
- [UCSCXenaTools: Retrieve Gene Expression and Clinical Information from UCSC Xena for Survival Analysis](https://shixiangwang.github.io/home/en/post/ucscxenatools-201908/)
- [Obtain RNAseq Values for a Specific Gene in Xena Database](https://shixiangwang.github.io/home/en/post/2020-07-22-ucscxenatools-single-gene/)
- [UCSC Xena Access APIs in UCSCXenaTools](https://shixiangwang.github.io/home/en/tools/ucscxenatools-api/)

## Citation

Cite me by the following paper.

```
Wang et al., (2019). The UCSCXenaTools R package: a toolkit for accessing genomics data
  from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq. 
  Journal of Open Source Software, 4(40), 1627, https://doi.org/10.21105/joss.01627

# For BibTex
  
@article{Wang2019UCSCXenaTools,
	journal = {Journal of Open Source Software},
	doi = {10.21105/joss.01627},
	issn = {2475-9066},
	number = {40},
	publisher = {The Open Journal},
	title = {The UCSCXenaTools R package: a toolkit for accessing genomics data from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq},
	url = {https://dx.doi.org/10.21105/joss.01627},
	volume = {4},
	author = {Wang, Shixiang and Liu, Xuesong},
	pages = {1627},
	date = {2019-08-05},
	year = {2019},
	month = {8},
	day = {5},
}
```

Cite UCSC Xena by the following paper. 

```
Goldman, Mary, et al. "The UCSC Xena Platform for cancer genomics data 
    visualization and interpretation." BioRxiv (2019): 326470.
```

## How to contribute

For anyone who wants to contribute, please follow the guideline:

* Clone project from GitHub
* Open `UCSCXenaTools.Rproj` with RStudio
* Modify source code 
* Run `devtools::check()`, and fix all errors, warnings and notes
* Create a pull request

## Acknowledgment

This package is based on [XenaR](https://github.com/mtmorgan/XenaR), thanks [Martin Morgan](https://github.com/mtmorgan) for his work.

[![ropensci_footer](https://ropensci.org/public_images/ropensci_footer.png)](https://ropensci.org)

Owner

  • Name: rOpenSci
  • Login: ropensci
  • Kind: organization
  • Email: info@ropensci.org
  • Location: Berkeley, CA

JOSS Publication

The UCSCXenaTools R package: a toolkit for accessing genomics data from UCSC Xena platform, from cancer multi-omics to single-cell RNA-seq
Published
August 05, 2019
Volume 4, Issue 40, Page 1627
Authors
Shixiang Wang ORCID
School of Life Science and Technology, ShanghaiTech University, Shanghai Institute of Biochemistry and Cell Biology, Chinese Academy of Sciences, University of Chinese Academy of Sciences
Xuesong Liu ORCID
School of Life Science and Technology, ShanghaiTech University
Editor
Arfon Smith ORCID
Tags
cancer genomics data access

Papers & Mentions

Total mentions: 4

The SLC Family Are Candidate Diagnostic and Prognostic Biomarkers in Clear Cell Renal Cell Carcinoma
Last synced: 2 months ago
The SMART App: an interactive web application for comprehensive DNA methylation analysis and visualization
Last synced: 2 months ago
Construction of a Prognostic Risk Prediction Model for Obesity Combined With Breast Cancer
Last synced: 2 months ago
CXCL12 and IL7R as Novel Therapeutic Targets for Liver Hepatocellular Carcinoma Are Correlated With Somatic Mutations and the Tumor Immunological Microenvironment
Last synced: 2 months ago

GitHub Events

Total
  • Create event: 1
  • Release event: 1
  • Issues event: 11
  • Watch event: 8
  • Issue comment event: 19
  • Push event: 9
  • Fork event: 2
Last Year
  • Create event: 1
  • Release event: 1
  • Issues event: 11
  • Watch event: 8
  • Issue comment event: 19
  • Push event: 10
  • Fork event: 2

Committers

Last synced: 5 months ago

All Time
  • Total Commits: 266
  • Total Committers: 4
  • Avg Commits per committer: 66.5
  • Development Distribution Score (DDS): 0.023
Past Year
  • Commits: 9
  • Committers: 1
  • Avg Commits per committer: 9.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
ShixiangWang w****x@s****n 260
Martin Morgan m****n@f****g 3
Shixiang Wang (王诗翔) w****g@i****m 2
rOpenSci Bot m****t@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 4 months ago

All Time
  • Total issues: 46
  • Total pull requests: 0
  • Average time to close issues: about 1 month
  • Average time to close pull requests: N/A
  • Total issue authors: 15
  • Total pull request authors: 0
  • Average comments per issue: 2.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 5
  • Pull requests: 0
  • Average time to close issues: 3 months
  • Average time to close pull requests: N/A
  • Issue authors: 2
  • Pull request authors: 0
  • Average comments per issue: 2.8
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • ShixiangWang (33)
  • ayueme (1)
  • markgene (1)
  • gbregni (1)
  • danshu (1)
  • vuzun (1)
  • Amhaslam (1)
  • ghost (1)
  • adomingues (1)
  • sckott (1)
  • jianguozhouzunyimedicaluniversity (1)
  • Shiywa (1)
  • jbagnot (1)
  • iamhealthy (1)
  • sunta3iouxos (1)
Pull Request Authors
Top Labels
Issue Labels
enhancement (10) bug (7) more-info-needed (6) help wanted (4) data-source-update (3) good first issue (2) plan (2) question (1) invalid (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cran 1,662 last-month
  • Total docker downloads: 524
  • Total dependent packages: 2
  • Total dependent repositories: 6
  • Total versions: 36
  • Total maintainers: 1
cran.r-project.org: UCSCXenaTools

Download and Explore Datasets from UCSC Xena Data Hubs

  • Versions: 36
  • Dependent Packages: 2
  • Dependent Repositories: 6
  • Downloads: 1,662 Last month
  • Docker Downloads: 524
Rankings
Stargazers count: 4.1%
Forks count: 5.8%
Average: 11.4%
Dependent repos count: 11.9%
Dependent packages count: 13.6%
Docker downloads count: 16.5%
Downloads: 16.6%
Maintainers (1)
Last synced: 4 months ago