tidySummarizedExperiment

Brings SummarizedExperiment to the tidyverse

https://github.com/tidyomics/tidySummarizedExperiment

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.2%) to scientific vocabulary

Keywords

bioconductor genomics rnaseq summarizedexperiment tidyverse transcriptomics
Last synced: 6 months ago · JSON representation

Repository

Brings SummarizedExperiment to the tidyverse

Basic Info
Statistics
  • Stars: 26
  • Watchers: 3
  • Forks: 7
  • Open Issues: 17
  • Releases: 8
Topics
bioconductor genomics rnaseq summarizedexperiment tidyverse transcriptomics
Created over 5 years ago · Last pushed 6 months ago
Metadata Files
Readme

README.Rmd

---
title: "tidySummarizedExperiment - part of tidytranscriptomics"
output: github_document
always_allow_html: true
---


[![Lifecycle:maturing](https://img.shields.io/badge/lifecycle-maturing-blue.svg)](https://www.tidyverse.org/lifecycle/#maturing) [![R build status](https://github.com/stemangiola/tidySummarizedExperiment/workflows/R-CMD-check-bioc/badge.svg)](https://github.com/stemangiola/tidySummarizedExperiment/actions)


```{r echo=FALSE}
knitr::opts_chunk$set(fig.path="man/figures/")
```
**Brings SummarizedExperiment to the tidyverse!**

website: [stemangiola.github.io/tidySummarizedExperiment/](https://stemangiola.github.io/tidySummarizedExperiment/)

Another [nice introduction](https://carpentries-incubator.github.io/bioc-intro/60-next-steps/index.html) by carpentries-incubator.

Please also have a look at

- [tidySingleCellExperiment](https://stemangiola.github.io/tidySingleCellExperiment/) for tidy manipulation of SingleCellExperiment objects
- [tidyseurat](https://stemangiola.github.io/tidyseurat/) for tidy manipulation of Seurat objects
- [tidybulk](https://stemangiola.github.io/tidybulk/) for tidy analysis of RNA sequencing data
- [nanny](https://github.com/stemangiola/nanny) for tidy high-level data analysis and manipulation
- [tidygate](https://github.com/stemangiola/tidygate) for adding custom gate information to your tibble
- [tidyHeatmap](https://stemangiola.github.io/tidyHeatmap/) for heatmaps produced with tidy principles


```{r, echo=FALSE, include=FALSE}
library(knitr)
knitr::opts_chunk$set(warning=FALSE, message=FALSE)
```

# Introduction

tidySummarizedExperiment provides a bridge between Bioconductor [SummarizedExperiment](https://bioconductor.org/packages/release/bioc/html/SummarizedExperiment.html) [@morgan2020summarized] and the tidyverse [@wickham2019welcome]. It creates an invisible layer that enables viewing the
Bioconductor *SummarizedExperiment* object as a tidyverse tibble, and provides SummarizedExperiment-compatible *dplyr*, *tidyr*, *ggplot* and *plotly* functions. This allows users to get the best of both Bioconductor and tidyverse worlds.


## Functions/utilities available

SummarizedExperiment-compatible Functions | Description
------------ | -------------
`all` | After all `tidySummarizedExperiment` is a SummarizedExperiment object, just better

tidyverse Packages | Description
------------ | -------------
`dplyr` | Almost all `dplyr` APIs like for any tibble
`tidyr` | Almost all `tidyr` APIs like for any tibble
`ggplot2` | `ggplot` like for any tibble
`plotly` | `plot_ly` like for any tibble

Utilities | Description
------------ | -------------
`as_tibble` | Convert cell-wise information to a `tbl_df`

## Installation

```{r, eval=FALSE}
if (!requireNamespace("BiocManager", quietly=TRUE)) {
      install.packages("BiocManager")
  }

BiocManager::install("tidySummarizedExperiment")
```

From Github (development)
```{r, eval=FALSE}
devtools::install_github("stemangiola/tidySummarizedExperiment")
```

Load libraries used in the examples.

```{r}
library(ggplot2)
library(tidySummarizedExperiment)
```


# Create `tidySummarizedExperiment`, the best of both worlds!

This is a SummarizedExperiment object but it is evaluated as a tibble. So it is fully compatible both with SummarizedExperiment and tidyverse APIs.

```{r}
pasilla_tidy <- tidySummarizedExperiment::pasilla 
```

**It looks like a tibble**

```{r}
pasilla_tidy
```

**But it is a SummarizedExperiment object after all**

```{r}
assays(pasilla_tidy)
```


# Tidyverse commands

We can use tidyverse commands to explore the tidy SummarizedExperiment object.

We can use `slice` to choose rows by position, for example to choose the first row.

```{r}
pasilla_tidy %>%
    slice(1)
```

We can use `filter` to choose rows by criteria.

```{r}
pasilla_tidy %>%
    filter(condition == "untreated")
```

We can use `select` to choose columns.

```{r}
pasilla_tidy %>%
    select(.sample)
```

We can use `count` to count how many rows we have for each sample.

```{r}
pasilla_tidy %>%
    count(.sample)
```

We can use `distinct` to see what distinct sample information we have.

```{r}
pasilla_tidy %>%
    distinct(.sample, condition, type)
```

We could use `rename` to rename a column. For example, to modify the type column name.

```{r}
pasilla_tidy %>%
    rename(sequencing=type)
```

We could use `mutate` to create a column. For example, we could create a new type column that contains single
and paired instead of single_end and paired_end.

```{r}
pasilla_tidy %>%
    mutate(type=gsub("_end", "", type))
```

We could use `unite` to combine multiple columns into a single column.

```{r}
pasilla_tidy %>%
    unite("group", c(condition, type))
```

We can also combine commands with the tidyverse pipe `%>%`.

For example, we could combine `group_by` and `summarise` to get the total counts for each sample.

```{r}
pasilla_tidy %>%
    group_by(.sample) %>%
    summarise(total_counts=sum(counts))
```

We could combine `group_by`, `mutate` and `filter` to get the transcripts with mean count > 0.

```{r}
pasilla_tidy %>%
    group_by(.feature) %>%
    mutate(mean_count=mean(counts)) %>%
    filter(mean_count > 0)
```


# Plotting

```{r}
my_theme <-
    list(
        scale_fill_brewer(palette="Set1"),
        scale_color_brewer(palette="Set1"),
        theme_bw() +
            theme(
                panel.border=element_blank(),
                axis.line=element_line(),
                panel.grid.major=element_line(size=0.2),
                panel.grid.minor=element_line(size=0.1),
                text=element_text(size=12),
                legend.position="bottom",
                aspect.ratio=1,
                strip.background=element_blank(),
                axis.title.x=element_text(margin=margin(t=10, r=10, b=10, l=10)),
                axis.title.y=element_text(margin=margin(t=10, r=10, b=10, l=10))
            )
    )
```

We can treat `pasilla_tidy` as a normal tibble for plotting.

Here we plot the distribution of counts per sample.

```{r plot1}
pasilla_tidy %>%
    ggplot(aes(counts + 1, group=.sample, color=`type`)) +
    geom_density() +
    scale_x_log10() +
    my_theme
```

Owner

  • Name: tidyomics
  • Login: tidyomics
  • Kind: organization

Open organizaion of developers creating tidy-style analysis packages in R/Bioconductor and beyond. Reach out to join.

GitHub Events

Total
  • Issues event: 2
  • Issue comment event: 3
  • Push event: 16
  • Pull request review comment event: 6
  • Pull request event: 2
  • Create event: 2
Last Year
  • Issues event: 2
  • Issue comment event: 3
  • Push event: 16
  • Pull request review comment event: 6
  • Pull request event: 2
  • Create event: 2

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 1
  • Total pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 1
  • Total pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 1
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • stemangiola (1)
Pull Request Authors
  • stemangiola (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

DESCRIPTION cran
  • R >= 4.1.0 depends
  • SummarizedExperiment * depends
  • S4Vectors * imports
  • cli * imports
  • dplyr * imports
  • ellipsis * imports
  • fansi * imports
  • ggplot2 * imports
  • lifecycle * imports
  • magrittr * imports
  • methods * imports
  • pillar * imports
  • plotly * imports
  • purrr * imports
  • rlang * imports
  • stringr * imports
  • tibble >= 3.0.4 imports
  • tidyr * imports
  • tidyselect * imports
  • utils * imports
  • vctrs * imports
  • BiocStyle * suggests
  • knitr * suggests
  • markdown * suggests
  • testthat * suggests
.github/workflows/check-bioc.yml actions
  • actions/cache v3 composite
  • actions/checkout v2 composite
  • actions/upload-artifact v3 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite