xlcutter

Parse Batches of 'xlsx' Files Based on a Template

https://github.com/bisaloo/xlcutter

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (16.5%) to scientific vocabulary

Keywords

data-extraction excel non-rectangular-data r r-package tidy-data
Last synced: 4 months ago · JSON representation

Repository

Parse Batches of 'xlsx' Files Based on a Template

Basic Info
Statistics
  • Stars: 7
  • Watchers: 1
  • Forks: 0
  • Open Issues: 3
  • Releases: 2
Topics
data-extraction excel non-rectangular-data r r-package tidy-data
Created about 3 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog Contributing License Codemeta

README.Rmd

---
output: github_document
---



```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
```

# xlcutter


[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/license/mit/)
[![R-CMD-check](https://github.com/Bisaloo/xlcutter/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/Bisaloo/xlcutter/actions/workflows/R-CMD-check.yaml)
[![Codecov test coverage](https://codecov.io/gh/Bisaloo/xlcutter/branch/main/graph/badge.svg)](https://app.codecov.io/gh/Bisaloo/xlcutter?branch=main)
[![lifecycle-concept](https://raw.githubusercontent.com/reconverse/reconverse.github.io/master/images/badge-concept.svg)](https://www.reconverse.org/lifecycle.html#concept)


This package allows you to parse entire folders of non-rectangular 'xlsx' files
into a single rectangular and tidy 'data.frame' based on a custom template file
defining the column names of the output.

## Installation

You can install the latest stable version of this package from CRAN:

``` r
install.packages("xlcutter")
```

or the development version from [GitHub](https://github.com/) with:

``` r
# install.packages("remotes")
remotes::install_github("Bisaloo/xlcutter")
```

## Example

Non-rectangular excel files are common in many domains. For a simple
demonstration here, we use the example of the ["Blue
timesheet"](https://templates.office.com/en-us/blue-timesheet-tm77799521) from
, where employees can log their working hours.

A typical use case of xlcutter in this example would be for a manager who want
to get a single rectangular dataset with the timesheets from different
employees.

![Screenshot of timesheets from two fictitious employees](man/figures/screenshot_timesheets.png)

Your first step to extract the data is to define the various columns you want
in the output in a *template* file. You can mark the data cells to extract with
any custom marker, with the default being `{{ column_name }}`.

![Screenshot of a template for the timesheet example](man/figures/screenshot_template.png)

```{r}
library(xlcutter)

data_files <- list.files(
  system.file("example", "timesheet", package = "xlcutter"),
  pattern = "\\.xlsx$",
  full.names = TRUE
)

template_file <- system.file(
  "example", "timesheet_template.xlsx",
  package = "xlcutter"
)

xlsx_cutter(
  data_files,
  template_file
)
```

## Other example of use cases

Other typical use cases for this package could be:

- an hospital that wants to collate non-rectangular information sheets from
different patients into a single rectangular dataset

Owner

  • Name: Hugo Gruson
  • Login: Bisaloo
  • Kind: user
  • Location: Heidelberg
  • Company: EMBL

Evolutionary Biologist turned Research Software Engineer in R.

CodeMeta (codemeta.json)

{
  "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
  "@type": "SoftwareSourceCode",
  "identifier": "xlcutter",
  "description": "Parse entire folders of non-rectangular 'xlsx' files into a single rectangular and tidy 'data.frame' based on a custom template file defining the column names of the output.",
  "name": "xlcutter: Parse Batches of 'xlsx' Files Based on a Template",
  "relatedLink": "https://hugogruson.fr/xlcutter/",
  "codeRepository": "https://github.com/Bisaloo/xlcutter",
  "issueTracker": "https://github.com/Bisaloo/xlcutter/issues",
  "license": "https://spdx.org/licenses/MIT",
  "version": "0.1.0",
  "programmingLanguage": {
    "@type": "ComputerLanguage",
    "name": "R",
    "url": "https://r-project.org"
  },
  "runtimePlatform": "R version 4.2.2 Patched (2022-11-10 r83330)",
  "author": [
    {
      "@type": "Person",
      "givenName": "Hugo",
      "familyName": "Gruson",
      "email": "hugo.gruson+R@normalesup.org",
      "@id": "https://orcid.org/0000-0002-4094-1476"
    }
  ],
  "copyrightHolder": [
    {
      "@type": "Person",
      "givenName": "Hugo",
      "familyName": "Gruson",
      "email": "hugo.gruson+R@normalesup.org",
      "@id": "https://orcid.org/0000-0002-4094-1476"
    }
  ],
  "maintainer": [
    {
      "@type": "Person",
      "givenName": "Hugo",
      "familyName": "Gruson",
      "email": "hugo.gruson+R@normalesup.org",
      "@id": "https://orcid.org/0000-0002-4094-1476"
    }
  ],
  "softwareSuggestions": [
    {
      "@type": "SoftwareApplication",
      "identifier": "knitr",
      "name": "knitr",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=knitr"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "rmarkdown",
      "name": "rmarkdown",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=rmarkdown"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "testthat",
      "name": "testthat",
      "version": ">= 3.0.0",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=testthat"
    }
  ],
  "softwareRequirements": {
    "1": {
      "@type": "SoftwareApplication",
      "identifier": "tidyxl",
      "name": "tidyxl",
      "provider": {
        "@id": "https://cran.r-project.org",
        "@type": "Organization",
        "name": "Comprehensive R Archive Network (CRAN)",
        "url": "https://cran.r-project.org"
      },
      "sameAs": "https://CRAN.R-project.org/package=tidyxl"
    },
    "SystemRequirements": null
  },
  "fileSize": "767.235KB",
  "readme": "https://github.com/Bisaloo/xlcutter/blob/main/README.md",
  "contIntegration": [
    "https://github.com/Bisaloo/xlcutter/actions/workflows/R-CMD-check.yaml",
    "https://app.codecov.io/gh/Bisaloo/xlcutter?branch=main"
  ],
  "developmentStatus": "https://www.reconverse.org/lifecycle.html#concept",
  "keywords": [
    "data-extraction",
    "excel",
    "r",
    "r-package",
    "tidy-data",
    "non-rectangular-data"
  ]
}

GitHub Events

Total
  • Push event: 3
Last Year
  • Push event: 3

Committers

Last synced: about 1 year ago

All Time
  • Total Commits: 36
  • Total Committers: 1
  • Avg Commits per committer: 36.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 2
  • Committers: 1
  • Avg Commits per committer: 2.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Hugo Gruson B****o 36

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 8
  • Total pull requests: 6
  • Average time to close issues: 3 months
  • Average time to close pull requests: 22 days
  • Total issue authors: 2
  • Total pull request authors: 1
  • Average comments per issue: 0.38
  • Average comments per pull request: 0.0
  • Merged pull requests: 6
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • Bisaloo (7)
  • playmobilmeister (1)
Pull Request Authors
  • Bisaloo (7)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

.github/workflows/R-CMD-check.yaml actions
  • actions/checkout v2 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/lint-changed-files.yaml actions
  • actions/checkout v3 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/pkgdown.yaml actions
  • JamesIves/github-pages-deploy-action 4.1.4 composite
  • actions/checkout v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/render_readme.yml actions
  • actions/checkout v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/test-coverage.yaml actions
  • actions/checkout v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION cran
  • tidyxl * imports
  • knitr * suggests
  • rmarkdown * suggests
  • testthat >= 3.0.0 suggests