pysparklyr

Extension to {sparklyr} that allows you to interact with Spark & Databricks Connect

https://github.com/mlverse/pysparklyr

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.0%) to scientific vocabulary

Keywords

databricks pyspark r spark spark-connect
Last synced: 6 months ago · JSON representation

Repository

Extension to {sparklyr} that allows you to interact with Spark & Databricks Connect

Basic Info
Statistics
  • Stars: 17
  • Watchers: 7
  • Forks: 4
  • Open Issues: 16
  • Releases: 9
Topics
databricks pyspark r spark spark-connect
Created over 2 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog License

README.Rmd

---
output: github_document
---



```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-",
  out.width = "100%"
)
```

# pysparklyr



[![R-CMD-check](https://github.com/mlverse/pysparklyr/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/mlverse/pysparklyr/actions/workflows/R-CMD-check.yaml)
[![Spark-Connect](https://github.com/mlverse/pysparklyr/actions/workflows/spark-tests.yaml/badge.svg)](https://github.com/mlverse/pysparklyr/actions/workflows/spark-tests.yaml)
[![codecov](https://codecov.io/gh/mlverse/pysparklyr/graph/badge.svg?token=O1N9qPabpF)](https://app.codecov.io/gh/mlverse/pysparklyr)
[![CRAN status](https://www.r-pkg.org/badges/version/pysparklyr)](https://CRAN.R-project.org/package=pysparklyr)


Integrates `sparklyr` with PySpark and Databricks. The main reason of this 
package is because the new Spark and Databricks Connect connection method does
not work with standard `sparklyr` integration.

## Installing

To install the version in CRAN use:

```r
install.packages("pysparklyr")
```

To get the development version from GitHub use:

```r
remotes::install_github("mlverse/pysparklyr")
```

## Using

To learn how to use, please visit the Spark / Databricks Connect article, 
available in the official `sparklyr` website: [Spark Connect, and Databricks Connect v2
](https://spark.posit.co/deployment/databricks-connect.html)

Owner

  • Name: mlverse
  • Login: mlverse
  • Kind: organization

Open source libraries to scale Data Science

GitHub Events

Total
  • Create event: 5
  • Release event: 3
  • Issues event: 13
  • Watch event: 3
  • Issue comment event: 30
  • Push event: 178
  • Pull request event: 27
  • Fork event: 2
Last Year
  • Create event: 5
  • Release event: 3
  • Issues event: 13
  • Watch event: 3
  • Issue comment event: 30
  • Push event: 178
  • Pull request event: 27
  • Fork event: 2

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 64
  • Total pull requests: 93
  • Average time to close issues: 20 days
  • Average time to close pull requests: 3 days
  • Total issue authors: 17
  • Total pull request authors: 5
  • Average comments per issue: 1.56
  • Average comments per pull request: 0.33
  • Merged pull requests: 86
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 13
  • Pull requests: 29
  • Average time to close issues: 2 days
  • Average time to close pull requests: 8 days
  • Issue authors: 8
  • Pull request authors: 3
  • Average comments per issue: 0.46
  • Average comments per pull request: 0.83
  • Merged pull requests: 26
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • edgararuiz (41)
  • dskard (3)
  • seugurlu (2)
  • romangehrn (2)
  • tschutte (2)
  • blairj09 (2)
  • shalutiwari (2)
  • kmishra9 (1)
  • francisbarton (1)
  • wurli (1)
  • JavOrraca (1)
  • zacdav-db (1)
  • dwh1142 (1)
  • gottalottasoul (1)
  • sharon-wang (1)
Pull Request Authors
  • edgararuiz (82)
  • zacdav-db (6)
  • edward-burn (2)
  • t-kalinowski (2)
  • blairj09 (1)
Top Labels
Issue Labels
fix-before-release (5) in-release (5) fixed-in-dev (3) enhancement (3) documentation (1)
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cran 1,017 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 9
  • Total maintainers: 1
cran.r-project.org: pysparklyr

Provides a 'PySpark' Back-End for the 'sparklyr' Package

  • Versions: 9
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 1,017 Last month
Rankings
Forks count: 21.3%
Stargazers count: 25.7%
Dependent packages count: 27.9%
Dependent repos count: 36.8%
Average: 39.6%
Downloads: 86.1%
Maintainers (1)
Last synced: 6 months ago

Dependencies

.github/workflows/R-CMD-check.yaml actions
  • actions/checkout v3 composite
  • r-lib/actions/check-r-package v2 composite
  • r-lib/actions/setup-pandoc v2 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
DESCRIPTION cran
  • DBI * imports
  • cli * imports
  • dbplyr * imports
  • dplyr * imports
  • fs * imports
  • glue * imports
  • magrittr * imports
  • methods * imports
  • purrr * imports
  • reticulate * imports
  • rlang * imports
  • sparklyr >= 1.8.2.9000 imports
  • tidyr * imports
  • tidyselect * imports
.github/workflows/spark-tests.yaml actions
  • actions/cache v3 composite
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/test-coverage.yaml actions
  • actions/cache v2 composite
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
  • actions/upload-artifact v3 composite
  • r-lib/actions/setup-r v2 composite
  • r-lib/actions/setup-r-dependencies v2 composite