kibior

Interact easily with Elasticsearch-related backend in R

https://github.com/regisoc/kibior

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
○
codemeta.json file
○
.zenodo.json file
✓
DOI references
Found 3 DOI reference(s) in README
○
Academic publication links
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (14.2%) to scientific vocabulary

Keywords

data-science database datasets elasticsearch elasticsearch-client push-pull r search search-engine

Last synced: 6 months ago · JSON representation

Repository

Interact easily with Elasticsearch-related backend in R

Basic Info

Host: GitHub
Owner: regisoc
Language: R
Default Branch: master
Homepage:
Size: 2.13 MB

Statistics

Stars: 3
Watchers: 4
Forks: 1
Open Issues: 2
Releases: 0

Topics

data-science database datasets elasticsearch elasticsearch-client push-pull r search search-engine

Created almost 6 years ago · Last pushed over 4 years ago

Metadata Files

Readme

kibior: easy scientific data handling, searching and sharing with Elasticsearch

Version: 0.1.1

TL;DR

| | | |-|-| | What | kibior is a R package dedicated to ease the pain of data handling in science, and more notably with biological data. | | Where | kibior is using Elasticsearch as database and search engine. | | Who | kibior is built for data science and data manipulation, so when any data-related action or need is involved, notably sharing data. It mainly targets bioinformaticians, and more broadly, data scientists. | | When | Available now from this repository, or CRAN repository. | | Public instances | Use the $get_kibio_instance() method to connect to Kibio and access known datasets. See Kibio datasets at the end of this document for a complete list. | | Cite this package | In R session, run citation("kibior") | | Publication | 10.1093/bioinformatics/btab157 |

Main features

This package allows:

Pushing, pulling, joining, sharing and searching tabular data between an R session and one or multiple Elasticsearch instances/clusters.
Massive data query and filter with Elasticsearch engine.
Multiple living Elasticsearch connections to different addresses.
Method autocompletion in proper environments (e.g. R cli, RStudio).
Import and export datasets from an to files.
Server-side execution for most of operations (i.e. on Elasticsearch instances/clusters).

How

Install

```r

Get from CRAN

install.packages("kibior")

or get the latest from Github

devtools::install_github("regisoc/kibior") ```

Run

```r

load

library(kibior)

Get a specific instance

kc <- Kibior$new("serveroraddress", port)

Or try something bigger...

kibio <- Kibior$getkibioinstance() kibio$list()

```

Examples

Here is an extract of some of the features proposed by KibioR. See Introduction vignette for more advanced usage.

Example: `push` datasets

```r

Push data (R memory -> Elasticsearch)

dplyr::starwars %>% kc$push("sw") dplyr::storms %>% kc$push("st") ```

Example: `pull` datasets

```r

Pull data with columns selection (Elasticsearch -> R memory)

kc$pull("sw", query = "homeworld:(naboo || tatooine)", columns = c("name", "homeworld", "height", "mass", "species"))

see vignette for query syntax

```

Example: `copy` datasets

```r

Copy dataset (Elasticsearch internal operation)

kc$copy("sw", "sw_copy") ```

Example: `delete` datasets

```r

Delete datasets

kc$delete("sw_copy") ```

Example: `list`, `match` dataset names

```r

List available datasets

kc$list()

Search for index names starting with "s"

kc$match("s*") ```

Example: get `columns` names and list unique `keys` in values

```r

Get columns of all datasets starting with "s"

kc$columns("s*")

Get unique values of a column

kc$keys("sw", "homeworld") ```

Example: some Elasticsearch basic statistical methods

```r

Count number of lines in dataset

kc$count("st")

Count number of lines with query (name of the storm is Anita)

kc$count("st", query = "name:anita")

Generic stats on two columns

kc$stats("sw", c("height", "mass"))

Specific descriptive stats with query

kc$avg("sw", c("height", "mass"), query = "homeworld:naboo") ```

Example: `join`

```r

Inner join between:

1/ a Elasticsearch-based dataset with query ("sw"),

2/ and a in-memory R dataset (dplyr::starwars)

kc$innerjoin("sw", dplyr::starwars, leftquery = "haircolor:black", leftcolumns = c("name", "mass", "height"), by = "name") ```

Owner

Name: regis
Login: regisoc
Kind: user

Repositories: 3
Profile: https://github.com/regisoc

GitHub Events

Total

Last Year

Committers

Last synced: almost 3 years ago

All Time

Total Commits: 129
Total Committers: 2
Avg Commits per committer: 64.5
Development Distribution Score (DDS): 0.031

Top Committers

Name	Email	Commits
regisoc	r**1@u**a	125
regis	r**c@u**m	4

Committer Domains (Top 20 + Academic)

ulaval.ca: 1

Issues and Pull Requests

Last synced: 11 months ago

All Time

Total issues: 4
Total pull requests: 0
Average time to close issues: about 2 months
Average time to close pull requests: N/A
Total issue authors: 1
Total pull request authors: 0
Average comments per issue: 0.75
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

regisoc (4)

Pull Request Authors

Top Labels

Issue Labels

enhancement (4)

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- cran 236 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 2
Total maintainers: 1

cran.r-project.org: kibior

A Simple Data Management and Sharing Tool

Homepage: https://github.com/regisoc/kibior
Documentation: http://cran.r-project.org/web/packages/kibior/kibior.pdf
License: GPL-2
Status: removed
Latest release: 0.1.1
published about 5 years ago

Versions: 2
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 236 Last month

Rankings

Stargazers count: 28.5%

Forks count: 28.8%

Dependent packages count: 29.8%

Dependent repos count: 35.5%

Average: 35.7%

Downloads: 56.1%

Maintainers (1)

regis.ongaro-carcy2@crchudequebec.ulaval.ca

Last synced: 10 months ago

Dependencies

DESCRIPTION cran

R >= 4.0 depends
Biostrings * imports
R6 >= 2.5.0 imports
Rsamtools * imports
data.table >= 1.13.2 imports
dplyr >= 1.0.2 imports
elastic >= 1.1.0 imports
jsonlite >= 1.7.1 imports
magrittr >= 1.5 imports
purrr >= 0.3.4 imports
rio >= 0.5.16 imports
rtracklayer * imports
stringr >= 1.4.0 imports
tibble >= 3.0.4 imports
tidyr >= 1.1.2 imports
ggplot2 >= 3.3.2 suggests
knitr >= 1.30 suggests
readr >= 1.4.0 suggests
rmarkdown >= 2.5 suggests
testthat >= 3.0.0 suggests
xml2 >= 1.3.2 suggests
yaml >= 2.2.1 suggests

kibior

Science Score: 13.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

kibior: easy scientific data handling, searching and sharing with Elasticsearch

TL;DR

Main features

How

Install

Get from CRAN

or get the latest from Github

Run

load

Get a specific instance

Or try something bigger...

Examples

Example: push datasets

Push data (R memory -> Elasticsearch)

Example: pull datasets

Pull data with columns selection (Elasticsearch -> R memory)

see vignette for query syntax

Example: copy datasets

Copy dataset (Elasticsearch internal operation)

Example: delete datasets

Delete datasets

Example: list, match dataset names

List available datasets

Search for index names starting with "s"

Example: get columns names and list unique keys in values

Get columns of all datasets starting with "s"

Get unique values of a column

Example: some Elasticsearch basic statistical methods

Count number of lines in dataset

Count number of lines with query (name of the storm is Anita)

Generic stats on two columns

Specific descriptive stats with query

Example: join

Inner join between:

1/ a Elasticsearch-based dataset with query ("sw"),

2/ and a in-memory R dataset (dplyr::starwars)

Owner

GitHub Events

Total

Last Year

Committers

All Time

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

cran.r-project.org: kibior

Rankings

Maintainers (1)

Dependencies

Example: `push` datasets

Example: `pull` datasets

Example: `copy` datasets

Example: `delete` datasets

Example: `list`, `match` dataset names

Example: get `columns` names and list unique `keys` in values

Example: `join`