Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
✓DOI references
Found 3 DOI reference(s) in README -
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.2%) to scientific vocabulary
Keywords
Repository
Interact easily with Elasticsearch-related backend in R
Statistics
- Stars: 3
- Watchers: 4
- Forks: 1
- Open Issues: 2
- Releases: 0
Topics
Metadata Files
README.md

kibior: easy scientific data handling, searching and sharing with Elasticsearch
Version: 0.1.1
TL;DR
| | |
|-|-|
| What | kibior is a R package dedicated to ease the pain of data handling in science, and more notably with biological data. |
| Where | kibior is using Elasticsearch as database and search engine. |
| Who | kibior is built for data science and data manipulation, so when any data-related action or need is involved, notably sharing data. It mainly targets bioinformaticians, and more broadly, data scientists. |
| When | Available now from this repository, or CRAN repository. |
| Public instances | Use the $get_kibio_instance() method to connect to Kibio and access known datasets. See Kibio datasets at the end of this document for a complete list. |
| Cite this package | In R session, run citation("kibior") |
| Publication | 10.1093/bioinformatics/btab157 |
Main features
This package allows:
Pushing,pulling,joining,sharingandsearchingtabular data between an R session and one or multiple Elasticsearch instances/clusters.Massive data query and filterwith Elasticsearch engine.Multiple living Elasticsearch connectionsto different addresses.Method autocompletionin proper environments (e.g. R cli, RStudio).Import and export datasetsfrom an to files.Server-side executionfor most of operations (i.e. on Elasticsearch instances/clusters).
How
Install
```r
Get from CRAN
install.packages("kibior")
or get the latest from Github
devtools::install_github("regisoc/kibior") ```
Run
```r
load
library(kibior)
Get a specific instance
kc <- Kibior$new("serveroraddress", port)
Or try something bigger...
kibio <- Kibior$getkibioinstance() kibio$list()
```
Examples
Here is an extract of some of the features proposed by KibioR.
See Introduction vignette for more advanced usage.
Example: push datasets
```r
Push data (R memory -> Elasticsearch)
dplyr::starwars %>% kc$push("sw") dplyr::storms %>% kc$push("st") ```
Example: pull datasets
```r
Pull data with columns selection (Elasticsearch -> R memory)
kc$pull("sw", query = "homeworld:(naboo || tatooine)", columns = c("name", "homeworld", "height", "mass", "species"))
see vignette for query syntax
```
Example: copy datasets
```r
Copy dataset (Elasticsearch internal operation)
kc$copy("sw", "sw_copy") ```
Example: delete datasets
```r
Delete datasets
kc$delete("sw_copy") ```
Example: list, match dataset names
```r
List available datasets
kc$list()
Search for index names starting with "s"
kc$match("s*") ```
Example: get columns names and list unique keys in values
```r
Get columns of all datasets starting with "s"
kc$columns("s*")
Get unique values of a column
kc$keys("sw", "homeworld") ```
Example: some Elasticsearch basic statistical methods
```r
Count number of lines in dataset
kc$count("st")
Count number of lines with query (name of the storm is Anita)
kc$count("st", query = "name:anita")
Generic stats on two columns
kc$stats("sw", c("height", "mass"))
Specific descriptive stats with query
kc$avg("sw", c("height", "mass"), query = "homeworld:naboo") ```
Example: join
```r
Inner join between:
1/ a Elasticsearch-based dataset with query ("sw"),
2/ and a in-memory R dataset (dplyr::starwars)
kc$innerjoin("sw", dplyr::starwars, leftquery = "haircolor:black", leftcolumns = c("name", "mass", "height"), by = "name") ```
Owner
- Name: regis
- Login: regisoc
- Kind: user
- Repositories: 3
- Profile: https://github.com/regisoc
GitHub Events
Total
Last Year
Committers
Last synced: almost 3 years ago
All Time
- Total Commits: 129
- Total Committers: 2
- Avg Commits per committer: 64.5
- Development Distribution Score (DDS): 0.031
Top Committers
| Name | Commits | |
|---|---|---|
| regisoc | r****1@u****a | 125 |
| regis | r****c@u****m | 4 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 11 months ago
All Time
- Total issues: 4
- Total pull requests: 0
- Average time to close issues: about 2 months
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 0.75
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- regisoc (4)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- cran 236 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 2
- Total maintainers: 1
cran.r-project.org: kibior
A Simple Data Management and Sharing Tool
- Homepage: https://github.com/regisoc/kibior
- Documentation: http://cran.r-project.org/web/packages/kibior/kibior.pdf
- License: GPL-2
- Status: removed
-
Latest release: 0.1.1
published about 5 years ago
Rankings
Maintainers (1)
Dependencies
- R >= 4.0 depends
- Biostrings * imports
- R6 >= 2.5.0 imports
- Rsamtools * imports
- data.table >= 1.13.2 imports
- dplyr >= 1.0.2 imports
- elastic >= 1.1.0 imports
- jsonlite >= 1.7.1 imports
- magrittr >= 1.5 imports
- purrr >= 0.3.4 imports
- rio >= 0.5.16 imports
- rtracklayer * imports
- stringr >= 1.4.0 imports
- tibble >= 3.0.4 imports
- tidyr >= 1.1.2 imports
- ggplot2 >= 3.3.2 suggests
- knitr >= 1.30 suggests
- readr >= 1.4.0 suggests
- rmarkdown >= 2.5 suggests
- testthat >= 3.0.0 suggests
- xml2 >= 1.3.2 suggests
- yaml >= 2.2.1 suggests