https://github.com/ahasverus/gpack

:package: R package to web scrap G**gle services

https://github.com/ahasverus/gpack

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (18.2%) to scientific vocabulary

Keywords

docker-image google-images google-scholar google-search google-trends openvpn r rselenium rvest unix-systems webscraping
Last synced: 5 months ago · JSON representation

Repository

:package: R package to web scrap G**gle services

Basic Info
Statistics
  • Stars: 2
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
docker-image google-images google-scholar google-search google-trends openvpn r rselenium rvest unix-systems webscraping
Created about 6 years ago · Last pushed about 3 years ago
Metadata Files
Readme License

README.Rmd

---
output: github_document
---




```{r, include = FALSE}
knitr::opts_chunk$set(collapse  = TRUE,
                      comment   = "#>",
                      fig.path  = "man/figures/",
                      out.width = "100%")
```



gpack 
=========================================================


[![R CMD Check](https://github.com/ahasverus/gpack/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/ahasverus/gpack/actions/workflows/R-CMD-check.yaml)
[![Website](https://github.com/ahasverus/gpack/actions/workflows/pkgdown.yaml/badge.svg)](https://github.com/ahasverus/gpack/actions/workflows/pkgdown.yaml)
[![CRAN status](https://www.r-pkg.org/badges/version/gpack)](https://CRAN.R-project.org/package=gpack)
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://choosealicense.com/licenses/mit/)



The goal of the R package `gpack` is to provide tools to web scraping G\*\*gle 
Services (Scholar, Pictures, Trends, Search). As G\*\*gle does not provide any API
and does not allow web scraping, user public IP address can be banned. This 
package relies on the software OpenVPN to periodically change the IP address
and the user-agent (i.e. the technical information about your system).



## System requirements

Before using the package `gpack` you must follow these instructions:


### Operating system

The package `gpack` has been developed **only for Unix platforms** (macOS and GNU/Linux).
If you are on Windows, you can use Docker to start a GNU/Linux container.

**Important:** the package `gpack` must be run **outside RStudio** (e.g. under a terminal).


### OpenVPN

The package `gpack` uses [**OpenVPN**](https://openvpn.net/). This software is a Virtual Private Network 
(VPN) system. It creates secure connection to VPN server. To install this software
please follows these [**instructions**](https://gist.github.com/ahasverus/41f8a99583149534cac08e7b8f13c51b).

You also need to store your Unix user password (`openvpn` requires super user 
rights to be controlled): Under R, run the following command: 
`usethis::edit_r_environ()`. Add the following line: `UNIX_PASSWD='xxx99_999xXxx'`



### Docker engine

The software [**Docker**](https://www.docker.com/) must be installed and running.
The technology [Selenium](https://www.selenium.dev/) will be run inside a Docker
container.



### Selenium image

The Docker image 
[`selenium/standalone-firefox`](https://hub.docker.com/r/selenium/standalone-firefox) 
must be installed. This image contains the Selenium technology running a Firefox browser.



## Installation

You can install the development version from [GitHub](https://github.com/) with:

```{r eval = FALSE}
# install.packages("remotes")
remotes::install_github("ahasverus/gpack")
```

Then you can attach the package `gpack`:

```{r eval = FALSE}
library("gpack")
```



## Overview

The package `gpack` provides two main function:

- `check_system()`: must be run first to change the integrity of the system
- `scrap_gscholar()`: get references metadata from G\*\*gle Scholar



## Citation

Please cite this package as: 

> Casajus N (`r format(Sys.Date(), "%Y")`) gpack: An R package to web scrap 
G\*\*gle Services (Scholar, Pictures, Trends, Search). R package version 0.0.1.



## Code of Conduct

Please note that the `gpack` project is released with a 
[Contributor Code of Conduct](https://contributor-covenant.org/version/2/0/CODE_OF_CONDUCT.html). 
By contributing to this project, you agree to abide by its terms.

Owner

  • Name: Nicolas Casajus
  • Login: ahasverus
  • Kind: user
  • Location: Montpellier, France
  • Company: @FRBCesab

Data scientist

GitHub Events

Total
Last Year

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 94
  • Total Committers: 1
  • Avg Commits per committer: 94.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Nicolas Casajus n****s@g****m 94

Issues and Pull Requests

Last synced: 7 months ago

All Time
  • Total issues: 1
  • Total pull requests: 0
  • Average time to close issues: 4 days
  • Average time to close pull requests: N/A
  • Total issue authors: 1
  • Total pull request authors: 0
  • Average comments per issue: 1.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • nmouquet (1)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels