tripadvisor-review-scraper

Scrape TripAdvisor Reviews

https://github.com/algo7/tripadvisor-review-scraper

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (6.9%) to scientific vocabulary

Keywords

scraper tripadvisor tripadvisor-scraper
Last synced: 6 months ago · JSON representation ·

Repository

Scrape TripAdvisor Reviews

Basic Info
  • Host: GitHub
  • Owner: algo7
  • License: gpl-3.0
  • Language: JavaScript
  • Default Branch: main
  • Homepage:
  • Size: 120 MB
Statistics
  • Stars: 16
  • Watchers: 2
  • Forks: 9
  • Open Issues: 5
  • Releases: 11
Topics
scraper tripadvisor tripadvisor-scraper
Created almost 4 years ago · Last pushed 6 months ago
Metadata Files
Readme License Citation

README.md

TripAdvisor-Review-Scraper

A simple scraper for TripAdvisor (Hotel, Restaurant, Airline) reviews.

Build & Push [Container Provisioner]

Build & Push [Scraper]

Build & Push [VPN Worker]

CodeQL

Current Issues

Table of Contents

Requirements

  1. Go +v1.21
  2. Make [Optional]
  3. Docker [Optional]
  4. Docker Compose [Optional]
  5. Node.js +18 [Optional. Only required if you want to use the scraper written in Node.js, which is deprecated.]

How to Install Docker:

  1. Windows
  2. Mac
  3. Linux

Project Layout

Scraper

There are 2 scrapers available: 1. Scraper written in Go 2. Scraper written in Node.js [Deprecated]

The scraper written in Go is preferred because it calls the API directly and is much faster than the scraper written in Node.js which goes the traditional way of parsing HTML. The instructions of how to use them are located in their separate folders.

Container Provisioner

Automates the process of provisioning containers for the scraper.

Please read more about the container provisioner here

Proxy Pool

Provides a pool of proxies for the scraper to use.

Please read more about the proxy pool here

Owner

  • Name: algo7
  • Login: algo7
  • Kind: user
  • Location: Taiwan Nantou | Switzerland Zurich | Switzerland Lausanne
  • Company: ECHO

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Lo"
  given-names: "Aviv"
title: "TripAdvisor-Review-Scraper"
version: 1.0.0
date-released: 2022-03-03
url: "https://github.com/algo7/TripAdvisor-Review-Scraper"

GitHub Events

Total
  • Issues event: 2
  • Watch event: 4
  • Delete event: 55
  • Issue comment event: 28
  • Push event: 27
  • Pull request review event: 5
  • Pull request event: 103
  • Fork event: 4
  • Create event: 54
Last Year
  • Issues event: 2
  • Watch event: 4
  • Delete event: 55
  • Issue comment event: 28
  • Push event: 27
  • Pull request review event: 5
  • Pull request event: 103
  • Fork event: 4
  • Create event: 54

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 0
  • Total pull requests: 45
  • Average time to close issues: N/A
  • Average time to close pull requests: 17 days
  • Total issue authors: 0
  • Total pull request authors: 3
  • Average comments per issue: 0
  • Average comments per pull request: 0.36
  • Merged pull requests: 13
  • Bot issues: 0
  • Bot pull requests: 43
Past Year
  • Issues: 0
  • Pull requests: 45
  • Average time to close issues: N/A
  • Average time to close pull requests: 17 days
  • Issue authors: 0
  • Pull request authors: 3
  • Average comments per issue: 0
  • Average comments per pull request: 0.36
  • Merged pull requests: 13
  • Bot issues: 0
  • Bot pull requests: 43
Top Authors
Issue Authors
  • algo7 (2)
  • murat-donut (1)
Pull Request Authors
  • dependabot[bot] (206)
  • algo7 (9)
  • eltorio (3)
  • irregularised (1)
  • Shidooo (1)
Top Labels
Issue Labels
bug (1) enhancement (1)
Pull Request Labels
dependencies (205) go (184) javascript (13) github_actions (8)

Packages

  • Total packages: 2
  • Total downloads: unknown
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 0
    (may contain duplicates)
  • Total versions: 13
proxy.golang.org: github.com/algo7/TripAdvisor-Review-Scraper/scraper
  • Versions: 0
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 8.7%
Average: 9.3%
Dependent repos count: 9.8%
Last synced: 6 months ago
proxy.golang.org: github.com/algo7/TripAdvisor-Review-Scraper/container_provisioner
  • Versions: 13
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent packages count: 8.4%
Dependent repos count: 10.4%
Stargazers count: 12.5%
Average: 12.5%
Forks count: 18.9%
Last synced: 6 months ago

Dependencies

setup/go.mod go
  • github.com/Microsoft/go-winio v0.4.16
  • github.com/ProtonMail/go-crypto v0.0.0-20210428141323-04723f9f07d7
  • github.com/acomagu/bufpipe v1.0.3
  • github.com/emirpasic/gods v1.12.0
  • github.com/go-git/gcfg v1.5.0
  • github.com/go-git/go-billy/v5 v5.3.1
  • github.com/go-git/go-git/v5 v5.4.2
  • github.com/imdario/mergo v0.3.12
  • github.com/jbenet/go-context v0.0.0-20150711004518-d14ea06fba99
  • github.com/kevinburke/ssh_config v0.0.0-20201106050909-4977a11b4351
  • github.com/mitchellh/go-homedir v1.1.0
  • github.com/sergi/go-diff v1.1.0
  • github.com/xanzy/ssh-agent v0.3.0
  • golang.org/x/crypto v0.0.0-20210421170649-83a5a9bb288b
  • golang.org/x/net v0.0.0-20210326060303-6b1517762897
  • golang.org/x/sys v0.0.0-20210502180810-71e4cd670f79
  • gopkg.in/warnings.v0 v0.1.2
setup/go.sum go
  • github.com/Microsoft/go-winio v0.4.14
  • github.com/Microsoft/go-winio v0.4.16
  • github.com/ProtonMail/go-crypto v0.0.0-20210428141323-04723f9f07d7
  • github.com/acomagu/bufpipe v1.0.3
  • github.com/anmitsu/go-shlex v0.0.0-20161002113705-648efa622239
  • github.com/armon/go-socks5 v0.0.0-20160902184237-e75332964ef5
  • github.com/creack/pty v1.1.9
  • github.com/davecgh/go-spew v1.1.0
  • github.com/davecgh/go-spew v1.1.1
  • github.com/emirpasic/gods v1.12.0
  • github.com/flynn/go-shlex v0.0.0-20150515145356-3f9db97f8568
  • github.com/gliderlabs/ssh v0.2.2
  • github.com/go-git/gcfg v1.5.0
  • github.com/go-git/go-billy/v5 v5.2.0
  • github.com/go-git/go-billy/v5 v5.3.1
  • github.com/go-git/go-git-fixtures/v4 v4.2.1
  • github.com/go-git/go-git/v5 v5.4.2
  • github.com/google/go-cmp v0.3.0
  • github.com/imdario/mergo v0.3.12
  • github.com/jbenet/go-context v0.0.0-20150711004518-d14ea06fba99
  • github.com/jessevdk/go-flags v1.5.0
  • github.com/kevinburke/ssh_config v0.0.0-20201106050909-4977a11b4351
  • github.com/konsorten/go-windows-terminal-sequences v1.0.1
  • github.com/kr/pretty v0.1.0
  • github.com/kr/pretty v0.2.1
  • github.com/kr/pty v1.1.1
  • github.com/kr/text v0.1.0
  • github.com/kr/text v0.2.0
  • github.com/matryer/is v1.2.0
  • github.com/mitchellh/go-homedir v1.1.0
  • github.com/niemeyer/pretty v0.0.0-20200227124842-a10e7caefd8e
  • github.com/pkg/errors v0.8.1
  • github.com/pkg/errors v0.9.1
  • github.com/pmezard/go-difflib v1.0.0
  • github.com/sergi/go-diff v1.1.0
  • github.com/sirupsen/logrus v1.4.1
  • github.com/stretchr/objx v0.1.0
  • github.com/stretchr/objx v0.1.1
  • github.com/stretchr/testify v1.2.2
  • github.com/stretchr/testify v1.4.0
  • github.com/stretchr/testify v1.7.0
  • github.com/xanzy/ssh-agent v0.3.0
  • golang.org/x/crypto v0.0.0-20190219172222-a4c6cb3142f2
  • golang.org/x/crypto v0.0.0-20210322153248-0c34fe9e7dc2
  • golang.org/x/crypto v0.0.0-20210421170649-83a5a9bb288b
  • golang.org/x/net v0.0.0-20210226172049-e18ecbb05110
  • golang.org/x/net v0.0.0-20210326060303-6b1517762897
  • golang.org/x/sys v0.0.0-20180905080454-ebe1bf3edb33
  • golang.org/x/sys v0.0.0-20190507160741-ecd444e8653b
  • golang.org/x/sys v0.0.0-20190916202348-b4ddaad3f8a3
  • golang.org/x/sys v0.0.0-20200302150141-5c8b2ff67527
  • golang.org/x/sys v0.0.0-20201119102817-f84b799fce68
  • golang.org/x/sys v0.0.0-20210320140829-1e4c9ba3b0c4
  • golang.org/x/sys v0.0.0-20210324051608-47abb6519492
  • golang.org/x/sys v0.0.0-20210502180810-71e4cd670f79
  • golang.org/x/term v0.0.0-20201126162022-7de9c90e9dd1
  • golang.org/x/text v0.3.3
  • golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e
  • gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405
  • gopkg.in/check.v1 v1.0.0-20190902080502-41f04d3bba15
  • gopkg.in/check.v1 v1.0.0-20200227125254-8fa46927fb4f
  • gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c
  • gopkg.in/warnings.v0 v0.1.2
  • gopkg.in/yaml.v2 v2.2.2
  • gopkg.in/yaml.v2 v2.2.4
  • gopkg.in/yaml.v2 v2.3.0
  • gopkg.in/yaml.v3 v3.0.0-20200313102051-9f266ea9e77c
.github/workflows/ci.yml actions
  • actions/checkout v2 composite
  • docker/login-action v1 composite
.github/workflows/codeql.yml actions
  • actions/checkout v3 composite
  • github/codeql-action/analyze v2 composite
  • github/codeql-action/autobuild v2 composite
  • github/codeql-action/init v2 composite
container_provisioner/Dockerfile docker
  • base latest build
  • golang 1.21.1-alpine3.18 build
container_provisioner/docker-compose.yml docker
  • ghcr.io/algo7/tripadvisor-review-scraper/container_provisioner latest
  • redis alpine
scraper/Dockerfile docker
  • base latest build
  • node 19-slim build
scraper/docker-compose.yml docker
  • ghcr.io/algo7/tripadvisor-review-scraper/scraper latest
container_provisioner/go.mod go
  • github.com/Microsoft/go-winio v0.6.1
  • github.com/andybalholm/brotli v1.0.5
  • github.com/aws/aws-sdk-go-v2 v1.21.0
  • github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.4.13
  • github.com/aws/aws-sdk-go-v2/config v1.18.42
  • github.com/aws/aws-sdk-go-v2/credentials v1.13.40
  • github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.13.11
  • github.com/aws/aws-sdk-go-v2/internal/configsources v1.1.41
  • github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.4.35
  • github.com/aws/aws-sdk-go-v2/internal/ini v1.3.43
  • github.com/aws/aws-sdk-go-v2/internal/v4a v1.1.4
  • github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.9.14
  • github.com/aws/aws-sdk-go-v2/service/internal/checksum v1.1.36
  • github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.9.35
  • github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.15.4
  • github.com/aws/aws-sdk-go-v2/service/s3 v1.40.0
  • github.com/aws/aws-sdk-go-v2/service/sso v1.14.1
  • github.com/aws/aws-sdk-go-v2/service/ssooidc v1.17.1
  • github.com/aws/aws-sdk-go-v2/service/sts v1.22.0
  • github.com/aws/smithy-go v1.14.2
  • github.com/cespare/xxhash/v2 v2.2.0
  • github.com/dgryski/go-rendezvous v0.0.0-20200823014737-9f7001d12a5f
  • github.com/docker/distribution v2.8.2+incompatible
  • github.com/docker/docker v24.0.6+incompatible
  • github.com/docker/go-connections v0.4.0
  • github.com/docker/go-units v0.5.0
  • github.com/gofiber/fiber/v2 v2.49.2
  • github.com/gofiber/template v1.8.2
  • github.com/gofiber/template/html/v2 v2.0.5
  • github.com/gofiber/utils v1.1.0
  • github.com/gogo/protobuf v1.3.2
  • github.com/google/go-cmp v0.5.9
  • github.com/google/uuid v1.3.1
  • github.com/klauspost/compress v1.16.7
  • github.com/mattn/go-colorable v0.1.13
  • github.com/mattn/go-isatty v0.0.19
  • github.com/mattn/go-runewidth v0.0.15
  • github.com/moby/term v0.5.0
  • github.com/morikuni/aec v1.0.0
  • github.com/opencontainers/go-digest v1.0.0
  • github.com/opencontainers/image-spec v1.0.2
  • github.com/pkg/errors v0.9.1
  • github.com/redis/go-redis/v9 v9.2.1
  • github.com/rivo/uniseg v0.2.0
  • github.com/stretchr/testify v1.8.4
  • github.com/valyala/bytebufferpool v1.0.0
  • github.com/valyala/fasthttp v1.49.0
  • github.com/valyala/tcplisten v1.0.0
  • golang.org/x/mod v0.8.0
  • golang.org/x/net v0.8.0
  • golang.org/x/sys v0.12.0
  • golang.org/x/time v0.3.0
  • golang.org/x/tools v0.6.0
  • gotest.tools/v3 v3.5.1
container_provisioner/go.sum go
  • github.com/Azure/go-ansiterm v0.0.0-20210617225240-d185dfc1b5a1
  • github.com/Microsoft/go-winio v0.6.1
  • github.com/andybalholm/brotli v1.0.5
  • github.com/aws/aws-sdk-go-v2 v1.21.0
  • github.com/aws/aws-sdk-go-v2/aws/protocol/eventstream v1.4.13
  • github.com/aws/aws-sdk-go-v2/config v1.18.42
  • github.com/aws/aws-sdk-go-v2/credentials v1.13.40
  • github.com/aws/aws-sdk-go-v2/feature/ec2/imds v1.13.11
  • github.com/aws/aws-sdk-go-v2/internal/configsources v1.1.41
  • github.com/aws/aws-sdk-go-v2/internal/endpoints/v2 v2.4.35
  • github.com/aws/aws-sdk-go-v2/internal/ini v1.3.43
  • github.com/aws/aws-sdk-go-v2/internal/v4a v1.1.4
  • github.com/aws/aws-sdk-go-v2/service/internal/accept-encoding v1.9.14
  • github.com/aws/aws-sdk-go-v2/service/internal/checksum v1.1.36
  • github.com/aws/aws-sdk-go-v2/service/internal/presigned-url v1.9.35
  • github.com/aws/aws-sdk-go-v2/service/internal/s3shared v1.15.4
  • github.com/aws/aws-sdk-go-v2/service/s3 v1.40.0
  • github.com/aws/aws-sdk-go-v2/service/sso v1.14.1
  • github.com/aws/aws-sdk-go-v2/service/ssooidc v1.17.1
  • github.com/aws/aws-sdk-go-v2/service/sts v1.22.0
  • github.com/aws/smithy-go v1.14.2
  • github.com/bsm/ginkgo/v2 v2.12.0
  • github.com/bsm/gomega v1.27.10
  • github.com/cespare/xxhash/v2 v2.2.0
  • github.com/davecgh/go-spew v1.1.1
  • github.com/davecgh/go-spew v1.1.0
  • github.com/dgryski/go-rendezvous v0.0.0-20200823014737-9f7001d12a5f
  • github.com/docker/distribution v2.8.2+incompatible
  • github.com/docker/docker v24.0.6+incompatible
  • github.com/docker/go-connections v0.4.0
  • github.com/docker/go-units v0.5.0
  • github.com/gofiber/fiber/v2 v2.49.2
  • github.com/gofiber/template v1.8.2
  • github.com/gofiber/template/html/v2 v2.0.5
  • github.com/gofiber/utils v1.1.0
  • github.com/gogo/protobuf v1.3.2
  • github.com/google/go-cmp v0.5.9
  • github.com/google/go-cmp v0.5.8
  • github.com/google/uuid v1.3.1
  • github.com/jmespath/go-jmespath v0.4.0
  • github.com/jmespath/go-jmespath/internal/testify v1.5.1
  • github.com/kisielk/errcheck v1.5.0
  • github.com/kisielk/gotool v1.0.0
  • github.com/klauspost/compress v1.16.7
  • github.com/mattn/go-colorable v0.1.13
  • github.com/mattn/go-isatty v0.0.19
  • github.com/mattn/go-isatty v0.0.16
  • github.com/mattn/go-runewidth v0.0.15
  • github.com/moby/term v0.5.0
  • github.com/morikuni/aec v1.0.0
  • github.com/opencontainers/go-digest v1.0.0
  • github.com/opencontainers/image-spec v1.0.2
  • github.com/pkg/errors v0.9.1
  • github.com/pmezard/go-difflib v1.0.0
  • github.com/redis/go-redis/v9 v9.2.1
  • github.com/rivo/uniseg v0.2.0
  • github.com/stretchr/objx v0.1.0
  • github.com/stretchr/testify v1.8.4
  • github.com/valyala/bytebufferpool v1.0.0
  • github.com/valyala/fasthttp v1.49.0
  • github.com/valyala/tcplisten v1.0.0
  • github.com/yuin/goldmark v1.2.1
  • github.com/yuin/goldmark v1.1.27
  • golang.org/x/crypto v0.0.0-20191011191535-87dc89f01550
  • golang.org/x/crypto v0.0.0-20200622213623-75b288015ac9
  • golang.org/x/crypto v0.0.0-20190308221718-c2843e01d9a2
  • golang.org/x/mod v0.3.0
  • golang.org/x/mod v0.8.0
  • golang.org/x/mod v0.2.0
  • golang.org/x/net v0.8.0
  • golang.org/x/net v0.0.0-20190404232315-eb5bcb51f2a3
  • golang.org/x/net v0.0.0-20201021035429-f5854403a974
  • golang.org/x/net v0.0.0-20190620200207-3b0461eec859
  • golang.org/x/net v0.0.0-20200226121028-0de0cce0169b
  • golang.org/x/sync v0.0.0-20201020160332-67f06af15bc9
  • golang.org/x/sync v0.1.0
  • golang.org/x/sync v0.0.0-20190423024810-112230192c58
  • golang.org/x/sync v0.0.0-20190911185100-cd5d95a43a6e
  • golang.org/x/sys v0.12.0
  • golang.org/x/sys v0.6.0
  • golang.org/x/sys v0.0.0-20190215142949-d0b11bdaac8a
  • golang.org/x/sys v0.0.0-20190412213103-97732733099d
  • golang.org/x/sys v0.0.0-20200930185726-fdedc70b468f
  • golang.org/x/sys v0.0.0-20220811171246-fbc7d0a398ab
  • golang.org/x/text v0.3.0
  • golang.org/x/text v0.3.3
  • golang.org/x/time v0.3.0
  • golang.org/x/tools v0.0.0-20180917221912-90fa682c2a6e
  • golang.org/x/tools v0.0.0-20191119224855-298f0cb1881e
  • golang.org/x/tools v0.0.0-20200619180055-7c47624df98f
  • golang.org/x/tools v0.0.0-20210106214847-113979e3529a
  • golang.org/x/tools v0.6.0
  • golang.org/x/xerrors v0.0.0-20190717185122-a985d3407aa7
  • golang.org/x/xerrors v0.0.0-20191011141410-1b5146add898
  • golang.org/x/xerrors v0.0.0-20191204190536-9bdfabe68543
  • golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1
  • gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405
  • gopkg.in/yaml.v2 v2.2.8
  • gopkg.in/yaml.v3 v3.0.1
  • gotest.tools/v3 v3.5.1
scraper/package-lock.json npm
  • 208 dependencies
scraper/package.json npm
  • eslint ^8.36.0 development
  • axios ^1.3.4
  • chalk ^5.2.0
  • cheerio ^1.0.0-rc.12
  • csvtojson ^2.0.10
  • json2csv ^5.0.7
  • puppeteer-extra ^3.3.6
  • puppeteer-extra-plugin-adblocker ^2.13.6
  • puppeteer-extra-plugin-block-resources ^2.4.3
  • rand-user-agent ^2.0.4