biscrawler

An Automation Webcrawler for Extracting Central Bankers' Speeches

https://github.com/davidycliao/biscrawler

Science Score: 31.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.7%) to scientific vocabulary

Keywords

bank-for-international-settlements central-banker central-bankers-speeches python scraper scraping speeches text-as-data
Last synced: 6 months ago · JSON representation ·

Repository

An Automation Webcrawler for Extracting Central Bankers' Speeches

Basic Info
  • Host: GitHub
  • Owner: davidycliao
  • License: mit
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 59.7 MB
Statistics
  • Stars: 10
  • Watchers: 1
  • Forks: 2
  • Open Issues: 0
  • Releases: 0
Topics
bank-for-international-settlements central-banker central-bankers-speeches python scraper scraping speeches text-as-data
Created over 4 years ago · Last pushed 11 months ago
Metadata Files
Readme License Citation

README.md

bisCrawler: An Automation Webcrawler for Extracting Central Bankers' Speeches 🛠️🧰

CI

An automation web crawling framework for retrieving for Extracting Central Bankers' Speeches on the Website of Bank for International Settlements (https://www.bis.org) based on Selenium and Chrome browser.

Environment Setup

  1. Need to install Anaconda Navigator and Python>=3.9 beforehand. And then, open the terminal and download bisCrawler repository by using git clone. About how to use git and Github, please have a look at this Tutorial for Beginners.

git clone git@github.com:davidycliao/bisCrawler.git

  1. Copy the commands below and paste them into the terminal:

```

Change the directory by typing cd command once bisCrawler repository is downloaded.

cd bisCrawler

Create the enviroment by using conda and name the enviroment bisCrawler.

conda create -n bisCrawler python=3.9 ```

Instruction

  1. Activate the pre-named enviroment. Alternatively, the environment for bisCrawler can be opened via Anaconda Navigator

conda activate bisCrawler

  1. Install the dependencies from requirements.txt using pip methond.

pip install -r requirements.txt

  1. Call bisCrawler Moduel
  • In the terminal: ```

    Note: you need to run it in the terminal where you activated the enviroment.

    python bisCrawler.py ```

  • In Jupyter Notebook:

from bisCrawler import scraper

scraper()

  1. When Running bisCrawler

When bisCrawler is running, you will be asked which page you would like to scrape (please, type any single digit from 1 to last page). Then bisCrawler will automatically generate pandas dataframe to restore the banker speeches and the urls to the textual document.

What bisCrawler Scrapes

This designed crawler automatically webscrapes the central bankers' speeches from the offical website, including a bunch of information with regards to each name of central banker, date and title and corresponding url to the textual document.

Websraped Data

The scraped dataframe will be stored as centralbankspeeches.csv in the bisCrawler folder.

Cite

Please cite this page if you use this toolkit for your research.

For example, with BibTeX: @misc{bisCrawler, howpublished = {\url{https://github.com/davidycliao/bisCrawler}}, title = {bisCrawler: An Automation Webcrawler for Extracting Central Bankers' Speeches}, author = {David Yen-Chieh Liao and Li Tang}, publisher = {GitHub}, year = {2021} }

Owner

  • Name: David Liao
  • Login: davidycliao
  • Kind: user
  • Location: Birmingham | Colchester
  • Company: @Connected-Politics-Lab

Researcher at UoB + member of @Connected-Politics-Lab

Citation (CITATION.cff)

cff-version: 0.0.1
message: "If you use this software, please cite it as below."
authors:
- family-names: "Liao"
  given-names: "David Yen-Chieh"
  orcid: ""
title: "bisCrawler: An Automation Webcrawler for Extracting Central Bankers' Speeches"
version: 0.0.1
doi: 
date-released: 2022-01-10
url: "https://github.com/davidycliao/bisCrawler"


GitHub Events

Total
  • Watch event: 1
  • Delete event: 1
  • Push event: 1
  • Pull request event: 1
Last Year
  • Watch event: 1
  • Delete event: 1
  • Push event: 1
  • Pull request event: 1