workbench-sydney-speaks

Workbench for corpus tools accessing the Sydney Speaks corpus

https://github.com/australian-text-analytics-platform/workbench-sydney-speaks

Science Score: 39.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 2 DOI reference(s) in README
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.2%) to scientific vocabulary

Keywords

corpus-linguistics corpus-tools notebook
Last synced: 6 months ago · JSON representation

Repository

Workbench for corpus tools accessing the Sydney Speaks corpus

Basic Info
  • Host: GitHub
  • Owner: Australian-Text-Analytics-Platform
  • License: mit
  • Language: Jupyter Notebook
  • Default Branch: master
  • Homepage:
  • Size: 76.2 KB
Statistics
  • Stars: 0
  • Watchers: 2
  • Forks: 1
  • Open Issues: 1
  • Releases: 0
Topics
corpus-linguistics corpus-tools notebook
Created almost 4 years ago · Last pushed over 3 years ago
Metadata Files
Readme Contributing License Zenodo

README.md

Workbench for Sydney Speaks corpus

Current version: v0.0.0

A workbench for corpus tools accessing the Sydney Speaks corpus. This workbench is based on the GLAM Workbench - https://github.com/GLAM-Workbench/glam-workbench-template . For more information see the Workbench for Sydney Speaks corpus section of the GLAM Workbench.

Access and use of the Sydney Speaks corpus

This workbench requires any user to already have access to the Sydney Speaks corpus. Remember to respect the contents and context of the corpus and follow any requirements of any license agreed to when obtaining the access.

Accessing Oni

Some of the notebooks in this workshop currently run using a demo version of Oni deployed on Nectar, which requires a particular API token.

To get an API token, go to https://data-dev.ldaca.edu.au, login via GitHub and generate an API TOKEN.

Edit the vars.env file in your notebooks/ folder with:

API_KEY=PASTE_YOUR_KEY_HERE

Do not commit this to GitHub as it is your private token.

Notebook topics

Remember to remove any notebook output to prevent pushing sensitive information. See https://github.com/Australian-Text-Analytics-Platform/ldaca-sydney-speaks for the original notebook and instructions on how to use it.

See the GLAM Workbench for more details.

Run these notebooks

There are a number of different ways to use these notebooks. Binder is quickest and easiest, but it doesn't save your data. I've listed the options below from easiest to most complicated (requiring more technical knowledge).

Using Binder

Launch on Binder

Click on the button above to launch the notebooks in this repository using the Binder service (it might take a little while to load). This is a free service, but note that sessions will close if you stop using the notebooks, and no data will be saved. Make sure you download any changed notebooks or harvested data that you want to save.

See Using Binder for more details.

Using Reclaim Cloud

Launch on Reclaim Cloud

Reclaim Cloud is a paid hosting service, aimed particularly at supported digital scholarship in hte humanities. Unlike Binder, the environments you create on Reclaim Cloud will save your data – even if you switch them off! To run this repository on Reclaim Cloud for the first time:

  • Create a Reclaim Cloud account and log in.
  • Click on the button above to start the installation process.
  • A dialogue box will ask you to set a password, this is used to limit access to your Jupyter installation.
  • Sit back and wait for the installation to complete!
  • Once the installation is finished click on the 'Open in Browser' button of your newly created environment (note that you might need to wait a few minutes before everything is ready).

See Using Reclaim Cloud for more details.

Using Docker

You can use Docker to run a pre-built computing environment on your own computer. It will set up everything you need to run the notebooks in this repository. This is free, but requires more technical knowledge – you'll have to install Docker on your computer, and be able to use the command line.

  • Install Docker Desktop.
  • Create a new directory for this repository and open it from the command line.
  • From the command line, run the following command:
    docker run -p 8888:8888 --name workbench-sydney-speaks -v "$PWD":/home/jovyan/work quay.io/glamworkbench/workbench-sydney-speaks repo2docker-entrypoint jupyter lab --ip 0.0.0.0 --NotebookApp.token='' --LabApp.default_url='/lab/tree/index.ipynb'
  • It will take a while to download and configure the Docker image. Once it's ready you'll see a message saying that Jupyter Notebook is running.
  • Point your web browser to http://127.0.0.1:8888

See Using Docker for more details.

Setting up on your own computer

If you know your way around the command line and are comfortable installing software, you might want to set up your own computer to run these notebooks.

Assuming you have recent versions of Python and Git installed, the steps might be something like:

  • Create a virtual environment, eg: python -m venv workbench-sydney-speaks
  • Open the new directory" cd workbench-sydney-speaks
  • Activate the environment source bin/activate
  • Install the necessary Python packages: pip install -r requirements.in
  • Clone the repository: git clone https://github.com/Australian-Text-Analytics-Platform/workbench-sydney-speaks.git notebooks
  • Open the new notebooks directory: cd notebooks
  • Run Jupyter: jupyter lab

See the GLAM Workbench for more details.

Cite as

See the GLAM Workbench or Zenodo for up-to-date citation details.


This repository is part of the GLAM Workbench.

Owner

  • Name: Australian-Text-Analytics-Platform
  • Login: Australian-Text-Analytics-Platform
  • Kind: organization

GitHub Events

Total
Last Year

Dependencies

dev-requirements.in pypi
  • black * development
  • flake8 * development
  • isort * development
  • nbqa * development
  • nbval * development
  • pre-commit * development
  • pytest * development
notebooks/requirements.txt pypi
  • python-dotenv >=0.19.2
requirements.in pypi
  • altair *
  • jupyterlab *
  • pandas *
  • requests *
  • voila *