workbench-sydney-speaks
Workbench for corpus tools accessing the Sydney Speaks corpus
https://github.com/australian-text-analytics-platform/workbench-sydney-speaks
Science Score: 39.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 2 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.2%) to scientific vocabulary
Keywords
Repository
Workbench for corpus tools accessing the Sydney Speaks corpus
Basic Info
Statistics
- Stars: 0
- Watchers: 2
- Forks: 1
- Open Issues: 1
- Releases: 0
Topics
Metadata Files
README.md
Workbench for Sydney Speaks corpus
Current version: v0.0.0
A workbench for corpus tools accessing the Sydney Speaks corpus. This workbench is based on the GLAM Workbench - https://github.com/GLAM-Workbench/glam-workbench-template . For more information see the Workbench for Sydney Speaks corpus section of the GLAM Workbench.
Access and use of the Sydney Speaks corpus
This workbench requires any user to already have access to the Sydney Speaks corpus. Remember to respect the contents and context of the corpus and follow any requirements of any license agreed to when obtaining the access.
Accessing Oni
Some of the notebooks in this workshop currently run using a demo version of Oni deployed on Nectar, which requires a particular API token.
To get an API token, go to https://data-dev.ldaca.edu.au, login via GitHub and generate an API TOKEN.
Edit the vars.env file in your notebooks/ folder with:
API_KEY=PASTE_YOUR_KEY_HERE
Do not commit this to GitHub as it is your private token.
Notebook topics
- LDACA-Sydney-Speaks notebook
– Example notebook to run against LDACA
Remember to remove any notebook output to prevent pushing sensitive information. See https://github.com/Australian-Text-Analytics-Platform/ldaca-sydney-speaks for the original notebook and instructions on how to use it.
See the GLAM Workbench for more details.
Run these notebooks
There are a number of different ways to use these notebooks. Binder is quickest and easiest, but it doesn't save your data. I've listed the options below from easiest to most complicated (requiring more technical knowledge).
Using Binder
Click on the button above to launch the notebooks in this repository using the Binder service (it might take a little while to load). This is a free service, but note that sessions will close if you stop using the notebooks, and no data will be saved. Make sure you download any changed notebooks or harvested data that you want to save.
See Using Binder for more details.
Using Reclaim Cloud
Reclaim Cloud is a paid hosting service, aimed particularly at supported digital scholarship in hte humanities. Unlike Binder, the environments you create on Reclaim Cloud will save your data – even if you switch them off! To run this repository on Reclaim Cloud for the first time:
- Create a Reclaim Cloud account and log in.
- Click on the button above to start the installation process.
- A dialogue box will ask you to set a password, this is used to limit access to your Jupyter installation.
- Sit back and wait for the installation to complete!
- Once the installation is finished click on the 'Open in Browser' button of your newly created environment (note that you might need to wait a few minutes before everything is ready).
See Using Reclaim Cloud for more details.
Using Docker
You can use Docker to run a pre-built computing environment on your own computer. It will set up everything you need to run the notebooks in this repository. This is free, but requires more technical knowledge – you'll have to install Docker on your computer, and be able to use the command line.
- Install Docker Desktop.
- Create a new directory for this repository and open it from the command line.
- From the command line, run the following command:
docker run -p 8888:8888 --name workbench-sydney-speaks -v "$PWD":/home/jovyan/work quay.io/glamworkbench/workbench-sydney-speaks repo2docker-entrypoint jupyter lab --ip 0.0.0.0 --NotebookApp.token='' --LabApp.default_url='/lab/tree/index.ipynb' - It will take a while to download and configure the Docker image. Once it's ready you'll see a message saying that Jupyter Notebook is running.
- Point your web browser to
http://127.0.0.1:8888
See Using Docker for more details.
Setting up on your own computer
If you know your way around the command line and are comfortable installing software, you might want to set up your own computer to run these notebooks.
Assuming you have recent versions of Python and Git installed, the steps might be something like:
- Create a virtual environment, eg:
python -m venv workbench-sydney-speaks - Open the new directory"
cd workbench-sydney-speaks - Activate the environment
source bin/activate - Install the necessary Python packages:
pip install -r requirements.in - Clone the repository:
git clone https://github.com/Australian-Text-Analytics-Platform/workbench-sydney-speaks.git notebooks - Open the new
notebooksdirectory:cd notebooks - Run Jupyter:
jupyter lab
See the GLAM Workbench for more details.
Cite as
See the GLAM Workbench or Zenodo for up-to-date citation details.
This repository is part of the GLAM Workbench.
Owner
- Name: Australian-Text-Analytics-Platform
- Login: Australian-Text-Analytics-Platform
- Kind: organization
- Website: https://atap.edu.au
- Repositories: 9
- Profile: https://github.com/Australian-Text-Analytics-Platform
GitHub Events
Total
Last Year
Dependencies
- black * development
- flake8 * development
- isort * development
- nbqa * development
- nbval * development
- pre-commit * development
- pytest * development
- python-dotenv >=0.19.2
- altair *
- jupyterlab *
- pandas *
- requests *
- voila *