ConTEXT Explorer
ConTEXT Explorer: a web-based text analysis tool for exploring and visualizing concepts across time - Published in JOSS (2021)
Science Score: 95.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 7 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: joss.theoj.org, zenodo.org -
✓Committers with academic emails
2 of 6 committers (33.3%) from academic institutions -
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Keywords
Repository
ConTEXT Explorer is an open Web-based system for exploring and visualizing concepts (combinations of occurring words and phrases) over time in the text documents.
Basic Info
Statistics
- Stars: 9
- Watchers: 2
- Forks: 3
- Open Issues: 5
- Releases: 1
Topics
Metadata Files
README.md
ConTEXT-Explorer
ConTEXT Explorer is an open Web-based system for exploring and visualizing concepts (combinations of co-occurring words and phrases) over time in the text documents. ConTEXT Explorer is designed to lower the barriers to applying information retrieval and machine learning for text analysis, including: - preprocessing text with sentencizer and tokenizer in a Spacy pipline; - building Gensim word2vec model for discovering similar terms, which can be used to expand queries; - indexing the cleaned text, and creating a search engine using Whoosh, which allows to rank sentences using the Okapi BM25F function; - visualizing results across time in interactive plots using Plotly.
It is designed to be user-friendly, enabling researchers to make sense of their data without technical knowledge. Users may:
- upload (and save) a text corpus, and customize search fields;
- add terms to the query using input from the word2vec model, sentence ranking, or manually;
- check term frequencies across time;
- group terms with "ALL" or "ANY" operator, and compound the groups to form more complex queries;
- view results across time for each query (using raw counts or proportion of relevant documents);
- save and reload results for further analysis;
- download a subset of a corpus filtered by user-defined terms.
More details can be found in the user manual below.
How to install
Get the app
Clone this repo to your local environment:
git clone https://github.com/alicia-ziying-yang/conTEXT-explorer.git
Set up environment
ConTEXT Explorer is developed using Plotly Dash in Python. We are using Python 3.7.5 and all required packages listed in requirement.txt. To help you install this application correctly, we provide a conda environment file ce-env.yml for you to set up a virtual environment. Simply enter the folder:
cd conTEXT-explorer
and run:
conda env create -f ce-env.yml
To activate this environment, use:
conda activate ce-env
Install the application
Then, ConTEXT Explorer can be easily installed by:
pip install .
Run the app
If you want to run ConTEXT Explorer on your local computer, comment the code for ubuntu server, and uncomment the last line in
app.py:# app.runserver(debug=False, host="0.0.0.0") # ubuntu server
app.runserver(debug=False, port="8010") # local test
To start the application, use:
start-ce
or
python app.py
The IP address with app access will be displayed in the output.
If you want to run ConTEXT Explorer on an ubuntu server, use:
nohup python app.py &
How to use
A sample corpus with a saved analysis is preset in this app. Feel free to explore the app features using this example. Please check more details in the manual below.
Click here to view the paged PDF version

Contact and Contribution
This application is designed and developed by Ziying (Alicia) Yang, Gosia Mikolajczak, and Andrew Turpin from the University of Melbourne in Australia.
If you encounter any errors while using the app, have suggestions for improvement, or want to contribute to this project by adding new functions or features, please submit an issue here and pull requests.
JOSS Publication
ConTEXT Explorer: a web-based text analysis tool for exploring and visualizing concepts across time
Authors
University of Melbourne
Tags
Dash Data Analysis Data VisulizationGitHub Events
Total
Last Year
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Ziying Alicia Yang | 8****g | 131 |
| GosiaMi | 4****i | 11 |
| Alicia Yang | z****3@4****l | 6 |
| Daniel S. Katz | d****z@i****g | 2 |
| Fabian-Robert Stöter | m****l@f****m | 1 |
| Andrew Turpin | a****n@u****u | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 18
- Total pull requests: 3
- Average time to close issues: 6 days
- Average time to close pull requests: about 18 hours
- Total issue authors: 4
- Total pull request authors: 3
- Average comments per issue: 2.28
- Average comments per pull request: 0.33
- Merged pull requests: 2
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- baileythegreen (8)
- alicia-ziying-yang (5)
- faroit (4)
- sara-02 (1)
Pull Request Authors
- baileythegreen (1)
- faroit (1)
- danielskatz (1)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- Whoosh ==2.7.4
- backports.csv ==1.0.7
- dash ==1.14.0
- dash-bootstrap-components ==0.10.5
- dash-core-components ==1.10.2
- dash-daq ==0.5.0
- dash-html-components ==1.0.3
- dash-table ==4.9.0
- dash-uploader ==0.4.1
- gensim ==3.8.1
- gunicorn >=19.9.0
- nltk ==3.4.5
- numpy >=1.16.2
- pandas ==1.2.3
- pytest ==5.3.5
- selenium ==3.141.0
- spacy ==2.2.4
- urllib3 ==1.25.8
- wordcloud ==1.8.1
