https://github.com/australian-text-analytics-platform/topsbm

topsbm topic modelling examples

https://github.com/australian-text-analytics-platform/topsbm

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (7.1%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

topsbm topic modelling examples

Basic Info
  • Host: GitHub
  • Owner: Australian-Text-Analytics-Platform
  • Language: Jupyter Notebook
  • Default Branch: main
  • Size: 7.49 MB
Statistics
  • Stars: 1
  • Watchers: 5
  • Forks: 1
  • Open Issues: 1
  • Releases: 0
Created over 2 years ago · Last pushed over 1 year ago
Metadata Files
Readme

README.md

ATAP: TopSBM

TopSBM is a topic modelling approach that infers a hierarchy of topic clusters and word clusters in your Corpus in a non-parametric manner by leveraging stochastic block models.

The approach is developed by E.G. Altman et al.

Top (Topic), SBM (Stochastic Block Models).

This repository is an integration effort of TopSBM to the ATAP platform.

Demo

This is demo jupyter notebook for TopSBM with ATAP Corpus integration. At the end of the notebook, you'll be able to download a Corpus with TopSBM results. You may then choose to upload this Corpus across to other ATAP tools for further analysis.

Demo notebook: Binder

Citations

  • TopSBM website: https://topsbm.github.io/
  • TopSBM repository: https://github.com/martingerlach/hSBM_Topicmodel

Local

If you are running this repository locally, you'll first need to: ```shell

1. activate your virtual environment

./scripts/install_topsbm.sh topsbm ```

Owner

  • Name: Australian-Text-Analytics-Platform
  • Login: Australian-Text-Analytics-Platform
  • Kind: organization

GitHub Events

Total
  • Watch event: 1
  • Push event: 2
  • Pull request event: 1
Last Year
  • Watch event: 1
  • Push event: 2
  • Pull request event: 1

Dependencies

environment.yml pypi
  • aiofiles ==22.1.0
  • aiosqlite ==0.19.0
  • annotated-types ==0.6.0
  • anyio ==4.0.0
  • appnope ==0.1.3
  • argon2-cffi ==23.1.0
  • argon2-cffi-bindings ==21.2.0
  • arrow ==1.3.0
  • asttokens ==2.4.0
  • atap-corpus ==0.1.3
  • attrs ==23.1.0
  • babel ==2.13.1
  • backcall ==0.2.0
  • beautifulsoup4 ==4.12.2
  • bleach ==6.1.0
  • blis ==0.7.11
  • bottleneck ==1.3.7
  • catalogue ==2.0.10
  • chardet ==5.2.0
  • charset-normalizer ==3.3.1
  • click ==8.1.7
  • cloudpathlib ==0.16.0
  • colorlog ==6.7.0
  • comm ==0.1.4
  • confection ==0.1.3
  • coolname ==2.2.0
  • cymem ==2.0.8
  • debugpy ==1.8.0
  • decorator ==5.1.1
  • defusedxml ==0.7.1
  • entrypoints ==0.4
  • et-xmlfile ==1.1.0
  • executing ==2.0.0
  • fastjsonschema ==2.18.1
  • fqdn ==1.5.1
  • idna ==3.4
  • ipykernel ==6.26.0
  • ipython ==8.16.1
  • ipython-genutils ==0.2.0
  • isoduration ==20.11.0
  • jedi ==0.19.1
  • jinja2 ==3.1.2
  • joblib ==1.3.2
  • json5 ==0.9.14
  • jsonpointer ==2.4
  • jsonschema ==4.19.1
  • jsonschema-specifications ==2023.7.1
  • langcodes ==3.3.0
  • llvmlite ==0.41.1
  • markupsafe ==2.1.3
  • matplotlib-inline ==0.1.6
  • mistune ==3.0.2
  • murmurhash ==1.0.10
  • nbclassic ==1.0.0
  • nbclient ==0.8.0
  • nbconvert ==7.9.2
  • nbformat ==5.9.2
  • nest-asyncio ==1.5.8
  • networkx ==3.2.1
  • notebook ==6.5.6
  • notebook-shim ==0.2.3
  • numba ==0.58.1
  • numexpr ==2.8.7
  • odfpy ==1.4.1
  • openpyxl ==3.1.2
  • overrides ==7.4.0
  • pandas ==2.1.1
  • pandocfilters ==1.5.0
  • parso ==0.8.3
  • pexpect ==4.8.0
  • pickleshare ==0.7.5
  • platformdirs ==3.11.0
  • plotly ==5.18.0
  • preshed ==3.0.9
  • prometheus-client ==0.17.1
  • prompt-toolkit ==3.0.39
  • psutil ==5.9.6
  • ptyprocess ==0.7.0
  • pure-eval ==0.2.2
  • pyarrow ==13.0.0
  • pydantic ==2.4.2
  • pydantic-core ==2.10.1
  • pydot ==1.4.2
  • pygments ==2.16.1
  • python-json-logger ==2.0.7
  • pytz ==2023.3.post1
  • pyxlsb ==1.0.10
  • pyyaml ==6.0.1
  • pyzmq ==24.0.1
  • referencing ==0.30.2
  • requests ==2.31.0
  • rfc3339-validator ==0.1.4
  • rfc3986-validator ==0.1.1
  • rpds-py ==0.10.6
  • scikit-learn ==1.3.2
  • seaborn ==0.13.0
  • send2trash ==1.8.2
  • smart-open ==6.4.0
  • sniffio ==1.3.0
  • soupsieve ==2.5
  • spacy ==3.7.2
  • spacy-legacy ==3.0.12
  • spacy-loggers ==1.0.5
  • srsly ==2.4.8
  • stack-data ==0.6.3
  • tenacity ==8.2.3
  • terminado ==0.17.1
  • thinc ==8.2.1
  • threadpoolctl ==3.2.0
  • tinycss2 ==1.2.1
  • tornado ==6.3.3
  • tqdm ==4.66.1
  • traitlets ==5.12.0
  • typer ==0.9.0
  • types-python-dateutil ==2.8.19.14
  • typing-extensions ==4.8.0
  • tzdata ==2023.3
  • uri-template ==1.3.0
  • urllib3 ==2.0.7
  • wasabi ==1.1.2
  • wcwidth ==0.2.8
  • weasel ==0.3.3
  • webcolors ==1.13
  • webencodings ==0.5.1
  • websocket-client ==1.6.4
  • xlrd ==2.0.1
  • xlsxwriter ==3.1.9
  • y-py ==0.6.2
  • ypy-websocket ==0.8.4
requirements.dev.txt pypi
  • jupyterlab <4.0 development
  • jupyterlab-vim <4.0 development