gismo
GISMO is a NLP tool to rank and organize a corpus of documents according to a query.
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (11.7%) to scientific vocabulary
Keywords
Repository
GISMO is a NLP tool to rank and organize a corpus of documents according to a query.
Basic Info
Statistics
- Stars: 7
- Watchers: 3
- Forks: 1
- Open Issues: 1
- Releases: 14
Topics
Metadata Files
README.md
A Generic Information Search... With a Mind of its Own!
GISMO is a NLP tool to rank and organize a corpus of documents according to a query.
Gismo stands for Generic Information Search... with a Mind of its Own.
- Free software: MIT License
- Github: https://github.com/balouf/gismo/
- Documentation: https://balouf.github.io/gismo/
Features
Gismo combines three main ideas:
- TF-IDTF: a symmetric version of the TF-IDF embedding.
- DIteration: a fast, push-based, variant of the PageRank algorithm.
- Fuzzy dendrogram: a variant of the Louvain clustering algorithm.
Quickstart
Install gismo:
console
$ pip install gismo
Use gismo in a Python project:
```pycon
from gismo.common import toysourcedict from gismo import Corpus, Embedding, CountVectorizer, Gismo corpus = Corpus(toysourcedict, totext=lambda x: x['content']) embedding = Embedding(vectorizer=CountVectorizer(dtype=float)) embedding.fittransform(corpus) gismo = Gismo(corpus, embedding) gismo.rank("Mogwaï") gismo.getfeaturesby_rank() ['mogwaï', 'gizmo', 'chinese', 'in', 'demon', 'folklore', 'is'] ```
To get the hang of a typical Gismo workflow, you can check the Toy Example notebook. For more advanced uses, look at the other tutorials or directly the reference section.
Credits
Thomas Bonald, Anne Bouillard, Marc-Olivier Buob, Dohy Hong for their helpful contribution.
This package was created with Cookiecutter and the francois-durand/package_helper project template.
Coverage
Owner
- Name: Fabien Mathieu
- Login: balouf
- Kind: user
- Location: Paris, France
- Company: LINCS
- Website: https://www.lincs.fr/people/fabien-mathieu/
- Repositories: 6
- Profile: https://github.com/balouf
Researcher at Lincs
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: Generic Information Search with a Mind of its Own (Gismo)
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- family-names: Mathieu
given-names: Fabien
email: fabien.mathieu@normalesup.org
url: "https://balouf.github.io/gismo/"
GitHub Events
Total
- Release event: 1
- Watch event: 2
- Delete event: 8
- Issue comment event: 8
- Push event: 23
- Pull request event: 12
- Create event: 7
Last Year
- Release event: 1
- Watch event: 2
- Delete event: 8
- Issue comment event: 8
- Push event: 23
- Pull request event: 12
- Create event: 7
Committers
Last synced: almost 3 years ago
All Time
- Total Commits: 225
- Total Committers: 3
- Avg Commits per committer: 75.0
- Development Distribution Score (DDS): 0.56
Top Committers
| Name | Commits | |
|---|---|---|
| fabien | f****u@n****m | 99 |
| Fabien | f****u@n****g | 79 |
| pyup-bot | g****t@p****o | 47 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 1
- Total pull requests: 103
- Average time to close issues: N/A
- Average time to close pull requests: 11 days
- Total issue authors: 1
- Total pull request authors: 3
- Average comments per issue: 1.0
- Average comments per pull request: 0.88
- Merged pull requests: 38
- Bot issues: 0
- Bot pull requests: 1
Past Year
- Issues: 0
- Pull requests: 25
- Average time to close issues: N/A
- Average time to close pull requests: 21 days
- Issue authors: 0
- Pull request authors: 3
- Average comments per issue: 0
- Average comments per pull request: 1.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 1
Top Authors
Issue Authors
- pyup-bot (1)
- hadifar (1)
Pull Request Authors
- pyup-bot (132)
- balouf (3)
- dependabot[bot] (2)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 403 last-month
- Total dependent packages: 0
- Total dependent repositories: 2
- Total versions: 16
- Total maintainers: 1
pypi.org: gismo
GISMO is a NLP tool to rank and organize a corpus of documents according to a query.
- Documentation: https://balouf.github.io/gismo/
- License: MIT
-
Latest release: 0.5.2
published 8 months ago
Rankings
Maintainers (1)
Dependencies
- IPython >=7.15.0
- beautifulsoup4 >=4.9.0
- bs4 >=0.0.1
- dill >=0.3.1.1
- gismo >=0.3.0
- lxml >=4.5.0
- nbsphinx >=0.6.1
- pytest >=5.4.1
- requests >=2.23.0
- setuptools >=46.1.3
- beautifulsoup4 >=4.9.0
- bs4 >=0.0.1
- dill >=0.3.1.1
- gismo >=0.3.0
- lxml >=4.5.0
- nbsphinx >=0.6.1
- numba >=0.49.0
- numpy >=1.18.4
- pytest >=5.4.1
- requests >=2.23.0
- scikit-learn >=0.23.1
- scipy >=1.4.1
- setuptools >=46.1.3
- spacy >=2.3.4
- IPython >=7.15.0 development
- Sphinx >=3.1.1 development
- bump2version >=1.0.0 development
- coverage >=5.1 development
- flake8 >=3.8.3 development
- nbsphinx >=0.7.1 development
- pip >=20.2.4 development
- pytest >=5.4.3 development
- pytest-cov >=2.10.0 development
- pytest-runner >=5.2 development
- sphinx_rtd_theme >=0.5.0 development
- tox >=3.15.2 development
- twine >=3.4.1 development
- watchdog >=0.10.2 development
- wheel >=0.34.2 development
- actions/checkout v3 composite
- actions/setup-python v4 composite
- codecov/codecov-action v3 composite
- JamesIves/github-pages-deploy-action v4 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
- actions/checkout v3 composite
- actions/setup-python v4 composite
