nltk

NLTK Source

https://github.com/nltk/nltk

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    50 of 469 committers (10.7%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.7%) to scientific vocabulary

Keywords

machine-learning natural-language-processing nlp nltk python

Keywords from Contributors

closember wx tk qt gtk distributed data-mining deep-neural-networks information-retrieval gensim
Last synced: 6 months ago · JSON representation ·

Repository

NLTK Source

Basic Info
  • Host: GitHub
  • Owner: nltk
  • License: apache-2.0
  • Language: Python
  • Default Branch: develop
  • Homepage: https://www.nltk.org
  • Size: 338 MB
Statistics
  • Stars: 14,256
  • Watchers: 455
  • Forks: 2,940
  • Open Issues: 268
  • Releases: 0
Topics
machine-learning natural-language-processing nlp nltk python
Created over 16 years ago · Last pushed 6 months ago
Metadata Files
Readme Changelog Contributing License Citation Security Authors

README.md

Natural Language Toolkit (NLTK)

PyPI CI

NLTK -- the Natural Language Toolkit -- is a suite of open source Python modules, data sets, and tutorials supporting research and development in Natural Language Processing. NLTK requires Python version 3.8, 3.9, 3.10, 3.11 or 3.12.

For documentation, please visit nltk.org.

Contributing

Do you want to contribute to NLTK development? Great! Please read CONTRIBUTING.md for more details.

See also how to contribute to NLTK.

Donate

Have you found the toolkit helpful? Please support NLTK development by donating to the project via PayPal, using the link on the NLTK homepage.

Citing

If you publish work that uses NLTK, please cite the NLTK book, as follows:

Bird, Steven, Edward Loper and Ewan Klein (2009).
Natural Language Processing with Python.  O'Reilly Media Inc.

Copyright

Copyright (C) 2001-2025 NLTK Project

For license information, see LICENSE.txt.

AUTHORS.md contains a list of everyone who has contributed to NLTK.

Redistributing

  • NLTK source code is distributed under the Apache 2.0 License.
  • NLTK documentation is distributed under the Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States license.
  • NLTK corpora are provided under the terms given in the README file for each corpus; all are redistributable and available for non-commercial use.
  • NLTK may be freely redistributed, subject to the provisions of these licenses.

Owner

  • Name: Natural Language Toolkit
  • Login: nltk
  • Kind: organization

Citation (CITATION.cff)

cff-version: 1.2.0
title: >-
  Natural Language ToolKit (NLTK)
message: >-
  Please cite this software using the metadata from
  'preferred-citation'.
type: software
authors:
  - name: "NLTK Team"
    email: "nltk.team@gmail.com"
repository-code: "https://github.com/nltk/nltk"
url: "https://www.nltk.org"
license: Apache-2.0
keywords:
  - "NLP"
  - "CL"
  - "natural language processing"
  - "computational linguistics"
  - "parsing"
  - "tagging"
  - "tokenizing"
  - "syntax"
  - "linguistics"
  - "language"
  - "natural language"
  - "text analytics"
preferred-citation:
  title: >-
    Natural Language Processing with Python: Analyzing
    Text with the Natural Language Toolkit
  type: book
  authors:
    - given-names: Steven
      family-names: Bird
      orcid: https://orcid.org/0000-0003-3782-7733
    - given-names: Ewan
      family-names: Klein
      orcid: https://orcid.org/0000-0002-0520-8447
    - given-names: Edward
      family-names: Loper
  year: 2009
  month: 6
  url: "https://www.nltk.org/book/"
  isbn: "9780596516499"
  publisher:
    name: "O'Reilly Media, Inc."
    website: "https://www.oreilly.com/"

GitHub Events

Total
  • Create event: 1
  • Commit comment event: 1
  • Issues event: 91
  • Watch event: 737
  • Delete event: 1
  • Issue comment event: 268
  • Push event: 28
  • Gollum event: 4
  • Pull request review comment event: 31
  • Pull request review event: 45
  • Pull request event: 55
  • Fork event: 82
Last Year
  • Create event: 1
  • Commit comment event: 1
  • Issues event: 91
  • Watch event: 737
  • Delete event: 1
  • Issue comment event: 268
  • Push event: 28
  • Gollum event: 4
  • Pull request review comment event: 31
  • Pull request review event: 45
  • Pull request event: 55
  • Fork event: 82

Committers

Last synced: 8 months ago

All Time
  • Total Commits: 13,330
  • Total Committers: 469
  • Avg Commits per committer: 28.422
  • Development Distribution Score (DDS): 0.689
Past Year
  • Commits: 127
  • Committers: 16
  • Avg Commits per committer: 7.938
  • Development Distribution Score (DDS): 0.441
Top Committers
Name Email Commits
Steven Bird s****1@g****m 4,151
Edward Loper e****r@l****u 2,188
Ewan Klein e****n@g****m 1,477
alvations a****s@g****m 739
Dan Garrette d****e@g****m 357
Mikhail Korobov k****4@g****m 276
Pierpaolo Pantone 2****o@g****m 257
Steven Xu 193
Ilia Kurenkov i****v@g****m 178
Tom Aarsen C****v@g****m 175
Will Roberts w****s@r****e 154
Eric Kafe k****c@g****m 132
Peter Ljunglöf p****f@h****e 106
Paul Bone p****e@c****u 94
Dmitrijs Milajevs d****t@g****m 91
Sumukh Ghodke s****e@g****m 91
hoontw h****w@g****m 80
Joel Nothman j****n@s****u 79
nschneid n****t@g****m 72
Joseph Frazee j****e@g****m 72
Marcus Uneson m****n@g****m 69
Haejoong Lee h****g@l****u 65
Mike Recachinas m****p@v****u 65
Trevor Cohn t****n@c****u 64
xim x****t@a****o 63
lrnzcig l****g@g****m 62
Long Duong l****9@g****m 59
Rob Speer r****r@m****u 56
purificant p****t 55
Greg Aumann g****n@g****m 53
and 439 more...

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 279
  • Total pull requests: 224
  • Average time to close issues: 11 months
  • Average time to close pull requests: 4 months
  • Total issue authors: 242
  • Total pull request authors: 80
  • Average comments per issue: 3.75
  • Average comments per pull request: 2.67
  • Merged pull requests: 132
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 59
  • Pull requests: 71
  • Average time to close issues: 16 days
  • Average time to close pull requests: 27 days
  • Issue authors: 50
  • Pull request authors: 24
  • Average comments per issue: 1.14
  • Average comments per pull request: 2.23
  • Merged pull requests: 30
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • alvations (12)
  • ekaf (12)
  • BLKSerene (3)
  • mcepl (2)
  • hiDevman (2)
  • shavetakhepra (2)
  • LordTT (2)
  • tomaarsen (2)
  • TomerYS (2)
  • DavidNemeskey (2)
  • kloczek (2)
  • Killpit (2)
  • ExplodingCabbage (2)
  • Sion1225 (2)
  • jmccrae (2)
Pull Request Authors
  • ekaf (57)
  • purificant (16)
  • tomaarsen (12)
  • alvations (11)
  • Mike014 (6)
  • WilliamPLaCroix (4)
  • Shazid08 (4)
  • Copilot (3)
  • eidheim (2)
  • drewvid (2)
  • josecols (2)
  • naktinis (2)
  • Higgs32584 (2)
  • ivanmilevtues (2)
  • trevorjwood (2)
Top Labels
Issue Labels
enhancement (11) nltk_data (8) wordnet (8) bug (8) corpus (7) tokenizer (7) good first issue (6) SMT (6) inactive (5) resolved (5) critical (4) invalid (4) tagger (4) CI (4) nice idea (3) pythonic (3) documentation (3) installation (3) tests (2) metrics (2) pleaseverify (2) admin (2) internals (2) windows related (2) parsing (2) stanford api (2) stem/lemma (2) multithread / multiprocessing (2) classifier (1) language-model (1)
Pull Request Labels
corpus (34) tokenizer (25) CI (16) parsing (16) metrics (15) tagger (14) stem/lemma (12) GUI (11) classifier (9) critical (8) enhancement (7) admin (6) bug (5) sentiment (4) internals (4) tests (3) language-model (3) twitter (2) cluster (2) wordnet (2) nice idea (2) translate (2) plot (1) needs review (1) cli (1) documentation (1) pythonic (1) inactive (1) LGTM (1)

Packages

  • Total packages: 16
  • Total downloads:
    • pypi 35,588,824 last-month
  • Total docker downloads: 974,969,708
  • Total dependent packages: 1,491
    (may contain duplicates)
  • Total dependent repositories: 59,006
    (may contain duplicates)
  • Total versions: 119
  • Total maintainers: 7
  • Total advisories: 5
pypi.org: nltk

Natural Language Toolkit

  • Versions: 63
  • Dependent Packages: 1,440
  • Dependent Repositories: 57,572
  • Downloads: 35,588,824 Last month
  • Docker Downloads: 974,969,708
Rankings
Dependent packages count: 0.0%
Dependent repos count: 0.0%
Downloads: 0.1%
Average: 0.2%
Docker downloads count: 0.3%
Forks count: 0.4%
Stargazers count: 0.4%
Last synced: 6 months ago
alpine-v3.18: py3-nltk-pyc

Precompiled Python bytecode for py3-nltk

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Average: 0.5%
Forks count: 0.8%
Stargazers count: 1.3%
Maintainers (1)
Last synced: 6 months ago
alpine-v3.18: py3-nltk

Natural Language Toolkit

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Average: 0.5%
Forks count: 0.8%
Stargazers count: 1.3%
Maintainers (1)
Last synced: 6 months ago
alpine-edge: py3-nltk

Natural Language Toolkit

  • Versions: 8
  • Dependent Packages: 2
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Forks count: 0.7%
Stargazers count: 1.3%
Average: 1.4%
Dependent packages count: 3.4%
Maintainers (1)
Last synced: 6 months ago
conda-forge.org: nltk
  • Homepage: http://nltk.org/
  • License: Apache-2.0
  • Latest release: 3.6.7
    published about 4 years ago
  • Versions: 15
  • Dependent Packages: 43
  • Dependent Repositories: 717
Rankings
Dependent repos count: 0.9%
Dependent packages count: 1.6%
Average: 1.8%
Forks count: 2.1%
Stargazers count: 2.5%
Last synced: 6 months ago
alpine-edge: py3-nltk-pyc

Precompiled Python bytecode for py3-nltk

  • Versions: 6
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Forks count: 0.8%
Stargazers count: 1.4%
Average: 4.1%
Dependent packages count: 14.1%
Maintainers (1)
Last synced: 6 months ago
alpine-v3.17: py3-nltk

Natural Language Toolkit

  • Versions: 1
  • Dependent Packages: 2
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Forks count: 0.8%
Stargazers count: 1.3%
Average: 5.3%
Dependent packages count: 19.0%
Maintainers (1)
Last synced: 6 months ago
anaconda.org: nltk

NLTK has been called a wonderful tool for teaching and working in computational linguistics using Python and an amazing library to play with natural language.

  • Versions: 16
  • Dependent Packages: 4
  • Dependent Repositories: 717
Rankings
Dependent repos count: 5.7%
Forks count: 6.1%
Stargazers count: 6.8%
Average: 10.0%
Dependent packages count: 21.6%
Last synced: 6 months ago
alpine-v3.22: py3-nltk-pyc

Precompiled Python bytecode for py3-nltk

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Average: 100%
Maintainers (1)
Last synced: 6 months ago
alpine-v3.21: py3-nltk-pyc

Precompiled Python bytecode for py3-nltk

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Average: 100%
Maintainers (1)
Last synced: 6 months ago
alpine-v3.21: py3-nltk

Natural Language Toolkit

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Average: 100%
Maintainers (1)
Last synced: 6 months ago
alpine-v3.20: py3-nltk-pyc

Precompiled Python bytecode for py3-nltk

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Average: 100%
Maintainers (1)
Last synced: 6 months ago
alpine-v3.22: py3-nltk

Natural Language Toolkit

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Average: 100%
Maintainers (1)
Last synced: 6 months ago
alpine-v3.19: py3-nltk

Natural Language Toolkit

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Average: 100%
Maintainers (1)
Last synced: 6 months ago
alpine-v3.19: py3-nltk-pyc

Precompiled Python bytecode for py3-nltk

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Average: 100%
Maintainers (1)
Last synced: 6 months ago
alpine-v3.20: py3-nltk

Natural Language Toolkit

  • Versions: 1
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Dependent repos count: 0.0%
Dependent packages count: 0.0%
Average: 100%
Maintainers (1)
Last synced: 6 months ago

Dependencies

.github/workflows/cffconvert.yml actions
  • actions/checkout v3 composite
  • citation-file-format/cffconvert-github-action 2.0.0 composite
.github/workflows/ci.yaml actions
  • actions/cache v3 composite
  • actions/checkout v3 composite
  • actions/setup-java v3 composite
  • actions/setup-python v3 composite
  • pre-commit/action v2.0.3 composite
.github/workflows/labeler.yml actions
  • actions/labeler v4 composite
requirements-ci.txt pypi
  • click *
  • gensim >=4.0.0
  • markdown-it-py *
  • matplotlib *
  • mdit-plain *
  • mdit-py-plugins *
  • pytest *
  • pytest-mock *
  • pytest-xdist *
  • pyyaml *
  • regex *
  • scikit-learn *
  • tqdm *
  • twython *
requirements-test.txt pypi
  • pylint * test
  • pytest >=6.0.1 test
  • pytest-cov >=2.10.1 test
  • pytest-mock * test
  • tox * test
setup.py pypi
  • click *
  • joblib *
  • regex >=2021.8.3
  • tqdm *