wopr

Memory Based Word Predictor/Language Model http://ilk.uvt.nl/wopr/

https://github.com/languagemachines/wopr

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (6.7%) to scientific vocabulary

Keywords

language-modelling lm nlp

Keywords from Contributors

computational-linguistics folia punctuation tokeniser
Last synced: 9 months ago · JSON representation

Repository

Memory Based Word Predictor/Language Model http://ilk.uvt.nl/wopr/

Basic Info
  • Host: GitHub
  • Owner: LanguageMachines
  • License: other
  • Language: C++
  • Default Branch: master
  • Size: 1.75 MB
Statistics
  • Stars: 5
  • Watchers: 6
  • Forks: 0
  • Open Issues: 1
  • Releases: 4
Topics
language-modelling lm nlp
Created about 11 years ago · Last pushed over 1 year ago
Metadata Files
Readme Changelog License Authors Codemeta

README

Info
====

Run 'sh bootstrap' first if starting with a fresh checkout.

Dependencies
============

Wopr needs libticcutils, timbl, and optionally, libfolia.

General
=======

./configure
make

With a local Timbl installation:

./configure --prefix=/home/pberck/local --with-timbl=/home/pberck/local
make
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/pberck/local/lib

If Timbl has been installed system-wide, the configure script should be
able to find it without the --with-timbl invocation.

Timbl support can be explicitly disabled by specifying --without-timbl. You 
will be left with a Wopr which can create data sets and run an n-gram
language mode.

OS X
====

./configure --with-timbl=/Users/pberck/install/

If needed, point to Homebrew unicode libs:

export DYLD_LIBRARY_PATH=/usr/local/Cellar/icu4c/50.1/lib/:$DYLD_LIBRARY_PATH

Debian 
======

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib

seems to be needed when the libraries are installed in the default
place. If not, the following is needed (with the correct path filled
in):

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/pberck/local/lib

Owner

  • Name: Language Machines
  • Login: LanguageMachines
  • Kind: organization
  • Email: proycon@anaproy.nl
  • Location: Nijmegen, The Netherlands

NLP Research group at Centre for Language Studies, Radboud University Nijmegen

CodeMeta (codemeta.json)

{
  "@context": [
    "https://doi.org/10.5063/schema/codemeta-2.0",
    "http://schema.org",
    {
      "entryPoints": {
        "@reverse": "schema:actionApplication"
      },
      "interfaceType": {
        "@id": "codemeta:interfaceType"
      }
    }
  ],
  "@type": "SoftwareSourceCode",
  "identifier": "wopr",
  "name": "Wopr",
  "version": "0.43",
  "description": "WOPR is a wrapper around the k-nearest neighbor classifier in TiMBL, offering word prediction and language modeling functionalities. Trained on a text corpus, WOPR can predict missing words, report perplexities at the word level and the text level, and generate spelling correction hypotheses.",
  "license": "https://spdx.org/licenses/GPL-3.0",
  "url": "https://ilk.uvt.nl/wopr",
  "producer": {
    "@id": "https://www.ru.nl/cls",
    "@type": "Organization",
    "name": "Centre for Language Studies",
    "url": "https://www.ru.nl/cls",
    "parentOrganization": {
      "@id": "https://www.ru.nl",
      "name": "Radboud University",
      "@type": "Organization",
      "url": "https://www.ru.nl",
      "location": {
        "@type": "Place",
        "name": "Nijmegen"
      }
    }
  },
  "author": [
    {
      "@type": "Person",
      "givenName": "Peter",
      "familyName": "Berck",
      "affiliation": {
        "@id": "https://www.ru.nl/cls"
      }
    },
    {
      "@type": "Person",
      "givenName": "Ko",
      "familyName": "van der Sloot",
      "affiliation": {
        "@id": "https://www.ru.nl/cls"
      }
    }
  ],
  "sourceOrganization": {
    "@id": "https://www.ru.nl/cls"
  },
  "programmingLanguage": {
    "@type": "ComputerLanguage",
    "identifier": "c++",
    "name": "C++"
  },
  "operatingSystem": "POSIX",
  "codeRepository": "https://github.com/LanguageMachines/wopr",
  "softwareRequirements": [
    {
      "@type": "SoftwareApplication",
      "identifier": "ticcutils",
      "name": "ticcutils"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "timbl",
      "name": "timbl"
    },
    {
      "@type": "SoftwareApplication",
      "identifier": "libfolia",
      "name": "libfolia"
    }
  ],
  "readme": "https://github.com/LanguageMachines/wopr/blob/master/README",
  "issueTracker": "https://github.com/LanguageMachines/wopr/issues",
  "contIntegration": "https://travis-ci.org/LanguageMachines/wopr",
  "releaseNotes": "https://github.com/LanguageMachines/wopr/releases",
  "developmentStatus": "inactive",
  "keywords": [
    "nlp",
    "natural language processing",
    "language modelling"
  ],
  "referencePublication": [
    {
      "@type": "TechArticle",
      "name": "Wopr",
      "author": [
        "Peter Berck"
      ],
      "datePublished": "2012",
      "url": "https://ilk.uvt.nl/wopr/woprdoc.pdf"
    }
  ],
  "dateCreated": "2008-04-27",
  "entryPoints": [
    {
      "@type": "EntryPoint",
      "name": "wopr",
      "urlTemplate": "file:///wopr",
      "description": "Command-line interface to WOPR",
      "interfaceType": "CLI"
    }
  ]
}

GitHub Events

Total
  • Push event: 2
Last Year
  • Push event: 2

Committers

Last synced: 11 months ago

All Time
  • Total Commits: 1,102
  • Total Committers: 7
  • Avg Commits per committer: 157.429
  • Development Distribution Score (DDS): 0.216
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
pberck p****k@1****3 864
(no author) (****)@1****3 87
Ko van der Sloot K****t@l****l 76
sloot s****t@1****3 54
Peter Berck p****r@b****e 15
Maarten van Gompel p****n@a****l 5
antalb a****b@1****3 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 11 months ago

All Time
  • Total issues: 3
  • Total pull requests: 0
  • Average time to close issues: about 10 hours
  • Average time to close pull requests: N/A
  • Total issue authors: 3
  • Total pull request authors: 0
  • Average comments per issue: 1.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • proycon (1)
  • matjemeisje (1)
  • kosloot (1)
Pull Request Authors
Top Labels
Issue Labels
bug (2) PRIORITY (1)
Pull Request Labels