Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
â—‹CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
â—‹.zenodo.json file
-
â—‹DOI references
-
â—‹Academic publication links
-
â—‹Committers with academic emails
-
â—‹Institutional organization owner
-
â—‹JOSS paper metadata
-
â—‹Scientific vocabulary similarity
Low similarity (18.3%) to scientific vocabulary
Keywords
r
r-package
rstats
wordnet
Last synced: 9 months ago
·
JSON representation
Repository
📖 A Very Nice Interface To Princeton's WordNet
Basic Info
Statistics
- Stars: 6
- Watchers: 2
- Forks: 1
- Open Issues: 1
- Releases: 1
Topics
r
r-package
rstats
wordnet
Created over 4 years ago
· Last pushed over 3 years ago
Metadata Files
Readme
License
README.Rmd
---
output: github_document
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
devtools::load_all()
```
# sehrnett
[](https://github.com/chainsawriot/sehrnett/actions)
[](https://app.codecov.io/gh/chainsawriot/sehrnett?branch=master)
[](https://CRAN.R-project.org/package=sehrnett)
[](https://github.com/chainsawriot/sehrnett/actions/workflows/R-CMD-check.yaml)
The goal of sehrnett is to provide a nice (and fast) interface to [Princeton's WordNet](https://wordnet.princeton.edu/). Unlike the original [wordnet package](https://cran.r-project.org/package=wordnet) (Feinerer et al., 2020), you don't need to install WordNet and / or setup rJava.
The data is not included in the package. Please run `download_wordnet()` to download the data (~100M Zipped, ~400M Unzipped) from the Internet, if such data is not available. Please make sure you agree with the [WordNet License](https://wordnet.princeton.edu/license-and-commercial-use).
## Installation
``` r
devtools::install_github("chainsawriot/sehrnett")
```
## `get_lemmas`
The most basic function is `get_lemmas`. It generates basic information about the lemmas [^1] you provided.
```r
library(sehrnett)
```
```{r}
get_lemmas(c("very", "nice"))
```
```{r}
get_lemmas("nice")
```
```{r}
get_lemmas("nice", pos = "n")
```
Please note that some definitions in WordNet are considered pejorative or offensive, e.g.
```{r}
get_lemmas("dog")
```
### Dot notation
The dot notation ("lemma.pos.sensenum") can be used to quick search for a particular word sense. For example, one can search for "king.n.10" to quickly pin down the word sense of "king" as a chess piece.
```{r}
get_lemmas("king.n.10")
```
### Lemmatization
The [morphological processing](https://wordnet.princeton.edu/documentation/morphy7wn) of the original Wordnet is partially implemented in `sehrnett` [^2]. As the Wordnet's database contains only information about lemmas (e.g. *eat*), you need to convert inflected variants (e.g. *ate*, *eaten*, *eating*) back to their lemmas to query them. The process is otherwise known as [lemmatization](https://en.wikipedia.org/wiki/Lemmatisation).
`sehrnett` provides such lemmatization. But you need to provide exactly one `pos` and set `lemmatize` to `TRUE` (default).
```{r}
get_lemmas(c("ate", "ducking"), pos = "v")
```
```{r}
get_lemmas(c("loci", "lemmata", "boxesful"), pos = "n")
```
```{r}
get_lemmas(c("nicest", "stronger"), pos = "a")
```
## A practical example
For example, you want to know the synonyms of the word "nuance" (very important for academic writing). You can first search using the lemma "nuance" with `get_lemmas`.
```{r}
res <- get_lemmas("nuance")
res
```
There could be multiple word senses and you need to choose which word sense you want to convey. But in this case, there is only one. You can then search for the `synsetid` (cognitive synonym identifier) of that word sense.
```{r}
# get_synonyms() is a wrapper to get_synsetids
get_synsetids(res$synsetid[1])
```
## Chainablilty
All `get_` functions are chainable by using the magrittr pipe operator.
```{r}
c("switch off") %>% get_lemmas(pos = "v") %>% get_synonyms
```
## `get_outdegrees`
WordNet is indeed a network. synsetids are connected to each other in a directed graph. An node (a synsetid) is linked to another with different link (edge) types labelling with different `linkid`s. You can list out all available `linkid`s with the function `list_linktypes`.
```{r}
list_linktypes()
```
```{r}
## all hypernyms
get_lemmas("dog", pos = "n", sensenum = 1) %>% get_outdegrees(linkid = 1)
```
```{r}
## all hyponymes
get_lemmas("dog", pos = "n", sensenum = 1) %>% get_outdegrees(linkid = 2)
```
```{r}
## all antonyms
get_lemmas("nice", pos = "a", sensenum = 1) %>% get_outdegrees(linkid = 30)
```
### Sugars
`sehrnett` provides several syntactic sugars as `get_` functions. For example:
```{r}
## all hyponymes
get_lemmas("dog", pos = "n", sensenum = 1) %>% get_hyponyms()
```
```{r}
get_lemmas("nice", pos = "a", sensenum = 1) %>% get_antonyms()
```
```{r}
get_lemmas("nice", pos = "a", sensenum = 1) %>% get_derivatives()
```
---
[^1]: Yes, the plural of *lemma* can also be *lemmata*, you Latin-speaking people.
[^2]: Like many implementations (e.g. NLTK, Ruby's rwordnet and node-wordnet-magic), the morpological processing is only partial. Collocations and hyphenation are not supported. Therefore, please don't expect that lemmatizing *asking for it* would obtain *ask for it* (as documented in Wordnet's website).
Owner
- Login: chainsawriot
- Kind: user
- Location: Germany
- Company: @gesistsa
- Website: http://www.chainsawriot.com
- Repositories: 241
- Profile: https://github.com/chainsawriot
GitHub Events
Total
Last Year
Committers
Last synced: over 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| chainsawriot | c****y@g****m | 32 |
Issues and Pull Requests
Last synced: 9 months ago
All Time
- Total issues: 2
- Total pull requests: 1
- Average time to close issues: 1 day
- Average time to close pull requests: 24 minutes
- Total issue authors: 1
- Total pull request authors: 1
- Average comments per issue: 0.5
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- chainsawriot (2)
Pull Request Authors
- chainsawriot (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- cran 226 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 3
- Total maintainers: 1
cran.r-project.org: sehrnett
A Very Nice Interface to 'WordNet'
- Homepage: https://github.com/chainsawriot/sehrnett
- Documentation: http://cran.r-project.org/web/packages/sehrnett/sehrnett.pdf
- License: GPL (≥ 3)
-
Latest release: 0.1.0
published over 3 years ago
Rankings
Forks count: 21.9%
Stargazers count: 22.5%
Dependent packages count: 29.8%
Average: 30.0%
Dependent repos count: 35.5%
Downloads: 40.2%
Maintainers (1)
Last synced:
9 months ago
Dependencies
DESCRIPTION
cran
- DBI * imports
- RSQLite * imports
- dplyr * imports
- magrittr * imports
- purrr * imports
- tibble * imports
- utils * imports
- covr * suggests
- testthat >= 3.0.0 suggests
.github/workflows/R-CMD-check.yaml
actions
- actions/checkout v3 composite
- r-lib/actions/check-r-package v2 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite
.github/workflows/test-coverage.yaml
actions
- actions/checkout v3 composite
- actions/upload-artifact v3 composite
- r-lib/actions/setup-r v2 composite
- r-lib/actions/setup-r-dependencies v2 composite