lingmatch

An all-in-one R package for the assessment of linguistic similarity

https://github.com/miserman/lingmatch

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    1 of 2 committers (50.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.3%) to scientific vocabulary

Keywords

nlp r rcpp text-analysis
Last synced: 6 months ago · JSON representation

Repository

An all-in-one R package for the assessment of linguistic similarity

Basic Info
Statistics
  • Stars: 11
  • Watchers: 3
  • Forks: 0
  • Open Issues: 0
  • Releases: 5
Topics
nlp r rcpp text-analysis
Created over 8 years ago · Last pushed about 1 year ago
Metadata Files
Readme

README.md

lingmatch

An all-in-one R package for the assessment of linguistic matching and/or accommodation.

features

  • Input raw text, a document-term matrix (DTM), or LIWC output.
  • Apply various weighting functions to a DTM.
  • Measure similarity and/or accommodation with various metrics.
  • Calculate standard forms of Language Style Matching (LSM) and Latent Semantic Similarity (LSS).

resources

installation

Download R from r-project.org, then install the package from an R console:

Release (version 1.0.7) R install.packages("lingmatch") Development (version 1.0.8) ```R

install.packages("remotes")

remotes::install_github("miserman/lingmatch") ```

And load the package: R library(lingmatch)

examples

Can make a quick comparison between two bits of text; by default this will give the cosine similarity between raw word-count vectors: R lingmatch("First text to look at.", "Text to compare that text with.")

Or, given a vector of texts: R text = c( "Why, hello there! How are you this evening?", "I am well, thank you for your inquiry!", "You are a most good at social interactions person!", "Why, thank you! You're not all bad yourself!" ) Process the texts in one step: ```R

with a dictionary

inquirercats = lmaprocess(text, dict = "inquirer", dir = "~/Dictionaries")

with a latent semantic space

glovevectors = lmaprocess(text, space = "glove", dir = "~/Latent Semantic Spaces") ```

Or process the texts step by step, then measure similarity between each: R dtm = lma_dtm(text) dtm_weighted = lma_weight(dtm) dtm_categorized = lma_termcat(dtm_weighted, lma_dict(1:9)) similarity = lma_simets(dtm_categorized, metric = "canberra")

Or do that within a single function call: R similarity = lingmatch( text, weight = "frequency", dict = lma_dict(1:9), metric = "canberra" )$sim

Or, if you want a standard form (as in this example), specify a default: R similarity = lingmatch(text, type = "lsm")$sim

Owner

  • Name: Micah Iserman
  • Login: miserman
  • Kind: user

GitHub Events

Total
  • Watch event: 1
  • Push event: 2
Last Year
  • Watch event: 1
  • Push event: 2

Committers

Last synced: about 2 years ago

All Time
  • Total Commits: 137
  • Total Committers: 2
  • Avg Commits per committer: 68.5
  • Development Distribution Score (DDS): 0.401
Past Year
  • Commits: 20
  • Committers: 1
  • Avg Commits per committer: 20.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
miserman m****n@t****u 82
Micah m****n@g****m 55
Committer Domains (Top 20 + Academic)
ttu.edu: 1

Issues and Pull Requests

Last synced: 7 months ago

All Time
  • Total issues: 2
  • Total pull requests: 0
  • Average time to close issues: about 1 month
  • Average time to close pull requests: N/A
  • Total issue authors: 1
  • Total pull request authors: 0
  • Average comments per issue: 1.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • amehtaSF (2)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • cran 306 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 3
  • Total versions: 8
  • Total maintainers: 1
cran.r-project.org: lingmatch

Linguistic Matching and Accommodation

  • Versions: 8
  • Dependent Packages: 0
  • Dependent Repositories: 3
  • Downloads: 306 Last month
Rankings
Dependent repos count: 16.4%
Stargazers count: 17.4%
Average: 24.9%
Forks count: 27.8%
Dependent packages count: 28.6%
Downloads: 34.4%
Maintainers (1)
Last synced: 6 months ago

Dependencies

DESCRIPTION cran
  • Matrix * depends
  • R >= 3.5 depends
  • methods * depends
  • Rcpp * imports
  • RcppParallel * imports
  • knitr * suggests
  • rmarkdown * suggests
  • splot * suggests
  • testthat >= 2.1.0 suggests