Samewords

Samewords: Automatic word disambigation in critical text editions - Published in JOSS (2019)

https://github.com/stenskjaer/samewords

Science Score: 93.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 7 DOI reference(s) in README and JOSS metadata
  • Academic publication links
    Links to: joss.theoj.org, zenodo.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
    Published in Journal of Open Source Software

Keywords from Contributors

mesh
Last synced: 6 months ago · JSON representation

Repository

Automatically annotate potentially ambiguous words in critical text editions made with LaTeX and reledmac.

Basic Info
  • Host: GitHub
  • Owner: stenskjaer
  • License: mit
  • Language: Python
  • Default Branch: master
  • Homepage:
  • Size: 632 KB
Statistics
  • Stars: 7
  • Watchers: 2
  • Forks: 1
  • Open Issues: 8
  • Releases: 2
Created almost 9 years ago · Last pushed almost 5 years ago
Metadata Files
Readme Changelog License Codemeta

README.md

Samewords

Documentation
Status DOI DOI

Word disambigutaion in critical text editions

In critical textual editions notes in the critical apparatus are normally made to the line where the words occur. This leads to ambiguous references when a critical apparatus note refers to a word that occurs more than once in a line. For example:

We have a passage of text here, such a nice place for a critical
note.

----
1 a] om. M

It is very unclear which of three instances of "a" the note refers to.

Reledmac is a great LaTeX package that facilitates typesetting critical editions of prime quality. It already provides facilities for disambiguating identical words, but it requires the creator of the critical text to mark all potential instances of ambiguous references manually (see the reledmac handbook for the details on that). Samewords automates this step for the editor.

Install and usage

{.sourceCode .bash} pip3 install samewords

That's it!

This requires Python 3.6 installed in your system. For more details on installation, see the [installation]{role="ref"} section.

Now call the script with the file you want annotated as the only argument to get the annotated version back in the terminal.

{.sourceCode .bash} samewords my-awesome-edition.tex

This will send the annotated version to stdout. To see that it actually contains some \sameword{} macros, you can try running it through grep:

{.sourceCode .bash} samewords my-awesome-edition.tex | grep sameword

You can define a output location with the --output option:

{.sourceCode .bash} samewords --output ~/Desktop/test/output my-awesome-edition.tex

This will check whether ~/Desktop/test/output is a directory or a file. If it is a directory, it will put the file inside that directory (with the original name). If it is a file, it will ask you whether you want to overwrite it. If it is neither a directory nor a file, it will create the file output and write the content to that.

Alternatively regular unix redirecting will work just as well in a Unix context:

{.sourceCode .bash} samewords my-beautiful-edition.tex > ~/Desktop/test/output.tex

See more in the documentation.

Owner

  • Name: Michael Stenskjær Christensen
  • Login: stenskjaer
  • Kind: user

JOSS Publication

Samewords: Automatic word disambigation in critical text editions
Published
April 13, 2019
Volume 4, Issue 36, Page 941
Authors
Michael Stenskjær Christensen ORCID
Saxo-Institute, University of Copenhagen, Representation and Reality, University of Gothenburg
Editor
Kevin M. Moerman ORCID
Tags
LaTeX textual criticism text editing

CodeMeta (codemeta.json)

{
  "@context": "https://raw.githubusercontent.com/codemeta/codemeta/master/codemeta.jsonld",
  "@type": "Code",
  "author": [
    {
      "@id": "https://orcid.org/0000-0002-8190-679X",
      "@type": "Person",
      "email": "michael.stenskjaer@gmail.com",
      "name": "Michael Stenskjær Christensen",
      "affiliation": "University of Copenhagen, University of Gothenburg"
    }
  ],
  "identifier": "http://doi.org/10.5281/zenodo.1306293",
  "codeRepository": "https://github.com/stenskjaer/samewords",
  "datePublished": "2018-07-05",
  "dateModified": "2018-07-05",
  "dateCreated": "2018-07-05",
  "description": "Samewords automatically annotates potentially ambiguous words in critical text editions made with Reledmac in LaTeX.",
  "keywords": "Scholarly editing, Critical editing, LaTeX",
  "license": "MIT",
  "title": "Samewords",
  "version": "v0.5.0"
}

GitHub Events

Total
Last Year

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 285
  • Total Committers: 3
  • Avg Commits per committer: 95.0
  • Development Distribution Score (DDS): 0.032
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Michael Stenskjær Christensen m****r@g****m 276
Michael Stenskjær Christensen m****n@c****m 8
dependabot[bot] 4****] 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 45
  • Total pull requests: 1
  • Average time to close issues: 2 months
  • Average time to close pull requests: 6 days
  • Total issue authors: 5
  • Total pull request authors: 1
  • Average comments per issue: 3.78
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 1
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • floriandk (22)
  • stenskjaer (18)
  • maieul (2)
  • cmar057 (2)
  • Padlina (1)
Pull Request Authors
  • dependabot[bot] (1)
Top Labels
Issue Labels
bug (4) enhancement (1) to be implemented (1)
Pull Request Labels
dependencies (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 24 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 3
  • Total versions: 18
  • Total maintainers: 1
pypi.org: samewords

Package for disambiguation of identical terms in critical editions in LaTeX with reledmac.

  • Versions: 18
  • Dependent Packages: 0
  • Dependent Repositories: 3
  • Downloads: 24 Last month
Rankings
Dependent repos count: 9.0%
Dependent packages count: 10.0%
Average: 17.3%
Stargazers count: 19.3%
Forks count: 22.6%
Downloads: 25.6%
Maintainers (1)
Last synced: 6 months ago

Dependencies

Pipfile pypi
  • black * develop
  • pytest ==5.3.2 develop
  • regex ==2018.8.17
  • samewords *
Pipfile.lock pypi
  • appdirs ==1.4.4 develop
  • attrs ==20.3.0 develop
  • black ==19.10b0 develop
  • click ==8.0.0rc1 develop
  • importlib-metadata ==4.0.0 develop
  • more-itertools ==8.7.0 develop
  • packaging ==20.9 develop
  • pathspec ==0.8.1 develop
  • pluggy ==0.13.1 develop
  • py ==1.10.0 develop
  • pyparsing ==3.0.0b2 develop
  • pytest ==5.3.2 develop
  • regex ==2018.8.17 develop
  • toml ==0.10.2 develop
  • typed-ast ==1.4.3 develop
  • typing-extensions ==3.7.4.3 develop
  • wcwidth ==0.2.5 develop
  • zipp ==3.4.1 develop
  • regex ==2018.8.17
  • samewords *
setup.py pypi
  • regex ==2018.8.17