https://github.com/clarin-eric/pressmint
PressMint: Interoperable Corpora of Historical Newspapers
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (6.0%) to scientific vocabulary
Keywords
Repository
PressMint: Interoperable Corpora of Historical Newspapers
Basic Info
- Host: GitHub
- Owner: clarin-eric
- Default Branch: main
- Homepage: https://www.clarin.eu/pressmint
- Size: 50.8 KB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 1
- Open Issues: 1
- Releases: 0
Topics
Metadata Files
README.md
PressMint: Interoperable Corpora of Historical Newspapers
The CLARIN PressMint project plans to compile corpora of historical newspapers for a number of countries and languages.
PressMint corpora are to be interoperable, i.e. encoded to a common PressMint schema, a customisation of the TEI Guidelines, but with various down-stream formats (TSV, CoNLL-U, JSON etc.) also available. The same scripts should process the common data in any PressMint corpus, despite the different kind of information included in the corpora.
The PressMint Git workflow, scripts and documentation will be based on the ParlaMint project, which builds richly annotated corpora of parliamentary proceedings for a large number of countries and autonomous regions.
This Git repository is, as yet, a stub with content still to be added. Note that there are several branches for different parts of the development.
The repository contains the following directories:
- The Samples directory contains directories by contributing (CLARIN) country. It will eventually include samples for all variants and formats of the PressMint corpora.
Owner
- Name: CLARIN ERIC
- Login: clarin-eric
- Kind: organization
- Email: trac@clarin.eu
- Location: Utrecht, The Netherlands
- Website: https://www.clarin.eu/content/development-information
- Repositories: 132
- Profile: https://github.com/clarin-eric
CLARIN central source code hub
GitHub Events
Total
- Member event: 6
- Push event: 16
- Pull request event: 3
- Create event: 4
Last Year
- Member event: 6
- Push event: 16
- Pull request event: 3
- Create event: 4