Science Score: 67.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: zenodo.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.1%) to scientific vocabulary
Repository
Parallel corpus for Early Modern French
Basic Info
- Host: GitHub
- Owner: FreEM-corpora
- Language: Python
- Default Branch: master
- Size: 3.92 MB
Statistics
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 2
Metadata Files
README.md
FreEM Norm corpus
diff
- WARNING: This repository is the new repository of [PARALLEL17](https://github.com/e-ditiones/PARALLEL17), which is not maintained anymore
Parallel corpus (diplomatic vs normalised) of 17th c. French texts.
For more information about FreEM corpora, cf. our website.
Corpus
The corpus is available in the corpus folder.
A detailed list of the content is available here.
Transcriptions
Transcripts are almost diplomatic. Long ſ is maintained ( plaiſir and not plaisir). Ligatures which have disappeared ( ſt, st, ct) are not kept, but not those that are maintained in contemporary French (œ, æ).
Use the normaliser
[TO DO]
Contribute
If you want to contribute, you can do so by cloning the repository and sending us a pull request, or by sending an email at simon.gabay[at]unige.ch.
Acknowledgments
Additional data and corrections have been provided by Philippe Gambette (GitHub) and Jonathan Poinhos.
Cite this repository
If you use the data:
bibtex
@software{gabay_simon_2022_6481179,
author = {Gabay, Simon and
Gambette, Philippe},
title = {{FreEM-corpora/FreEMnorm: FreEM norm Parallel
(original vs. normalised) corpus for Early Modern
French}},
month = jan,
year = 2022,
note = {If you use this software, please cite it as below.},
publisher = {Zenodo},
version = {1.0.1},
doi = {10.5281/zenodo.6481179},
url = {https://doi.org/10.5281/zenodo.6481179}
}
You can also additionnally use one of our latest publications:
bibtex
@inproceedings{gabay:hal-02276150,
TITLE = {{A Workflow For On The Fly Normalisation Of 17th c. French}},
AUTHOR = {Gabay, Simon and Riguet, Marine and Barrault, Lo{\"i}c},
URL = {https://hal.archives-ouvertes.fr/hal-02276150},
BOOKTITLE = {{DH2019}},
ADDRESS = {Utrecht, Netherlands},
ORGANIZATION = {{ADHO}},
YEAR = {2019},
MONTH = Jul,
KEYWORDS = {17th Century France ; Parallel corpus building},
PDF = {https://hal.archives-ouvertes.fr/hal-02276150/file/DH2019_final.pdf},
HAL_ID = {hal-02276150},
HAL_VERSION = {v1},
}
bibtex
@inproceedings{gabay:hal-02596669,
TITLE = {{Traduction automatique pour la normalisation du fran{\c c}ais du XVII e si{\`e}cle}},
AUTHOR = {Gabay, Simon and Barrault, Lo{\"i}c},
URL = {https://hal.archives-ouvertes.fr/hal-02596669},
BOOKTITLE = {{TALN 2020}},
ADDRESS = {Nancy, France},
ORGANIZATION = {{ATALA}},
SERIES = {27{\`e}me Conf{\'e}rence sur le Traitement Automatique des Langues Naturelles},
YEAR = {2020},
MONTH = Jun,
KEYWORDS = {Normalisation ; 17th c French ; Neural Machine Translation (NMT) ; Statistical Machine Translation (SMT) ; Digital humanities ; Humanit{\'e}s num{\'e}riques ; Fran{\c c}ais classique ; Traduction automatique neuronale ; Traduction automatique statistique},
PDF = {https://hal.archives-ouvertes.fr/hal-02596669/file/main.pdf},
HAL_ID = {hal-02596669},
HAL_VERSION = {v1},
}
bibtex
@inproceedings{gabay:hal-03596653,
TITLE = {{Automatic Normalisation of Early Modern French}},
AUTHOR = {Bawden, Rachel and Poinhos, Jonathan and Kogkitsidou, Eleni and Gambette, Philippe and Sagot, Beno{\^i}t and Gabay, Simon},
URL = {https://hal.inria.fr/hal-03596653},
BOOKTITLE = {{Proceedings of the 13th Language Resources and Evaluation Conference}},
ADDRESS = {Marseille, France},
ORGANIZATION = {{European Language Resources Association}},
YEAR = {2022},
MONTH = Jun,
HAL_ID = {hal-03540226},
HAL_VERSION = {v1},
}
Please keep me posted if you use this data!
Contact
simon.gabay[at]unige.ch
Licence

This work is licensed under a Creative Commons Attribution 4.0 International Licence.
Owner
- Name: FreEM-corpora
- Login: FreEM-corpora
- Kind: organization
- Repositories: 2
- Profile: https://github.com/FreEM-corpora
Citation (CITATION.cff)
cff-version: 4.0.0
message: "If you use this software, please cite it as below."
authors:
- family-names: Gabay
given-names: Simon
orcid: https://orcid.org/0000-0001-9094-4475
- family-names: Gambette
given-names: Philippe
orcid: https://orcid.org/0000-0001-7062-0262
title: "FreEM-corpora/FreEMnorm: FreEM norm Parallel (original vs. normalised) corpus for Early Modern French"
version: "1.0.1"
doi: "10.5281/zenodo.5865428"
license: cc-by-4.0
date-released: 2022-01-17
GitHub Events
Total
- Watch event: 1
- Push event: 11
Last Year
- Watch event: 1
- Push event: 11