https://github.com/alan-turing-institute/latin_annotation
Code for analysing the semantic annotation of Latin data from SemEval 2020 task 1
Science Score: 13.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (6.3%) to scientific vocabulary
Repository
Code for analysing the semantic annotation of Latin data from SemEval 2020 task 1
Basic Info
- Host: GitHub
- Owner: alan-turing-institute
- Language: Jupyter Notebook
- Default Branch: master
- Size: 49.7 MB
Statistics
- Stars: 0
- Watchers: 4
- Forks: 0
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
Latin annotation
Code for analysing the semantic annotation of Latin data from SemEval 2020 task 1
interannotator_latinise.py (developed by Barbara McGillivray): calculates inter-annotator agreement between the annotators that annotated the word virtus.
diachronic_analysis.py (developed by Barbara McGillivray): analyses the association between new senses of target words and CE texts.
annotation_analysis.py (developed by Daria Kondakova): analyses factors influencing the annotation confidence and the annotation results for individual words.
Confidence analysis
The setup of the annotation task meant that each word was only annotated by one person, apart from uirtus. To account for the potential differences between the individual annotators, we conducted a quantitative analysis of the annotated data. The objectives of the analysis were: (1) to find out whether there is a personal style of annotation that would affect further analysis of the data; and (2) to look for features of the words themselves that could influence the annotators’ decisions.
The results of the first part of the analysis are presented in the form of spreadsheets with data aggregated by (a) annotator and (b) number of senses of the annotated word, stored in confidence analysis/spreadsheets.
Folders heatmaps and words visualised contain the visualisation of the annotation data on the level of an individual word. More information on the specific visualisations can be found in the respective folders.
The commented code can be viewed in the Jupyter notebook annotation_analysis.ipynb.
Vagueness score
Code to calculate the vagueness score of each word and related plots is contained in the folder vagueness. The folder contains additional documentation.
Owner
- Name: The Alan Turing Institute
- Login: alan-turing-institute
- Kind: organization
- Email: info@turing.ac.uk
- Website: https://turing.ac.uk
- Repositories: 477
- Profile: https://github.com/alan-turing-institute
The UK's national institute for data science and artificial intelligence.
GitHub Events
Total
- Issues event: 1
Last Year
- Issues event: 1
Issues and Pull Requests
Last synced: 9 months ago
All Time
- Total issues: 1
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 0.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- mhauru (1)