https://github.com/alan-turing-institute/latin_annotation

Code for analysing the semantic annotation of Latin data from SemEval 2020 task 1

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (6.3%) to scientific vocabulary

Last synced: 9 months ago · JSON representation

Repository

Code for analysing the semantic annotation of Latin data from SemEval 2020 task 1

Basic Info

Host: GitHub
Owner: alan-turing-institute
Language: Jupyter Notebook
Default Branch: master
Size: 49.7 MB

Statistics

Stars: 0
Watchers: 4
Forks: 0
Open Issues: 0
Releases: 1

Created almost 6 years ago · Last pushed about 4 years ago

Metadata Files

Readme

Latin annotation

Code for analysing the semantic annotation of Latin data from SemEval 2020 task 1

interannotator_latinise.py (developed by Barbara McGillivray): calculates inter-annotator agreement between the annotators that annotated the word virtus.

diachronic_analysis.py (developed by Barbara McGillivray): analyses the association between new senses of target words and CE texts.

annotation_analysis.py (developed by Daria Kondakova): analyses factors influencing the annotation confidence and the annotation results for individual words.

Confidence analysis

The setup of the annotation task meant that each word was only annotated by one person, apart from uirtus. To account for the potential differences between the individual annotators, we conducted a quantitative analysis of the annotated data. The objectives of the analysis were: (1) to find out whether there is a personal style of annotation that would affect further analysis of the data; and (2) to look for features of the words themselves that could influence the annotators’ decisions.

The results of the first part of the analysis are presented in the form of spreadsheets with data aggregated by (a) annotator and (b) number of senses of the annotated word, stored in confidence analysis/spreadsheets.

Folders heatmaps and words visualised contain the visualisation of the annotation data on the level of an individual word. More information on the specific visualisations can be found in the respective folders.

The commented code can be viewed in the Jupyter notebook annotation_analysis.ipynb.

Vagueness score

Code to calculate the vagueness score of each word and related plots is contained in the folder vagueness. The folder contains additional documentation.

Owner

Name: The Alan Turing Institute
Login: alan-turing-institute
Kind: organization
Email: info@turing.ac.uk

Website: https://turing.ac.uk
Repositories: 477
Profile: https://github.com/alan-turing-institute

The UK's national institute for data science and artificial intelligence.

GitHub Events

Total

Issues event: 1

Last Year

Issues event: 1

Issues and Pull Requests

Last synced: 9 months ago

All Time

Total issues: 1
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 1
Total pull request authors: 0
Average comments per issue: 0.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 1
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 1
Pull request authors: 0
Average comments per issue: 0.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/alan-turing-institute/latin_annotation

Science Score: 13.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Latin annotation

Confidence analysis

Vagueness score

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels