https://github.com/camel-lab/conllx_evaluation

Evaluate accuracy of CoNLL-X annotations performed by annotators

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.2%) to scientific vocabulary

Last synced: 10 months ago · JSON representation

Repository

Evaluate accuracy of CoNLL-X annotations performed by annotators

Basic Info

Host: GitHub
Owner: CAMeL-Lab
License: mit
Language: Python
Default Branch: main
Size: 104 KB

Statistics

Stars: 0
Watchers: 2
Forks: 1
Open Issues: 0
Releases: 0

Created about 4 years ago · Last pushed 11 months ago

Metadata Files

Readme License

Camel-depeval

Compare two CoNLL-X files or directories, to obtain the tokenization F-score and POS tag accuracy, as well as the LAS, UAS, and label scores.

Since comparison usually occurs between gold and parsed files, the two files/directories will be differentiated using gold and parsed keywords. In other words, you do not need to have gold and parsed files to compare; any two will do.

The tree alignment part of the code uses cedwordalignment.

Note: the evaluator is also CoNLL-U compatible.

Methodology

Two files or directories are passed to the evaluator. If two directories are passed, the directories must have matching file names.
The files are read, and the trees every two files are compared.
Align trees using ced_word_alignment
- involves inserting null alignment tokens
The evaluation scores are then calulated
- tokenization f-score is calculated on all aligned tokens, while the remaining metrics are calulated after removing insertions (null alignment tokens added to the gold tree)

Assumptions

Since cedwordalignment is used, the second and third assumptions are the same. - No words are added to either the parsed or gold files. - No changes to the word order. - Text is in the same script and encoding.

align_trees.py aligns trees using the cedwordalignment algorithm
class_conllx used to read CoNLL-X files
classes dataclasses used throughout the code
conllx_counts gets different statistics after comparing 2 CoNLL-X files
conllx_scores calculates scores given counts
evaluate_conllx_driver main script
handle_args simplifies use of the argparse library
requirements.txt necessary dependencies needed to run the scripts.
cedwordalignment/ the ced alignment library
README.md this document.

Requirements

Python 3.8 and above.

To use, you need to first install the necessary dependencies by running the following command:

bash pip install -r requirements.txt

Usage

```text usage: evaluateconllxdriver.py [-h] [-g] [-p] [-gd] [-pd]

This script takes 2 CoNLL-X files or 2 directories of CoNLL-X files and evaluates the scores.

required arguments: -g , --gold the gold CoNLL-X file -p , --parsed the parsed CoNLL-X file

or: -gd , --golddir the gold directory containing CoNLL-X files -pd , --parseddir the parsed directory containing CoNLL-X files ```

Examples

The sentences used are taken from CamelTB1001introduction1.conllx and CamelTB1001night1_1.conllx (data can be obtained from The Camel Treebank.

Sample 1:

The toknization is the same, and so the Fscore is 100%, and the insertion/deletion counts are both 0.
```text python src/main.py -g data/samplesgold/sample1.conllx -p data/samplesparsed/sample1.conllx ``` ||| |- |- | | tokenizationfscore | 100.0 | | tokenizationprecision | 100.0 | | tokenizationrecall | 100.0 | | wordaccuracy | 100.0 | | pos | 81.579 | | uas | 55.263 | | label | 65.789 | | las | 44.737 | | ppuasscore | 0 | | pplabelscore | 0 | | pplasscore | 0 |

Sample 2:

text python src/main.py -g data/samples_gold/sample_2.conllx -p data/samples_parsed/sample_2.conllx ||| |- |- | | tokenizationfscore | 90.385 | | tokenizationprecision | 90.385 | | tokenizationrecall | 90.385 | | wordaccuracy | 97.222 | | pos | 86.538 | | uas | 65.385 | | label | 75.0 | | las | 57.692 | | ppuasscore | 0.0 | | pplabelscore | 0.0 | | pplas_score | 0.0 |

Sample normalization:

Using the arguments x (punctuation), n (number), and a (alef, yeh, and ta marbuta), the evaluation will ignore differences in tokenization. When using the arguments, the following comparisons will be equal:

1 and ١ , and ، ي and ى

Without normalization

text python src/main.py -g data/samples_gold/sample_4_norm.conllx -p data/samples_parsed/sample_4_norm.conllx ||| |- |- | | tokenizationfscore | 80.0 | | tokenizationprecision | 80.0 | | tokenizationrecall | 80.0 | | wordaccuracy | 75.0 | | pos | 80.0 | | uas | 80.0 | | label | 80.0 | | las | 80.0 | | ppuasscore | 50.0 | | pplabelscore | 50.0 | | pplas_score | 50.0 |

With normalization of punctuation and numbers (you can also add a to make the arugment -xna)

text python src/main.py -g data/samples_gold/sample_4_norm.conllx -p data/samples_parsed/sample_4_norm.conllx -xn ||| |- |- | | tokenizationfscore | 100.0 | | tokenizationprecision | 100.0 | | tokenizationrecall | 100.0 | | wordaccuracy | 100.0 | | pos | 100.0 | | uas | 100.0 | | label | 100.0 | | las | 100.0 | | ppuasscore | 100.0 | | pplabelscore | 100.0 | | pplas_score | 100.0 |

Run evaluation on a folder

text python src/main.py --gold_dir=data/samples_gold --parsed_dir=data/samples_parsed ||||||||||| |- |- |- |- |- |- |- |- |- |- | |tokenizationfscore| tokenizationprecision| tokenizationrecall| wordaccuracy| pos| uas| label| las| ppuasscore| pplabelscore| pplas_score| |sample_4_norm|80.0|80.0|80.0|75.0|80.0|80.0|80.0|80.0|50.0|50.0|50.0|| |sample_2|90.385|90.385|90.385|97.222|86.538|65.385|75.0|57.692|0.0|0.0|0.0|| |sample_1|100.0|100.0|100.0|100.0|81.579|55.263|65.789|44.737|0.0|0.0|0.0|| |sample_3|80.0|80.0|80.0|75.0|100.0|100.0|100.0|100.0|100.0|100.0|100.0||

License

conllx_evaluator is available under the MIT license. See the LICENSE file for more info.

Owner

Name: CAMeL Lab
Login: CAMeL-Lab
Kind: organization
Location: Abu Dhabi, UAE

Website: http://camel-lab.com
Repositories: 22
Profile: https://github.com/CAMeL-Lab

The Computational Approaches to Modeling Language (CAMeL) Lab at New York University Abu Dhabi

GitHub Events

Total

Delete event: 1
Push event: 6
Pull request event: 2
Create event: 2

Last Year

Delete event: 1
Push event: 6
Pull request event: 2
Create event: 2

Dependencies

ced_word_alignment/requirements.txt pypi

docopt ==0.6.2
editdistance ==0.5.3

requirements.txt pypi

editdistance ==0.6.0
numpy ==1.22.4
pandas ==1.4.2
python-dateutil ==2.8.2
pytz ==2022.1
six ==1.16.0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/camel-lab/conllx_evaluation

Science Score: 26.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

Camel-depeval

Methodology

Assumptions

Contents

Requirements

Usage

Examples

Sample 1:

Sample 2:

Sample normalization:

Without normalization

With normalization of punctuation and numbers (you can also add a to make the arugment -xna)

Run evaluation on a folder

License

Owner

GitHub Events

Total

Last Year

Dependencies