https://github.com/camel-lab/conllx_evaluation

Evaluate accuracy of CoNLL-X annotations performed by annotators

https://github.com/camel-lab/conllx_evaluation

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.2%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Evaluate accuracy of CoNLL-X annotations performed by annotators

Basic Info
  • Host: GitHub
  • Owner: CAMeL-Lab
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 104 KB
Statistics
  • Stars: 0
  • Watchers: 2
  • Forks: 1
  • Open Issues: 0
  • Releases: 0
Created about 4 years ago · Last pushed 11 months ago
Metadata Files
Readme License

README.md

Camel-depeval

Compare two CoNLL-X files or directories, to obtain the tokenization F-score and POS tag accuracy, as well as the LAS, UAS, and label scores.

Since comparison usually occurs between gold and parsed files, the two files/directories will be differentiated using gold and parsed keywords. In other words, you do not need to have gold and parsed files to compare; any two will do.

The tree alignment part of the code uses cedwordalignment.

Note: the evaluator is also CoNLL-U compatible.

Methodology

  1. Two files or directories are passed to the evaluator. If two directories are passed, the directories must have matching file names.
  2. The files are read, and the trees every two files are compared.
  3. Align trees using ced_word_alignment
    • involves inserting null alignment tokens
  4. The evaluation scores are then calulated
    • tokenization f-score is calculated on all aligned tokens, while the remaining metrics are calulated after removing insertions (null alignment tokens added to the gold tree)

Assumptions

Since cedwordalignment is used, the second and third assumptions are the same. - No words are added to either the parsed or gold files. - No changes to the word order. - Text is in the same script and encoding.


Contents

  • align_trees.py aligns trees using the cedwordalignment algorithm
  • class_conllx used to read CoNLL-X files
  • classes dataclasses used throughout the code
  • conllx_counts gets different statistics after comparing 2 CoNLL-X files
  • conllx_scores calculates scores given counts
  • evaluate_conllx_driver main script
  • handle_args simplifies use of the argparse library
  • requirements.txt necessary dependencies needed to run the scripts.
  • cedwordalignment/ the ced alignment library
  • README.md this document.

Requirements

  • Python 3.8 and above.

To use, you need to first install the necessary dependencies by running the following command:

bash pip install -r requirements.txt


Usage

```text usage: evaluateconllxdriver.py [-h] [-g] [-p] [-gd] [-pd]

This script takes 2 CoNLL-X files or 2 directories of CoNLL-X files and evaluates the scores.

required arguments: -g , --gold the gold CoNLL-X file -p , --parsed the parsed CoNLL-X file

or: -gd , --golddir the gold directory containing CoNLL-X files -pd , --parseddir the parsed directory containing CoNLL-X files ```


Examples

The sentences used are taken from CamelTB1001introduction1.conllx and CamelTB1001night1_1.conllx (data can be obtained from The Camel Treebank.

Sample 1:

The toknization is the same, and so the Fscore is 100%, and the insertion/deletion counts are both 0.
```text python src/main.py -g data/samples
gold/sample1.conllx -p data/samplesparsed/sample1.conllx ``` ||| |- |- | | tokenizationfscore | 100.0 | | tokenizationprecision | 100.0 | | tokenizationrecall | 100.0 | | wordaccuracy | 100.0 | | pos | 81.579 | | uas | 55.263 | | label | 65.789 | | las | 44.737 | | ppuasscore | 0 | | pplabelscore | 0 | | pplasscore | 0 |

Sample 2:

text python src/main.py -g data/samples_gold/sample_2.conllx -p data/samples_parsed/sample_2.conllx ||| |- |- | | tokenizationfscore | 90.385 | | tokenizationprecision | 90.385 | | tokenizationrecall | 90.385 | | wordaccuracy | 97.222 | | pos | 86.538 | | uas | 65.385 | | label | 75.0 | | las | 57.692 | | ppuasscore | 0.0 | | pplabelscore | 0.0 | | pplas_score | 0.0 |

Sample normalization:

Using the arguments x (punctuation), n (number), and a (alef, yeh, and ta marbuta), the evaluation will ignore differences in tokenization. When using the arguments, the following comparisons will be equal:

1 and ١ , and ، ي and ى

Without normalization

text python src/main.py -g data/samples_gold/sample_4_norm.conllx -p data/samples_parsed/sample_4_norm.conllx ||| |- |- | | tokenizationfscore | 80.0 | | tokenizationprecision | 80.0 | | tokenizationrecall | 80.0 | | wordaccuracy | 75.0 | | pos | 80.0 | | uas | 80.0 | | label | 80.0 | | las | 80.0 | | ppuasscore | 50.0 | | pplabelscore | 50.0 | | pplas_score | 50.0 |

With normalization of punctuation and numbers (you can also add a to make the arugment -xna)

text python src/main.py -g data/samples_gold/sample_4_norm.conllx -p data/samples_parsed/sample_4_norm.conllx -xn ||| |- |- | | tokenizationfscore | 100.0 | | tokenizationprecision | 100.0 | | tokenizationrecall | 100.0 | | wordaccuracy | 100.0 | | pos | 100.0 | | uas | 100.0 | | label | 100.0 | | las | 100.0 | | ppuasscore | 100.0 | | pplabelscore | 100.0 | | pplas_score | 100.0 |

Run evaluation on a folder

text python src/main.py --gold_dir=data/samples_gold --parsed_dir=data/samples_parsed ||||||||||| |- |- |- |- |- |- |- |- |- |- | |tokenizationfscore| tokenizationprecision| tokenizationrecall| wordaccuracy| pos| uas| label| las| ppuasscore| pplabelscore| pplas_score| |sample_4_norm|80.0|80.0|80.0|75.0|80.0|80.0|80.0|80.0|50.0|50.0|50.0|| |sample_2|90.385|90.385|90.385|97.222|86.538|65.385|75.0|57.692|0.0|0.0|0.0|| |sample_1|100.0|100.0|100.0|100.0|81.579|55.263|65.789|44.737|0.0|0.0|0.0|| |sample_3|80.0|80.0|80.0|75.0|100.0|100.0|100.0|100.0|100.0|100.0|100.0||


License

conllx_evaluator is available under the MIT license. See the LICENSE file for more info.

Owner

  • Name: CAMeL Lab
  • Login: CAMeL-Lab
  • Kind: organization
  • Location: Abu Dhabi, UAE

The Computational Approaches to Modeling Language (CAMeL) Lab at New York University Abu Dhabi

GitHub Events

Total
  • Delete event: 1
  • Push event: 6
  • Pull request event: 2
  • Create event: 2
Last Year
  • Delete event: 1
  • Push event: 6
  • Pull request event: 2
  • Create event: 2

Dependencies

ced_word_alignment/requirements.txt pypi
  • docopt ==0.6.2
  • editdistance ==0.5.3
requirements.txt pypi
  • editdistance ==0.6.0
  • numpy ==1.22.4
  • pandas ==1.4.2
  • python-dateutil ==2.8.2
  • pytz ==2022.1
  • six ==1.16.0