annie
ANNIE - Annotation Interface for Named-entity & Information Extraction
Science Score: 57.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 1 DOI reference(s) in README -
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (16.8%) to scientific vocabulary
Repository
ANNIE - Annotation Interface for Named-entity & Information Extraction
Basic Info
Statistics
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 1
Metadata Files
README.md
ANNIE - Annotation Interface for Named-entity & Information Extraction
ANNIE is a lightweight, Python-based desktop application designed for annotating text files with named entities and directed relations. Built with Tkinter, it offers a user-friendly interface for researchers, linguists, and NLP practitioners to create high-quality annotated datasets for named entity recognition (NER) and relation extraction tasks.
Features
- Entity Annotation: Tag text spans with customizable labels (e.g., Person, Organization, Location).
- Relation Annotation: Define directed relations between entities (e.g., "spouseof", "worksat").
- Batch Processing: Load and annotate multiple
.txtfiles from a directory. - Entity Propagation: Automatically annotate matching text spans across all files, with optional dictionary-based propagation.
- AI Pre-annotation: Use a pre-trained NER model (requires
transformersandtorch) for automated entity tagging. - Entity Merging/Demerging: Merge multiple mentions of the same entity or separate them via right-click.
- Relation Flipping: Reverse the direction of relations with a single click.
- Multi-label & Overlapping Annotations: Optionally allow multiple tags or overlapping annotations.
- Session Management: Save and load annotation sessions to resume work.
- Export/Import: Support for CoNLL-2003 and spaCy JSONL formats for training data.
- Color-coded Visualization: Highlight entities with tag-specific colors; propagated entities are underlined.
- Read-only Text Area: Prevents accidental modifications.
- Hotkey Support: Use keys 0-9 for quick tag selection/relabeling and
afor AI pre-annotation. - Flexible Schema: Customize entity tags and relation types, with save/load functionality.
Getting Started
Prerequisites
- Python: 3.6 or higher.
- Required Libraries:
tkinter(included with Python),json,os,shutil,pathlib,uuid,itertools,re,time,threading. - Optional Libraries:
transformersandtorchfor AI pre-annotation (pip install transformers torch).
Installation
- Clone this repository or download the source code.
- Navigate to the project directory.
- Run the application:
bash python annie.py
Basic Usage
1. Loading Files
- Go to File → Open Directory and select a folder with
.txtfiles. - Files load in alphabetical order; the first file displays automatically.
- Use Previous/Next buttons or click a file in the listbox to navigate.
- Add files to the session via File → Add File(s) to Session....
2. Entity Annotation
- Drag to select text or double-click a word in the text area.
- Choose a tag from the Entity Tag dropdown or press 0-9 to select a tag.
- Click Annotate Sel to tag the selection.
- Entities appear in the Entities list and are highlighted with tag-specific colors.
- Single-click an annotated span to remove it, or select entities and click Remove Sel.
- Enable Extend to word to snap selections to word boundaries.
- Relabel entities by selecting them and pressing 0-9.
3. Relation Annotation
- Select exactly two entities in the Entities list (first = head, second = tail).
- Choose a relation type from the Relation Type dropdown.
- Click Add Relation (Head→Tail) to create the relation.
- Relations appear in the Relations list.
- Select a relation and click Flip H/T to reverse it or Remove Relation to delete it.
4. Saving Annotations
- Go to File → Save Annotations to export annotations as a JSON file.
- Use File → Save Session to save the entire session (files, annotations, tags).
- Load sessions via File → Load Session....
Advanced Features
Managing Entity Tags and Relation Types
- Use Settings → Manage Entity Tags... or Manage Relation Types... to add, remove, or edit tags/types.
- Save/load schemas via Settings → Save Tag/Relation Schema... or Load Tag/Relation Schema....
Entity Propagation
- Click Propagate Entities to copy entities from the current file to all files.
- Use Settings → Load Dictionary & Propagate Entities... to annotate from a dictionary file (format:
text tag, e.g.,John Person).
Entity Merging/Demerging
- Select multiple entities and click Merge Sel. to assign them the same ID.
- Right-click an annotated span and select Demerge This Instance to assign a new ID.
AI Pre-annotation
- Press
aor go to Settings → Pre-annotate with AI... to tag entities using a pre-trained NER model. - Requires
transformersandtorch. Annotations are marked as propagated (underlined).
Import/Export
- Export annotations via File → Export for Training... in CoNLL or JSONL format.
- Import annotations via File → Import Annotations... from CoNLL or JSONL files, creating new
.txtfiles.
Multi-label Annotations
- Enable Settings → Allow Multi-label & Overlapping Annotations to permit overlapping tags.
Data Format
Annotations are stored in JSON format:
json
{
"file1.txt": {
"entities": [
{
"id": "a1b2c3...",
"start_line": 1,
"start_char": 10,
"end_line": 1,
"end_char": 20,
"text": "John Smith",
"tag": "Person",
"propagated": false
}
],
"relations": [
{
"id": "d4e5f6...",
"type": "works_at",
"head_id": "a1b2c3...",
"tail_id": "g7h8i9..."
}
]
}
}
Session files include additional metadata:
json
{
"version": "1.12",
"files_list": ["file1.txt", "file2.txt"],
"current_file_index": 0,
"entity_tags": ["Person", "Organization", ...],
"relation_types": ["spouse_of", "works_at", ...],
"tag_colors": {"Person": "#ffcccc", ...},
"annotations": {...},
"extend_to_word": true,
"allow_multilabel_overlap": true
}
Tips & Tricks
- Hotkeys: Use 0-9 to select/relabel tags,
afor AI pre-annotation, andDeleteto remove entities/relations. - Navigation: Click column headers to sort Entities/Relations lists; type a letter to jump to matching items.
- Workflow: Annotate entities first, then relations; propagate entities early to save time.
- Dictionary Format: Use one entity per line (e.g.,
New York Location). - Double-click: Selects a word for quick annotation.
- Read-only Text: Ensures no accidental edits; use mouse or hotkeys for actions.
Troubleshooting
- AI Pre-annotation Fails: Install
transformersandtorch; ensure a file is loaded. - Missing Files: Session loading warns about missing files; continue with available ones.
- Overlap Issues: Enable multi-label support in Settings for overlapping annotations.
- Highlighting Issues: Switch files to refresh the display.
- Export Errors: Check write permissions and use
.conllor.jsonlextensions.
Version History
- 1.12 (2025):
- Added AI pre-annotation with
Babelscape/wikineural-multilingual-ner. - Implemented multi-label and overlapping annotation support.
- Added demerge functionality via right-click.
- Made text area read-only to prevent accidental edits.
- Improved propagation with whitespace handling and underlining for propagated entities.
- Enhanced double-click/highlight annotation and single-click removal.
- Added import/export for CoNLL and spaCy JSONL formats.
- Added AI pre-annotation with
- 0.75: Double-click and highlight annotation, immutable text area.
- 0.70: Propagated entities flagged and underlined.
- 0.65: Entity search and sorting.
- 0.60: Session save/load for continuous work.
Cite
APA Style
Kovács, T. (2025). ANNIE: Annotation Interface for Named-entity & Information Extraction (Version 1.12) [Computer software]. GitHub. https://github.com/kreeedit/ANNIE
BibTeX
bibtex
@software{Kovacs_ANNIE_2025,
author = {Kovács, Tamás},
title = {{ANNIE: Annotation Interface for Named-entity & Information Extraction}},
version = {1.12},
publisher = {Zenodo},
year = {2025},
doi = {10.5281/zenodo.15805548},
url = {https://github.com/kreeedit/ANNIE}
}
License
Apache 2.0
Owner
- Login: kreeedit
- Kind: user
- Location: Graz, Austria
- Company: University of Graz
- Website: https://orcid.org/0000-0002-3913-2946
- Twitter: tms_kovacs
- Repositories: 3
- Profile: https://github.com/kreeedit
Citation (CITATION.cff)
cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Kovács"
given-names: "Tamás"
orcid: "https://orcid.org/0000-0002-3913-2946"
title: "ANNIE: Annotation Interface for Named-entity & Information Extraction"
version: 0.75
doi: 10.5281/zenodo.15805548
date-released: 2025-07-04
url: "https://github.com/kreeedit/ANNIE"
GitHub Events
Total
- Release event: 1
- Watch event: 1
- Public event: 1
- Push event: 34
- Create event: 1
Last Year
- Release event: 1
- Watch event: 1
- Public event: 1
- Push event: 34
- Create event: 1