zensols.calamr
CALAMR: Component ALignment for Abstract Meaning Representation (LREC-COLING paper)
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (15.0%) to scientific vocabulary
Repository
CALAMR: Component ALignment for Abstract Meaning Representation (LREC-COLING paper)
Basic Info
Statistics
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 1
- Releases: 0
Metadata Files
README.md
CALAMR: Component ALignment for Abstract Meaning Representation
This repository contains code for the paper CALAMR: Component ALignment for Abstract Meaning Representation and aligns the components of a bipartite source and summary AMR graph. To reproduce the results of the paper, see the paper repository.
The results are useful as a semantic graph similarity score (like SMATCH) or to find the summarized portion (as AMR nodes, edges and subgraphs) of a document or the portion of the source that represents the summary. If you use this library or the PropBank API/curated database, please cite our paper.
Features:
- Align source/summary AMR graphs.
- Scores for extent to which AMRs are summarized or represented in their source text.
- Rendering of the alignments.
- Support for four AMR corpora.
Table of Contents
Documentation
The recommended reading order for this project:
- The conference slides
- The abstract and introduction of the paper CALAMR: Component ALignment for Abstract Meaning Representation
- Overview and implementation guide
- Full documentation
- API reference
Installing
Because the this library has many dependencies and many moving parts, it is best to create a new environment using conda:
bash
wget https://github.com/plandes/calamr/raw/refs/heads/master/environment.yml
conda env create -f environment.yml
conda activate calamr
The library can also be installed with pip from the pypi repository:
bash
pip3 install zensols.calamr
See Installing the Gsii Model.
Corpora
This repository contains code to support the following corpora with source/summary AMR for alignment:
- LDC2020T02 Proxy Corpus
- ISI Little Prince
- ISI Bio AMR
- A micro corpus used in the paper examples and usage.
Usage
The command-line tool and API does not depend on the repository. However, it has a template configuration file that both the CLI and the API use. The examples also use data in the repository. Do the following to get started:
- Clone this repository and change the working directory to it:
bash git clone https://github.com/plandes/calamr && cd calamr - Copy the resource file:
bash cp src/config/dot-calamrrc ~/.calamrrc
Command Line
The steps below show how to use the command-line tool. First set up the application environment:
- Edit the
~/.calamrrcfile to choose the corpus and visualization. Keep thecalamr_corpusset toadhocfor these examples. (Note that you can also set the theCALAMRRCenvironment variable to a file in a different location if you prefer.) - Create the micro corpus:
bash calamr mkadhoc --corpusfile corpus/micro/source.json - Print the document keys of the corpus:
bash calamr keys
Aligning Corpus Documents
AMR corpora that distinguish between source and summary documents are needed so the API knows what data to align. The following examples utilize preexisting corpora (including the last section's micro corpus):
- Generate the Liu et al. graph for the micro corpus in directory
example:bash calamr aligncorp liu-example -f txt -o example - Force the Little Prince AMR corpus download and confirm success with the
single document key
1943:bash calamr keys --override=calamr_corpus.name=little-prince - Use the default AMR parser to extract sentence text from the Little Prince
AMR corpus using the SPRING parser:
bash calamr penman -o lp.txt --limit 5 \ --override amr_default.parse_model=spring \ ~/.cache/calamr/corpus/amr-rel/amr-bank-struct-v3.0.txt - Score the parsed sentences using CALAMR, SMATCH and WLK:
bash calamr score --parsed lp.txt \ --methods calamr,smatch,wlk \ ~/.cache/calamr/corpus/amr-rel/amr-bank-struct-v3.0.txt
Ad hoc Corpora
The micro corpus can be edited and rebuilt to add your own data to be aligned. However, there's an easier way to align ad hoc documents.
- Align a summarized document not included in any corpus. First create the
annotated documents as files
short-story.json.json [ { "id": "intro", "body": "The Dow Jones Industrial Average and other major indexes pared losses.", "summary": "Dow Jones and other major indexes reduced losses." }, { "id": "dow-stats", "body": "The Dow ended 0.5% lower on Friday while the S&P 500 fell 0.7%. Among the S&P sectors, energy and utilities gained while technology and communication services lagged.", "summary": "Dow sank 0.5%, S&P 500 lost 0.7% and energy, utilities up, tech, comms came down." } ]Now align the documents using theXFM Bart BaseAMR parser, rendering with the maximum number of steps (-r 10), and save results toexample:bash calamr align short-story.json --override amr_default.parse_model=xfm_bart_base -r 10 -o example -f txt
The -r option controls how many intermediate graphs generated to show the
iteration of the algorithm over all the steps (see the paper for details).
AMR Release 3.0 Corpus (LDC2020T02)
If you are using the AMR 3.0 corpus, there is a preprocessing step that needs executing before it can be used.
The Proxy Report corpus from the AMR 3.0 does not have both the alignments
(text-to-graph alignments) and snt-type (indicates if a sentence is part of
the source or the summary) metadata. By default, this API expects both. To
merge them into one dataset do the following:
- Obtain or purchase the corpus.
- Move the file where the software can find it:
bash mkdir ~/.cache/calamr/download cp /path/to/amr_annotation_3.0_LDC2020T02.tgz ~/.cache/calamr/download - Merge the alignments and sentence descriptors:
bash ./src/bin/merge-proxy-anons.py - Confirm the merge was successful by printing the document keys and align a report:
bash calamr keys --override=calamr_corpus.name=proxy-report calamr aligncorp 20041010_0024 -f txt -o example \ --override calamr_corpus.name=proxy-report
API
This section explains how to use the library's API directly in Python.
Aligning Ad hoc Documents
This is taken from the ad hoc API example
- Get the resource bundle: ```python from zensols.amr import AmrSentence, AmrDocument, AmrFeatureDocument from zensols.calamr import DocumentGraph, FlowGraphResult, Resource, ApplicationFactory
# get the resource bundle
res: Resource = ApplicationFactory.getresource()
1. Create test data:
python
# create AMR sentences
testsummary = AmrSentence("""\
# ::snt Joe's dog was chasing a cat in the garden.
# ::snt-type summary
# ::id liu-example.0
(c / chase-01
:ARG0 (d / dog
:poss (p / person
:name (n / name
:op1 "Joe")))
:ARG1 (c2 / cat)
:location (g / garden))""")
test_body = AmrSentence("""\
# ::snt I saw Joe's dog, which was running in the garden.
# ::snt-type body
# ::id liu-example.1
(s / see-01
:ARG0 (ii / i)
:ARG1 (d / dog
:poss (p / person
:name (n / name
:op1 "Joe"))
:ARG0-of (r / run-02
:location (g / garden))))""")
# create the AMR document
adoc = AmrDocument((testsummary, testbody))
1. Create the annotated document and align it:
python
# convert the AMR document to an AMR annotated document with NLP features
fdoc: AmrFeatureDocument = res.toannotateddoc(adoc)
# create the bipartite source/summary graph
graph: DocumentGraph = res.create_graph(fdoc)
# align the graph
flow: FlowGraphResult = res.align(graph)
1. Get and visualize the results:
python
# write the summarization metrics
flow.write()
# render the results as a graph in a web browser
flow.render()
```
Aligning Corpora Documents
To use an existing corpus (ad hoc "micro" corpus, The Little Prince, Biomedical Corpus, or Proxy report 3.0), use the following API to speed things up:
- Get the resource bundle: ```python from pathlib import Path from zensols.amr import AmrFeatureDocument from zensols.calamr import DocumentGraph, Resource, ApplicationFactory
# get the resource bundle
res: Resource = ApplicationFactory.getresource()
1. Get the Liu et al. AMR feature document example and print it.
python
doc: AmrFeatureDocument = res.getcorpusdocument('liu-example')
doc.write()
output:
yaml
[T]: Joe's dog was chasing a cat in the garden. I saw Joe's dog, which was running in the garden. The dog was chasing a cat.
sentences:
[N]: Joe's dog was chasing a cat in the garden.
(c0 / chase-01~e.4
:location (g0 / garden~e.9)
:ARG0 (d0 / dog~e.2
:poss (p0 / person
:name (n0 / name
:op1 "Joe"~e.0)))
:ARG1 (c1 / cat~e.6))
.
.
.
amr:
summary:
Joe's dog was chasing a cat in the garden.
sections:
no section sentences
I saw Joe's dog, which was running in the garden.
The dog was chasing a cat.
1. Align (if not already and cached) and get the flow results of the example:
python
flow = res.aligncorpusdocument('liu-example')
flow.write()
output:
yaml
summary:
Joe's dog was chasing a cat in the garden.
sections:
no section sentences
I saw Joe's dog, which was running in the garden.
The dog was chasing a cat.
statistics:
agg:
alignedportionhmean: 0.8695652173913044
meanflow: 0.7131309357900468
totalignable: 21
totaligned: 18
alignedportion: 0.8571428571428571
reentrancies: 0
1. Parse the first document from the [ad hoc JSON file](#ad-hoc-corpora) align
it, and give its statistics:
python
doc: AmrFeatureDocument = next(iter(res.parsedocuments(Path('short-story.json'))))
graph: DocumentGraph = res.creategraph(doc)
flow = res.align(graph)
flow.write()
output:
yaml
summary:
Dow Jones and other major indexes reduced losses.
sections:
no section sentences
The Dow Jones Industrial Average and other major indexes pared losses.
statistics:
agg:
alignedportionhmean: 1.0
meanflow: 0.9269955839429582
totalignable: 24
totaligned: 24
alignedportion: 1.0
reentrancies: 0
...
1. Render the results of a flow:
python
flow = res.aligncorpusdocument('liu-example')
flow.render()
1. Render all graphs of the flow results of the flow to directory `example`:
python
flow.render(
contexts=flow.getrendercontexts(includenascent=True),
directory=Path('example'),
display=False)
```
Docker
A stand-alone docker image is also available (see CALAMR Docker image). This docker image provides stand-alone container with all models, configuration and the adhoc micro corpus installed.
Example Graphs
The Liu et al. example graphs were created from the last step of the API examples, which is equivalent the first step of the command line example.
GraphViz
To create these graphs, set your ~/.calamrrc configuration to:
ini
[calamr_default]
renderer = graphviz
The Nascent Graph (with flow data)
The Source Graph
Plotly
To create these graphs, set your ~/.calamrrc configuration to:
ini
[calamr_default]
renderer = plotly
See the interactive version.

Attribution
This project, or reference model code, uses:
- Python 3.11
- amrlib for AMR parsing.
- amr_coref for AMR co-reference
- zensols.amr for AMR features and summarization data structures.
- Sentence-BERT embeddings
- zensols.propbankdb and zensols.deepnlp for PropBank embeddings
- zensols.nlparse for natural language features and NLP scoring
- Smatch and WLK for scoring.
Citation
If you use this project in your research please use the following BibTeX entry:
bibtex
@inproceedings{landes-di-eugenio-2024-calamr-component,
title = "{CALAMR}: Component {AL}ignment for {A}bstract {M}eaning {R}epresentation",
author = "Landes, Paul and
Di Eugenio, Barbara",
editor = "Calzolari, Nicoletta and
Kan, Min-Yen and
Hoste, Veronique and
Lenci, Alessandro and
Sakti, Sakriani and
Xue, Nianwen",
booktitle = "Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)",
month = may,
year = "2024",
address = "Torino, Italy",
publisher = "ELRA and ICCL",
url = "https://aclanthology.org/2024.lrec-main.236",
pages = "2622--2637"
}
Changelog
An extensive changelog is available here.
License
Copyright (c) 2023 - 2025 Paul Landes
Owner
- Name: Paul Landes
- Login: plandes
- Kind: user
- Repositories: 90
- Profile: https://github.com/plandes
Citation (CITATION.cff)
cff-version: 1.2.0
title: 'CALAMR: Component ALignment for Abstract Meaning Representation'
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
date-released: 2024-05-19
repository-code: https://github.com/uic-nlp-lab/calamr
authors:
- given-names: Paul
family-names: Landes
email: landes@mailc.net
affiliation: University of Illinois at Chicago
orcid: 'https://orcid.org/0000-0003-0985-0864'
preferred-citation:
type: conference-paper
authors:
- given-names: Paul
family-names: Landes
email: landes@mailc.net
affiliation: University of Illinois at Chicago
orcid: 'https://orcid.org/0000-0003-0985-0864'
- given-names: Barbara
family-names: Di Eugenio
affiliation: University of Illinois at Chicago
title: 'CALAMR: Component ALignment for Abstract Meaning Representation'
url: https://aclanthology.org/2024.lrec-main.236/
year: 2024
conference:
name: Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
city: Torino
country: IT
date-start: 2024-05-20
date-end: 2024-05-25
GitHub Events
Total
- Watch event: 2
- Push event: 10
- Create event: 1
Last Year
- Watch event: 2
- Push event: 10
- Create event: 1
Issues and Pull Requests
Last synced: 10 months ago
All Time
- Total issues: 6
- Total pull requests: 0
- Average time to close issues: about 18 hours
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 2.17
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 0
- Pull request authors: 0
- Average comments per issue: 0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- NyistMilan (6)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 75 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 4
- Total maintainers: 1
pypi.org: zensols.calamr
CALAMR: Component ALignment for AMR
- Homepage: https://github.com/plandes/calamr
- Documentation: https://zensols.calamr.readthedocs.io/
- License: mit
-
Latest release: 0.2.0
published over 1 year ago
Rankings
Maintainers (1)
Dependencies
- nvidia/cuda 12.3.1-runtime-ubuntu22.04 build
- plandes/calamr latest
- chart-studio *
- igraph *
- pyvis *
- zensols.amr *
- zensols.deepnlp *
- zensols.propbankdb *
- zensols.rend *
- zensols.util *
- actions/checkout v3 composite
- actions/setup-python v3 composite
- editdistance *
- pyemd *
- rouge-score *
- chart-studio ==1.1.0
- editdistance ==0.8.1
- igraph ==0.11.3
- pyemd ==1.0.0
- pyvis ==0.2.1
- rouge-score ==0.1.2
- torch ==2.1.2
- transformers ==4.45.2
- zensols.amr ==0.2.1
- zensols.calamr ==0.2.0
- zensols.deeplearn ==1.13.2
- zensols.deepnlp ==1.17.0
- zensols.nlp ==1.12.1
- zensols.propbankdb ==0.2.0
- zensols.rend *
- zensols.util ==1.15.2
- chart-studio ==1.1.0
- editdistance ==0.8.1
- igraph ==0.11.3
- pyemd ==1.0.0
- pyvis ==0.2.1
- rouge-score ==0.1.2
- torch ==2.1.2
- transformers ==4.45.2
- zensols.amr ==0.2.1
- zensols.deeplearn ==1.13.2
- zensols.deepnlp ==1.17.0
- zensols.nlp ==1.12.1
- zensols.propbankdb ==0.2.0
- zensols.rend *
- zensols.util ==1.15.2