frame-semantic-transformer

Frame Semantic Parser based on T5 and FrameNet

https://github.com/chanind/frame-semantic-transformer

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (10.1%) to scientific vocabulary

Keywords

framenet huggingface nlp semantic-parsing t5 transformers

Keywords from Contributors

spacy-extension abstract-meaning-representation amr first-order-logic logic
Last synced: 6 months ago · JSON representation ·

Repository

Frame Semantic Parser based on T5 and FrameNet

Basic Info
Statistics
  • Stars: 62
  • Watchers: 5
  • Forks: 13
  • Open Issues: 10
  • Releases: 18
Topics
framenet huggingface nlp semantic-parsing t5 transformers
Created almost 4 years ago · Last pushed over 2 years ago
Metadata Files
Readme Changelog License Citation

README.md

Frame Semantic Transformer

ci PyPI

Frame-based semantic parsing library trained on FrameNet and built on HuggingFace's T5 Transformer

Live Demo: chanind.github.io/frame-semantic-transformer

Full docs: frame-semantic-transformer.readthedocs.io

About

This library draws heavily on Open-Sesame (paper) for inspiration on training and evaluation on FrameNet 1.7, and uses ideas from the paper Open-Domain Frame Semantic Parsing Using Transformers for using T5 as a frame-semantic parser. SimpleT5 was also used as a base for the initial training setup.

More details: FrameNet Parsing with Transformers Blog Post

Performance

This library uses the same train/dev/test documents and evaluation methodology as Open-Sesame, so that the results should be comparable between the 2 libraries. There are 2 pretrained models available, base and small, corresponding to t5-base and t5-small in Huggingface, respectively.

| Task | Sesame F1 (dev/test) | Small Model F1 (dev/test) | Base Model F1 (dev/test) | | ---------------------- | -------------------- | ------------------------- | ------------------------ | | Trigger identification | 0.80 / 0.73 | 0.75 / 0.71 | 0.78 / 0.74 | | Frame classification | 0.90 / 0.87 | 0.87 / 0.86 | 0.91 / 0.89 | | Argument extraction | 0.61 / 0.61 | 0.76 / 0.73 | 0.78 / 0.75 |

The base model performs similarly to Open-Sesame on trigger identification and frame classification tasks, but outperforms it by a significant margin on argument extraction. The small pretrained model has lower F1 than base across the board, but is 1/4 the size and still outperforms Open-Sesame at argument extraction.

Installation

pip install frame-semantic-transformer

Usage

Inference

The main entry to interacting with the library is the FrameSemanticTransformer class, as shown below. For inference the detect_frames() method is likely all that is needed to perform frame parsing.

```python from framesemantictransformer import FrameSemanticTransformer

frame_transformer = FrameSemanticTransformer()

result = frametransformer.detectframes("The hallway smelt of boiled cabbage and old rag mats.")

print(f"Results found in: {result.sentence}") for frame in result.frames: print(f"FRAME: {frame.name}") for element in frame.frame_elements: print(f"{element.name}: {element.text}") ```

The result returned from detect_frames() is an object containing sentence, a parsed version of the original sentence text, trigger_locations, the indices within the sentence where frame triggers were detected, and frames, a list of all detected frames in the sentence. Within frames, each object containes name which corresponds to the FrameNet name of the frame, trigger_location corresponding to which trigger in the text this frame this frame uses, and frame_elements containing a list of all relevant frame elements found in the text.

For more efficient bulk processing of text, there's a detect_frames_bulk method which will process a list of sentences in batches. You can control the batch size using the batch_size param. By default this is 8.

```python frametransformer = FrameSemanticTransformer(batchsize=16)

result = frametransformer.detectframes_bulk([ "I'm getting quite hungry, but I can wait a bit longer.", "The chef gave the food to the customer.", "The hallway smelt of boiled cabbage and old rag mats.", ]) ```

Note: It's not recommended to pass more than a single sentence per string to detect_frames() or detect_frames_bulk(). If you have a paragraph of text to process, it's best to split the paragraph into a list of sentences and pass the sentences as a list to detect_frames_bulk(). Only single sentences per string were used during training, so it's not clear how the model will handle multiple sentences in the same string.

```python

❌ Bad, don't do this

frametransformer.detectframes("Fuzzy Wuzzy was a bear. Fuzzy Wuzzy had no hair.")

👍 Do this instead

frametransformer.detectframes_bulk([ "Fuzzy Wuzzy was a bear.", "Fuzzy Wuzzy had no hair.", ]) ```

Running on GPU vs CPU

By default, FrameSemanticTransformer will attempt to use a GPU if one is available. If you'd like to explictly set whether to run on GPU vs CPU, you can pass the use_gpu param.

```python

force the model to run on the CPU

frametransformer = FrameSemanticTransformer(usegpu=False) ```

Loading Models

There are currently 2 available pre-trained models for inference, called base and small, fine-tuned from HuggingFace's t5-base and t5-small model respectively. If a local fine-tuned t5 model exists that can be loaded as well. If no model is specified, the base model will be used.

base_transformer = FrameSemanticTransformer("base") # this is also the default small_transformer = FrameSemanticTransformer("small") # a smaller pretrained model which is faster to run custom_transformer = FrameSemanticTransformer("/path/to/model") # load a custom t5 model

By default, models are lazily loaded when detect_frames() is first called. If you want to load the model sooner, you can call setup() on a FrameSemanticTransformer instance to load models immediately.

frame_transformer = FrameSemanticTransformer() frame_transformer.setup() # load models immediately

Contributing

Any contributions to improve this project are welcome! Please open an issue or pull request in this repo with any bugfixes / changes / improvements you have!

This project uses Black for code formatting, Flake8 for linting, and Pytest for tests. Make sure any changes you submit pass these code checks in your PR. If you have trouble getting these to run feel free to open a pull-request regardless and we can discuss further in the PR.

License

The code contained in this repo is released under a MIT license, however the pretrained models are released under an Apache 2.0 license in accordance with FrameNet training data and HuggingFace's T5 base models.

Citation

If you use Frame semantic transformer in your work, please cite the following:

bibtex @article{chanin2023opensource, title={Open-source Frame Semantic Parsing}, author={Chanin, David}, journal={arXiv preprint arXiv:2303.12788}, year={2023} }

Owner

  • Name: David Chanin
  • Login: chanind
  • Kind: user
  • Location: London, UK
  • Company: UCL

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Chanin"
  given-names: "David"
title: "Open-source Frame Semantic Parsing"
date-released: 2023-03-22
url: "https://arxiv.org/abs/2303.12788"

GitHub Events

Total
  • Issues event: 1
  • Watch event: 12
  • Fork event: 3
Last Year
  • Issues event: 1
  • Watch event: 12
  • Fork event: 3

Committers

Last synced: over 1 year ago

All Time
  • Total Commits: 152
  • Total Committers: 6
  • Avg Commits per committer: 25.333
  • Development Distribution Score (DDS): 0.145
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
David Chanin c****v@g****m 130
github-actions g****s@g****m 10
github-actions a****n@g****m 8
Jacob Striebel 2****l 2
Shubhashis Roy Dipta i****a@g****m 1
Curtis Ruck r****c 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 7 months ago

All Time
  • Total issues: 17
  • Total pull requests: 15
  • Average time to close issues: 20 days
  • Average time to close pull requests: 1 day
  • Total issue authors: 15
  • Total pull request authors: 4
  • Average comments per issue: 2.71
  • Average comments per pull request: 0.27
  • Merged pull requests: 15
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 2
  • Pull request authors: 0
  • Average comments per issue: 2.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • fatihbozdag (2)
  • sdspieg (2)
  • jerome-white (1)
  • yunxiaomr (1)
  • pholur (1)
  • RiverDong (1)
  • Zce1112zslx (1)
  • ruckc (1)
  • anon2014 (1)
  • aalgirdas (1)
  • sidsvash26 (1)
  • cbjrobertson (1)
  • Yangyi-Chen (1)
  • shafqatvirk (1)
  • neostrange (1)
Pull Request Authors
  • chanind (11)
  • striebel (2)
  • ruckc (1)
  • dipta007 (1)
Top Labels
Issue Labels
Pull Request Labels

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 121 last-month
  • Total dependent packages: 0
  • Total dependent repositories: 1
  • Total versions: 19
  • Total maintainers: 1
pypi.org: frame-semantic-transformer

Frame Semantic Parser based on T5 and FrameNet

  • Versions: 19
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 121 Last month
Rankings
Dependent packages count: 10.0%
Stargazers count: 10.5%
Forks count: 11.4%
Average: 14.6%
Downloads: 19.3%
Dependent repos count: 21.8%
Maintainers (1)
Last synced: 6 months ago

Dependencies

demo/client/package-lock.json npm
  • 1153 dependencies
demo/client/package.json npm
  • prettier ^2.6.2 development
  • @heroicons/react ^1.0.6
  • @testing-library/jest-dom ^5.16.4
  • @testing-library/react ^13.2.0
  • @testing-library/user-event ^13.5.0
  • @types/jest ^27.5.1
  • @types/node ^16.11.36
  • @types/react ^18.0.9
  • @types/react-dom ^18.0.4
  • classnames ^2.3.1
  • react ^18.1.0
  • react-dom ^18.1.0
  • react-query ^3.39.0
  • react-router-dom ^6.3.0
  • react-scripts 5.0.1
  • react-spinners-kit ^1.9.1
  • typescript ^4.6.4
  • web-vitals ^2.1.4
demo/client/yarn.lock npm
  • 1191 dependencies
demo/server/requirements.txt pypi
  • Flask ==2.1.2
  • flask-cors ==3.0.10
  • gunicorn ==20.1.0
pyproject.toml pypi
  • black ^22.3.0 develop
  • flake8 ^4.0.1 develop
  • mypy ^0.950 develop
  • pytest ^5.2 develop
  • syrupy ^2.0.0 develop
  • nltk ^3.7
  • python ^3.7
  • pytorch-lightning ^1.6.2
  • sentencepiece ^0.1.96
  • torch ^1.11.0
  • tqdm ^4.64.0
  • transformers ^4.18.0
.github/workflows/ci.yaml actions
  • actions/checkout v3 composite
  • actions/setup-python v3 composite
  • relekang/python-semantic-release v7.28.1 composite
  • snok/install-poetry v1 composite
.github/workflows/website.yaml actions
  • actions/checkout v3 composite
  • actions/setup-node v3 composite
  • peaceiris/actions-gh-pages v3 composite