frame-semantic-transformer
Frame Semantic Parser based on T5 and FrameNet
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.1%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
Frame Semantic Parser based on T5 and FrameNet
Basic Info
- Host: GitHub
- Owner: chanind
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://chanind.github.io/frame-semantic-transformer
- Size: 1.04 MB
Statistics
- Stars: 62
- Watchers: 5
- Forks: 13
- Open Issues: 10
- Releases: 18
Topics
Metadata Files
README.md
Frame Semantic Transformer
Frame-based semantic parsing library trained on FrameNet and built on HuggingFace's T5 Transformer
Live Demo: chanind.github.io/frame-semantic-transformer
Full docs: frame-semantic-transformer.readthedocs.io
About
This library draws heavily on Open-Sesame (paper) for inspiration on training and evaluation on FrameNet 1.7, and uses ideas from the paper Open-Domain Frame Semantic Parsing Using Transformers for using T5 as a frame-semantic parser. SimpleT5 was also used as a base for the initial training setup.
More details: FrameNet Parsing with Transformers Blog Post
Performance
This library uses the same train/dev/test documents and evaluation methodology as Open-Sesame, so that the results should be comparable between the 2 libraries. There are 2 pretrained models available, base and small, corresponding to t5-base and t5-small in Huggingface, respectively.
| Task | Sesame F1 (dev/test) | Small Model F1 (dev/test) | Base Model F1 (dev/test) | | ---------------------- | -------------------- | ------------------------- | ------------------------ | | Trigger identification | 0.80 / 0.73 | 0.75 / 0.71 | 0.78 / 0.74 | | Frame classification | 0.90 / 0.87 | 0.87 / 0.86 | 0.91 / 0.89 | | Argument extraction | 0.61 / 0.61 | 0.76 / 0.73 | 0.78 / 0.75 |
The base model performs similarly to Open-Sesame on trigger identification and frame classification tasks, but outperforms it by a significant margin on argument extraction. The small pretrained model has lower F1 than base across the board, but is 1/4 the size and still outperforms Open-Sesame at argument extraction.
Installation
pip install frame-semantic-transformer
Usage
Inference
The main entry to interacting with the library is the FrameSemanticTransformer class, as shown below. For inference the detect_frames() method is likely all that is needed to perform frame parsing.
```python from framesemantictransformer import FrameSemanticTransformer
frame_transformer = FrameSemanticTransformer()
result = frametransformer.detectframes("The hallway smelt of boiled cabbage and old rag mats.")
print(f"Results found in: {result.sentence}") for frame in result.frames: print(f"FRAME: {frame.name}") for element in frame.frame_elements: print(f"{element.name}: {element.text}") ```
The result returned from detect_frames() is an object containing sentence, a parsed version of the original sentence text, trigger_locations, the indices within the sentence where frame triggers were detected, and frames, a list of all detected frames in the sentence. Within frames, each object containes name which corresponds to the FrameNet name of the frame, trigger_location corresponding to which trigger in the text this frame this frame uses, and frame_elements containing a list of all relevant frame elements found in the text.
For more efficient bulk processing of text, there's a detect_frames_bulk method which will process a list of sentences in batches. You can control the batch size using the batch_size param. By default this is 8.
```python frametransformer = FrameSemanticTransformer(batchsize=16)
result = frametransformer.detectframes_bulk([ "I'm getting quite hungry, but I can wait a bit longer.", "The chef gave the food to the customer.", "The hallway smelt of boiled cabbage and old rag mats.", ]) ```
Note: It's not recommended to pass more than a single sentence per string to detect_frames() or detect_frames_bulk(). If you have a paragraph of text to process, it's best to split the paragraph into a list of sentences and pass the sentences as a list to detect_frames_bulk(). Only single sentences per string were used during training, so it's not clear how the model will handle multiple sentences in the same string.
```python
❌ Bad, don't do this
frametransformer.detectframes("Fuzzy Wuzzy was a bear. Fuzzy Wuzzy had no hair.")
👍 Do this instead
frametransformer.detectframes_bulk([ "Fuzzy Wuzzy was a bear.", "Fuzzy Wuzzy had no hair.", ]) ```
Running on GPU vs CPU
By default, FrameSemanticTransformer will attempt to use a GPU if one is available. If you'd like to explictly set whether to run on GPU vs CPU, you can pass the use_gpu param.
```python
force the model to run on the CPU
frametransformer = FrameSemanticTransformer(usegpu=False) ```
Loading Models
There are currently 2 available pre-trained models for inference, called base and small, fine-tuned from HuggingFace's t5-base and t5-small model respectively. If a local fine-tuned t5 model exists that can be loaded as well. If no model is specified, the base model will be used.
base_transformer = FrameSemanticTransformer("base") # this is also the default
small_transformer = FrameSemanticTransformer("small") # a smaller pretrained model which is faster to run
custom_transformer = FrameSemanticTransformer("/path/to/model") # load a custom t5 model
By default, models are lazily loaded when detect_frames() is first called. If you want to load the model sooner, you can call setup() on a FrameSemanticTransformer instance to load models immediately.
frame_transformer = FrameSemanticTransformer()
frame_transformer.setup() # load models immediately
Contributing
Any contributions to improve this project are welcome! Please open an issue or pull request in this repo with any bugfixes / changes / improvements you have!
This project uses Black for code formatting, Flake8 for linting, and Pytest for tests. Make sure any changes you submit pass these code checks in your PR. If you have trouble getting these to run feel free to open a pull-request regardless and we can discuss further in the PR.
License
The code contained in this repo is released under a MIT license, however the pretrained models are released under an Apache 2.0 license in accordance with FrameNet training data and HuggingFace's T5 base models.
Citation
If you use Frame semantic transformer in your work, please cite the following:
bibtex
@article{chanin2023opensource,
title={Open-source Frame Semantic Parsing},
author={Chanin, David},
journal={arXiv preprint arXiv:2303.12788},
year={2023}
}
Owner
- Name: David Chanin
- Login: chanind
- Kind: user
- Location: London, UK
- Company: UCL
- Website: https://chanind.github.io
- Repositories: 97
- Profile: https://github.com/chanind
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Chanin" given-names: "David" title: "Open-source Frame Semantic Parsing" date-released: 2023-03-22 url: "https://arxiv.org/abs/2303.12788"
GitHub Events
Total
- Issues event: 1
- Watch event: 12
- Fork event: 3
Last Year
- Issues event: 1
- Watch event: 12
- Fork event: 3
Committers
Last synced: over 1 year ago
Top Committers
| Name | Commits | |
|---|---|---|
| David Chanin | c****v@g****m | 130 |
| github-actions | g****s@g****m | 10 |
| github-actions | a****n@g****m | 8 |
| Jacob Striebel | 2****l | 2 |
| Shubhashis Roy Dipta | i****a@g****m | 1 |
| Curtis Ruck | r****c | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 7 months ago
All Time
- Total issues: 17
- Total pull requests: 15
- Average time to close issues: 20 days
- Average time to close pull requests: 1 day
- Total issue authors: 15
- Total pull request authors: 4
- Average comments per issue: 2.71
- Average comments per pull request: 0.27
- Merged pull requests: 15
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 2
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 2
- Pull request authors: 0
- Average comments per issue: 2.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- fatihbozdag (2)
- sdspieg (2)
- jerome-white (1)
- yunxiaomr (1)
- pholur (1)
- RiverDong (1)
- Zce1112zslx (1)
- ruckc (1)
- anon2014 (1)
- aalgirdas (1)
- sidsvash26 (1)
- cbjrobertson (1)
- Yangyi-Chen (1)
- shafqatvirk (1)
- neostrange (1)
Pull Request Authors
- chanind (11)
- striebel (2)
- ruckc (1)
- dipta007 (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 121 last-month
- Total dependent packages: 0
- Total dependent repositories: 1
- Total versions: 19
- Total maintainers: 1
pypi.org: frame-semantic-transformer
Frame Semantic Parser based on T5 and FrameNet
- Homepage: https://github.com/chanind/frame-semantic-transformer
- Documentation: https://frame-semantic-transformer.readthedocs.io/
- License: MIT
-
Latest release: 0.10.0
published over 2 years ago
Rankings
Maintainers (1)
Dependencies
- 1153 dependencies
- prettier ^2.6.2 development
- @heroicons/react ^1.0.6
- @testing-library/jest-dom ^5.16.4
- @testing-library/react ^13.2.0
- @testing-library/user-event ^13.5.0
- @types/jest ^27.5.1
- @types/node ^16.11.36
- @types/react ^18.0.9
- @types/react-dom ^18.0.4
- classnames ^2.3.1
- react ^18.1.0
- react-dom ^18.1.0
- react-query ^3.39.0
- react-router-dom ^6.3.0
- react-scripts 5.0.1
- react-spinners-kit ^1.9.1
- typescript ^4.6.4
- web-vitals ^2.1.4
- 1191 dependencies
- Flask ==2.1.2
- flask-cors ==3.0.10
- gunicorn ==20.1.0
- black ^22.3.0 develop
- flake8 ^4.0.1 develop
- mypy ^0.950 develop
- pytest ^5.2 develop
- syrupy ^2.0.0 develop
- nltk ^3.7
- python ^3.7
- pytorch-lightning ^1.6.2
- sentencepiece ^0.1.96
- torch ^1.11.0
- tqdm ^4.64.0
- transformers ^4.18.0
- actions/checkout v3 composite
- actions/setup-python v3 composite
- relekang/python-semantic-release v7.28.1 composite
- snok/install-poetry v1 composite
- actions/checkout v3 composite
- actions/setup-node v3 composite
- peaceiris/actions-gh-pages v3 composite