https://github.com/arekku21/progressivetransformersslp
Source code for "Progressive Transformers for End-to-End Sign Language Production" (ECCV 2020)
Science Score: 10.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
○codemeta.json file
-
○.zenodo.json file
-
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.7%) to scientific vocabulary
Last synced: 9 months ago
·
JSON representation
Repository
Source code for "Progressive Transformers for End-to-End Sign Language Production" (ECCV 2020)
Basic Info
- Host: GitHub
- Owner: Arekku21
- License: other
- Default Branch: master
- Size: 1.59 MB
Statistics
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
- Releases: 0
Fork of BenSaunders27/ProgressiveTransformersSLP
Created about 3 years ago
· Last pushed over 3 years ago
https://github.com/Arekku21/ProgressiveTransformersSLP/blob/master/
# Progressive Transformers for End-to-End Sign Language Production
Source code for "Progressive Transformers for End-to-End Sign Language Production" (Ben Saunders, Necati Cihan Camgoz, Richard Bowden - ECCV 2020)
Conference video available at https://twitter.com/BenMSaunders/status/1336638886198521857
# Usage
Install required packages using the requirements.txt file.
`pip install -r requirements.txt`
To run, start __main__.py with arguments "train" and ".\Configs\Base.yaml":
`python __main__.py train ./Configs/Base.yaml`
An example train.log file can be found in ".\Configs\train.log" and a validation file at ".\Configs\validations.txt"
Back Translation model created from https://github.com/neccam/slt. Back Translation evaluation code coming soon.
# Data
Pre-processed Phoenix14T data can be requested via email at b.saunders@surrey.ac.uk. If you wish to create the data yourself, please follow below:
Phoenix14T data can be downloaded from https://www-i6.informatik.rwth-aachen.de/~koller/RWTH-PHOENIX-2014-T/ and skeleton joints can be extracted using OpenPose at https://github.com/CMU-Perceptual-Computing-Lab/openpose and lifted to 3D using the 2D to 3D Inverse Kinematics code at https://github.com/gopeith/SignLanguageProcessing under 3DposeEstimator.
Prepare Phoenix14T (or other sign language dataset) data as .txt files for .skel, .gloss, .txt and .files. Data format should be parallel .txt files for "src", "trg" and "files", with each line representing a new sequence:
- The "src" file contains source sentences, with each line representing new sentence.
- The "trg" file contains skeleton data of each frame, with a space separating frames. The joints should be divided by 3 to match the scaling I used. Each frame contains 150 joint values and a subsequent counter value, all separated by a space. Each sequence should be separated with a new line. If your data contains 150 joints per frame, please ensure that trg_size is set to 150 in the config file.
- The "files" file should contain the name of each sequence on a new line.
Examples can be found in /Data/tmp. Data path must be specified in config file.
# Pre-Trained Model
A pre-trained Progressive Transformer checkpoint can be downloaded from https://www.dropbox.com/s/l4xmnybp7luz0l3/PreTrained_PTSLP_Model.ckpt?dl=0.
This model has a size of ```num_layers: 2```, ```num_heads: 4``` and ```embedding_dim: 512```, as outlined in ```./Configs/Base.yaml```. It has been pre-trained on the full PHOENIX14T dataset with the data format as above. The relevant train.log and validations.txt files can be found in ```.\Configs```.
To initialise a model from this checkpoint, pass the ```--ckpt ./PreTrained_PTSLP_Model.ckpt``` argument to either ```train``` or ```test``` modes. Additionally, to initialise the correct src_embed size, the config argument ```src_vocab: "./Configs/src_vocab.txt"``` must be set to the location of the src_vocab.txt, found under ```./Configs```. Please open an issue if this checkpoint cannot be downloaded or loaded.
# Reference
If you use this code in your research, please cite the following [papers](https://arxiv.org/abs/2004.14874):
```
@inproceedings{saunders2020progressive,
title = {{Progressive Transformers for End-to-End Sign Language Production}},
author = {Saunders, Ben and Camgoz, Necati Cihan and Bowden, Richard},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2020}}
@inproceedings{saunders2020adversarial,
title = {{Adversarial Training for Multi-Channel Sign Language Production}},
author = {Saunders, Ben and Camgoz, Necati Cihan and Bowden, Richard},
booktitle = {Proceedings of the British Machine Vision Conference (BMVC)},
year = {2020}}
@inproceedings{saunders2021continuous,
title = {{Continuous 3D Multi-Channel Sign Language Production via Progressive Transformers and Mixture Density Networks}},
author = {Saunders, Ben and Camgoz, Necati Cihan and Bowden, Richard},
booktitle = {International Journal of Computer Vision (IJCV)},
year = {2021}}
```
## Acknowledgements
This work received funding from the SNSF Sinergia project 'SMILE' (CRSII2 160811), the European Union's Horizon2020 research and innovation programme under grant agreement no. 762021 'Content4All' and the EPSRC project 'ExTOL' (EP/R03298X/1). This work reflects only the authors view and the Commission is not responsible for any use that may be made of the information it contains. We would also like to thank NVIDIA Corporation for their GPU grant.
Owner
- Name: Alec Vince
- Login: Arekku21
- Kind: user
- Location: At home snuggling with my cats
- Company: the interesting company
- Twitter: Arekku21
- Repositories: 4
- Profile: https://github.com/Arekku21
Cod-ing...A fish verb idk i am not studying that.