subaligner

Automatically synchronize and translate subtitles, or create new ones by transcribing, using pre-trained DNNs, Forced Alignments and Transformers. https://subaligner.readthedocs.io/

https://github.com/baxtree/subaligner

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 2 DOI reference(s) in README
✓
Academic publication links
Links to: zenodo.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.4%) to scientific vocabulary

Keywords

advanced-substation-alpha alignment captions ebu-stl microdvd mpl2 sami sbv scc subrip substation-alpha subtitle-conversion subtitle-synchronization subtitle-translation subtitles tmp transcription ttml voice-activity-detection webvtt

Last synced: 6 months ago · JSON representation ·

Repository

Automatically synchronize and translate subtitles, or create new ones by transcribing, using pre-trained DNNs, Forced Alignments and Transformers. https://subaligner.readthedocs.io/

Basic Info

Host: GitHub
Owner: baxtree
License: mit
Language: Python
Default Branch: master
Homepage: https://hub.docker.com/r/baxtree/subaligner
Size: 103 MB

Statistics

Stars: 481
Watchers: 14
Forks: 20
Open Issues: 1
Releases: 32

Topics

Created about 6 years ago · Last pushed 7 months ago

Metadata Files

Readme Contributing License Code of conduct Citation

README.md

Codecov

Supported Formats

Subtitle: SubRip, TTML, WebVTT, (Advanced) SubStation Alpha, MicroDVD, MPL2, TMP, EBU STL, SAMI, SCC and SBV.

Video/Audio: MP4, WebM, Ogg, 3GP, FLV, MOV, Matroska, MPEG TS, WAV, MP3, AAC, FLAC, etc.

:information_source: Subaligner relies on file extensions as default hints to process a wide range of audiovisual or subtitle formats. It is recommended to use extensions widely acceppted by the community to ensure compatibility.

Dependant package

Required by the basic installation: FFmpeg

Install FFmpeg

apt-get install ffmpeg

brew install ffmpeg

Basic Installation

Install from PyPI

pip install -U pip && pip install -U setuptools wheel

pip install subaligner

Install from source

git clone git@github.com:baxtree/subaligner.git && cd subaligner

pip install -U pip && pip install -U setuptools

pip install .

:information_source: It is highly recommended creating a virtual environment prior to installation.

Installation with Optional Packages Supporting Additional Features

Install dependencies for enabling translation and transcription

pip install 'subaligner[llm]'

Install dependencies for enabling forced alignment

pip install 'setuptools<65.0.0'

pip install 'subaligner[stretch]'

Install dependencies for setting up the development environment

pip install 'setuptools<65.0.0'

pip install 'subaligner[dev]'

Install all extra dependencies

pip install 'setuptools<65.0.0'

pip install 'subaligner[harmony]'

Note that subaligner[stretch], subaligner[dev] and subaligner[harmony] require eSpeak to be pre-installed:

Install eSpeak

apt-get install espeak libespeak1 libespeak-dev espeak-data

brew install espeak

Also, if Python 3.12+ is used, you will need to install the following patch for those extras to fully function:

Install patched aeneas

pip install git+https://github.com/baxtree/aeneas.git@v1.7.3.1#egg=aeneas

Container Support

If you prefer using a containerised environment over installing everything locally:

Run subaligner with a container

docker run -v pwd:pwd -w pwd -it baxtree/subaligner bash

For Windows users, you can use Windows Subsystem for Linux (WSL) to install Subaligner. Alternatively, you can use Docker Desktop to pull and run the image. Assuming your media assets are stored under d:\media, open built-in command prompt, PowerShell, or Windows Terminal:

Run the subaligner container on Windows

docker pull baxtree/subaligner

docker run -v "/d/media":/media -w "/media" -it baxtree/subaligner bash

Usage

Single-stage alignment (high-level shift with lower latency)

subaligner -m single -v video.mp4 -s subtitle.srt

subaligner -m single -v https://example.com/video.mp4 -s https://example.com/subtitle.srt -o subtitle_aligned.srt

Dual-stage alignment (low-level shift with higher latency)

subaligner -m dual -v video.mp4 -s subtitle.srt

subaligner -m dual -v https://example.com/video.mp4 -s https://example.com/subtitle.srt -o subtitle_aligned.srt

Generate subtitles by transcribing audiovisual files

subaligner -m transcribe -v video.mp4 -ml eng -mr whisper -mf small -o subtitle_aligned.srt

subaligner -m transcribe -v video.mp4 -ml zho -mr whisper -mf medium -o subtitle_aligned.srt

Pass in a global prompt for the entire audio transcription

subaligner -m transcribe -v video.mp4 -ml eng -mr whisper -mf turbo -ip "your initial prompt" -o subtitle_aligned.srt

Use the full subtitle content as a prompt

subaligner -m transcribe -v video.mp4 -s subtitle.srt -ml eng -mr whisper -mf turbo -o subtitle_aligned.srt

Use the previous subtitle segment as the prompt when transcribing the following segment

subaligner -m transcribe -v video.mp4 -s subtitle.srt --use_prior_prompting -ml eng -mr whisper -mf turbo -o subtitle_aligned.srt

(For details on the prompt crafting for transcription, please refer to Whisper prompting guide.)

Alignment on segmented plain texts (double newlines as the delimiter)

subaligner -m script -v video.mp4 -s subtitle.txt -o subtitle_aligned.srt

subaligner -m script -v https://example.com/video.mp4 -s https://example.com/subtitle.txt -o subtitle_aligned.srt

Generate JSON raw subtitle with per-word timings

subaligner -m transcribe -v video.mp4 -ml eng -mr whisper -mf turbo -ip "your initial prompt" --word_time_codes -o raw_subtitle.json

subaligner -m script -v video.mp4 -s subtitle.txt --word_time_codes -o raw_subtitle.json

Alignment on multiple subtitles against the single media file

subaligner -m script -v video.mp4 -s subtitle_lang_1.txt -s subtitle_lang_2.txt

subaligner -m script -v video.mp4 -s subtitle_lang_1.txt subtitle_lang_2.txt

Alignment on embedded subtitles

subaligner -m single -v video.mkv -s embedded:stream_index=0 -o subtitle_aligned.srt

subaligner -m dual -v video.mkv -s embedded:stream_index=0 -o subtitle_aligned.srt

Translative alignment with the ISO 639-3 language code pair (src,tgt)

subaligner --languages

subaligner -m single -v video.mp4 -s subtitle.srt -t src,tgt

subaligner -m dual -v video.mp4 -s subtitle.srt -t src,tgt

subaligner -m script -v video.mp4 -s subtitle.txt -o subtitle_aligned.srt -t src,tgt

subaligner -m dual -v video.mp4 -s subtitle.srt -tr helsinki-nlp -o subtitle_aligned.srt -t src,tgt

subaligner -m dual -v video.mp4 -s subtitle.srt -tr facebook-mbart -tf large -o subtitle_aligned.srt -t src,tgt

subaligner -m dual -v video.mp4 -s subtitle.srt -tr facebook-m2m100 -tf small -o subtitle_aligned.srt -t src,tgt

subaligner -m dual -v video.mp4 -s subtitle.srt -tr whisper -tf small -o subtitle_aligned.srt -t src,eng

Transcribe audiovisual files and generate translated subtitles

subaligner -m transcribe -v video.mp4 -ml src -mr whisper -mf small -tr helsinki-nlp -o subtitle_aligned.srt -t src,tgt

Shift subtitle manually by offset in seconds

subaligner -m shift --subtitle_path subtitle.srt -os 5.5

subaligner -m shift --subtitle_path subtitle.srt -os -5.5 -o subtitle_shifted.srt

Run batch alignment against directories

subaligner_batch -m single -vd videos/ -sd subtitles/ -od aligned_subtitles/

subaligner_batch -m dual -vd videos/ -sd subtitles/ -od aligned_subtitles/

subaligner_batch -m dual -vd videos/ -sd subtitles/ -od aligned_subtitles/ -of ttml

Run alignments with pipx

pipx run subaligner -m single -v video.mp4 -s subtitle.srt

pipx run subaligner -m dual -v video.mp4 -s subtitle.srt

Run the module as a script

python -m subaligner -m single -v video.mp4 -s subtitle.srt

python -m subaligner -m dual -v video.mp4 -s subtitle.srt

Run alignments with the docker image

docker pull baxtree/subaligner

docker run -v `pwd`:`pwd` -w `pwd` -it baxtree/subaligner subaligner -m single -v video.mp4 -s subtitle.srt

docker run -v `pwd`:`pwd` -w `pwd` -it baxtree/subaligner subaligner -m dual -v video.mp4 -s subtitle.srt

docker run -it baxtree/subaligner subaligner -m single -v https://example.com/video.mp4 -s https://example.com/subtitle.srt -o subtitle_aligned.srt

docker run -it baxtree/subaligner subaligner -m dual -v https://example.com/video.mp4 -s https://example.com/subtitle.srt -o subtitle_aligned.srt

The aligned subtitle will be saved at subtitle_aligned.srt. To obtain the subtitle in raw JSON format for downstream processing, replace the output file extension with .json. For details on CLIs, run subaligner -h or subaligner_batch -h, subaligner_convert -h, subaligner_train -h and subaligner_tune -h for additional utilities. subaligner_1pass and subaligner_2pass are shortcuts for running subaligner with -m single and -m dual options, respectively.

Advanced Usage

You can train a new model with your own audiovisual files and subtitle files,

Train a custom model

subalignertrain -vd VIDEODIRECTORY -sd SUBTITLEDIRECTORY -tod TRAININGOUTPUT_DIRECTORY

Then you can apply it to your subtitle synchronisation with the aforementioned commands. For more details on how to train and tune your own model, please refer to Subaligner Docs.

For larger media files taking longer to process, you can reconfigure various timeouts using the following:

Options for tuning timeouts

-mpt [Maximum waiting time in seconds when processing media files]
-sat [Maximum waiting time in seconds when aligning each segment]
-fet [Maximum waiting time in seconds when embedding features for training]

Anatomy

Subtitles can be out of sync with their companion audiovisual media files for a variety of causes including latency introduced by Speech-To-Text on live streams or calibration and rectification involving human intervention during post-production.

A model has been trained with synchronised video and subtitle pairs and later used for predicating shifting offsets and directions under the guidance of a dual-stage aligning approach.

First Stage (Global Alignment):

Second Stage (Parallelised Individual Alignment):

Acknowledgement

This tool wouldn't be possible without the following packages: librosa tensorflow scikit-learn pycaption pysrt pysubs2 aeneas transformers openai-whisper.

Thanks to Alan Robinson and Nigel Megitt for their invaluable feedback.

Owner

Name: Xi Bai
Login: baxtree
Kind: user

Website: http://baixi.info
Repositories: 14
Profile: https://github.com/baxtree

Citation (CITATION.cff)

cff-version: 1.2.0
message: "This repository contains the implementation of Subaligner, which provides a one-stop solution on automatic subtitle synchronisation and translation with pretrained deep neural networks, forced alignments and transformers. If you use this software, please cite it via 'Cite this repository' on GitHub."
authors:
  - family-names: Bai
    given-names: Xi
    orcid: https://orcid.org/0000-0002-2177-8458
title: "Subaligner: Towards Automated Subtitle Alignment"
doi: 10.5281/zenodo.5603083
date-released: 2021-10-28
url: "https://github.com/baxtree/subaligner"

GitHub Events

Total

Create event: 7
Release event: 4
Issues event: 13
Watch event: 42
Delete event: 5
Issue comment event: 26
Push event: 30
Pull request event: 4
Fork event: 2

Last Year

Create event: 7
Release event: 4
Issues event: 13
Watch event: 42
Delete event: 5
Issue comment event: 26
Push event: 30
Pull request event: 4
Fork event: 2

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 6
Total pull requests: 2
Average time to close issues: about 1 month
Average time to close pull requests: N/A
Total issue authors: 6
Total pull request authors: 1
Average comments per issue: 3.33
Average comments per pull request: 0.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 2

Past Year

Issues: 6
Pull requests: 2
Average time to close issues: about 1 month
Average time to close pull requests: N/A
Issue authors: 6
Pull request authors: 1
Average comments per issue: 3.33
Average comments per pull request: 0.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 2

View more stats

Top Authors

Issue Authors

Zane5 (1)
starptr (1)
raeulein (1)
tdworz (1)
Tarzoq (1)
Johndirr (1)
ToThePointTechDev (1)
codefaux (1)
cleverestx (1)
fieri (1)
danbeibei (1)
ClaireCJS (1)
bobveringa (1)

Pull Request Authors

dependabot[bot] (2)

Top Labels

Issue Labels

Pull Request Labels

dependencies (2) python (2)

Packages

Total packages: 1
Total downloads:
- pypi 1,342 last-month
Total docker downloads: 144

Total dependent packages: 0
Total dependent repositories: 1
Total versions: 33
Total maintainers: 1

pypi.org: subaligner

Automatically synchronize and translate subtitles, or create new ones by transcribing, using pre-trained DNNs, Forced Alignments and Transformers.

Homepage: https://github.com/baxtree/subaligner
Documentation: https://subaligner.readthedocs.io/en/latest/
License: MIT License
Latest release: 0.3.10
published 7 months ago

Versions: 33
Dependent Packages: 0
Dependent Repositories: 1
Downloads: 1,342 Last month
Docker Downloads: 144

Rankings

Docker downloads count: 3.8%

Dependent packages count: 10.1%

Downloads: 11.8%

Average: 11.8%

Dependent repos count: 21.6%

Maintainers (1)

baxtree

Last synced: 6 months ago

Dependencies

Pipfile pypi

coverage ==4.5.1 develop
line-profiler ==3.0.2 develop
mock ==2.0.0 develop
mypy ==0.790 develop
parameterized ==0.8.1 develop
pex <=2.1.80 develop
pycodestyle ==2.5.0 develop
pygments ~=2.7.4 develop
pylint ~=2.8.2 develop
radish-bdd ~=0.13.3 develop
scikit-build ==0.11.1 develop
snakeviz ==2.1.0 develop
sphinx ==3.3.1 develop
sphinx-rtd-theme ==0.5.0 develop
tox ~=3.23.0 develop
twine >=3.1.1 develop
Cython ~=0.29.22
HeapDict ==1.0.0
Keras-Applications >=1.0.8
Keras-Preprocessing >=1.0.9
Markdown ==2.6.11
PyYAML >=4.2b1
Werkzeug >=0.15.3
absl-py ~=0.10
aeneas ==1.7.3.0
astor ==0.7.1
astroid ~=2.5.6
audioread ==2.1.5
beautifulsoup4 <4.9.0
bleach ==3.3.0
cachetools ==3.1.1
captionstransformer ~=1.2.1
certifi ==2019.11.28
chardet ==3.0.4
click ==5.1
cloudpickle ==0.5.3
cycler ==0.10.0
dask <2022.1.0
decorator ==4.3.0
distributed ==1.13.0
filelock <4.0.0
google-auth ==1.27.0
google-auth-oauthlib ==0.4.2
google-pasta ~=0.2
graphviz ==0.8.3
h5py <=3.6.0
html5lib ==1.0b9
hyperopt ==0.2.4
idna ==2.8
isort ==4.3.4
joblib ==0.11
kiwisolver ==1.0.1
lazy-object-proxy ==1.4.3
le-pycaption ==2.2.0a1
librosa >=0.8.0
locket ==0.2.0
mccabe ==0.6.1
msgpack-python ==0.5.6
numba >=0.50.0
numpy <1.24.0
oauthlib ==3.1.0
pbr ==4.0.2
pluggy ==0.13.1
psutil ==5.6.7
py ==1.10.0
pyasn1 ==0.4.8
pyasn1-modules ==0.2.7
pydot ==1.2.4
pydot-ng ==1.0.0
pydotplus ==2.0.2
pylint ==2.5.0
pyparsing ==2.2.0
pyprof2calltree ==1.4.3
pysrt ==1.1.1
pystack-debugger ==0.8.0
pysubs2 <=1.4.2
python-dateutil ==2.7.2
pytz ==2018.4
requests ~=2.25.1
requests-oauthlib ==1.3.0
rsa ==4.7
scikit-learn >=0.19.1
scipy <=1.8.1
sentencepiece ~=0.1.95
setuptools >=41.0.0
six ~=1.15.0
tblib ==1.3.2
tensorflow >=1.15.5,<2.9
termcolor ==1.1.0
toml ==0.10.0
toolz ==0.9.0
torch <=1.12.0
tornado ==5.1.0
transformers ~=4.5.1
typing-extensions ~=3.7.0
urllib3 ~=1.26.5
zict ==0.1.3
zipp ==0.6.0

requirements-dev.txt pypi

coverage ==5.5 development
line-profiler ==3.1.0 development
mock ==4.0.3 development
mypy ==0.931 development
parameterized ==0.8.1 development
pex <=2.1.80 development
pycodestyle ==2.5.0 development
pygments ==2.7.4 development
pylint * development
radish-bdd * development
scikit-build ==0.11.1 development
snakeviz ==2.1.0 development
tox * development
twine >=3.1.1 development
types-requests ==2.27.9 development
types-setuptools ==57.4.9 development
typing-extensions <4.0.0 development

requirements-site.txt pypi

docutils *
sphinx ==3.3.1
sphinx-rtd-theme ==0.5.0

requirements-stretch.txt pypi

aeneas *

requirements.txt pypi

Cython *
HeapDict ==1.0.0
Keras-Applications >=1.0.8
Keras-Preprocessing >=1.0.9
Markdown ==2.6.11
PyYAML >=4.2b1
Werkzeug >=0.15.3
absl-py *
astor ==0.7.1
audioread ==2.1.5
beautifulsoup4 <4.9.0
bleach ==3.3.0
cachetools ==3.1.1
captionstransformer *
cchardet ==2.1.7
certifi ==2019.11.28
chardet ==3.0.4
click ==5.1
cloudpickle *
cycler ==0.10.0
dask <2022.1.0
decorator ==4.3.0
distributed ==1.13.0
filelock <4.0.0
google-auth ==1.27.0
google-auth-oauthlib ==0.4.2
google-pasta *
graphviz ==0.8.3
h5py <=3.6.0
html5lib ==1.0b9
hyperopt ==0.2.4
idna ==2.8
isort ==4.3.4
kiwisolver ==1.0.1
lazy-object-proxy ==1.4.3
le-pycaption ==2.2.0a1
librosa >=0.8.0
locket ==0.2.0
mccabe ==0.6.1
msgpack-python ==0.5.6
networkx >=2.5.1
numba >=0.50.0
numpy <1.24.0
oauthlib ==3.1.0
pbr ==4.0.2
pluggy ==0.13.1
psutil ==5.6.7
py ==1.10.0
pyasn1 ==0.4.8
pyasn1-modules ==0.2.7
pydot ==1.2.4
pydot-ng ==1.0.0
pydotplus ==2.0.2
pyprof2calltree ==1.4.3
pysrt ==1.1.1
pystack-debugger ==0.8.0
pysubs2 <=1.4.2
pytz ==2018.4
requests *
requests-oauthlib ==1.3.0
rsa ==4.7
scikit-learn *
scipy <=1.8.1
setuptools >=41.0.0
six *
tblib ==1.3.2
tensorflow >=1.15.5,<2.9
termcolor ==1.1.0
toml ==0.10.0
toolz ==0.9.0
tornado ==5.1.0
urllib3 *
zict ==0.1.3
zipp ==0.6.0

.github/workflows/ci-pipeline.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite

.github/workflows/dockerhub.yml actions

actions/checkout v3 composite
docker/build-push-action v4 composite
docker/build-push-action v2 composite
docker/login-action v2 composite
docker/setup-buildx-action v2 composite
docker/setup-qemu-action v2 composite

.github/workflows/lint-charts.yml actions

actions/checkout v3 composite
mamezou-tech/setup-helmfile v1.2.0 composite

docker/docker-compose.yml docker

baxtree/subaligner ${SUBALIGNER_VERSION}.el7
baxtree/subaligner ${SUBALIGNER_VERSION}.u20
baxtree/subaligner ${SUBALIGNER_VERSION}.u22
baxtree/subaligner ${SUBALIGNER_VERSION}.arch
baxtree/subaligner ${SUBALIGNER_VERSION}.deb11
baxtree/subaligner ${SUBALIGNER_VERSION}.fed34

pyproject.toml pypi

requirements-arm64.txt pypi

HeapDict ==1.0.0
Markdown ==2.6.11
PyYAML >=4.2b1
Werkzeug >=0.15.3
astor ==0.7.1
beautifulsoup4 <4.9.0
bleach ==3.3.0
cachetools ==3.1.1
captionstransformer *
certifi ==2019.11.28
chardet ==3.0.4
click ==5.1
cloudpickle *
cycler ==0.10.0
decorator ==4.3.0
distributed ==1.13.0
filelock <4.0.0
google-auth-oauthlib ==0.4.2
google-pasta *
graphviz ==0.8.3
h5py <4.0.0
html5lib ==1.0b9
hyperopt ==0.2.4
idna ==2.8
isort ==4.3.4
joblib >=1.2.0
keras *
le-pycaption ==2.2.0a1
librosa <0.10.0
locket ==0.2.0
mccabe ==0.6.1
networkx >=2.5.1
numba >=0.50.0
numpy <1.24.0
oauthlib ==3.1.0
pbr ==4.0.2
pluggy ==0.13.1
protobuf <4.0
psutil ==5.6.7
py ==1.10.0
pyasn1 ==0.4.8
pyasn1-modules ==0.2.7
pycountry *
pydot ==1.2.4
pydot-ng ==1.0.0
pydotplus ==2.0.2
pyprof2calltree ==1.4.3
pysrt ==1.1.1
pystack-debugger ==0.8.0
pysubs2 <=1.4.2
pytz ==2018.4
rsa ==4.7
scikit-learn <1.2.0
scipy <=1.8.1
setuptools >=41.0.0
six *
tblib ==1.3.2
tensorflow-macos *
tensorflow-metal *
termcolor ==1.1.0
toml ==0.10.0
toolz ==0.9.0
tornado ==5.1.0
urllib3 *
zict ==0.1.3
zipp ==0.6.0

requirements-llm.txt pypi

openai-whisper ==20230314
sentencepiece *
torch <1.13.0
transformers <4.27.0

setup.py pypi