audim

An animation and video rendering engine for audio-based and voice-based podcast videos.

https://github.com/mratanusarkar/audim

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.6%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

An animation and video rendering engine for audio-based and voice-based podcast videos.

Basic Info

Host: GitHub
Owner: mratanusarkar
License: apache-2.0
Language: Python
Default Branch: main
Homepage: https://mratanusarkar.github.io/audim/
Size: 14.1 MB

Statistics

Stars: 5
Watchers: 1
Forks: 0
Open Issues: 3
Releases: 1

Created over 1 year ago · Last pushed about 1 year ago

Metadata Files

Readme License Citation

# Audim ✨ [![Documentation](https://img.shields.io/badge/docs-mkdocs-4baaaa.svg?logo=materialformkdocs&logoColor=white)](https://mratanusarkar.github.io/audim) [![PyPI version](https://img.shields.io/pypi/v/audim.svg?color=blue&logo=pypi&logoColor=white)](https://pypi.org/project/audim/) [![Python versions](https://img.shields.io/pypi/pyversions/audim.svg?color=blue&logo=python&logoColor=white)](https://pypi.org/project/audim/) [![GitHub Actions Workflow Status](https://img.shields.io/github/actions/workflow/status/mratanusarkar/audim/deploy.yml?logo=githubactions&logoColor=white)](https://github.com/mratanusarkar/audim/actions)
[![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-orange.svg?logo=apache&logoColor=white)](https://github.com/mratanusarkar/audim/blob/main/LICENSE) [![Author: Atanu Sarkar](https://img.shields.io/badge/Author-Atanu%20Sarkar-708FCC?logo=github&logoColor=white)](https://github.com/mratanusarkar) [![Citation](https://img.shields.io/badge/Cite%20this-Repository-green?logo=gitextensions&logoColor=white)](https://github.com/mratanusarkar/audim/blob/main/CITATION.cff)
[![Visitors](https://api.visitorbadge.io/api/visitors?path=https%3A%2F%2Fgithub.com%2Fmratanusarkar%2Faudim&label=view&labelColor=%235e5e5e&countColor=%237C8AA0&style=flat&labelStyle=lower)](https://visitorbadge.io/status?path=https%3A%2F%2Fgithub.com%2Fmratanusarkar%2Faudim) [![PyPI Total Downloads](https://static.pepy.tech/badge/audim)](https://pepy.tech/projects/audim) [![PyPI Monthly Downloads](https://img.shields.io/pypi/dm/audim?style=flat&color=%231F86BF)](https://pypistats.org/packages/audim) **Au**dio Po**d**cast An**im**ation Engine > _An animation and video rendering engine for audio-based and voice-based podcast videos._ | [Documentation](https://mratanusarkar.github.io/audim) | [Features](#-features) | [Getting Started](#-getting-started) | [Quick Links](#-quick-links) |

🚀 Demo

https://github.com/user-attachments/assets/634df0ca-77ee-448b-ac35-f4eb3e4261b9

A sample podcast video generated with Audim

[!NOTE]

For this example, we have transformed a conversation between Grant Sanderson (from 3Blue1Brown) and Sal Khan (from Khan Academy) from YouTube into a visually engaging podcast video using Audim.

See docs/devblog/v0.0.7 for more details on how this video was generated.

🔗 Quick Links

Getting Started
- See Setup and ensure you have setup correctly before usage.
- For developers and contributors, see Development.
API Documentation
- See API Docs for the audim API documentation.
Usage and Examples
- See Usage for usage examples.
Dev Blog
- See Dev Blog for the development blog of the project to gain more insights into the project.
- See Changelog for the changelog of the project.

🎯 Introduction

Audim is an engine for precise programmatic animation and rendering of podcast videos from audio-based and voice-based file recordings.

✨ Features

💻 Precise programmatic animations.
🎬 Rendering of videos with layout based scenes.
📝 Generate subtitles and transcripts from audio/video files.
🎤 From subtitle and scene elements to podcast video generation.

🚀 Getting Started

Prerequisites

🐍 Python ≥ 3.10
🖥️ Conda or venv
🎥 FFmpeg (recommended, for faster video encoding)

Installation

1. Install Audim

It is recommended to install Audim in a virtual environment from PyPI or Conda in a Python=3.10 environment.

Install audim package from PyPI:

bash pip install audim

Install from source

By installing `audim` from source, you can explore the latest features and enhancements that have not yet been officially released. Please note that the latest changes may be still in development and may not be stable and may contain bugs. #### Install from source ```bash pip install git+https://github.com/mratanusarkar/audim.git ``` OR, you can also clone the repository and install the package from source: #### Clone the repository ```bash git clone https://github.com/mratanusarkar/audim.git ```

2. Install FFmpeg locally (recommended)

Using local FFmpeg is optional but recommended for speeding up the video encoding process.

On Ubuntu, install FFmpeg using:

bash sudo apt install ffmpeg libx264-dev

On Windows and other platforms, download and install FFmpeg from the official website:

Download FFmpeg
Ensure FFmpeg is in your system PATH

Virtual environment and project setup for development with uv

#### Install `uv` and setup project environment: > **IMPORTANT** > > If you are using conda base environment as the default base environment for your python projects, run the below command to activate the base environment. If not, skip this step and continue with the next step. > > ```bash > conda activate base > ``` ```bash # Install uv pip install uv # Setup project environment uv venv source .venv/bin/activate # on Linux # .venv\Scripts\activate # on Windows uv pip install -e ".[dev,docs]" ``` #### Build and deploy documentation You can build and serve the documentation by running: ```bash uv pip install -e .[docs] mkdocs serve ``` ## Code Quality Before committing, please ensure that the code is formatted and styled correctly. Run the following commands to check and fix code style issues: ```bash # Check and fix code style issues ruff format . ruff check --fix . ``` See [Development](https://mratanusarkar.github.io/audim/setup/development/) for more details on how to setup the project environment and contribute to the project.

⚖️ License & Attribution

Audim is licensed under Apache 2.0. You can use it freely for personal and commercial projects.

Attribution is encouraged. If you use Audim, please:

Keep the default watermark in videos, OR
Add "Made with Audim" to video descriptions, OR
Link to this repo in your project documentation

See NOTICE file for complete attribution guidelines.

📄 Citation

If you use Audim in your project or research, please cite it as follows:

bibtex @software{audim, title = {Audim: Audio Podcast Animation Engine}, author = {Sarkar, Atanu}, year = {2025}, url = {https://github.com/mratanusarkar/audim}, version = {0.0.7} }

You can also click the "Cite this repository" button on GitHub for other citation formats.

⚠️ Disclaimer

[!WARNING] Early Development Stage

This project is actively under development and may contain bugs or limitations.

While stable for basic use cases, the rendering engine requires further development and testing across diverse scenarios.

The API is subject to change, so keep an eye at the documentation for the latest updates.

[!TIP] We encourage you to:

Try Audim for your projects and podcast videos.

Report issues when encountered.

Feel free to raise a PR to contribute and improve the project.

Your feedback and contributions help make Audim better for everyone!

Owner

Name: Atanu Sarkar
Login: mratanusarkar
Kind: user
Location: Kolkata, West Bengal, India
Company: BGSW

Twitter: mratanusarkar
Repositories: 8
Profile: https://github.com/mratanusarkar

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
type: software
title: "Audim: Audio Podcast Animation Engine"
abstract: "An engine for precise programmatic animation and rendering of podcast videos from audio-based and voice-based file recordings. Audim provides audio-to-subtitle generation with speaker diarization and subtitle-to-podcast video generation with customizable layouts and effects."
authors:
  - family-names: "Sarkar"
    given-names: "Atanu"
    orcid: "https://orcid.org/0009-0009-6334-7312"
    email: "mratanusarkar@gmail.com"
repository-code: "https://github.com/mratanusarkar/audim"
url: "https://mratanusarkar.github.io/audim"
keywords:
  - podcast generation
  - video generation
  - animation engine
  - programmatic animation
  - multimedia
  - content creation
  - python
  - python library
license: Apache-2.0
version: "0.0.7"
date-released: "2025-05-25"
preferred-citation:
  type: software
  title: "Audim: Audio Podcast Animation Engine"
  authors:
    - family-names: "Sarkar"
      given-names: "Atanu"
  year: 2025
  url: "https://github.com/mratanusarkar/audim"
  version: "0.0.7"

GitHub Events

Total

Release event: 1
Watch event: 3
Delete event: 1
Issue comment event: 10
Push event: 7
Pull request review event: 16
Pull request event: 4
Create event: 2

Last Year

Release event: 1
Watch event: 3
Delete event: 1
Issue comment event: 10
Push event: 7
Pull request review event: 16
Pull request event: 4
Create event: 2

Committers

Last synced: about 1 year ago

All Time

Total Commits: 177
Total Committers: 1
Avg Commits per committer: 177.0
Development Distribution Score (DDS): 0.0

Past Year

Commits: 177
Committers: 1
Avg Commits per committer: 177.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Atanu Sarkar	m**r@g**m	177

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 8
Total pull requests: 23
Average time to close issues: 18 days
Average time to close pull requests: about 7 hours
Total issue authors: 1
Total pull request authors: 1
Average comments per issue: 0.38
Average comments per pull request: 1.0
Merged pull requests: 23
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 8
Pull requests: 23
Average time to close issues: 18 days
Average time to close pull requests: about 7 hours
Issue authors: 1
Pull request authors: 1
Average comments per issue: 0.38
Average comments per pull request: 1.0
Merged pull requests: 23
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

mratanusarkar (8)

Pull Request Authors

mratanusarkar (25)

Top Labels

Issue Labels

Pull Request Labels

version tag (8) release tag (1)

Dependencies

.github/workflows/deploy.yml actions

actions/cache v3 composite
actions/checkout v4 composite
actions/setup-python v4 composite

pyproject.toml pypi

Pillow >=10.2.0
matplotlib >=3.8.0
moviepy ==2.0.0.dev2
numpy ==1.26.4
opencv-python >=4.9.0.80
pydub ==0.25.1
pysrt >=1.1.2
torch ==2.2.0
torchaudio ==2.2.0
whisperx ==3.3.1

audim

Science Score: 44.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

🚀 Demo

🔗 Quick Links

🎯 Introduction

✨ Features

🚀 Getting Started

Prerequisites

Installation

1. Install Audim

2. Install FFmpeg locally (recommended)

⚖️ License & Attribution

📄 Citation

⚠️ Disclaimer

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies