audim

An animation and video rendering engine for audio-based and voice-based podcast videos.

https://github.com/mratanusarkar/audim

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.6%) to scientific vocabulary
Last synced: 7 months ago · JSON representation ·

Repository

An animation and video rendering engine for audio-based and voice-based podcast videos.

Basic Info
Statistics
  • Stars: 5
  • Watchers: 1
  • Forks: 0
  • Open Issues: 3
  • Releases: 1
Created about 1 year ago · Last pushed 11 months ago
Metadata Files
Readme License Citation

README.md

# Audim ✨ [![Documentation](https://img.shields.io/badge/docs-mkdocs-4baaaa.svg?logo=materialformkdocs&logoColor=white)](https://mratanusarkar.github.io/audim) [![PyPI version](https://img.shields.io/pypi/v/audim.svg?color=blue&logo=pypi&logoColor=white)](https://pypi.org/project/audim/) [![Python versions](https://img.shields.io/pypi/pyversions/audim.svg?color=blue&logo=python&logoColor=white)](https://pypi.org/project/audim/) [![GitHub Actions Workflow Status](https://img.shields.io/github/actions/workflow/status/mratanusarkar/audim/deploy.yml?logo=githubactions&logoColor=white)](https://github.com/mratanusarkar/audim/actions)
[![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-orange.svg?logo=apache&logoColor=white)](https://github.com/mratanusarkar/audim/blob/main/LICENSE) [![Author: Atanu Sarkar](https://img.shields.io/badge/Author-Atanu%20Sarkar-708FCC?logo=github&logoColor=white)](https://github.com/mratanusarkar) [![Citation](https://img.shields.io/badge/Cite%20this-Repository-green?logo=gitextensions&logoColor=white)](https://github.com/mratanusarkar/audim/blob/main/CITATION.cff)
[![Visitors](https://api.visitorbadge.io/api/visitors?path=https%3A%2F%2Fgithub.com%2Fmratanusarkar%2Faudim&label=view&labelColor=%235e5e5e&countColor=%237C8AA0&style=flat&labelStyle=lower)](https://visitorbadge.io/status?path=https%3A%2F%2Fgithub.com%2Fmratanusarkar%2Faudim) [![PyPI Total Downloads](https://static.pepy.tech/badge/audim)](https://pepy.tech/projects/audim) [![PyPI Monthly Downloads](https://img.shields.io/pypi/dm/audim?style=flat&color=%231F86BF)](https://pypistats.org/packages/audim) **Au**dio Po**d**cast An**im**ation Engine > _An animation and video rendering engine for audio-based and voice-based podcast videos._ | [Documentation](https://mratanusarkar.github.io/audim) | [Features](#-features) | [Getting Started](#-getting-started) | [Quick Links](#-quick-links) |

🚀 Demo

https://github.com/user-attachments/assets/634df0ca-77ee-448b-ac35-f4eb3e4261b9

A sample podcast video generated with Audim

[!NOTE]

For this example, we have transformed a conversation between Grant Sanderson (from 3Blue1Brown) and Sal Khan (from Khan Academy) from YouTube into a visually engaging podcast video using Audim.

See docs/devblog/v0.0.7 for more details on how this video was generated.

🔗 Quick Links

  1. Getting Started
    • See Setup and ensure you have setup correctly before usage.
    • For developers and contributors, see Development.
  2. API Documentation
    • See API Docs for the audim API documentation.
  3. Usage and Examples
    • See Usage for usage examples.
  4. Dev Blog
    • See Dev Blog for the development blog of the project to gain more insights into the project.
    • See Changelog for the changelog of the project.

🎯 Introduction

Audim is an engine for precise programmatic animation and rendering of podcast videos from audio-based and voice-based file recordings.

✨ Features

  • 💻 Precise programmatic animations.
  • 🎬 Rendering of videos with layout based scenes.
  • 📝 Generate subtitles and transcripts from audio/video files.
  • 🎤 From subtitle and scene elements to podcast video generation.

🚀 Getting Started

Prerequisites

  • 🐍 Python ≥ 3.10
  • 🖥️ Conda or venv
  • 🎥 FFmpeg (recommended, for faster video encoding)

Installation

1. Install Audim

It is recommended to install Audim in a virtual environment from PyPI or Conda in a Python=3.10 environment.

Install audim package from PyPI:

bash pip install audim

Install from source
By installing `audim` from source, you can explore the latest features and enhancements that have not yet been officially released. Please note that the latest changes may be still in development and may not be stable and may contain bugs. #### Install from source ```bash pip install git+https://github.com/mratanusarkar/audim.git ``` OR, you can also clone the repository and install the package from source: #### Clone the repository ```bash git clone https://github.com/mratanusarkar/audim.git ```

2. Install FFmpeg locally (recommended)

Using local FFmpeg is optional but recommended for speeding up the video encoding process.

On Ubuntu, install FFmpeg using:

bash sudo apt install ffmpeg libx264-dev

On Windows and other platforms, download and install FFmpeg from the official website:

Virtual environment and project setup for development with uv
#### Install `uv` and setup project environment: > **IMPORTANT** > > If you are using conda base environment as the default base environment for your python projects, run the below command to activate the base environment. If not, skip this step and continue with the next step. > > ```bash > conda activate base > ``` ```bash # Install uv pip install uv # Setup project environment uv venv source .venv/bin/activate # on Linux # .venv\Scripts\activate # on Windows uv pip install -e ".[dev,docs]" ``` #### Build and deploy documentation You can build and serve the documentation by running: ```bash uv pip install -e .[docs] mkdocs serve ``` ## Code Quality Before committing, please ensure that the code is formatted and styled correctly. Run the following commands to check and fix code style issues: ```bash # Check and fix code style issues ruff format . ruff check --fix . ``` See [Development](https://mratanusarkar.github.io/audim/setup/development/) for more details on how to setup the project environment and contribute to the project.

⚖️ License & Attribution

Audim is licensed under Apache 2.0. You can use it freely for personal and commercial projects.

Attribution is encouraged. If you use Audim, please:

  • Keep the default watermark in videos, OR
  • Add "Made with Audim" to video descriptions, OR
  • Link to this repo in your project documentation

See NOTICE file for complete attribution guidelines.

📄 Citation

If you use Audim in your project or research, please cite it as follows:

bibtex @software{audim, title = {Audim: Audio Podcast Animation Engine}, author = {Sarkar, Atanu}, year = {2025}, url = {https://github.com/mratanusarkar/audim}, version = {0.0.7} }

You can also click the "Cite this repository" button on GitHub for other citation formats.

⚠️ Disclaimer

[!WARNING] Early Development Stage

  • This project is actively under development and may contain bugs or limitations.
  • While stable for basic use cases, the rendering engine requires further development and testing across diverse scenarios.
  • The API is subject to change, so keep an eye at the documentation for the latest updates.

[!TIP] We encourage you to:

  • Try Audim for your projects and podcast videos.
  • Report issues when encountered.
  • Feel free to raise a PR to contribute and improve the project.

Your feedback and contributions help make Audim better for everyone!

Owner

  • Name: Atanu Sarkar
  • Login: mratanusarkar
  • Kind: user
  • Location: Kolkata, West Bengal, India
  • Company: BGSW

"An Engineer by profession, Physics lover by passion" | Software Developer at BGSW | Embedded Systems | IoT | Robotics | Machine Learning | Computer Vision

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
type: software
title: "Audim: Audio Podcast Animation Engine"
abstract: "An engine for precise programmatic animation and rendering of podcast videos from audio-based and voice-based file recordings. Audim provides audio-to-subtitle generation with speaker diarization and subtitle-to-podcast video generation with customizable layouts and effects."
authors:
  - family-names: "Sarkar"
    given-names: "Atanu"
    orcid: "https://orcid.org/0009-0009-6334-7312"
    email: "mratanusarkar@gmail.com"
repository-code: "https://github.com/mratanusarkar/audim"
url: "https://mratanusarkar.github.io/audim"
keywords:
  - podcast generation
  - video generation
  - animation engine
  - programmatic animation
  - multimedia
  - content creation
  - python
  - python library
license: Apache-2.0
version: "0.0.7"
date-released: "2025-05-25"
preferred-citation:
  type: software
  title: "Audim: Audio Podcast Animation Engine"
  authors:
    - family-names: "Sarkar"
      given-names: "Atanu"
  year: 2025
  url: "https://github.com/mratanusarkar/audim"
  version: "0.0.7"

GitHub Events

Total
  • Release event: 1
  • Watch event: 3
  • Delete event: 1
  • Issue comment event: 10
  • Push event: 7
  • Pull request review event: 16
  • Pull request event: 4
  • Create event: 2
Last Year
  • Release event: 1
  • Watch event: 3
  • Delete event: 1
  • Issue comment event: 10
  • Push event: 7
  • Pull request review event: 16
  • Pull request event: 4
  • Create event: 2

Committers

Last synced: 10 months ago

All Time
  • Total Commits: 177
  • Total Committers: 1
  • Avg Commits per committer: 177.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 177
  • Committers: 1
  • Avg Commits per committer: 177.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Atanu Sarkar m****r@g****m 177

Issues and Pull Requests

Last synced: 8 months ago

All Time
  • Total issues: 8
  • Total pull requests: 23
  • Average time to close issues: 18 days
  • Average time to close pull requests: about 7 hours
  • Total issue authors: 1
  • Total pull request authors: 1
  • Average comments per issue: 0.38
  • Average comments per pull request: 1.0
  • Merged pull requests: 23
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 8
  • Pull requests: 23
  • Average time to close issues: 18 days
  • Average time to close pull requests: about 7 hours
  • Issue authors: 1
  • Pull request authors: 1
  • Average comments per issue: 0.38
  • Average comments per pull request: 1.0
  • Merged pull requests: 23
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • mratanusarkar (8)
Pull Request Authors
  • mratanusarkar (25)
Top Labels
Issue Labels
Pull Request Labels
version tag (8) release tag (1)

Dependencies

.github/workflows/deploy.yml actions
  • actions/cache v3 composite
  • actions/checkout v4 composite
  • actions/setup-python v4 composite
pyproject.toml pypi
  • Pillow >=10.2.0
  • matplotlib >=3.8.0
  • moviepy ==2.0.0.dev2
  • numpy ==1.26.4
  • opencv-python >=4.9.0.80
  • pydub ==0.25.1
  • pysrt >=1.1.2
  • torch ==2.2.0
  • torchaudio ==2.2.0
  • whisperx ==3.3.1