https://github.com/google-deepmind/physics-iq-benchmark

Benchmarking physical understanding in generative video models

Keywords

benchmark generative-models physical-understanding video-generation

Last synced: 4 months ago · JSON representation

Repository

Benchmarking physical understanding in generative video models

Basic Info

Host: GitHub
Owner: google-deepmind
License: other
Language: Python
Default Branch: main
Homepage: https://physics-iq.github.io/
Size: 60.3 MB

Statistics

Stars: 199
Watchers: 8
Forks: 19
Open Issues: 3
Releases: 0

Topics

benchmark generative-models physical-understanding video-generation

Created about 1 year ago · Last pushed 5 months ago

Metadata Files

Readme Contributing License

README.md

logo

Step A: Generating Videos | Step B: Evaluating Generated Videos | Leaderboard | Citation | License

Physics-IQ: Benchmarking physical understanding in generative video models

Physics-IQ is a high-quality, realistic, and comprehensive benchmark dataset for evaluating physical understanding in generative video models.

Project website: physics-iq.github.io

Key Features:

Real-world videos: All videos are captured with high-quality cameras, not rendered.
Diverse scenarios: Covers a wide range of physical phenomena, including collisions, fluid dynamics, gravity, material properties, light, shadows, magnetism, and more.
Multiple perspectives: Each scenario is filmed from 3 different angles.
Variations: Each scenario is recorded twice to capture natural physical variations.
High resolution and frame rate: Videos are recorded at 3840 × 2160 resolution and 30 frames per second.

Teaser 1 Teaser 2 Teaser 3 Teaser 4 Teaser 5 Teaser 6 Teaser 7 Teaser 8

Leaderboard

The best possible score on Physics-IQ is 100.0%, this score would be achieved by physically realistic videos that differ only in physical randomness but adhere to all tested principles of physics.

If you test your model on Physics-IQ and would like your score/paper/model to be featured here in this table, feel free to open a pull request that adds a row to the table and we'll be happy to include it!

| # | Model | input type | Physics-IQ score | date added (YYYY-MM-DD) | | -- | --- | --- | --- | --- | | 1 | Magi-1 | multiframe (v2v) | 56.0 % :1stplacemedal: | 2025-04-21 | | 2 | Video-GPT | multiframe (v2v) | 35.0 % :2ndplacemedal: | 2025-05-22 | | 3 | Magi-1 | i2v | 30.2 % :3rdplacemedal: | 2025-04-21 | | 4 | VideoPoet | multiframe (v2v) | 29.5 % | 2025-02-19 | | 5 | Lumiere | multiframe (v2v) | 23.0 % | 2025-02-19 | | 6 | Runway Gen 3 | i2v | 22.8 % | 2025-02-19 | | 7 | VideoPoet | i2v | 20.3 % | 2025-02-19 | | 8 | Lumiere | i2v | 19.0 % | 2025-02-19 | | 9 | Stable Video Diffusion | i2v | 14.8 % | 2025-02-19 | | 10 | Pika | i2v | 13.0 % | 2025-02-19 | | 11 | Sora | i2v | 10.0 % | 2025-02-19 |

Note to early adopters of the benchmark: results from the paper were finalized on February 19, 2025; if you used the toolbox before please re-run since we changed and improved a few aspects. Likewise, if you downloaded the dataset before that date, it is recommended to re-download it, ensuring the ground truth video masks have a duration of five seconds.

Step A: Generating Videos for Physics-IQ Test Cases Based on Video Model

1. Download Benchmark Dataset

Visit the Google Cloud Storage link to download the dataset, or run the following script:

bash pip install gsutil python3 ./code/download_physics_iq_data.py

If your desired FPS already exists in the dataset, it will be downloaded.
If it does not exist, the script will download 30 FPS files and generate your desired FPS videos by downsampling the 30 FPS version.

2. Running Video Model on Test Cases from Benchmark

This section explains how to generate videos using the provided benchmark and save them in the required format. Follow the instructions below based on your model type:

2.1 Image-to-Video (i2v) Models

Input Requirements:
- Initial Frame: Use frames from physics-iq-benchmark/switch-frames.
- Text Input (Optional): If required, use descriptions from descriptions.csv.
Steps to Run:
- Generate videos using the initial frame (and text condition, if applicable).
- Save generated videos in the following structure, using any filename as long as the unique ID prefix from the test videos is kept (0001_, ..., 0198_): .model_name/{ID}_{anything-you-like}.mp4
- Refer to the generated_video_name column in descriptions.csv for file naming conventions.

2.2 Multiframe-to-Video Models

Input Requirements:
- Conditioning Frames:
  - Available in physics-iq-benchmark/split-videos/conditioning-videos.
  - Ensure the correct frame rate: 30FPS, 24FPS, 16FPS, or 8FPS.
- Text Input (Optional): Use descriptions.csv.
Steps to Run:
- Use conditioning frames to generate videos.
- Save generated videos in the structure: model_name/{ID}_{perspective}_{scenario_name}.mp4 example: model_name/{0001}_{perspective-left}_{trimmed-ball-and-block-fall}.mp4
- Refer to the generated_video_name column in descriptions.csv for file naming conventions.

Step B: Evaluating Generated Videos on Physics-IQ to Generate Benchmark Scores

1. Installation

Ensure you have Python 3 installed. Then, run the following command to install the necessary packages:

bash pip install -r requirements.txt

2. Dataset Placement

Ensure you have downloaded and placed the physics-iq-benchmark dataset in your working directory. This dataset must include 30FPS videos and optionally your desired FPS. If your desired FPS does not exist in our dataset already, it will be automatically generated. You should have the following structure:

plaintext physics-IQ-benchmark/ ├── full-videos/ │ └── ... | ├── split-videos/ │ ├── conditioning-videos/ │ │ └── 30FPS/ │ │ ├── 0001_conditioning-videos_30FPS_perspective-left_take-1_trimmed-ball-and-block-fall.mp4 │ │ ├── 0002_conditioning-videos_30FPS_perspective-center_take-1_trimmed-ball-and-block-fall.mp4 │ │ └── ... │ └── testing-videos/ │ └── 30FPS/ │ ├── 0001_testing-videos_30FPS_perspective-left_take-1_trimmed-ball-and-block-fall.mp4 │ ├── 0002_testing-videos_30FPS_perspective-center_take-1_trimmed-ball-and-block-fall.mp4 │ └── ... ├── switch-frames/ │ ├── 0001_switch-frames_anyFPS_perspective-left_trimmed-ball-and-block-fall.jpg │ ├── 0002_switch-frames_anyFPS_perspective-center_trimmed-ball-and-block-fall.jpg │ └── ... └── video-masks/ └── real/ └── 30FPS/ ├── 0001_video-masks_30FPS_perspective-left_take-1_trimmed-ball-and-block-fall.mp4 ├── 0002_video-masks_30FPS_perspective-center_take-1_trimmed-ball-and-block-fall.mp4 └── ...

the descriptions file which includes all file names and descriptions of the scenarios should be placed in your home directory as descriptions.csv.
Place your generated videos under .model_name directory.

⚠️ IMPORTANT: Note that this script evaluates the first 5 seconds of your generated videos. Hence, make sure these are the 5 seconds generated right after the switch frame.

3. Generate benchmark scores and plots

bash python3 code/run_physics_iq.py --input_folders <generated_videos_dirs> --output_folder <output_dir> --descriptions_file <descriptions_file> Parameters: - --input_folders: The path to the directories containing input videos (in .mp4 format), with one directory per model (/model_name/video.mp4). - --output_folder: The path to the directory where output CSV files will be saved. - --descriptions_file: The path to the descriptions.csv file.

Citation

If you think this project is helpful, please feel free to leave a star ⭐️

latex @article{motamed2025physics, title={Do generative video models understand physical principles?}, author={Saman Motamed and Laura Culp and Kevin Swersky and Priyank Jaini and Robert Geirhos}, journal={arXiv preprint arXiv:2501.09038}, year={2025} }

License and disclaimer

All software is licensed under the Apache License, Version 2.0 (Apache 2.0); you may not use this file except in compliance with the Apache 2.0 license. You may obtain a copy of the Apache 2.0 license at: https://www.apache.org/licenses/LICENSE-2.0

All other materials are licensed under the Creative Commons Attribution 4.0 International License (CC-BY). You may obtain a copy of the CC-BY license at: https://creativecommons.org/licenses/by/4.0/legalcode

Unless required by applicable law or agreed to in writing, all software and materials distributed here under the Apache 2.0 or CC-BY licenses are distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the licenses for the specific language governing permissions and limitations under those licenses.

This is not an official Google product.

Owner

Name: Google DeepMind
Login: google-deepmind
Kind: organization

Website: https://www.deepmind.com/
Repositories: 245
Profile: https://github.com/google-deepmind

GitHub Events

Total

Create event: 4
Issues event: 15
Watch event: 179
Delete event: 4
Issue comment event: 52
Member event: 1
Public event: 1
Push event: 38
Pull request review comment event: 24
Pull request review event: 26
Pull request event: 30
Fork event: 14

Last Year

Create event: 4
Issues event: 15
Watch event: 179
Delete event: 4
Issue comment event: 52
Member event: 1
Public event: 1
Push event: 38
Pull request review comment event: 24
Pull request review event: 26
Pull request event: 30
Fork event: 14

Committers

Last synced: 8 months ago

All Time

Total Commits: 56
Total Committers: 5
Avg Commits per committer: 11.2
Development Distribution Score (DDS): 0.232

Past Year

Commits: 56
Committers: 5
Avg Commits per committer: 11.2
Development Distribution Score (DDS): 0.232

Top Committers

Name	Email	Commits
Robert Geirhos	2****s	43
sam-motamed	s**d@g**m	10
Jay Lokhande	9****e	1
小孩	8**5@q**m	1
veya2ztn	z**0@g**m	1

Committer Domains (Top 20 + Academic)

qq.com: 1

Issues and Pull Requests

Last synced: 5 months ago

All Time

Total issues: 10
Total pull requests: 29
Average time to close issues: 4 days
Average time to close pull requests: 2 days
Total issue authors: 9
Total pull request authors: 7
Average comments per issue: 1.3
Average comments per pull request: 1.52
Merged pull requests: 23
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 10
Pull requests: 29
Average time to close issues: 4 days
Average time to close pull requests: 2 days
Issue authors: 9
Pull request authors: 7
Average comments per issue: 1.3
Average comments per pull request: 1.52
Merged pull requests: 23
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

rgeirhos (2)
NielsRogge (1)
limpbot (1)
Steven-Xiong (1)
stefan-baumann (1)
wisdomikezogwo (1)
chloexiangyy (1)
veya2ztn (1)
yingShen-ys (1)

Pull Request Authors

rgeirhos (10)
sam-motamed (10)
zhuangshaobin (3)
Jay-Lokhande (2)
yfqiu98 (2)
veya2ztn (1)
david-klindt (1)

Top Labels

Issue Labels

enhancement (2)

https://github.com/google-deepmind/physics-iq-benchmark

Science Score: 36.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Physics-IQ: Benchmarking physical understanding in generative video models

Key Features:

Leaderboard

Step A: Generating Videos for Physics-IQ Test Cases Based on Video Model

1. Download Benchmark Dataset

2. Running Video Model on Test Cases from Benchmark

2.1 Image-to-Video (i2v) Models

2.2 Multiframe-to-Video Models

Step B: Evaluating Generated Videos on Physics-IQ to Generate Benchmark Scores

1. Installation

2. Dataset Placement

3. Generate benchmark scores and plots

Citation

License and disclaimer

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies