https://github.com/aleph23/film

FILM: Frame Interpolation for Large Motion, In ECCV 2022.
Last synced: 9 months ago · JSON representation
Repository

FILM: Frame Interpolation for Large Motion, In ECCV 2022.
Basic Info

Host: GitHub
Owner: aleph23
License: apache-2.0
Language: Python
Default Branch: main
Homepage: https://film-net.github.io
Size: 26.4 MB
Statistics

Stars: 0
Watchers: 0
Forks: 0
Open Issues: 2
Releases: 0
Fork of jford49/frame-interpolation
Created about 2 years ago · Last pushed almost 2 years ago
https://github.com/aleph23/film/blob/main/

# MasterClasses.01 pipeline

Features:
- frame detection - `python3 formatters/lsall.py`
- frame rename - `python3 formatters/rename.py`
- frame interpolation - `python3 frame_gen.py`
- mp4 generation - `python3 frame_combine.py`

Directories:
- interpolation configuration - `./config`
- storage for interpolation models - `./pretrained_models`

---

#### Original documentation

---

# FILM: Frame Interpolation for Large Motion

### [Website](https://film-net.github.io/) | [Paper](https://arxiv.org/pdf/2202.04901.pdf) | [Google AI Blog](https://ai.googleblog.com/2022/10/large-motion-frame-interpolation.html) | [Tensorflow Hub Colab](https://www.tensorflow.org/hub/tutorials/tf_hub_film_example) | [YouTube](https://www.youtube.com/watch?v=OAD-BieIjH4) 


The official Tensorflow 2 implementation of our high quality frame interpolation neural network. We present a unified single-network approach that doesn't use additional pre-trained networks, like optical flow or depth, and yet achieve state-of-the-art results. We use a multi-scale feature extractor that shares the same convolution weights across the scales. Our model is trainable from frame triplets alone. 


[FILM: Frame Interpolation for Large Motion](https://arxiv.org/abs/2202.04901) 

[Fitsum Reda](https://fitsumreda.github.io/)¹, [Janne Kontkanen](https://scholar.google.com/citations?user=MnXc4JQAAAAJ&hl=en)¹, [Eric Tabellion](http://www.tabellion.org/et/)¹, [Deqing Sun](https://deqings.github.io/)¹, [Caroline Pantofaru](https://scholar.google.com/citations?user=vKAKE1gAAAAJ&hl=en)¹, [Brian Curless](https://homes.cs.washington.edu/~curless/)^1,2

¹Google Research, ²University of Washington

In ECCV 2022.

![A sample 2 seconds moment.](https://github.com/googlestaging/frame-interpolation/blob/main/moment.gif)
FILM transforms near-duplicate photos into a slow motion footage that look like it is shot with a video camera.

## Web Demo

Integrated into [Hugging Face Spaces ](https://huggingface.co/spaces) using [Gradio](https://github.com/gradio-app/gradio). Try out the Web Demo: [![Hugging Face Spaces](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/johngoad/frame-interpolation)

Try the interpolation model with the replicate web demo at 
[![Replicate](https://replicate.com/google-research/frame-interpolation/badge)](https://replicate.com/google-research/frame-interpolation)

Try FILM to interpolate between two or more images with the PyTTI-Tools at [![PyTTI-Tools:FILM](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.sandbox.google.com/github/pytti-tools/frame-interpolation/blob/main/PyTTI_Tools_FiLM-colab.ipynb#scrollTo=-7TD7YZJbsy_)

An alternative Colab for running FILM on arbitrarily more input images, not just on two images, [![FILM-Gdrive](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1NuaPPSvUhYafymUf2mEkvhnEtpD5oihs)

## Change Log
* **Nov 28, 2022**: Upgrade `eval.interpolator_cli` for **high resolution frame interpolation**. `--block_height` and `--block_width` determine the total number of patches (`block_height*block_width`) to subdivide the input images. By default, both arguments are set to 1, and so no subdivision will be done.
* **Mar 12, 2022**: Support for Windows, see [WINDOWS_INSTALLATION.md](https://github.com/google-research/frame-interpolation/blob/main/WINDOWS_INSTALLATION.md).
* **Mar 09, 2022**: Support for **high resolution frame interpolation**. Set `--block_height` and `--block_width` in `eval.interpolator_test` to extract patches from the inputs, and reconstruct the interpolated frame from the iteratively interpolated patches.

## Installation

*   Get Frame Interpolation source codes

```
git clone https://github.com/google-research/frame-interpolation
cd frame-interpolation
```

*   Optionally, pull the recommended Docker base image

```
docker pull gcr.io/deeplearning-platform-release/tf2-gpu.2-6:latest
```

* If you do not use Docker, set up your NVIDIA GPU environment with:
    * [Anaconda Python 3.9](https://www.anaconda.com/products/individual)
    * [CUDA Toolkit 11.2.1](https://developer.nvidia.com/cuda-11.2.1-download-archive)
    * [cuDNN 8.1.0](https://developer.nvidia.com/rdp/cudnn-download)

*   Install frame interpolation dependencies

```
pip3 install -r requirements.txt
sudo apt-get install -y ffmpeg
```

### See [WINDOWS_INSTALLATION](https://github.com/google-research/frame-interpolation/blob/main/WINDOWS_INSTALLATION.md) for Windows Support

## Pre-trained Models

*   Create a directory where you can keep large files. Ideally, not in this
    directory.

```
mkdir -p 
```

*   Download pre-trained TF2 Saved Models from
    [google drive](https://drive.google.com/drive/folders/1q8110-qp225asX3DQvZnfLfJPkCHmDpy?usp=sharing)
    and put into ``.

The downloaded folder should have the following structure:

```
/
 film_net/
    L1/
    Style/
    VGG/
 vgg/
    imagenet-vgg-verydeep-19.mat
```

## Running the Codes

The following instructions run the interpolator on the photos provided in
'frame-interpolation/photos'.

### One mid-frame interpolation

To generate an intermediate photo from the input near-duplicate photos, simply run:

```
python3 -m eval.interpolator_test \
   --frame1 photos/one.png \
   --frame2 photos/two.png \
   --model_path /film_net/Style/saved_model \
   --output_frame photos/output_middle.png
```

This will produce the sub-frame at `t=0.5` and save as 'photos/output_middle.png'.

### Many in-between frames interpolation

It takes in a set of directories identified by a glob (--pattern). Each directory
is expected to contain at least two input frames, with each contiguous frame
pair treated as an input to generate in-between frames. Frames should be named such that when sorted (naturally) with `natsort`, their desired order is unchanged.

```
python3 -m eval.interpolator_cli \
   --pattern "photos" \
   --model_path /film_net/Style/saved_model \
   --times_to_interpolate 6 \
   --output_video
```

You will find the interpolated frames (including the input frames) in
'photos/interpolated_frames/', and the interpolated video at
'photos/interpolated.mp4'.

The number of frames is determined by `--times_to_interpolate`, which controls
the number of times the frame interpolator is invoked. When the number of frames
in a directory is `num_frames`, the number of output frames will be
`(2^times_to_interpolate+1)*(num_frames-1)`.

## Datasets

We use [Vimeo-90K](http://data.csail.mit.edu/tofu/dataset/vimeo_triplet.zip) as
our main training dataset. For quantitative evaluations, we rely on commonly
used benchmark datasets, specifically:

*   [Vimeo-90K](http://data.csail.mit.edu/tofu/testset/vimeo_interp_test.zip)
*   [Middlebury-Other](https://vision.middlebury.edu/flow/data)
*   [UCF101](https://people.cs.umass.edu/~hzjiang/projects/superslomo/UCF101_results.zip)
*   [Xiph](https://github.com/sniklaus/softmax-splatting/blob/master/benchmark.py)

### Creating a TFRecord

The training and benchmark evaluation scripts expect the frame triplets in the
[TFRecord](https://www.tensorflow.org/tutorials/load_data/tfrecord) storage format. 


We have included scripts that encode the relevant frame triplets into a
[tf.train.Example](https://www.tensorflow.org/api_docs/python/tf/train/Example)
data format, and export to a TFRecord file. 


You can use the commands `python3 -m
datasets.create__tfrecord --help` for more information.

For example, run the command below to create a TFRecord for the Middlebury-other
dataset. Download the [images](https://vision.middlebury.edu/flow/data) and point `--input_dir` to the unzipped folder path.

```
python3 -m datasets.create_middlebury_tfrecord \
  --input_dir= \
  --output_tfrecord_filepath= \
  --num_shards=3
```

The above command will output a TFRecord file with 3 shards as `@3`.

## Training

Below are our training gin configuration files for the different loss function:

```
training/
 config/
    film_net-L1.gin
    film_net-VGG.gin
    film_net-Style.gin
```

To launch a training, simply pass the configuration filepath to the desired
experiment. 

By default, it uses all visible GPUs for training. To debug or train
on a CPU, append `--mode cpu`.

```
python3 -m training.train \
   --gin_config training/config/.gin \
   --base_folder  \
   --label 
```

*   When training finishes, the folder structure will look like this:

```
/
 /
    config.gin
    eval/
    train/
    saved_model/
```

### Build a SavedModel

Optionally, to build a
[SavedModel](https://www.tensorflow.org/guide/saved_model) format from a trained
checkpoints folder, you can use this command:

```
python3 -m training.build_saved_model_cli \
   --base_folder  \
   --label 
```

*   By default, a SavedModel is created when the training loop ends, and it will be saved at
    `//saved_model`.

## Evaluation on Benchmarks

Below, we provided the evaluation gin configuration files for the benchmarks we
have considered:

```
eval/
 config/
    middlebury.gin
    ucf101.gin
    vimeo_90K.gin
    xiph_2K.gin
    xiph_4K.gin
```

To run an evaluation, simply pass the configuration file of the desired evaluation dataset. 

If a GPU is visible, it runs on it.

```
python3 -m eval.eval_cli \
   --gin_config eval/config/.gin \
   --model_path /film_net/L1/saved_model
```

The above command will produce the PSNR and SSIM scores presented in the paper.

## Citation

If you find this implementation useful in your works, please acknowledge it
appropriately by citing:

```
@inproceedings{reda2022film,
 title = {FILM: Frame Interpolation for Large Motion},
 author = {Fitsum Reda and Janne Kontkanen and Eric Tabellion and Deqing Sun and Caroline Pantofaru and Brian Curless},
 booktitle = {European Conference on Computer Vision (ECCV)},
 year = {2022}
}
```

```
@misc{film-tf,
  title = {Tensorflow 2 Implementation of "FILM: Frame Interpolation for Large Motion"},
  author = {Fitsum Reda and Janne Kontkanen and Eric Tabellion and Deqing Sun and Caroline Pantofaru and Brian Curless},
  year = {2022},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/google-research/frame-interpolation}}
}
```

## Acknowledgments

We would like to thank Richard Tucker, Jason Lai and David Minnen. We would also
like to thank Jamie Aspinall for the imagery included in this repository.

## Coding style

*   2 spaces for indentation
*   80 character line length
*   PEP8 formatting

## Disclaimer

This is not an officially supported Google product.
Owner

Login: aleph23
Kind: user
Location: Mojave
Company: Occasionally
Repositories: 5
Profile: https://github.com/aleph23
Primarily bipedal
GitHub Events

Total

Pull request review event: 1
Last Year

Pull request review event: 1
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/aleph23/film

Science Score: 10.0%

Repository

Basic Info

Statistics

https://github.com/aleph23/film/blob/main/

Owner

GitHub Events

Total

Last Year