scenedetect
:movie_camera: Python and OpenCV-based scene cut/transition detection program & library.
awesome-virtual-try-on
A curated list of awesome research papers, projects, code, dataset, workshops etc. related to virtual try-on.
yuview
The Free and Open Source Cross Platform YUV Viewer with an advanced analytics toolset
https://github.com/bytedance/xgplayer
A HTML5 video player with a parser that saves traffic
contivqaexp
A software tool for designing, conducting, and analyzing continuous subjective video quality assessments experiments.
bombuscv-rs
OpenCV based motion detection/recording software built for research on Bumblebees.
https://github.com/altunenes/scramblery
Desktop app for image and video scrambling/bluring with various methods including Fourier phase scramble: Entire image/video or just detected facial area.
https://github.com/opencast/opencast
The free and open source solution for automated video capture and distribution at scale.
skelly_synchronize
Synchronization tool for videos of the same event. Uses audio cross correlation to synchronize.
https://github.com/bytedance/salmonn
SALMONN family: A suite of advanced multi-modal LLMs
https://github.com/ai-forever/kandinsky-4
Text and image to video generation: Kandinsky 4.0 (2024)
https://github.com/akamhy/videohash
Near Duplicate Video Detection (Perceptual Video Hashing) - Get a 64-bit comparable hash-value for any video.
https://github.com/simleek/displayarray
A OpenCV interface to display tensors, multiple cameras, and so on.
https://github.com/vpalmisano/webrtcperf
WebRTC performance and quality evaluation tool.
https://github.com/SocAIty/media-toolkit
Web-ready standardized file processing and serialization. Read, write, convert and send files. Including image, audio, video and any other file. Easily convert between numpy, base64, bytes and more.
ball-action-spotting
SoccerNet@CVPR | 1st place solution for Ball Action Spotting Challenge 2023
scribesalad
A collection of YouTube videos transcripts : Podcasts (Joe Rogan Experience, Tim Ferris, Jocko podcast, ..), lectures (YaleCourses, MIT lectures, ..). A big transcripts salad spanning history, geography, science, politics, film making and more.
https://github.com/alexkranias/sketchit
SketchIt is a an interactive, media manipulation software applying fundamental computer vision/edge detection algorithms to media for both educational and artistic purposes.
homepage
A simple flask web app that pulls the audio track from almost any internet video using ytdl
https://github.com/awslabs/speke-reference-server
Secure Packager and Encoder Key Exchange (SPEKE) is part of the AWS Elemental content encryption protection strategy for media services customers. SPEKE defines the standard for communication between our media services and digital rights management (DRM) system key servers. This project provides the basic framework that partners can specialize and extend to support their specific method of Digital Rights Management while utilizing AWS' video streaming solutions.
manga-reader
Generate a video recap of any manga volume PDF with GPT Vision and Elevenlabs narration. Discord: https://discord.gg/MMqcuDe2WZ
https://github.com/chenzhaiyu/anomaly-detection
Intrusion detection and displacement monitoring from video sequences
image-processing-matlab
Acquired time lapse images from the IncuCyte® and processed with MATLAB to create a time-lapse video of one particular aggregate of bacteria
https://github.com/amazon-science/gluonmm
A library of transformer models for computer vision and multi-modality research
argan
[Open Source]. ARGAN - The improved version of AnimeGAN. Landscape photos/videos to anime
open-in-mpv
Host-side of the extension to open any link or page URL in mpv via the browser context menu.
bmt
Source code for "Bi-modal Transformer for Dense Video Captioning" (BMVC 2020)
https://github.com/amirzenoozi/aparat-videos-dataset
Some Simple Information About Aparat Videos for DataScientists
https://github.com/hcmlab/nova
NOVA is a tool for annotating and analyzing behaviours in social interactions. It supports Annotators using Machine Learning already during the coding process. Further it features both, discrete labels and continuous scores and a visuzalization of streams recorded with the SSI Framework.
https://github.com/alexkranias/photessera
Photessera is a Java-based video and image manipulation software that can allows users to create video and image mosaics, constructed out of a separate group of user-selected images. The software supports the following file types as input: MP4, MOV, PNG, JPG. As of now the software is only verified as functional on Windows devices. A video explanation of the functionality of the software can be found @ https://youtu.be/ftKO35jiCHQ
https://github.com/altunenes/gstreamer-parallelism-study
tech experiment about parallel video decoding for my blogpost (CPU based using crossbeam)
2d3mf
Code and models for the paper "2D3MF: Deepfake Detection using Multi Modal Middle Fusion"