https://github.com/akamhy/videohash

Near Duplicate Video Detection (Perceptual Video Hashing) - Get a 64-bit comparable hash-value for any video.

https://github.com/akamhy/videohash

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: springer.com
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.7%) to scientific vocabulary

Keywords

duplicate-detection duplicate-video-finder duplicate-videos ffmpeg find-similar-videos-by-content ndvd ndvr near-duplicate-video near-duplicate-video-clip-detection python video video-deduplication video-similarity-search visual-claim
Last synced: 6 months ago · JSON representation

Repository

Near Duplicate Video Detection (Perceptual Video Hashing) - Get a 64-bit comparable hash-value for any video.

Basic Info
Statistics
  • Stars: 332
  • Watchers: 9
  • Forks: 54
  • Open Issues: 18
  • Releases: 24
Topics
duplicate-detection duplicate-video-finder duplicate-videos ffmpeg find-similar-videos-by-content ndvd ndvr near-duplicate-video near-duplicate-video-clip-detection python video video-deduplication video-similarity-search visual-claim
Created about 5 years ago · Last pushed over 1 year ago
Metadata Files
Readme Contributing License

README.md


The Python package for near duplicate video detection

Build Status Build Status Build Status codecov Total alerts Language grade: Python pypi Downloads GitHub lastest commit PyPI - Python Version


Introduction

Videohash is a Python package for detecting near-duplicate videos (Perceptual Video Hashing). It can take any input video and generate a 64-bit equivalent hash value. Videohash is way more faster than comparing the imagehash values of individual frames of the video and more reliable than hashing keyframes.

The video-hash-values for identical or near-duplicate videos are the same or similar, implying that if the video is resized (upscaled/downscaled), transcoded, watermark added/removed, stabilized, color changed, frame rate changed, changed aspect ratio, cropped, black-bars added or removed, the hash-value should remain unchanged or not vary substantially.

How the hash values are calculated

  • Every one second, a frame from the input video is extracted, the frames are shrunk to a 144x144 pixel square, a collage is constructed that contains all of the resized frames(square-shaped), the collage's wavelet hash's bit-list is the first bit-list that we use. The frames extracted are now stitched horizontally to each other, and finally divided into 64 equal sized images, the domiant color of these 64 images are detected and compared with a pre-defined pattern of dominant colors, if they match the bit is set else unset. So now we have two bitlist, finally we bitwise XOR these two bitlists. The XOR'ed output is used to generate the final 64 bit hash-value for the video. The bits are joined to form the 64 bit hash-value of the input value.

When not to use Videohash

  • Videohash cannot be used to verify whether one video is a part of another (video fingerprinting). If the video is reversed or rotated by a substantial angle (greater than 10 degrees), Videohash will not provide the same or similar hash result, but you can always reverse the video manually and generate the hash value for reversed video.

How to compare the video hash values stored in a database


Installation

To use this software, you must have FFmpeg installed. Please read how to install FFmpeg if you don't already know how.

Install videohash

Upgrade pip bash python3 -m pip install --upgrade pip If you do not want to upgrade pip and the installation fails try appending --prefer-binary to the following installation command(s).

Install from the PyPi (recommended):

bash pip install videohash

Using conda, from conda-forge (recommended):

Maintainer is @step21

bash conda install -c conda-forge videohash

Install directly from the GitHub repository (NOT recommended):

bash pip install git+https://github.com/akamhy/videohash.git


Features

  • Generate videohash of a video directly from its URL(uses yt-dlp) or its path.
  • Can be used as the core of a scalable Near Duplicate Video Retrieval (NDVR) system.
  • The end-user can access the image representation(the collage) of the video.
  • A videohash instance can be compared to a 64-bit stored hash, its hex representation, bitlist, and other videohash instances.

Usage

In the following usage example the first two and the fourth instance of VideoHash class are computing the hash for the same video(not same as in checksum) and the third one is a different video.

```python

from videohash import VideoHash url1 = "https://user-images.githubusercontent.com/64683866/168872267-7c6682f8-7294-4d9a-8a68-8c6f44c06df6.mp4" videohash1 = VideoHash(url=url1)

url2 = "https://user-images.githubusercontent.com/64683866/168869109-1f77c839-6912-4e24-8738-42cb15f3ab47.mp4" videohash2 = VideoHash(url=url2) videohash2 - videohash1 2 videohash2.is_similar(videohash1) True

url3 = "https://user-images.githubusercontent.com/64683866/148960165-a210f2d2-6c41-4349-bd8d-a4cb673bc0af.mp4" videohash3 = VideoHash(url=url3) videohash3.issimilar(videohash1) False videohash3.isdiffrent(videohash2) True videohash3-videohash1 34 videohash3-videohash2 34 path4 = "/home/akamhy/Downloads/168872267-7c6682f8-7294-4d9a-8a68-8c6f44c06df6.mp4" videohash4 = VideoHash(path=path4) videohash4 == videohash1 True videohash4 - videohash1 0 videohash4.issimilar(videohash2) True videohash4.issimilar(videohash4) True videohash4.is_similar(videohash3) False

```

Extended Usage : https://github.com/akamhy/videohash/wiki/Extended-Usage

API Reference : https://github.com/akamhy/videohash/wiki/API-Reference


Credits


License

License: MIT

Copyright (c) 2021-2022 Akash Mahanty. See license for details.

The VideoHash logo was created by iconolocode. See license for details.

Videos are from NASA and are in the public domain.

NASA copyright policy states that "NASA material is not protected by copyright unless noted".

Owner

  • Name: Akash Mahanty
  • Login: akamhy
  • Kind: user
  • Location: Delhi, India

~

GitHub Events

Total
  • Issues event: 2
  • Watch event: 45
  • Issue comment event: 1
  • Pull request event: 1
  • Fork event: 4
Last Year
  • Issues event: 2
  • Watch event: 45
  • Issue comment event: 1
  • Pull request event: 1
  • Fork event: 4

Committers

Last synced: 6 months ago

All Time
  • Total Commits: 198
  • Total Committers: 6
  • Avg Commits per committer: 33.0
  • Development Distribution Score (DDS): 0.025
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Akash Mahanty a****y@y****m 193
whitesource-bolt-for-github[bot] 4****] 1
iconolocode 9****e 1
Florian Idelberger s****1 1
Eddie Thokerunga 4****n 1
Codacy Badger b****r@c****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 45
  • Total pull requests: 57
  • Average time to close issues: about 2 months
  • Average time to close pull requests: 27 minutes
  • Total issue authors: 19
  • Total pull request authors: 12
  • Average comments per issue: 1.84
  • Average comments per pull request: 1.09
  • Merged pull requests: 49
  • Bot issues: 8
  • Bot pull requests: 1
Past Year
  • Issues: 1
  • Pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • akamhy (20)
  • mend-bolt-for-github[bot] (8)
  • ziczhu (1)
  • specky532 (1)
  • melyux (1)
  • lockywolf (1)
  • dale-wahl (1)
  • MikPisula (1)
  • runck (1)
  • christopherwingert (1)
  • CaileanMParker (1)
  • akamg (1)
  • trim21 (1)
  • hagemt (1)
  • Demmenie (1)
Pull Request Authors
  • akamhy (44)
  • jerrecode (2)
  • Demmenie (2)
  • mend-bolt-for-github[bot] (1)
  • dale-wahl (1)
  • codacy-badger (1)
  • iconolocode (1)
  • step21 (1)
  • Eddievin (1)
  • aryan6969 (1)
  • pritamsay (1)
  • albertopasqualetto (1)
Top Labels
Issue Labels
bug (9) security vulnerability (8) enhancement (3) hacktoberfest (2) good first issue (1) help wanted (1)
Pull Request Labels
enhancement (3) bug (2) documentation (2) hacktoberfest-accepted (2)

Packages

  • Total packages: 2
  • Total downloads:
    • pypi 9,908 last-month
  • Total docker downloads: 36
  • Total dependent packages: 0
    (may contain duplicates)
  • Total dependent repositories: 1
    (may contain duplicates)
  • Total versions: 26
  • Total maintainers: 1
pypi.org: videohash

Python package for Near Duplicate Video Detection (Perceptual Video Hashing) - Get a 64-bit comparable hash-value for any video.

  • Versions: 24
  • Dependent Packages: 0
  • Dependent Repositories: 1
  • Downloads: 9,908 Last month
  • Docker Downloads: 36
Rankings
Stargazers count: 4.6%
Downloads: 4.6%
Forks count: 7.7%
Average: 9.7%
Dependent packages count: 10.0%
Dependent repos count: 21.8%
Maintainers (1)
Last synced: 6 months ago
conda-forge.org: videohash
  • Versions: 2
  • Dependent Packages: 0
  • Dependent Repositories: 0
Rankings
Stargazers count: 27.8%
Dependent repos count: 34.0%
Average: 38.8%
Forks count: 42.2%
Dependent packages count: 51.2%
Last synced: 6 months ago

Dependencies

.github/workflows/ci_linux.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
.github/workflows/ci_mac_os.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
.github/workflows/ci_windows.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
.github/workflows/codeql-analysis.yml actions
  • actions/checkout v2 composite
  • github/codeql-action/analyze v1 composite
  • github/codeql-action/autobuild v1 composite
  • github/codeql-action/init v1 composite
.github/workflows/python-publish.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
requirements-test.txt pypi
  • black * test
  • codecov * test
  • flake8 * test
  • mypy * test
  • patool * test
  • pytest * test
  • pytest-cov * test
  • pyunpack * test
  • requests * test
  • types-Pillow * test
requirements.txt pypi
  • ImageHash *
  • Pillow *
  • imagedominantcolor *
  • yt-dlp *
setup.py pypi
  • ImageHash *
  • Pillow *
  • imagedominantcolor *
  • yt-dlp *