https://github.com/arya-gaj/your-phone-can-spot-fashion-v1

A lightweight yet powerful system that analyzes short-form videos in real time to identify fashion products by combining computer vision and natural language processing, all processed locally.

https://github.com/arya-gaj/your-phone-can-spot-fashion-v1

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (6.5%) to scientific vocabulary

Keywords

colab faiss-cpu jupyter librosa matplotlib numpy openai-clip opencv-python os pandas python requests scikit-learn seaborn shutil tensorflow threadpoolexecutor torch tqdm ultralytics
Last synced: 5 months ago · JSON representation ·

Repository

A lightweight yet powerful system that analyzes short-form videos in real time to identify fashion products by combining computer vision and natural language processing, all processed locally.

Basic Info
  • Host: GitHub
  • Owner: arya-gaj
  • License: agpl-3.0
  • Language: Jupyter Notebook
  • Default Branch: main
  • Homepage:
  • Size: 26.2 MB
Statistics
  • Stars: 0
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Topics
colab faiss-cpu jupyter librosa matplotlib numpy openai-clip opencv-python os pandas python requests scikit-learn seaborn shutil tensorflow threadpoolexecutor torch tqdm ultralytics
Created 9 months ago · Last pushed 6 months ago
Metadata Files
Readme Contributing License Citation

README.md

Your Phone Can Spot Fashion

Abstract

This model introduces a comprehensive pipeline for fashion product discovery in short-form videos, exemplified by platforms such as Instagram Reels and TikTok. It integrates both visual and textual cues by extracting frames for object detection while leveraging video captions and hashtags to infer the underlying vibe. First, visual-semantic embeddings are generated using the Contrastive Language-Image Pre-training (CLIP) model, enabling similarity search against a curated product catalog through Facebook AI Similarity Search (FAISS). The detected items are then matched as Exact Match, Similar Match, or No Match to improve retrieval precision, with performance evaluated through detailed visualizations including confidence score distributions, match type breakdowns, and product type frequency heatmaps. To address ethical concerns regarding user privacy and data protection, all video and product data are processed and stored locally without external transmission. This work lays essential groundwork for future research in automated tagging, vibe classification, and intelligent product discovery in video-driven e-commerce.

Owner

  • Name: Aryaman Gajrani
  • Login: arya-gaj
  • Kind: user

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Gajrani"
  given-names: "Aryaman"
  orcid: "https://orcid.org/0009-0009-7141-8707"
title: "your-phone-can-spot-fashion"
version: 1.0.0
doi: 10.5281/zenodo.1234
date-released: 2025-07-01
url: "https://github.com/arya-gaj/your-phone-can-spot-fashion"

GitHub Events

Total
  • Push event: 11
Last Year
  • Push event: 11