marqo-fashionclip
State-of-the-art CLIP/SigLIP embedding models finetuned for the fashion domain. +57% increase in evaluation metrics vs FashionCLIP 2.0.
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (4.1%) to scientific vocabulary
Keywords
Repository
State-of-the-art CLIP/SigLIP embedding models finetuned for the fashion domain. +57% increase in evaluation metrics vs FashionCLIP 2.0.
Basic Info
- Host: GitHub
- Owner: marqo-ai
- License: apache-2.0
- Language: Python
- Default Branch: main
- Homepage: https://huggingface.co/Marqo
- Size: 11.2 MB
Statistics
- Stars: 95
- Watchers: 2
- Forks: 11
- Open Issues: 1
- Releases: 0
Topics
Metadata Files
README.md
Marqo-FashionCLIP
This repository is designed to evaluate Marqo-FashionCLIP and Marqo-FashionSigLIP across seven public benchmark datasets. Read more about the models on our blog.
Benchmark Results
We averaged the performance of three common tasks across the datasets: text-to-image, category-to-product, and sub-category-to-product. As demonstrated below, Marqo-FashionCLIP and Marqo-FashionSigLIP outperform both pretrained OpenCLIP models and the state-of-the-art fashion CLIP models. For a more comprehensive performance comparison, refer to the LEADERBOARD.
Text-To-Image (Averaged across 6 datasets) | Model | AvgRecall | Recall@1 | Recall@10 | MRR | |----------------------------|-------------|------------|-------------|-----------| | Marqo-FashionSigLIP | 0.231 | 0.121 | 0.340 | 0.239 | | Marqo-FashionCLIP | 0.192 | 0.094 | 0.290 | 0.200 | | FashionCLIP2.0 | 0.163 | 0.077 | 0.249 | 0.165 | | OpenFashionCLIP | 0.132 | 0.060 | 0.204 | 0.135 | | ViT-B-16-laion2bs34bb88k | 0.174 | 0.088 | 0.261 | 0.180 | | ViT-B-16-SigLIP-webli | 0.212 | 0.111 | 0.314 | 0.214 |
Category-To-Product (Averaged across 5 datasets) | Model | AvgP | P@1 | P@10 | MRR | |----------------------------|-----------|-----------|-----------|-----------| | Marqo-FashionSigLIP | 0.737 | 0.758 | 0.716 | 0.812 | | Marqo-FashionCLIP | 0.705 | 0.734 | 0.676 | 0.776 | | FashionCLIP2.0 | 0.684 | 0.681 | 0.686 | 0.741 | | OpenFashionCLIP | 0.646 | 0.653 | 0.639 | 0.720 | | ViT-B-16-laion2bs34bb88k | 0.662 | 0.673 | 0.652 | 0.743 | | ViT-B-16-SigLIP-webli | 0.688 | 0.690 | 0.685 | 0.751 |
Sub-Category-To-Product (Averaged across 4 datasets) | Model | AvgP | P@1 | P@10 | MRR | |----------------------------|-----------|-----------|-----------|-----------| | Marqo-FashionSigLIP | 0.725 | 0.767 | 0.683 | 0.811 | | Marqo-FashionCLIP | 0.707 | 0.747 | 0.667 | 0.772 | | FashionCLIP2.0 | 0.657 | 0.676 | 0.638 | 0.733 | | OpenFashionCLIP | 0.598 | 0.619 | 0.578 | 0.689 | | ViT-B-16-laion2bs34bb88k | 0.638 | 0.651 | 0.624 | 0.712 | | ViT-B-16-SigLIP-webli | 0.643 | 0.643 | 0.643 | 0.726 |
Models
Hugging Face
We released our models on HuggingFace: Marqo-FashionCLIP and Marqo-FashionSigLIP. We also have a Hugging Face Space Demo of our models in action: Classification with Marqo-FashionSigLIP.
You can load the models with transformers by
python
from transformers import AutoModel, AutoProcessor
model = AutoModel.from_pretrained('Marqo/marqo-fashionCLIP', trust_remote_code=True)
processor = AutoProcessor.from_pretrained('Marqo/marqo-fashionCLIP', trust_remote_code=True)
and
python
from transformers import AutoModel, AutoProcessor
model = AutoModel.from_pretrained('Marqo/marqo-fashionSigLIP', trust_remote_code=True)
processor = AutoProcessor.from_pretrained('Marqo/marqo-fashionSigLIP', trust_remote_code=True)
Then,
```python
import torch
from PIL import Image
image = [Image.open("docs/fashion-hippo.png")] text = ["a hat", "a t-shirt", "shoes"] processed = processor(text=text, images=image, padding='maxlength', returntensors="pt")
with torch.nograd(): imagefeatures = model.getimagefeatures(processed['pixelvalues'], normalize=True) textfeatures = model.gettextfeatures(processed['input_ids'], normalize=True)
text_probs = (100.0 * image_features @ text_features.T).softmax(dim=-1)
print("Label probs:", text_probs) ```
We released this article illustrating a simple ecommerce search with a fashion dataset if you want to see the model in action.
OpenCLIP
You can load the models with open_clip by
python
import open_clip
model, preprocess_train, preprocess_val = open_clip.create_model_and_transforms('hf-hub:Marqo/marqo-fashionCLIP')
tokenizer = open_clip.get_tokenizer('hf-hub:Marqo/marqo-fashionCLIP')
and
python
import open_clip
model, preprocess_train, preprocess_val = open_clip.create_model_and_transforms('hf-hub:Marqo/marqo-fashionSigLIP')
tokenizer = open_clip.get_tokenizer('hf-hub:Marqo/marqo-fashionSigLIP')
Then,
```python
import torch
from PIL import Image
image = preprocess_val(Image.open("docs/fashion-hippo.png")).unsqueeze(0) text = tokenizer(["a hat", "a t-shirt", "shoes"])
with torch.nograd(), torch.cuda.amp.autocast(): imagefeatures = model.encodeimage(image, normalize=True) textfeatures = model.encode_text(text, normalize=True)
text_probs = (100.0 * image_features @ text_features.T).softmax(dim=-1)
print("Label probs:", text_probs) ```
Marqo
To deploy on Marqo Cloud (recommended): 1. Sign Up to Marqo Cloud.
Install Marqo and the Marqo python client:
bash pip install marqoCreate and index:
```python import marqo
settings = { "type": "unstructured", "model": "marqo-fashion-clip", # model name "modelProperties": { "name": "ViT-B-16", # model architecture "dimensions": 512, # embedding dimensions "url": "https://marqo-gcl-public.s3.us-west-2.amazonaws.com/marqo-fashionCLIP/marqofashionCLIP.pt", # model weights "type": "openclip" # loading library }, }
apikey = "yourapikey" # replace with your api key (https://www.marqo.ai/blog/finding-my-marqo-api-key) mq = marqo.Client("https://api.marqo.ai", apikey=api_key)
mq.createindex("fashion-index", settingsdict=settings)
triggers model download
mq.index("fashion-index").search("black dress")
```
See the full documentation for more details on adding documents and searching.
Quick Start
Install PyTorch first and run
bash
pip install -r requirements.txt
To evaluate Marqo-FashionCLIP, run this command
bash
python eval.py \
--dataset-config ./configs/${DATASET}.json \
--model-name Marqo/marqo-fashionCLIP \
--run-name Marqo-FashionCLIP
- DATASET can be one of ['deepfashioninshop', 'deepfashionmultimodal', 'fashion200k', 'KAGL', 'atlas', 'polyvore' 'iMaterialist']
To evaluate Marqo-FashionSigLIP, run this command
bash
python eval.py \
--dataset-config ./configs/${DATASET}.json \
--model-name Marqo/marqo-fashionSigLIP \
--run-name Marqo-FashionSigLIP
- DATASET can be one of ['deepfashioninshop', 'deepfashionmultimodal', 'fashion200k', 'KAGL', 'atlas', 'polyvore' 'iMaterialist']
Scripts to evaluate other models including FashionCLIP 2.0 and OpenFashionCLIP can be found in scripts directory.
Datasets
We collected 7 public multimodal fashion datasets and uploaded to HuggingFace: Atlas, DeepFashion (In-shop), DeepFashion (Multimodal), Fashion200k, iMaterialist, KAGL, and Polyvore. Each dataset has different metadata available. Thus, tasks for each dataset are stored as json files in scripts directory. Refer to our blog for more information about each dataset.
Summarizing Results
To renew LEADERBOARD.md and summarize results of different models locally, run this command
bash
python summarize_results.py
Citation
@software{Jung_Marqo-FashionCLIP_and_Marqo-FashionSigLIP_2024,
author = {Jung, Myong Chol and Clark, Jesse},
month = aug,
title = {{Marqo-FashionCLIP and Marqo-FashionSigLIP}},
url = {https://github.com/marqo-ai/marqo-FashionCLIP},
version = {1.0.0},
year = {2024}
}
Owner
- Name: Marqo
- Login: marqo-ai
- Kind: organization
- Website: https://www.marqo.ai/
- Repositories: 5
- Profile: https://github.com/marqo-ai
Tensor search for humans.
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Jung" given-names: "David" orcid: - family-names: "Clark" given-names: "Jesse" orcid: title: "Marqo-FashionCLIP and Marqo-FashionSigLIP" version: 1.0.0 doi: date-released: 2024-08-14 url: "https://github.com/marqo-ai/marqo-FashionCLIP"
GitHub Events
Total
- Issues event: 5
- Watch event: 44
- Issue comment event: 6
- Fork event: 7
Last Year
- Issues event: 5
- Watch event: 44
- Issue comment event: 6
- Fork event: 7
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| David Jung | d****g@g****m | 9 |
| Jesse Clark | j****k | 5 |
| Ellie Sleightholm | 1****m | 4 |
| David Jung | d****d@m****i | 2 |
| Ikko Eltociear Ashimine | e****r@g****m | 1 |
| Ellie Sleightholm | e****m@E****e | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 4
- Total pull requests: 1
- Average time to close issues: 20 days
- Average time to close pull requests: about 5 hours
- Total issue authors: 4
- Total pull request authors: 1
- Average comments per issue: 1.0
- Average comments per pull request: 0.0
- Merged pull requests: 1
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 4
- Pull requests: 0
- Average time to close issues: 20 days
- Average time to close pull requests: N/A
- Issue authors: 4
- Pull request authors: 0
- Average comments per issue: 1.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- thanhtung2693 (1)
- yunbinmo (1)
- anilsathyan7 (1)
- joekendal (1)
Pull Request Authors
- eltociear (1)