https://github.com/ai-forever/ru-clip

CLIP implementation for Russian language

https://github.com/ai-forever/ru-clip

Science Score: 33.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
    1 of 12 committers (8.3%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (7.3%) to scientific vocabulary

Keywords

clip computer-vision nlp

Keywords from Contributors

transformer russian dalle image-generation russian-language text-to-image deep-face-swap deepfake face-swap faceswap
Last synced: 9 months ago · JSON representation

Repository

CLIP implementation for Russian language

Basic Info
  • Host: GitHub
  • Owner: ai-forever
  • License: apache-2.0
  • Language: Jupyter Notebook
  • Default Branch: main
  • Homepage:
  • Size: 3.97 MB
Statistics
  • Stars: 145
  • Watchers: 3
  • Forks: 40
  • Open Issues: 8
  • Releases: 0
Topics
clip computer-vision nlp
Created almost 5 years ago · Last pushed over 2 years ago
Metadata Files
Readme License

README.md

RuCLIP

Zero-shot image classification model for Russian language


RuCLIP (Russian Contrastive Language–Image Pretraining) is a multimodal model for obtaining images and text similarities and rearranging captions and pictures. RuCLIP builds on a large body of work on zero-shot transfer, computer vision, natural language processing and multimodal learning. This repo has the prototypes model of OpenAI CLIP's Russian version following this paper.

Models

Installing

pip install ruclip==0.0.2

Usage

Open In Colab Standart RuCLIP API

Open In Colab RuCLIP + SberVqgan

Open In Colab ONNX example

Init models

```python import ruclip

device = 'cuda' clip, processor = ruclip.load('ruclip-vit-base-patch32-384', device=device) ```

Zero-Shot Classification [Minimal Example]

```python import torch import base64 import requests import matplotlib.pyplot as plt from PIL import Image from io import BytesIO

prepare images

bs4urls = requests.get('https://raw.githubusercontent.com/ai-forever/ru-dolph/master/pics/pipelines/catsvsdogsbs4.json').json() images = [Image.open(BytesIO(base64.b64decode(bs4url))) for bs4url in bs4_urls]

prepare classes

classes = ['кошка', 'собака'] templates = ['{}', 'это {}', 'на картинке {}', 'это {}, домашнее животное']

predict

predictor = ruclip.Predictor(clip, processor, device, bs=8, templates=templates) with torch.nograd(): textlatents = predictor.gettextlatents(classes) predlabels = predictor.run(images, textlatents)

show results

f, ax = plt.subplots(2,4, figsize=(12,6)) for i, (pilimg, predlabel) in enumerate(zip(images, predlabels)): ax[i//4, i%4].imshow(pilimg) ax[i//4, i%4].settitle(classes[predlabel]) ```

Cosine similarity Visualization Example

Softmax Scores Visualization Example

Linear Probe and ZeroShot Correlation Results

Linear Probe Example

```python train = CIFAR100(root, download=True, train=True) test = CIFAR100(root, download=True, train=False)

with torch.nograd(): Xtrain = predictor.getimagelatents((pilimg for pilimg, _ in train)).cpu().numpy() Xtest = predictor.getimagelatents((pilimg for pilimg, _ in test)).cpu().numpy() ytrain, y_test = np.array(train.targets), np.array(test.targets)

clf = LogisticRegression(solver='lbfgs', penalty='l2', maxiter=1000, verbose=1) clf.fit(Xtrain, ytrain) ypred = clf.predict(Xtest) accuracy = np.mean((ytest == y_pred).astype(np.float)) * 100. print(f"Accuracy = {accuracy:.3f}") ```

>>> Accuracy = 75.680

Performance

We have evaluated the performance zero-shot image classification on the following datasets:

| Dataset | ruCLIP Base [vit-base-patch32-224] | ruCLIP Base [vit-base-patch16-224] | ruCLIP Large [vit-large-patch14-224] | ruCLIP Base [vit-base-patch32-384] | ruCLIP Large [vit-large-patch14-336] | ruCLIP Base [vit-base-patch16-384] | CLIP [vit-base-patch16-224] original + OPUS-MT | CLIP [vit-base-patch16-224] original | | :----------------------------- | :------------------------------------------------------------------------------------------------------ | :------------------------------------------------------------------------------------------------------ | :--------------------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------------ | :-------------------------------------------------------------------------------------------------------- | :------------------------------------------------------------------------------------------------------- | :----------------------------------------------------------------------------------------------------- | :-------------------------------------- | | Food101, acc | 0.505 | 0.552 | 0.597 | 0.642 | 0.712💥 | 0.689 | 0.664 | 0.883 | | CIFAR10, acc | 0.818 | 0.810 | 0.878 | 0.862 | 0.906💥 | 0.845 | 0.859 | 0.893 | | CIFAR100, acc | 0.504 | 0.496 | 0.511 | 0.529 | 0.591 | 0.569 | 0.603💥 | 0.647 | | Birdsnap, acc | 0.115 | 0.117 | 0.172 | 0.161 | 0.213💥 | 0.195 | 0.126 | 0.396 | | SUN397, acc | 0.452 | 0.462 | 0.484 | 0.510 | 0.523💥 | 0.521 | 0.447 | 0.631 | | Stanford Cars, acc | 0.433 | 0.487 | 0.559 | 0.572 | 0.659💥 | 0.626 | 0.567 | 0.638 | | DTD, acc | 0.380 | 0.401 | 0.370 | 0.390 | 0.408 | 0.421💥 | 0.243 | 0.432 | | MNIST, acc | 0.447 | 0.464 | 0.337 | 0.404 | 0.242 | 0.478 | 0.559💥 | 0.559 | | STL10, acc | 0.932 | 0.932 | 0.934 | 0.946 | 0.956 | 0.964 | 0.967💥 | 0.970 | | PCam, acc | 0.501 | 0.505 | 0.520 | 0.506 | 0.554 | 0.501 | 0.603💥 | 0.573 | | CLEVR, acc | 0.148 | 0.128 | 0.152 | 0.188 | 0.142 | 0.132 | 0.240💥 | 0.240 | | Rendered SST2, acc | 0.489 | 0.527 | 0.529 | 0.508 | 0.539💥 | 0.525 | 0.484 | 0.484 | | ImageNet, acc | 0.375 | 0.401 | 0.426 | 0.451 | 0.488💥 | 0.482 | 0.392 | 0.638 | | FGVC Aircraft, mean-per-class | 0.033 | 0.043 | 0.046 | 0.053 | 0.075 | 0.046 | 0.220💥 | 0.244 | | Oxford Pets, mean-per-class | 0.560 | 0.595 | 0.604 | 0.587 | 0.546 | 0.635💥 | 0.507 | 0.874 | | Caltech101, mean-per-class | 0.786 | 0.775 | 0.777 | 0.834 | 0.835💥 | 0.835💥 | 0.792 | 0.883 | | Flowers102, mean-per-class | 0.401 | 0.388 | 0.455 | 0.449 | 0.517💥 | 0.452 | 0.357 | 0.697 | | Hateful Memes, roc-auc | 0.564 | 0.516 | 0.530 | 0.537 | 0.519 | 0.543 | 0.579💥 | 0.589 |

And for linear-prob evaluation:

| Dataset | ruCLIP Base [vit-base-patch32-224] | ruCLIP Base [vit-base-patch16-224] | ruCLIP Large [vit-large-patch14-224] | ruCLIP Base [vit-base-patch32-384] | ruCLIP Large [vit-large-patch14-336] | ruCLIP Base [vit-base-patch16-384] | CLIP [vit-base-patch16-224] original | | :------------- | :------------------------------------ | :------------------------------------ | :-------------------------------------- | :------------------------------------ | :------------------------------------------------ | :----------------------------------------------- | :-------------------------------------- | | Food101 | 0.765 | 0.827 | 0.840 | 0.851 | 0.896💥 | 0.890 | 0.901 | | CIFAR10 | 0.917 | 0.922 | 0.927 | 0.934 | 0.943💥 | 0.942 | 0.953 | | CIFAR100 | 0.716 | 0.739 | 0.734 | 0.745 | 0.770 | 0.773💥 | 0.808 | | Birdsnap | 0.347 | 0.503 | 0.567 | 0.434 | 0.609 | 0.612💥 | 0.664 | | SUN397 | 0.683 | 0.721 | 0.731 | 0.721 | 0.759💥 | 0.758 | 0.777 | | Stanford Cars | 0.697 | 0.776 | 0.797 | 0.766 | 0.831 | 0.840💥 | 0.866 | | DTD | 0.690 | 0.734 | 0.711 | 0.703 | 0.731 | 0.749💥 | 0.770 | | MNIST | 0.963 | 0.974💥 | 0.949 | 0.965 | 0.949 | 0.971 | 0.989 | | STL10 | 0.957 | 0.962 | 0.973 | 0.968 | 0.981💥 | 0.974 | 0.982 | | PCam | 0.827 | 0.823 | 0.791 | 0.835 | 0.807 | 0.846💥 | 0.830 | | CLEVR | 0.356 | 0.360 | 0.358 | 0.308 | 0.318 | 0.378💥 | 0.604 | | Rendered SST2 | 0.603 | 0.655 | 0.651 | 0.651 | 0.637 | 0.661💥 | 0.606 | | FGVC Aircraft | 0.254 | 0.312 | 0.290 | 0.283 | 0.341 | 0.362💥 | 0.604 | | Oxford Pets | 0.774 | 0.820 | 0.819 | 0.730 | 0.753 | 0.856💥 | 0.931 | | Caltech101 | 0.904 | 0.917 | 0.914 | 0.922 | 0.937💥 | 0.932 | 0.956 | | HatefulMemes | 0.545 | 0.568 | 0.563 | 0.581 | 0.585💥 | 0.578 | 0.645 |

Also, we have created speed comparison based on CIFAR100 dataset using Nvidia-V100 for evaluation:

| | ruclip-vit-base-patch32-224 | ruclip-vit-base-patch16-224 | ruclip-vit-large-patch14-224 | ruclip-vit-base-patch32-384 | ruclip-vit-large-patch14-336 | ruclip-vit-base-patch16-384 | |----------|-----------------------------|-----------------------------|------------------------------|-----------------------------|------------------------------|-----------------------------| | iter/sec | 308.84 💥 | 155.35 | 49.95 | 147.26 | 22.11 | 61.79 |

Authors

Supported by

Social Media

Owner

  • Name: AI Forever
  • Login: ai-forever
  • Kind: organization
  • Location: Armenia

Creating ML for the future. AI projects you already know. We are non-profit organization with members from all over the world.

GitHub Events

Total
  • Watch event: 9
  • Fork event: 5
Last Year
  • Watch event: 9
  • Fork event: 5

Committers

Last synced: 12 months ago

All Time
  • Total Commits: 41
  • Total Committers: 12
  • Avg Commits per committer: 3.417
  • Development Distribution Score (DDS): 0.488
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
shonenkov s****v@p****u 21
Anton Emelyanov l****t@m****u 5
Anton Emelyanov k****n@g****m 4
Tatiana Shavrina r****s@g****m 2
Denis d****v@g****m 2
boomb0om i****1@g****m 1
Ilya Chernikov 4****v 1
Gerasimov Maxim 5****7 1
Danyache d****v@n****u 1
Andrey Kuznetsov k****y@g****m 1
Anastasia Maltseva n****a@m****u 1
Alex Wortega a****h@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 12 months ago

All Time
  • Total issues: 18
  • Total pull requests: 22
  • Average time to close issues: about 4 hours
  • Average time to close pull requests: about 2 months
  • Total issue authors: 9
  • Total pull request authors: 12
  • Average comments per issue: 0.83
  • Average comments per pull request: 0.05
  • Merged pull requests: 20
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • Lednik7 (2)
  • christophschuhmann (1)
  • breadfan (1)
  • potassium-chloride (1)
  • nomomon (1)
  • Petilia (1)
  • ternaus (1)
  • bitcoin5000 (1)
  • rom1504 (1)
Pull Request Authors
  • danielgafni (2)
  • shonenkov (2)
  • Danyache (2)
  • denndimitrov (2)
  • TatianaShavrina (1)
  • neverix (1)
  • NastyaMittseva (1)
  • AlexWortega (1)
  • boomb0om (1)
  • kuznetsoffandrey (1)
  • ichrnkv (1)
  • Lednik7 (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

requirements-dev.txt pypi
  • pre-commit * development
  • pytest * development
  • pytest-cov * development
requirements.txt pypi
  • huggingface_hub ==0.2.1
  • more_itertools ==8.12.0
  • torch *
  • torchvision *
  • youtokentome *
setup.py pypi