uform - Release v3.1.1

Release: v3.1.1 [skip ci]

- Python
Published by ashvardanian over 1 year ago

uform - v3.1 🍏 Apple Neural Engine Optimizations

Apple chips provide several functional units capable of high-throughput matrix multiplication and AI inference. Those computeUnits include the CPU, GPU, and the Apple Neural Engine (ANE). A user may naively hope that any typical architecture, like BERT or ViT, should work fine on all of those chips in any of the common quantization forms, like switching from f32 single-precision to bf16 and f16 half-precision floats or i8 and u8 integers. That is not the case. Of all the backends that UForm has been tested on, quantizing the entire model for CoreML was the most challenging task, and Apple became the only platform where we distribute the models in the original precision, which is a pity given a fleet of 2 billion potential target devices running iOS worldwide, almost all of which are in the countries and language groups natively supported by UForm multimodal multilingual embeddings.

When using @unum-cloud UForm models in Swift, we pass computeUnits: .all to let Apple's scheduler choose the target device itself and treat it as a black-box optimization. However, a better way to do this is if you can explicitly provide models tuned for the Apple Neural Engine. So, together with our friends from @TheStageAI, we've quantized our models to map perfectly to ANE-supported operations with minimal loss in precision, reducing the model size by 2-4x and accelerating inference up to 5x:

| Model | GPU Text E. | ANE Text E. | GPU Image E. | ANE Image E. | | :------------------ | ----------: | ----------: | -----------: | -----------: | | english-small | 2.53 ms | 0.53 ms | 6.57 ms | 1.23 ms | | english-base | 2.54 ms | 0.61 ms | 18.90 ms | 3.79 ms | | english-large | 2.30 ms | 0.61 ms | 79.68 ms | 20.94 ms | | multilingual-base | 2.34 ms | 0.50 ms | 18.98 ms | 3.77 ms |

On Apple M4 iPad, running iOS 18.2. The batch size is 1, and the model is pre-loaded into memory. The original encoders use f32 single-precision numbers for maximum compatibility and mostly rely on GPU for computation. The quantized encoders use a mixture of i8, f16, and f32 numbers for maximum performance and mostly rely on the Apple Neural Engine (ANE) for computation. The median latency is reported.

To use them in Swift, check out the docs at unum-cloud.github.io/uform/swift/ or the SwiftSemanticSearch repository for an integrated example with USearch.

Thanks to @ArnoldMSU, @b1n0, @Aydarkhan, @AndreyAgeev from TheStage.ai for help 👏

- Python
Published by ashvardanian over 1 year ago

uform - Release v3.0.3

Release: v3.0.3 [skip ci]

- Python
Published by ashvardanian over 1 year ago

uform - v3.0.2

3.0.2 (2024-04-25)

Make

Change NPM name (e97977e)

- Python
Published by ashvardanian about 2 years ago

uform - v3.0.1

3.0.1 (2024-04-25)

Make

Upgrade CI (83fc71a)

- Python
Published by ashvardanian about 2 years ago

uform - UForm v3 for 3 platforms 🕸️🍏🐍

Multimodal Embeddings for JavaScript, Swift, and Python

How many AI models can run on-device out of the box? UForm multimodal embeddings can 🥳

| Model | Parameters | Languages | Architecture | | :-------------------------------------------------- | ---------: | --------: | -------------------------------------------: | | uform3-image-text-english-large 🆕 | 365M | 1 | 6 text layers, ViT-L/14, 6 multimodal layers | | uform3-image-text-english-base | 143M | 1 | 2 text layers, ViT-B/16, 2 multimodal layers | | uform3-image-text-english-small 🆕 | 79M | 1 | 2 text layers, ViT-S/16, 2 multimodal layers | | uform3-image-text-multilingual-base | 206M | 21 | 8 text layers, ViT-B/16, 4 multimodal layers |

JavaScript

Load the models and preprocessors for different modalities:

```js import { getModel, Modality, TextProcessor, TextEncoder, ImageEncoder, ImageProcessor } from '@unum-cloud/uform';

const { configPath, modalityPaths, tokenizerPath } = await getModel({ modelId: 'unum-cloud/uform3-image-text-english-small', modalities: [Modality.TextEncoder, Modality.ImageEncoder], }); ```

Embed images:

```js const imageProcessor = new ImageProcessor(configPath); await imageProcessor.init(); const processedImages = await imageProcessor.process("path/to/image.png");

const imageEncoder = new ImageEncoder(modalityPaths.image_encoder, imageProcessor); await imageEncoder.init(); const imageOutput = await imageEncoder.encode(processedImages); assert(imageOutput.embeddings.dims.length === 2, "Output should be 2D"); ```

Embed queries:

```js const textProcessor = new TextProcessor(configPath, tokenizerPath); await textProcessor.init(); const processedTexts = await textProcessor.process("a small red panda in a zoo");

const textEncoder = new TextEncoder(modalityPaths.text_encoder, textProcessor); await textEncoder.init(); const textOutput = await textEncoder.encode(processedTexts); assert(textOutput.embeddings.dims.length === 2, "Output should be 2D"); await textEncoder.dispose(); ```

Swift

Embed images:

```swift let imageModel = try await ImageEncoder(modelName: "unum-cloud/uform3-image-text-english-small") let imageURL = "https://github.com/ashvardanian/ashvardanian/blob/master/demos/bbq-on-beach.jpg?raw=true" guard let url = URL(string: imageURL), let imageSource = CGImageSourceCreateWithURL(url as CFURL, nil), let cgImage = CGImageSourceCreateImageAtIndex(imageSource, 0, nil) { throw Exception("Could not load image from URL: (imageURL)") }

var imageEmbedding: Embedding = try imageModel.encode(cgImage) var imageVector: [Float32] = embedding.asFloats() ```

Embed queries:

swift let textModel = try await TextEncoder(modelName: "unum-cloud/uform3-image-text-english-small") let text = "A group of friends enjoy a barbecue on a sandy beach, with one person grilling over a large black grill, while the other sits nearby, laughing and enjoying the camaraderie." let textEmbedding: Embedding = try textModel.encode(text) let textVector: [Float32] = textEmbedding.asFloats()

Python

Load model:

```py from uform import get_model, Modality

modelname = 'unum-cloud/uform3-image-text-english-small' modalities = [Modality.TEXTENCODER, Modality.IMAGEENCODER] processors, models = getmodel(model_name, modalities=modalities) ```

Embed images:

```py import requests from io import BytesIO from PIL import Image

imageurl = 'https://media-cdn.tripadvisor.com/media/photo-s/1b/28/6b/53/lovely-armenia.jpg' image = Image.open(BytesIO(requests.get(imageurl).content))

processorimage = processors[Modality.IMAGEENCODER] modelimage = models[Modality.IMAGEENCODER] imagedata = processorimage(image) imagefeatures, imageembedding = modelimage.encode(imagedata, return_features=True) ```

Embed queries:

```py text = 'a cityscape bathed in the warm glow of the sun, with varied architecture and a towering, snow-capped mountain rising majestically in the background'

modeltext = models[Modality.TEXTENCODER] processortext = processors[Modality.TEXTENCODER]

textdata = processortext(text) textfeatures, textembedding = modeltext.encode(textdata, return_features=True) ```

Thanks to @xenova and @sroussey for help with JavaScript! Thanks to @vmanot and @pcuenca for their work on Swift!

- Python
Published by ashvardanian about 2 years ago

uform - v2.1.1

2.1.1 (2024-04-16)

Fix

Importing ViT in gen_model.py (#80) (21f49ba), closes #80

- Python
Published by ashvardanian about 2 years ago

uform - v2.1.0

2.1.0 (2024-04-14)

Add

Initial Swift support (00bd84c)

Fix

Image preprocessing in Swift (f2772d0)

Improve

Fetching nested configs (729b9d9)

Make

Formatting Swift code (f6faf4c)

- Python
Published by ashvardanian about 2 years ago

uform - v2.0.2

2.0.2 (2024-03-28)

Make

Fix PyPi CI version with hash (364afe6)

- Python
Published by ashvardanian about 2 years ago

uform - v2.0.1

2.0.1 (2024-03-28)

Make

PyPi upload version (9453802)

- Python
Published by ashvardanian about 2 years ago

uform - Multimodal Matryoshka, Multimodal DPO, and ONNX 🎉

DPO Preview

Today we are releasing a new batch of multimodal models trained with Nebius and already available on HuggingFace 🤗

Matryoshka style multimodal embeddings ranging from 64 to 256 and 768 dimensions 🖼️
Improved multimodal chat in 1.2B parameters, tuned with Direct Preference Optimization 💬
ONNX backend, making PyTorch dependency optional for lightning fast deployments ⚡

- Python
Published by ashvardanian about 2 years ago

uform - v1.1.1: Polishing the Repo

Great thanks to @lmmx, @blackforestboi, and @kapulkin for their patches to the project!

Performance observations for M2 CPUs (#56) (8374ef6), closes #56
Passing labels to text_decoder to compute loss. (#65) (f445a8b), closes #65
Larger batch benchmarks (fdc8587)
pre-commit config and linters (#62) (0a3efac), closes #62

- Python
Published by ashvardanian over 2 years ago

uform - v1.1.0

1.1.0 (2024-02-15)

Add

gen2 model (#66) (37c26bc), closes #66

- Python
Published by ashvardanian over 2 years ago

uform - v1.0.3

1.0.3 (2023-12-29)

Improve

basic benchmark (042ae87)

- Python
Published by ashvardanian over 2 years ago

uform - v1.0.2

1.0.2 (2023-12-28)

Make

Deprecate Anaconda (1ec8097)

- Python
Published by ashvardanian over 2 years ago

uform - UForm v1: Multimodal Chat in 1.5 Billion Parameters

UForm v1: Multimodal Chat in 1.5 Billion Parameters

The UForm family of tiny multimodal transformer models just got bigger! In addition to the existing CLIP-like embedding models, we now have a generative model useful for image captioning, visual question answering, and multimodal chats. All that is in just a billion parameters, small enough to fit even on mobile devices 🎉

Repository: https://github.com/unum-cloud/uform Generative model: https://huggingface.co/unum-cloud/uform-gen Chat model: https://huggingface.co/unum-cloud/uform-gen-chat

Evaluation Metrics

Being the smallest model of its kind, unum-cloud/uform-gen is hard to compare to others. Next in size are the 5x larger LLaVAs and InstructBLIP, with 7 billion parameters. LLaVA performs noticeably better on VQAv2: 78.5 vs 66.5. On captioning, CLIPScore and RefCLIPScore are relatively close across all models.

| Model | Size | Caption Length | CLIPScore | RefCLIPScore | | :---------------------------------- | ---: | -------------: | --------: | -----------: | | llava-hf/llava-1.5-7b-hf | 7B | Long | 0.878 | 0.529 | | llava-hf/llava-1.5-7b-hf | 7B | Short | 0.886 | 0.531 | | | | Salesforce/instructblip-vicuna-7b | 7B | Long | 0.902 | 0.534 | | Salesforce/instructblip-vicuna-7b | 7B | Short | 0.848 | 0.523 | | | | unum-cloud/uform-gen | 1.5B | Long | 0.847 | 0.523 | | unum-cloud/uform-gen | 1.5B | Short | 0.842 | 0.522 | | | | unum-cloud/uform-gen-chat | 1.5B | Long | 0.860 | 0.525 | | unum-cloud/uform-gen-chat | 1.5B | Short | 0.858 | 0.525 |

Throughput

On RTX 3090, using vanilla PyTorch for inference, with bfloat16 arithmetic and greedy decoding, one should expect the following numbers for throughput.

| Model | Size | Speed | Speedup | | :---------------------------------- | ---: | ------------------: | --------: | | llava-hf/llava-1.5-7b-hf | 7B | ~ 40 tokens/second | | | Salesforce/instructblip-vicuna-7b | 7B | ~ 40 tokens/second | | | unum-cloud/uform-gen | 1.5B | ~ 140 tokens/second | x 3.5 |

- Python
Published by ashvardanian over 2 years ago

uform - v0.4.8

0.4.8 (2023-10-13)

Make

pass ANACONDA_API_TOKEN as env. var. (ed020d3)

- Python
Published by ashvardanian over 2 years ago

uform - v0.4.7

0.4.7 (2023-10-13)

Make

urllib3 after v2 breaks Anaconda pipeline (05ed238)

- Python
Published by ashvardanian over 2 years ago

uform - v0.4.6

0.4.6 (2023-10-13)

Make

depend on urllib3 (79f7519)

- Python
Published by ashvardanian over 2 years ago

uform - v0.4.5

0.4.5 (2023-10-13)

Make

Push to Anaconda (72a2de4)

- Python
Published by ashvardanian over 2 years ago

uform - v0.4.4

0.4.4 (2023-09-20)

Docs

Add "Training Objectives" (76bdba9)
Shorter lines (87ddf43)
Update speed table (5dfcfe8)

Improve

Expose TextEncoder and other models classes (47d969b)

- Python
Published by ashvardanian over 2 years ago

uform - v0.4.3

0.4.3 (2023-09-01)

Docs

Improved intro (d52225c)
Intro Accents (708ab38)
New intro (a32cfb3)
update citation information (766dd04)

Make

Add rebase action (45c2976)

- Python
Published by ashvardanian over 2 years ago

uform - v0.4.2

0.4.2 (2023-08-17)

Docs

Update Graphcore code (baa18d5)

Fix

Sphinx last version not work (c3a0cc7)

- Python
Published by ashvardanian almost 3 years ago

uform - v0.4.1

0.4.1 (2023-08-17)

Make

Docs build (599e042)

- Python
Published by ashvardanian almost 3 years ago

uform - v0.4.0

0.4.0 (2023-08-17)

Add

IPU support (d21ec7f)

Docs

Clean up (834a1dc)
Content and grammar (7619298)
Prepare model v2 docs (b0afb84)

Fix

Change checkpoints downloading mechanism (d594544)

- Python
Published by ashvardanian almost 3 years ago

uform - v0.3.2

0.3.2 (2023-08-04)

Improve

Adapt TritonClient to the new tokenizer (8cc2830)

- Python
Published by ashvardanian almost 3 years ago

uform - v0.3.1

0.3.1 (2023-08-04)

Improve

Remove dependency on transformers and timm (b7df8b4)

Refactor

Black formatter and dependencies order (2145a38)

- Python
Published by ashvardanian almost 3 years ago

uform - v0.3.0

0.3.0 (2023-08-01)

Add

properties to get features dimensions (483bd05)

Build

Package annotation (c8bb53c)

Docs

Change GA tracking number to GA4 (cc5718c)
Citation file (0a2353a)

Fix

loading nonpersistent buffers (ab5046d)

Refactor

Type annotation, args validation, docs (f833463)
Type annotation, args validation, docs (00ef1bb)

- Python
Published by ashvardanian almost 3 years ago

uform - v0.2.0

0.2.0 (2023-03-29)

Add

RPC implementation #21 (18e00f3), closes #21

Docs

Add RPC usage notes (f51af9d)
Fetch latest VERSION string (f6bc088)

- Python
Published by ashvardanian about 3 years ago

uform - v0.1.3

0.1.3 (2023-03-27)

Make

Allow publishing Docs from CI (fdc9651)

- Python
Published by ashvardanian about 3 years ago

uform - v0.1.2

0.1.2 (2023-03-27)

Build

Add toml dependency (ea29065)
Add documentation. (57c9181)

Docs

Fix github link (96b03c1)
Mention stripped release VERSION (5db7dd1)

Make

Polished docs generation (7ffe753)

- Python
Published by ashvardanian about 3 years ago

uform - v0.1.1

0.1.1 (2023-03-23)

Make

Added GitHub plugin (2b5c18b)

- Python
Published by ashvardanian about 3 years ago

uform - v0.0.6 Release

Mar 15, 2023

More convenient interface for batch encoding.

- Python
Published by kimihailv about 3 years ago

uform - v0.0.5 Release

Feb 28, 2023

Add the support of custom models from huggingface
Minor bugfixes

- Python
Published by kimihailv over 3 years ago

uform - v0.0.4 Release

Feb 24, 2023

Changed model loading schema, each model has separate repository on Hugging Face.
Small refactoring.

- Python
Published by kimihailv over 3 years ago

uform - v0.0.3 Release

Feb 24, 2023

Fixed README on PyPi

- Python
Published by kimihailv over 3 years ago

uform - v0.0.2 Release

Feb 24, 2023

First stable release.
Updated model storing schema.

- Python
Published by kimihailv over 3 years ago

Recent Releases of uform

uform - Release v3.1.1

uform - v3.1 🍏 Apple Neural Engine Optimizations

uform - Release v3.0.3

uform - v3.0.2

3.0.2 (2024-04-25)

Make

uform - v3.0.1

3.0.1 (2024-04-25)

Make

uform - UForm v3 for 3 platforms 🕸️🍏🐍

Multimodal Embeddings for JavaScript, Swift, and Python

JavaScript

Swift

Python

uform - v2.1.1

2.1.1 (2024-04-16)

Fix

uform - v2.1.0

2.1.0 (2024-04-14)

Add

Fix

Improve

Make

uform - v2.0.2

2.0.2 (2024-03-28)

Make

uform - v2.0.1

2.0.1 (2024-03-28)

Make

uform - Multimodal Matryoshka, Multimodal DPO, and ONNX 🎉

uform - v1.1.1: Polishing the Repo

uform - v1.1.0

1.1.0 (2024-02-15)

Add

uform - v1.0.3

1.0.3 (2023-12-29)

Improve

uform - v1.0.2

1.0.2 (2023-12-28)

Make

uform - UForm v1: Multimodal Chat in 1.5 Billion Parameters

UForm v1: Multimodal Chat in 1.5 Billion Parameters

Evaluation Metrics

Throughput

uform - v0.4.8

0.4.8 (2023-10-13)

Make

uform - v0.4.7

0.4.7 (2023-10-13)

Make

uform - v0.4.6

0.4.6 (2023-10-13)

Make

uform - v0.4.5

0.4.5 (2023-10-13)

Make

uform - v0.4.4

0.4.4 (2023-09-20)

Docs

Improve

uform - v0.4.3

0.4.3 (2023-09-01)

Docs

Make

uform - v0.4.2

0.4.2 (2023-08-17)

Docs

Fix

uform - v0.4.1

0.4.1 (2023-08-17)

Make

uform - v0.4.0

0.4.0 (2023-08-17)

Add

Docs

Fix

uform - v0.3.2

0.3.2 (2023-08-04)

Improve