aniemore

Emotions recognition from audio and text files (only russian language)

Keywords

artificial-intelligence deep-learning emotion-recognition machine-learning package python russian-language speech-recognition speech-to-text text-classification voice-classfication

Last synced: 6 months ago · JSON representation ·

Repository

Emotions recognition from audio and text files (only russian language)

Basic Info

Host: GitHub
Owner: aniemore
License: mit
Language: Python
Default Branch: master
Homepage:
Size: 2.11 MB

Statistics

Stars: 71
Watchers: 7
Forks: 8
Open Issues: 4
Releases: 0

Topics

artificial-intelligence deep-learning emotion-recognition machine-learning package python russian-language speech-recognition speech-to-text text-classification voice-classfication

Created almost 4 years ago · Last pushed 8 months ago

Metadata Files

Readme License Citation

README.md

Aniemore - это открытая библиотека искусственного интеллекта для потоковой аналитики эмоциональных оттенков речи человека.

Основные технические параметры

Объем набора данных Russian Emotional Speech Dialogues содержит более 3000 аудиофрагментов представляющих 200 различных людей;
Модели способны распознавать эмоции в зашумленных аудиофайлах длительностью в 3 секунды;
Скорость обработки и ответа модели составляет не более 5 секунд;
Пословная ошибка модели WER 30%;
Совокупная точность модели 75%
Диапазон распознавания эмоций: злость, отвращение, страх, счастье, интерес, грусть, нейтрально;
Акустические возможности - 3 уровня.

Описание

Aniemore - это библиотека для Python, которая позволяет добавить в ваше программное обеспечение возможность определять эмоциональный фон речи человека, как в голосе, так и в тексте. Для этого в библиотеке разработано два соответсвующих модуля - Voice и Text.

Aniemore содержит свой собственный датасет RESD (Russian Emotional Speech Dialogues) и другие наборы данных разного объема, которые вы можете использовать для обучения своих моделей.

| Датасет | Примечание | |----------------|-----------------------------------------------------------------------------| | RESD | 7 эмоций, 4 часа аудиозаписей диалогов студийное качество | | RESD_Annotated | RESD + speech-to-text аннотации | | REPV | 2000 голосовых сообщений (.ogg), 200 актеров, 2 нейтральные фразы, 5 эмоций | | REPV-S | 140 голосовых сообщений (.ogg) "Привет, как дела?" с разными эмоциями |

Вы можете использовать готовые предобученные модели из библиотеки:

| Модель | Точность | |----------------------------------------------------------------------------------------------------------------------------------|----------| | Голосовые модели | | | wav2vec2-xlsr-53-russian-emotion-recognition | 73% | | wav2vec2-emotion-russian-resd | 75% | | wavlm-emotion-russian-resd | 82% | | hubert-emotion-russian-resd | 75% | | unispeech-sat-emotion-russian-resd | 72% | | wavlm-bert-base (мультимодальная) | 81% | | wavlm-bert-fusion (мультимодальная, улучшенная) | 83% | | Текстовые модели | | | rubert-base-emotion-russian-cedr-m7 | 74% | | rubert-tiny2-russian-emotion-detection | 85% | | rubert-large-emotion-russian-cedr-m7 | 76% | | rubert-tiny-emotion-russian-cedr-m7 | 72% |

Показатели моделей в разрезе эмоций

показатели моделей.jpg

Установка

shell pip install aniemore

Минимальные требования к оборудованию

| Архитектура | ЦПУ | ОЗУ | SSD | |--------------------------|--------|-------|-------| | Wave2Vec2 | 2 ядра | 8 ГБ | 40 ГБ | | WaveLM | 2 ядра | 8 ГБ | 40 ГБ | | Hubert | 2 ядра | 8 ГБ | 40 ГБ | | UniSpeechSAT | 2 ядра | 8 ГБ | 40 ГБ | | BertTiny/BertTiny2 | 2 ядра | 4 ГБ | 40 ГБ | | Bert_Base | 2 ядра | 4 ГБ | 40 ГБ | | Bert_Large | 2 ядра | 8 ГБ | 40 ГБ | | WavLM Bert Base | 2 ядра | 16 ГБ | 40 ГБ | | WavLM Bert Fusion | 2 ядра | 16 ГБ | 40 ГБ | | Whisper Tiny | 2 ядра | 4 ГБ | 40 ГБ | | Whisper Base | 2 ядра | 4 ГБ | 40 ГБ | | Whisper Small | 2 ядра | 4 ГБ | 40 ГБ | | Whisper Medium | 2 ядра | 8 ГБ | 40 ГБ | | Whisper Large | 2 ядра | 16 ГБ | 40 ГБ | | TextEnhancer | 2 ядра | 4 ГБ | 40 ГБ |

Пример использования

Ниже приведены простые примеры использования библиотеки. Для более детальных примеров, в том числе загрузка cобственной модели - смотрите сделанный для этого Google Colab

Распознавание эмоций в тексте

```python import torch from aniemore.recognizers.text import TextRecognizer from aniemore.models import HuggingFaceModel

model = HuggingFaceModel.Text.BertTiny2 device = 'cuda' if torch.cuda.isavailable() else 'cpu' tr = TextRecognizer(model=model, device=device)

tr.recognize('это работает? :(', returnsinglelabel=True) ```

Распознавание эмоций в голосе

```python import torch from aniemore.recognizers.voice import VoiceRecognizer from aniemore.models import HuggingFaceModel

model = HuggingFaceModel.Voice.WavLM device = 'cuda' if torch.cuda.isavailable() else 'cpu' vr = VoiceRecognizer(model=model, device=device) vr.recognize('/content/ваш-звуковой-файл.wav', returnsingle_label=True) ```

Распознавание эмоций (мультимодальный метод)

```python import torch from aniemore.recognizers.multimodal import VoiceTextRecognizer from aniemore.utils.speech2text import SmallSpeech2Text from aniemore.models import HuggingFaceModel

model = HuggingFaceModel.MultiModal.WavLMBertFusion s2t_model = SmallSpeech2Text()

text = SmallSpeech2Text.recognize('/content/ваш-звуковой-файл.wav').text device = 'cuda' if torch.cuda.is_available() else 'cpu'

vtr = VoiceTextRecognizer(model=model, device=device) vtr.recognize(('/content/ваш-звуковой-файл.wav', text), returnsinglelabel=True) ```

Распознавание эмоций (мультимодальный метод с автоматическим распознаванием речи)

```python import torch from aniemore.recognizers.multimodal import MultiModalRecognizer from aniemore.utils.speech2text import SmallSpeech2Text from aniemore.models import HuggingFaceModel

model = HuggingFaceModel.MultiModal.WavLMBertFusion device = 'cuda' if torch.cuda.isavailable() else 'cpu' mr = MultiModalRecognizer(model=model, s2tmodel=SmallSpeech2Text(), device=device) mr.recognize('/content/ваш-звуковой-файл.wav', returnsinglelabel=True) ```

Доп. ссылки

Все модели и датасеты, а так же примеры их использования вы можете посмотреть в нашем HuggingFace профиле

Аффилированость

Aniemore (Artem Nikita Ilya EMOtion REcognition)

Разработка открытой библиотеки произведена коллективом авторов на базе ООО "Социальный код". Результаты работы получены за счет гранта Фонда содействия развитию малых форм предприятий в научно-технической сфере (Договор №1ГУКодИИС12-D7/72697 от 22.12.2021).

Цитирование

Для цитировация воспользуйтесь пунктом Cite this repository в правом меню About этого проекта, или скопируйте информацию ниже: bibtex @software{Lubenets_Aniemore, author = {Lubenets, Ilya and Davidchuk, Nikita and Amentes, Artem}, license = {MIT}, title = {{Aniemore}}, url = {https://github.com/aniemore/Aniemore} }

Owner

Name: Aniemore
Login: aniemore
Kind: organization
Location: United Kingdom

Repositories: 1
Profile: https://github.com/aniemore

Citation (CITATION.cff)

# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!

cff-version: 1.2.0
title: Aniemore
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Ilya
    family-names: Lubenets
    email: lubenets.ilya.igorevich@gmail.com
    orcid: 'https://orcid.org/0009-0005-9507-6669'
  - given-names: Nikita
    family-names: Davidchuk
    email: davidchuk.no@gmail.com
    orcid: 'https://orcid.org/0009-0002-0016-1333'
  - given-names: Artem
    family-names: Amentes
    email: artem.amentes@gmail.com
    orcid: 'https://orcid.org/0000-0002-0468-3078'
identifiers:
  - type: url
    value: 'https://github.com/aniemore/Aniemore'
    description: GitHub
  - type: url
    value: 'https://huggingface.co/Aniemore'
    description: HuggingFace
repository-code: 'https://github.com/aniemore/Aniemore'
repository: 'https://huggingface.co/Aniemore'
abstract: >-
  Aniemore is an open source artificial intelligence library
  for streaming emotion analytics of human speech.
keywords:
  - emotion recognition
  - AI
  - machine learning
  - russian language
  - text classification
  - audio classification
  - SER
license: MIT

GitHub Events

Total

Issues event: 1
Watch event: 22
Push event: 1
Fork event: 1

Last Year

Issues event: 1
Watch event: 22
Push event: 1
Fork event: 1

Committers

Last synced: almost 3 years ago

All Time

Total Commits: 102
Total Committers: 7
Avg Commits per committer: 14.571
Development Distribution Score (DDS): 0.52

Top Committers

Name	Email	Commits
Ilya Lubenets	x**l@g**m	49
=	=	18
Nikita Davidchuk	b**8@g**m	15
Ilya	x**L@g**m	9
Ilya Lubenets	x**x@g**m	4
Artem Amentes	9**s@u**m	4
Ilya Lubenets	l**1@g**m	3

Issues and Pull Requests

Last synced: 8 months ago

All Time

Total issues: 5
Total pull requests: 10
Average time to close issues: 4 months
Average time to close pull requests: about 23 hours
Total issue authors: 5
Total pull request authors: 2
Average comments per issue: 1.0
Average comments per pull request: 0.1
Merged pull requests: 9
Bot issues: 0
Bot pull requests: 1

Past Year

Issues: 2
Pull requests: 1
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 2
Pull request authors: 1
Average comments per issue: 0.0
Average comments per pull request: 0.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 1

View more stats

Top Authors

Issue Authors

KirillRnd (1)
kurkumaIntangible (1)
Mikhael-Danilov (1)
Ar4ikov (1)
dependabot[bot] (1)
7Soldier (1)

Pull Request Authors

Ar4ikov (7)
dependabot[bot] (1)

Top Labels

Issue Labels

enhancement (1) dependencies (1) invalid (1)

Pull Request Labels

dependencies (1)

Packages

Total packages: 1
Total downloads:
- pypi 171 last-month

Total dependent packages: 0
Total dependent repositories: 0
Total versions: 11
Total maintainers: 3

pypi.org: aniemore

Aniemore (Artem Nikita Ilya EMOtion REcognition) is a library for emotion recognition in voice and text for russian language.

Documentation: https://aniemore.readthedocs.io/
License: MIT
Latest release: 1.2.3
published about 2 years ago

Versions: 11
Dependent Packages: 0
Dependent Repositories: 0
Downloads: 171 Last month

Rankings

Dependent packages count: 6.6%

Downloads: 13.4%

Average: 18.1%

Forks count: 19.6%

Stargazers count: 20.5%

Dependent repos count: 30.6%

Maintainers (3)

Ar4ikov toiletsandpaper artemamentes

Last synced: 6 months ago

aniemore

Science Score: 44.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Основные технические параметры

Описание

Показатели моделей в разрезе эмоций

Установка

Минимальные требования к оборудованию

Пример использования

Распознавание эмоций в тексте

Распознавание эмоций в голосе

Распознавание эмоций (мультимодальный метод)

Распознавание эмоций (мультимодальный метод с автоматическим распознаванием речи)

Доп. ссылки

Аффилированость

Цитирование

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Committers

All Time

Top Committers

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: aniemore

Rankings

Maintainers (3)