https://github.com/asreview/asreview-multilingual-feature-extractor

A model extension for ASReview. ASReview multilingual feature extractor is a feature extractor based on distiluse-base-multilingual-cased-v1.

https://github.com/asreview/asreview-multilingual-feature-extractor

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.5%) to scientific vocabulary

Keywords

asreview feature-extraction multilingual sentence-transformers
Last synced: 5 months ago · JSON representation

Repository

A model extension for ASReview. ASReview multilingual feature extractor is a feature extractor based on distiluse-base-multilingual-cased-v1.

Basic Info
  • Host: GitHub
  • Owner: asreview
  • License: mit
  • Language: Python
  • Default Branch: main
  • Homepage: https://www.asreview.ai
  • Size: 14.6 KB
Statistics
  • Stars: 1
  • Watchers: 4
  • Forks: 1
  • Open Issues: 0
  • Releases: 2
Topics
asreview feature-extraction multilingual sentence-transformers
Created over 4 years ago · Last pushed about 1 year ago
Metadata Files
Readme License

README.md

ASReview multilingual feature extractor

This extension to ASReview implements a multilingual feature extractor algorithm. This algorithm allows for the usage of records in multiple languages. These languages are:

Arabic, Chinese, Dutch, English, French, German, Italian, Korean, Polish, Portuguese, Russian, Spanish, Turkish.

The extension implements sentence-transformers/distiluse-base-multilingual-cased-v1. This is a sentence-transformers model and maps sentences to a 512 dimensional dense vector space and is multilingual. For more information about the feature extraction method, see

Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. ArXiv, abs/1908.10084. https://arxiv.org/abs/1908.10084

Installation

Install the multilingual feature extractor with:

bash pip install .

or

bash pip install git+https://github.com/asreview/asreview-multilingual-feature-extractor.git

Usage

ASReview LAB

ASReview LAB users can select the model in the Model Selection step of the project setup. Select "Multilingual Sentence Transformer" under "Feature extraction".

Simulation

The new feature extractor Multilingual Sentence Transformer is defined in asreviewcontrib/models/distiluse-base-multilingual.py and can be used in a simulation.

bash asreview simulate example_data_file.csv -e multilingual

Test the feature extractor with:

bash asreview simulate benchmark:van_de_Schoot_2017 -e multilingual -m svm

License

MIT license

Contact

For any questions or remarks, please send an email to asreview@uu.nl or open an issue.

Owner

  • Name: ASReview
  • Login: asreview
  • Kind: organization
  • Email: asreview@uu.nl
  • Location: Utrecht University

ASReview - Active learning for Systematic Reviews

GitHub Events

Total
  • Issues event: 2
  • Watch event: 1
  • Issue comment event: 1
  • Push event: 1
  • Pull request event: 1
  • Fork event: 1
Last Year
  • Issues event: 2
  • Watch event: 1
  • Issue comment event: 1
  • Push event: 1
  • Pull request event: 1
  • Fork event: 1