SCALAR - A Platform for Real-time Machine Learning Competitions on Data Streams

SCALAR - A Platform for Real-time Machine Learning Competitions on Data Streams - Published in JOSS (2020)

https://github.com/nedrad88/scalar

Science Score: 93.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in JOSS metadata
✓
Academic publication links
Links to: ieee.org
○
Committers with academic emails
○
Institutional organization owner
✓
JOSS paper metadata
Published in Journal of Open Source Software

Scientific Fields

Computer Science Computer Science - 44% confidence

Earth and Environmental Sciences Physical Sciences - 40% confidence

Last synced: 6 months ago · JSON representation

Repository

SCALAR - Streaming ChALlenge plAtfoRm

Basic Info

Host: GitHub
Owner: nedRad88
License: apache-2.0
Language: JavaScript
Default Branch: master
Size: 25.9 MB

Statistics

Stars: 4
Watchers: 1
Forks: 2
Open Issues: 3
Releases: 1

Created over 5 years ago · Last pushed over 2 years ago

Metadata Files

Readme Contributing License

SCALAR - Streaming ChALlenge plAtfoRm

SCALAR is the first of its kind, platform to organize Machine Learning competitions on Data Streams. It was used to organize a real-time ML competition on IEEE Big Data Cup Challenges 2019.

Features

Data stream mining competitions - With SCALAR you can organize the real-time competitions on Data Streams. It is inspired by Kaggle, platform for offline machine learning competitions.

Simple, user friendly interface - SCALAR has a WEB application that allows you to easily browse and subscribe to the competitions. Its simple and intuitive design, let's you to easily upload the datasets, and create the competition.

Dialog competition

Live results and leaderboard - Since the competition is real-time, the results are also updated in real time. During the competition, you can follow performance of your model in the WEB application. Leaderboard will show how do you compare to other users, and live chart shows the comparison with baseline model.

Secure, bi-directional streaming communication - We use a combination of gRPC and Protobuf to provide secure, low latency bi-directional streaming communication between server and users.

Architecture

Freedom to choose a programming language - SCALAR lets users to choose their preferred environment. The only requirement is to be able to communicate through gRPC and Protobuf, which is supported for many programming languages: Python, Java, C++, GO... Additionally, SCALAR provides support for R. Apart from that, users can choose their setup, environment and additional resources to train better models.

Getting Started

The project is done in Python and organized in Docker containers. Each service is a separate Docker container.

Prerequisites

To run the platform locally, Docker is needed:

Install Docker

Also, Docker compose should be installed:

Install Docker compose

Running

Running is done using Docker-compose.

Quick setup (to test the functionality of the platform)

Run setup.py and follow the instructions to setup the environment. The script will set up the time zone and create the docker network for all containers.
- Once the setup.py finished successfully, the platform can be run by: docker-compose up

In depth setup (If you want to use it for organizing competitions and enable the registration):

Download the code locally and then adjust the config.json and docker-compose.yml files. More details in config-ReadMe.txt and in docker-compose-ReadMe.txt.
Set up an email account which will be used to send the registration confirmation message and authentication token. For that, you will need to set up your email account to allow the access of less secure apps. For a quick start, update only email information in config.json.
In docker-compose.yml update only the local paths to mount a persistent volumes, following the docker-compose-ReadMe.txt.
Run setup.py and follow the instructions to setup the environment. The script will set up the time zone and create the docker network for all containers. python3 setup.py
Once the setup.py finished successfully, the platform can be run by: docker-compose up

This command will pull all necessary containers and run them. When all services are up, web application will be available on localhost:80

To log in to the platform, you can use default credentials: admin:admin

For the test purposes, the test competition will be created automatically.

It will be scheduled to start 5 minutes after starting the platform.

Once you log in, you will be able to see the competition under Competitions/Coming tab.

In order to subscribe to the competition, click on it and then click the Subscribe button. Subscribe to competition

Setting up the client

Navigate to example_data directory, and run:

python3 client_setup.py

Once the necessary packages have been installed, go to Python client directory and edit the client.py file. Copy the Competition code and Secret key from a competition page and add it in client.py as shown in the figure below:

Client setup

Once the competition has started, run client.py, and you should be able to see how the messages and predictions are exchanged. Then you will be able to see the live chart and leaderboard on the competition page. (You will have to refresh the page to get new measures.)

To get to know around the platform use the the Quick Start Guide. To create and participate in the competition use the provided examples. * Nedeljko Radulovic * Dihia Boulegane * Albert Bifet * Nenad Stojanovic

Acknowledgments

Open source Docker containers were used: * MongoDB * Spark Docker container by GettyImages * MySQL * Kafka by Wurstmeister * Zookeeper by Wurstmeister

References

Owner

Name: Nedeljko Radulovic
Login: nedRad88
Kind: user

Repositories: 2
Profile: https://github.com/nedRad88

JOSS Publication

SCALAR - A Platform for Real-time Machine Learning Competitions on Data Streams

Published

December 05, 2020

DOI

10.21105/joss.02676

Volume 5, Issue 56, Page 2676

Authors

Nedeljko Radulovic
LTCI, Télécom Paris, IP-Paris, Paris, France

Dihia Boulegane
LTCI, Télécom Paris, IP-Paris, Paris, France, Orange Labs, Grenoble, France

Albert Bifet
LTCI, Télécom Paris, IP-Paris, Paris, France, University of Waikato, New Zealand

Editor

Gabriela Alessio Robles

GitHub Events

Total

Last Year

Committers

Last synced: 7 months ago

All Time

Total Commits: 250
Total Committers: 5
Avg Commits per committer: 50.0
Development Distribution Score (DDS): 0.216

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
nedRad88	n**8@g**m	196
gradoslovar	g**r@o**m	34
Gabby	a**a@g**m	11
dihiaboulegane	d**e@t**r	8
Albert Bifet	a****t	1

Committer Domains (Top 20 + Academic)

telecom-paristech.fr: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 5
Total pull requests: 9
Average time to close issues: about 2 months
Average time to close pull requests: 24 days
Total issue authors: 2
Total pull request authors: 3
Average comments per issue: 3.6
Average comments per pull request: 0.33
Merged pull requests: 4
Bot issues: 0
Bot pull requests: 5

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

GregaVrbancic (4)
atanikan (1)

Pull Request Authors

dependabot[bot] (5)
galessiorob (3)
nedRad88 (1)

Top Labels

Issue Labels

Pull Request Labels

dependencies (5) java (1) python (1)

Dependencies

provider/my_application/static/bower_components/angular-local-storage/bower.json bower

angular-mocks ^1.x development
angular ^1.x

provider/my_application/static/bower_components/angular-random-string/bower.json bower

angular-mocks ~1.3.x development
angular 1.3.x

provider/my_application/static/bower_components/angular-socket-io/bower.json bower

angular-mocks ~1.2.6 development
angular ~1.2.6

example_data/user_code/Code/Java/competition/pom.xml maven

io.grpc:grpc-netty 1.11.0
io.grpc:grpc-protobuf 1.11.0
io.grpc:grpc-stub 1.11.0

provider/my_application/static/bower_components/angular-local-storage/package.json npm

grunt ~0.4.2 development
grunt-contrib-concat * development
grunt-contrib-jshint ~0.12.0 development
grunt-contrib-uglify * development
grunt-karma latest development
jasmine-core ^2.4.1 development
karma ~0.13.19 development
karma-coverage ^0.5.3 development
karma-jasmine ~0.3.7 development
karma-phantomjs-launcher ~1.0.0 development
load-grunt-tasks ~3.4.0 development
phantomjs-prebuilt ^2.1.4 development
time-grunt ~1.3.0 development

provider/my_application/static/bower_components/angular-socket-io/package.json npm

karma ~0.10.2 development
karma-firefox-launcher ~0.1.0 development
karma-jasmine ~0.1.3 development
karma-coverage ~0.1.4

provider/my_application/requirements.txt pypi

Flask *
Flask-Mail *
Flask-SQLAlchemy *
Flask-WTF *
apscheduler *
confluent-kafka *
eventlet *
flask-socketio *
flask_sse *
gevent *
greenlet *
grpcio ==1.13.0
grpcio-tools ==1.13.0
hashids *
kafka-python *
orjson *
pause *
pyjwt *
pymongo *
pymysql *
pyspark ==2.4.0
python-dateutil *
redis *
scikit-multiflow *
sqlalchemy *
werkzeug *

docker-compose.yml docker

mongo 3.6.18
mysql 5.7.21
nedeljkoradulovic88/spark latest
wurstmeister/kafka latest
wurstmeister/zookeeper latest

provider/Dockerfile docker

nedeljkoradulovic88/base_debian latest build

setup.py pypi

SCALAR - A Platform for Real-time Machine Learning Competitions on Data Streams

Science Score: 93.0%

Scientific Fields

Repository

Basic Info

Statistics

Metadata Files

README.md

SCALAR - Streaming ChALlenge plAtfoRm

Features

Getting Started

Prerequisites

Running

Quick setup (to test the functionality of the platform)

In depth setup (If you want to use it for organizing competitions and enable the registration):

Setting up the client

Acknowledgments

References

Owner

JOSS Publication

SCALAR - A Platform for Real-time Machine Learning Competitions on Data Streams

Authors

Editor

Tags

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies