eosc-recommender-metrics
A framework for evaluating Recommender Systems (EOSC Recommender System)
Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.0%) to scientific vocabulary
Keywords
Repository
A framework for evaluating Recommender Systems (EOSC Recommender System)
Basic Info
- Host: GitHub
- Owner: ARGOeu
- License: apache-2.0
- Language: JavaScript
- Default Branch: master
- Homepage: https://argoeu.github.io/eosc-recommender-metrics/
- Size: 16.3 MB
Statistics
- Stars: 0
- Watchers: 4
- Forks: 3
- Open Issues: 3
- Releases: 8
Topics
Metadata Files
README.md
Recommender Metrics Framework
A framework for generating statistics, metrics, KPIs, and graphs for Recommender Systems
Preprocessor
RS metrics
Dependencies
- Install Conda from here. Tested on conda v 4.10.3.
- Run from terminal:
conda env create -f environment.yml - Run from terminal:
conda activate rsmetrics - Run from terminal:
chmod +x ./preprocessor.py ./preprocessor_common.py ./rsmetrics.py
Usage
Usage of the Batch System
- Configure
./preprocessor_common.py,./preprocessor.pyand./rsmetrics.pyby editting theconfig.yamlor providing another with-c. - Run from terminal:
./preprocessor_common.pyin order to gatherusersandresourcesand store them in theDatastore:bash ./preprocessor_common.py # this will ingest users and resources [from scratch] by retrieving the data from 'marketplace_rs' provider (which is specified in the config file ./preprocessor_common.py -p marketplace_rs # equivalent to first one ./preprocessor_common.py -p marketplace_rs --use-cache # equivalent to first one but use the cache file to read resources instead of downloading them via the EOSC Marketplace ./preprocessor_common.py -p athena # currently is not working since users collection only exist in 'marketplace_rs' - Run from terminal:
./preprocessor.py -p <provider>in order to gatheruser_actionsandrecommendationsfrom the particular provider and store them in theDatastore:bash ./preprocessor.py # this will ingest user_actions and recommendations [from scratch] by retrieving the data from 'marketplace_rs' provider (which is specified in the config file ./preprocessor.py -p marketplace_rs # equivalent to first one ./preprocessor.py -p athena # same procedure as the first one but for 'athena' provider Run from terminal:
./rsmetrics.py -p <provider>in order to gather the respective data (users,resources,user_actionsandrecommendations), calculatestatisticsandmetricsand store them in theDatastore, concerning that particular provider:bash ./rsmetrics.py # this will calculate and store statistics and metrics concerning data (users, resources, user_actions and recommendations) concerning the specified provider (which by default is 'marketplace_rs') ./rsmetrics.py -p marketplace_rs # equivalent to first one ./rsmetrics.py -p athena # same procedure as the first one for 'athena' providerA typical
rsmetrics.pycommand for a monthly report, would be:bash ./rsmetrics.py -p provider -s $(date +"%Y-%m-01") -e $(date +"%Y-%m-%d") -t "$(date +"%B %Y")"
Usage of the Streaming System
- Run from terminal
./rs-stream.pyin order to listen to the stream for new data, process them, and store them in theDatastore, concerning that particular provider:bash ./rs-stream.py -a username:password -q host:port -t user_actions -d ""mongodb://localhost:27017/datastore"" -p provider_name
Reporting
The reporting script generates an evalutation report in html format automatically served from a spawed localserver default: localhost:8080 and automatically opens the default browser to present the report.
To execute the script issue:
chmod u+X ./report.py
report.py
The script will automatically look for evaulation result files in the default folder ./data and will output the report in the default folder: ./report
Additional script usage with parameters
The report.py script can be used with the --input parameter: a path to a folder that the results from the evaluation process have been generated (default folder:./data). The report script can also take an --output parameter: a path to an output folder where the generated report will be served automatically.
Note: the script copies to the output folder all the necessary files such as pre_metrics.json, metrics.json as well as report.html.prototype renamed to index.html
``` usage: report.py [-h] [-i STRING] [-o STRING] [-a STRING] [-p STRING]
Generate report
optional arguments: -h, --help show this help message and exit -i STRING, --input STRING Input folder -o STRING, --output STRING Output report folder -a STRING, --address STRING Address to bind and serve the report -p STRING, --port STRING Port to bind and serve the report ```
Utilities
Get item catalog script (./get_catalog.py)
This script contacts EOSC Marketplace remote service api and generates a csv with a list of all available items of a specific catalog (e.g. services, datasets, trainings, publications, data_sources, ), their name, id and url
To execute the script issue:
chmod u+x ./get_catalog.py
./get_catalog.py -u https://remote.example.foo -c service -b 100 -l 2000 -o `my-catalog.csv`
Arguments:
- -u or -url: the endpoint url of the marketplace search service
- -o or --output: this is the output csv file (e.g. ./service_catalog.csv or ./training_catalog.csv) - optional
- -b or --batch: because search service returns results with pagination this configures the batch for each retrieval (number of items per request) - optional
- -l or --limit: (optional) the user can specify a limit of max items to be retrieves (this is handy for large catalogs if you want to receive a subset) - optional
- -c or --category: the category of list of items you want to retrieve
- -d or --datastore: mongodb destination database uri to store the results into (e.g. mongodb://localhost:27017/rsmetrics) - optional
- -p or --providers: state in a comma-separated list wich providers (engines) handle the items of the specific category
currently supported category types for marketplace:
- service
- training
- dataset (this is for items of the DATA catalog)
- data_source (this is for items of the DATASOURCES catalog)
- publication
- guideline (this is for items of the INTEROPERABILITY GUIDELINES catalog)
- software
- bundle
- other
Serve Evaluation Reports as a Service
The webservice folder hosts a simple webservice implemented in Flask framework which can be used to host the report results.
Note: Please make sure you work in a virtual environment and you have already downloaded the required dependencies by issuing
pip install -r requirements.txt
The webservice application serves two endpoints
- / : This is the frontend webpage that displays the Report Results in a UI
- /api : This api call returns the evaluation metrics in json format
To run the webservice issue:
cd ./webservice
flask run
The webservice by default runs in localhost:5000 you can override this by issuing for example:
flask run -h 127.0.0.1 -p 8080
There is an env variable RS_EVAL_METRIC_SOURCE which directs the webservice to the generated metrics.json file produced after the evaluation process.
This by default honors this repo's folder structure and directs to the root /data/metrics.json path
You can override this by editing the .env file inside the /webservice folder, or specificy the RS_EVAL_METRIC_SOURCE variable accordingly before executing the flask run command
Tested with python 3.9
Monitor for entries in the MongoDB collections
A typical example that counts the documents found in user_actions, recommendations, and resources for 1 day ago would be:
bash
./monitor.py -d "mongodb://localhost:27017/rsmetrics" -s "$(date -u -d '1 day ago' '+%Y-%m-%d')" -e "$(date -u '+%Y-%m-%d')"
E-mail send over SMTP for the above example: ```bash ./monitor.py -d "mongodb://localhost:27017/rsmetrics" -s "$(date -u -d '1 day ago' '+%Y-%m-%d')" -e "$(date -u '+%Y-%m-%d')" --email "smtp://server:port" sender@domain recipient1@domain recipient2@domain
```
Export Capacity information for entries in the MongoDB collections
A typical example that counts the documents found in user_actions, recommendations, and resources for 1 year ago would be:
bash
./monitor.py -d "mongodb://localhost:27017/rsmetrics" -s "$(date -u -d '1 day ago' '+%Y-%m-%d')" -e "$(date -u '+%Y-%m-%d')" --capacity
which will return results in CSV format of year,month,user_actions,recommendations
Additionally, capacity can be plotted:
bash
./monitor.py -d "mongodb://localhost:27017/rsmetrics" -s "$(date -u -d '1 day ago' '+%Y-%m-%d')" -e "$(date -u '+%Y-%m-%d')" --capacity --plot
Deployment docs
Installation and configuration documents can be found here.
Owner
- Name: ARGOeu
- Login: ARGOeu
- Kind: organization
- Repositories: 75
- Profile: https://github.com/ARGOeu
Α team working with the latest technologies about accounting, monitoring, messaging and eSeal Capabilities
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: EOSC Recommender System Metrics
message: A framework for evaluating EOSC Recommender System. Use of additional diagnostic metrics and visualizations offering deeper and sometimes surprising insights about the models performance.
type: software
authors:
- given-names: Kostas
family-names: Kagkelidis
email: kaggis@admin.grnet.gr
affiliation: GRNET S.A.
- given-names: Nikolaos
family-names: Triantafyllis
email: ntriantafyl@admin.grnet.gr
affiliation: GRNET S.A.
- given-names: Themis
family-names: Zamani
email: themis@admin.grnet.gr
affiliation: GRNET S.A.
GitHub Events
Total
- Delete event: 8
- Issue comment event: 7
- Push event: 4
- Pull request event: 14
- Create event: 3
Last Year
- Delete event: 8
- Issue comment event: 7
- Push event: 4
- Pull request event: 14
- Create event: 3
Dependencies
- 1041 dependencies
- @docusaurus/module-type-aliases 2.2.0 development
- @docusaurus/core 2.2.0
- @docusaurus/preset-classic 2.2.0
- @easyops-cn/docusaurus-search-local ^0.33.5
- @mdx-js/react ^1.6.22
- clsx ^1.2.1
- hast-util-is-element 1.1.0
- katex ^0.16.3
- prism-react-renderer ^1.3.5
- react ^17.0.2
- react-dom ^17.0.2
- rehype-katex 5
- remark-math 3
- Flask ==2.1.2
- Jinja2 ==3.1.2
- MarkupSafe ==2.1.1
- PyYAML ==6.0
- Werkzeug ==2.1.2
- beautifulsoup4 ==4.10.0
- certifi ==2022.12.7
- charset-normalizer ==2.0.12
- click ==8.1.3
- flask-pymongo ==2.3.0
- idna ==3.3
- importlib-metadata ==4.11.4
- itsdangerous ==2.1.2
- joblib ==1.2.0
- natsort ==8.1.0
- numpy ==1.22.3
- pandas ==1.4.2
- pymongo ==4.1.0
- pymongoarrow ==0.6.2
- python-dateutil ==2.8.2
- python-dotenv ==0.20.0
- pytz ==2022.1
- requests ==2.27.1
- scikit-surprise ==1.1.1
- scipy ==1.8.0
- six ==1.16.0
- soupsieve ==2.3.2
- surprise ==0.1
- urllib3 ==1.26.9
- zipp ==3.8.0
- 1199 dependencies
- @testing-library/jest-dom ^5.16.5
- @testing-library/react ^13.4.0
- @testing-library/user-event ^13.5.0
- react ^18.2.0
- react-dom ^18.2.0
- react-router-dom ^6.6.2
- react-scripts 5.0.1
- web-vitals ^2.1.4
- _libgcc_mutex 0.1.*
- _openmp_mutex 4.5.*
- ca-certificates 2022.3.18.*
- ld_impl_linux-64 2.35.1.*
- libffi 3.3.*
- libgcc-ng 9.3.0.*
- libgomp 9.3.0.*
- libstdcxx-ng 9.3.0.*
- ncurses 6.3.*
- openssl 1.1.1n.*
- pip 21.2.4.*
- python 3.9.11.*
- readline 8.1.2.*
- setuptools 58.0.4.*
- sqlite 3.38.0.*
- tk 8.6.11.*
- tzdata 2021e.*
- wheel 0.37.1.*
- xz 5.2.5.*
- zlib 1.2.11.*