proj_higher-order-ranking

Efficient ranking algorithm and analysis of multi-body interactions

https://github.com/jackyeung99/proj_higher-order-ranking

Science Score: 67.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
    Links to: aps.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (18.9%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Efficient ranking algorithm and analysis of multi-body interactions

Basic Info
  • Host: GitHub
  • Owner: jackyeung99
  • License: mit
  • Language: Jupyter Notebook
  • Default Branch: main
  • Homepage:
  • Size: 1.58 GB
Statistics
  • Stars: 2
  • Watchers: 2
  • Forks: 0
  • Open Issues: 0
  • Releases: 1
Created almost 2 years ago · Last pushed 6 months ago
Metadata Files
Readme Contributing License Citation

README.md

Efficient inference of rankings from multi-body comparisons

This project provides source code for the Efficient inference of rankings from multi-body comparisons. The repository also contains the original scientific analyses developed by the Authors (see below) for the paper

If you use this codebase, please cite our work according to the CITATION.cff.

Contents

Getting Started

The code base for this project is written in Python with package management handled with Conda.

These instructions will give you a copy of the project up and running on your local machine for development, testing, and analysis purposes.

Prerequisites

A compatible Python install is needed to begin - the package management is handled by Conda as described below. - Python [3.10+] - GNU Make [4.2+]

A complete list of utilized packages is available in the requirements.txt file. There is, however, a package dependency hierarchy where some packages in the requirements.txt are not strictly necessary for the utilization of package infrastructure. The core requirements are listed as dependencies in the build instructions. Further instructions for creating a controlled environment from this manifest is available below, in the Installing section.

Installing

To (locally) reproduce this project, do the following:

  1. Download this code base. Notice that raw data are typically not included in the git-history and may need to be downloaded independently - see Reproducing Experiments for more information.
  2. (Optional) Open a terminal with Python installed and create a new virtual environment: bash python -m venv .venv source .venv/bin/activate
  3. Install the package bash pip install .

This will install all necessary packages for you to be able to run the scripts and everything should work out of the box.

Quick Start

This guide provides simple instructions for running the simulation on a chosen dataset.

  1. Compile source code

The core of the ranking calculations are written in an efficient C implementation. These must be compiled before the python scripts running the experiments will work. You can use the provided makefiles in the C_Prog/ subdirectories to compile this for UNIX-based machines out-of-the-box; Windows machines will need to edit some of the compiler flags within the makefiles. The compilation can be accomplished by running the following from the root directory

bash cd C_Prog/Readfile make cd ../Convergence_Readfile make cd ../..

  1. Locate the Dataset

Find the ID of the dataset you want to use by checking the datasets/dataset.info file. Each dataset is assigned a unique ID that must be formatted as a 5-digit number, with leading zeroes if necessary (e.g., 00001). Ensure that the selected dataset has a file for both its edges and nodes within dataset/Real_Data/

  1. Run the Model

if the dataset has the true scores set issynthetic = 1 otherwise issynthetic = 0

Run the model on the selected dataset using the following command:

bash python3 src/test.py --dataset_number=00001 --is_synthetic=0

Usage

Reproducing experiments

Synthetic

to generate synthetic results(note that this will create a large amount of files)

bash python3 datasets/utils/gen_synthetic_data

Accuracy

To run the experiments on the accuracy of all four models on the synthetic data bash python3 exp/ex01/ex01

This will download each result into the folder exp/ex01/data to preprocess and visualize these result run all cells within notebook/ex01syntheticaccuracy

Convergence

To run the experiments on the convergence of our model and zermellos bash python3 exp/ex02/ex02 This will result in a table being saved into the folder exp/ex02/results to visualize these results run all cells within the file notebook/ex02syntheticconvergence

Real Results

To run all datasets included in the paper

Accuracy

To run the experiments on the accuracy of all four models on the synthetic data bash python3 exp/ex03/ex03

This will download each result into the folder exp/ex03/data to preprocess and visualize these result run all cells within notebook/ex01realaccuracy

Convergence

To run the experiments on the convergence of our model and zermellos bash python3 exp/ex04/ex04 This will result in a table being saved into the folder exp/ex04/results to visualize these results run all cells within the file notebook/ex04realconvergence

Package Structure

text ├── C_Prog * Efficient C implementation │   ├── Convergence_Readfile * Measure convergence results │   │   ├── bt_functions.c │   │   ├── bt_functions.h │   │   ├── bt_model_data.c │   │   ├── bt_model_data.out │   │   ├── makefile │   │   ├── mt19937-64.c │   │   ├── mt64.h │   │   ├── my_sort.c │   │   └── my_sort.h │   └── Readfile * Measure accuracy │   ├── bt_functions.c │   ├── bt_functions.h │   ├── bt_model_data.c │   ├── bt_model_data.out │   ├── makefile │   ├── mt19937-64.c │   ├── mt64.h │   ├── my_sort.c │   └── my_sort.h ├── LICENSE ├── README.md ├── datasets │   ├── Real_Data * Edges and Nodes of datasets used in paper │   │   ├── 00001_edges.txt │   │   ├── 00001_nodes.txt │   │   ├── 00002_edges.txt │   │   ├── 00002_nodes.txt │   │   ├── 00003_edges.txt │   │   ├── 00003_nodes.txt │   │   ├── 00004_edges.txt │   │   ├── 00004_nodes.txt │   │   ├── 00005_edges.txt │   │   ├── 00005_nodes.txt │   │   ├── 00006_edges.txt │   │   ├── 00006_nodes.txt │   │   ├── 00007_edges.txt │   │   ├── 00007_nodes.txt │   │   ├── 00008_edges.txt │   │   ├── 00008_nodes.txt │   │   ├── 00009_edges.txt │   │   └── 00009_nodes.txt │   ├── dataset_info.csv * information on edge size, number of players, number of games, and mappings of dataset names and ids │   └── utils * preprocessing │   ├── convert_raw_files.py │   ├── dataset_info.py │   ├── extract_ordered_games.py │   ├── gen_synthetic_data.py │   └── rename_datasets.py ├── doc │   ├── experiment_descriptions.txt │   └── sketch_experiment.txt ├── exp │   ├── ex01 │   │   ├── ex01.py │   │   └── results │   │   ├── leadership_log_likelihood_summary.csv │   │   ├── log_likelihood_summary.csv │   │   ├── rho_summary.csv │   │   └── tau_summary.csv │   ├── ex02 │   │   ├── ex02.py │   │   └── results │   │   └── Convergence_Table.csv │   ├── ex03 │   │   ├── ex03.py │   │   └── results │   │   ├── leadership_log_likelihood_summary.csv │   │   └── log_likelihood_summary.csv │   └── ex04 │   ├── ex04.py │   └── results │   └── Convergence_Table.csv ├── notebook │   ├── comparison_models.ipynb │   ├── convergence_behavior.ipynb │   ├── ex01_synthetic_accuracy.ipynb │   ├── ex02_synthetic_convergence.ipynb │   ├── ex03_real_accuracy.ipynb │   ├── ex04_real_convergence.ipynb │   ├── figure_settings │   │   ├── __init__.py │   │   ├── ieee.mplstyle │   │   ├── science.mplstyle │   │   └── settings.py │   └── training_size.ipynb ├── requirements.txt ├── src │   ├── __init__.py │   ├── archive │   │   ├── weighted_bt.py │   │   └── weighted_graph_helpers.py │   ├── models * All models including comparisons to Zermello and other graph ranking algorithms │   │   ├── BradleyTerry.py * Python representation of our model │   │   ├── SpringRank.py │   │   ├── __init__.py │   │   ├── page_rank.py │   │   ├── point_wise.py │   │   └── zermello.py │   |── utils │   | ├── __init__.py │   | ├── c_operation_helpers.py * run c code │   | ├── convergence_test_helpers.py │   | ├── file_handlers.py │   | ├── graph_tools.py * building hypergraphs │   | ├── metrics.py │   | └── operation_helpers.py * run python implmentations | |_ test.py * Example Run | | └── tst ├── test_graph_tools.py ├── test_metrics.py ├── test_models.py ├── test_operation_helpers.py └── test_synthetic.py

Documentation

This repository does not maintain extensive independent documentation for its source code. We do, however, include documentation and notes on scientific experiments we've conducted throughout the project. If you are interested in seeing these notes, please email Filippo Radicchi with your inquiry.

Tests

All unit tests are written with pytest.

Tests can be run directly with the commands: bash pip install pytest pytest tst/

Other Information

Built With

Contributing

Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.

Versioning

We use Semantic Versioning for versioning. For the versions available, see the tags on this repository.

Authors

All correspondence shoulld be directed to Filippo Radicchi.

  • Jack Yeung
  • Daniel Kaiser
  • Filippo Radicchi

License

This project is licensed under the MIT License Creative Commons License - see the LICENSE file for details.

Acknowledgments

  • Billie Thompson - Provided README and CONTRIBUTING template - PurpleBooth
  • George Datseris - Published workshop on scientific code; inspired organization for reproducibility - GoodScientificCodeWorkshop

Owner

  • Login: jackyeung99
  • Kind: user

Citation (CITATION.cff)

# YAML 1.2
cff-version: "1.2.0"
authors:
- email: 
  family-names: 
  given-names: 
  orcid: "https://orcid.org/"
- family-names: Kaiser
  given-names: Daniel
  orcid: "https://orcid.org/"
- family-names: Iacopini
  given-names: Iacopo
  orcid: "https://orcid.org/0000-0001-8794-6410"
- family-names: Radicchi
  given-names: Filippo
  orcid: "https://orcid.org/"
contact:
- email: filrad@iu.edu
  family-names: Radicchi
  given-names: Filippo
  orcid: "https://orcid.org/"
doi: 
message: If you use this software, please cite our article in the
  Journal of Machine Learning Research.
preferred-citation:
  authors:
    - email: 
      family-names: 
      given-names: 
      orcid: "https://orcid.org/"
    - family-names: Kaiser
      given-names: Daniel
      orcid: "https://orcid.org/"
    - family-names: Iacopini
      given-names: Iacopo
      orcid: "https://orcid.org/0000-0001-8794-6410"
    - family-names: Radicchi
      given-names: Filippo
      orcid: "https://orcid.org/"
  date-published: 2025-XX-YY
  doi: XX
  issn: XX
  issue: XX
  journal: Journal of Machine Learning Research
  start: 
  title: "Efficient inference of rankings from multi-body comparisons"
  type: article
  url: "URL"
  volume: 
title: "Efficient inference of rankings from multi-body comparisons"

GitHub Events

Total
  • Watch event: 1
  • Push event: 2
Last Year
  • Watch event: 1
  • Push event: 2

Dependencies

requirements.txt pypi
  • Babel ==2.15.0
  • Jinja2 ==3.1.4
  • Markdown ==3.6
  • MarkupSafe ==2.1.5
  • PyYAML ==6.0.1
  • Pygments ==2.18.0
  • QtPy ==2.4.1
  • Send2Trash ==1.8.3
  • SpringRank ==0.0.7
  • Werkzeug ==3.0.3
  • absl-py ==2.1.0
  • anyio ==4.3.0
  • argon2-cffi ==23.1.0
  • argon2-cffi-bindings ==21.2.0
  • arrow ==1.3.0
  • asttokens ==2.4.1
  • astunparse ==1.6.3
  • async-lru ==2.0.4
  • attrs ==23.2.0
  • beautifulsoup4 ==4.12.3
  • bleach ==6.1.0
  • cachetools ==5.3.3
  • certifi ==2024.2.2
  • cffi ==1.16.0
  • charset-normalizer ==3.3.2
  • comm ==0.2.2
  • contourpy ==1.2.1
  • cycler ==0.12.1
  • debugpy ==1.8.1
  • decorator ==5.1.1
  • defusedxml ==0.7.1
  • exceptiongroup ==1.2.1
  • executing ==2.0.1
  • fastjsonschema ==2.19.1
  • flatbuffers ==24.3.25
  • fonttools ==4.51.0
  • fqdn ==1.5.1
  • gast ==0.5.4
  • google-auth ==2.30.0
  • google-auth-oauthlib ==1.2.0
  • google-pasta ==0.2.0
  • grpcio ==1.64.1
  • h11 ==0.14.0
  • h5py ==3.11.0
  • httpcore ==1.0.5
  • httpx ==0.27.0
  • idna ==3.7
  • iniconfig ==2.0.0
  • ipykernel ==6.29.4
  • ipython ==8.24.0
  • ipywidgets ==8.1.2
  • isoduration ==20.11.0
  • jedi ==0.19.1
  • joblib ==1.4.2
  • json5 ==0.9.25
  • jsonpointer ==2.4
  • jsonschema ==4.22.0
  • jsonschema-specifications ==2023.12.1
  • jupyter ==1.0.0
  • jupyter-console ==6.6.3
  • jupyter-events ==0.10.0
  • jupyter-lsp ==2.2.5
  • jupyter_client ==8.6.2
  • jupyter_core ==5.7.2
  • jupyter_server ==2.14.0
  • jupyter_server_terminals ==0.5.3
  • jupyterlab ==4.2.1
  • jupyterlab_pygments ==0.3.0
  • jupyterlab_server ==2.27.2
  • jupyterlab_widgets ==3.0.10
  • keras ==2.15.0
  • kiwisolver ==1.4.5
  • libclang ==18.1.1
  • llvmlite ==0.43.0
  • lxml ==5.2.2
  • markdown-it-py ==3.0.0
  • matplotlib ==3.9.0
  • matplotlib-inline ==0.1.7
  • mdurl ==0.1.2
  • mistune ==3.0.2
  • ml-dtypes ==0.3.2
  • namex ==0.0.8
  • nbclient ==0.10.0
  • nbconvert ==7.16.4
  • nbformat ==5.10.4
  • nest-asyncio ==1.6.0
  • notebook ==7.2.0
  • notebook_shim ==0.2.4
  • numba ==0.60.0
  • numpy ==1.26.4
  • oauthlib ==3.2.2
  • opt-einsum ==3.3.0
  • optree ==0.11.0
  • overrides ==7.7.0
  • packaging ==24.0
  • pandas ==2.2.2
  • pandocfilters ==1.5.1
  • parso ==0.8.4
  • pexpect ==4.9.0
  • pillow ==10.3.0
  • platformdirs ==4.2.2
  • pluggy ==1.5.0
  • prometheus_client ==0.20.0
  • prompt-toolkit ==3.0.43
  • protobuf ==4.25.3
  • psutil ==5.9.8
  • ptyprocess ==0.7.0
  • pure-eval ==0.2.2
  • pyasn1 ==0.6.0
  • pyasn1_modules ==0.4.0
  • pycparser ==2.22
  • pyparsing ==3.1.2
  • pytest ==8.2.1
  • python-dateutil ==2.9.0.post0
  • python-json-logger ==2.0.7
  • pytils ==0.4.1
  • pytz ==2024.1
  • pyzmq ==26.0.3
  • qtconsole ==5.5.2
  • referencing ==0.35.1
  • requests ==2.32.2
  • requests-oauthlib ==2.0.0
  • rfc3339-validator ==0.1.4
  • rfc3986-validator ==0.1.1
  • rich ==13.7.1
  • rpds-py ==0.18.1
  • rsa ==4.9
  • scikit-learn ==1.5.0
  • scipy ==1.13.1
  • six ==1.16.0
  • sniffio ==1.3.1
  • soupsieve ==2.5
  • stack-data ==0.6.3
  • tensorboard ==2.15.2
  • tensorboard-data-server ==0.7.2
  • tensorflow ==2.15.1
  • tensorflow-estimator ==2.15.0
  • tensorflow-io-gcs-filesystem ==0.37.0
  • tensorflow-ranking ==0.5.5
  • tensorflow-recommenders ==0.7.3
  • tensorflow-serving-api ==2.15.1
  • termcolor ==2.4.0
  • terminado ==0.18.1
  • threadpoolctl ==3.5.0
  • tinycss2 ==1.3.0
  • tomli ==2.0.1
  • tools ==0.1.9
  • tornado ==6.4
  • traitlets ==5.14.3
  • types-python-dateutil ==2.9.0.20240316
  • typing_extensions ==4.12.0
  • tzdata ==2024.1
  • uri-template ==1.3.0
  • urllib3 ==2.2.1
  • wcwidth ==0.2.13
  • webcolors ==1.13
  • webencodings ==0.5.1
  • websocket-client ==1.8.0
  • widgetsnbextension ==4.0.10
  • wrapt ==1.14.1