Choice-Learn
Choice-Learn: Large-scale choice modeling for operational contexts through the lens of machine learning - Published in JOSS (2024)
Science Score: 98.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 13 DOI reference(s) in README and JOSS metadata -
✓Academic publication links
Links to: arxiv.org, researchgate.net, sciencedirect.com, joss.theoj.org -
○Committers with academic emails
-
○Institutional organization owner
-
✓JOSS paper metadata
Published in Journal of Open Source Software
Keywords
Scientific Fields
Repository
Discrete choice modeling in Python with large datasets & models - Assortment & Pricing Optimization .
Basic Info
- Host: GitHub
- Owner: artefactory
- License: mit
- Language: Python
- Default Branch: main
- Homepage: https://artefactory.github.io/choice-learn
- Size: 34.6 MB
Statistics
- Stars: 72
- Watchers: 3
- Forks: 7
- Open Issues: 25
- Releases: 14
Topics
Metadata Files
README.md
Choice-Learn is a Python package designed to help you formulate, estimate, and deploy discrete choice models, e.g., for assortment planning. The package provides ready-to-use datasets and models studied in the academic literature. It also provides a lower level use if you wish to customize the specification of the choice model or formulate your own model from scratch. Choice-Learn efficiently handles large-scale choice data by limiting RAM usage.
Choice-Learn uses NumPy and pandas as data backend engines and TensorFlow for models.
:trident: Table of Contents
- Introduction - Discrete Choice modeling
- What's in there ?
- Getting Started
- Installation
- Usage
- Documentation
- Contributing
- Citation
- References
:trident: Introduction - Discrete Choice modeling
Discrete choice models aim at explaining or predicting choices over a set of alternatives. Well known use-cases include analyzing people's choice of mean of transport or products purchases in stores.
If you are new to choice modeling, you can check this resource. The different notebooks from the Getting Started section can also help you understand choice modeling and more importantly help you for your usecase.
:trident: What's in there ?
Data
- The ChoiceDataset class can handle choice datasets with efficient memory management. It can be used on your own dataset. [Example]
- Many academic datasets are integrated in the library and ready to be used:
| Dataset | Raw Data | Origin | from choice_learn.datasets import | Doc | | ---------- | :----: | ------ | ------ | :---: | | SwissMetro | csv | Bierlaire et al. (2001) [2] | load_swissmetro | # | | ModeCanada | csv | Forinash and Koppelman (1993) [3] | load_modecanada | # | | Train | csv | Ben-Akiva et al. (1993) [5] |load_train | # | | Heating | csv | Kenneth Train's website | load_heating | # | | HC | csv | Kenneth Train's website | load_hc | # | | Electricity | csv | Kenneth Train's website | load_electricity | # | | Stated Car Preferences | csv | McFadden and Train (2000) [9] | loadcarpreferences | # | | TaFeng Grocery Dataset | csv | Kaggle | load_tafeng | # | | ICDM-2013 Expedia | url | Ben Hamner and Friedman (2013) [6] | load_expedia | # | | London Passenger Mode Choice | url | Hillel et al. (2018) [11] | load_londonpassenger | # |
Model estimation
- Different models are already implemented. You can import and parametrize the models for your own usage.
- Otherwise, custom modeling is made easy by subclassing the ChoiceModel class and specifying your own utility function. [Example]
List of implemented & ready-to-use models:
| Model | Example | Colab | Related Paper | from choice_learn.models import | Doc |
| ---------- | -------- | -------- | ------ | ------ | :---: |
| MNL | notebook | | | SimpleMNL | # |
| Conditional Logit | notebook |
| Train et al. [4] | ConditionalLogit | # |
| Nested Logit | notebook |
| McFadden [10] | NestedLogit | # |
| Latent Class MNL | notebook |
| | LatentClassConditionalLogit | # |
| Halo MNL | notebook |
| Maragheh et al. [14] | HaloMNL | # |
| Low-Rank Halo MNL | notebook |
| Ko and Li [15] | LowRankHaloMNL | # |
| NN-based Model | Example | Colab | Related Paper | from choice_learn.models import | Doc |
| ---------- | -------- | ------ | ---- | ------ | :---: |
| RUMnet| notebook | | Aouad and Désir [1] | RUMnet | # |
| TasteNet | notebook |
| Han et al. [7] | TasteNet | # |
| Learning-MNL | notebook |
| Sifringer et al. [13] | LearningMNL | # |
| ResLogit | notebook |
| Wong and Farooq [12] | ResLogit | # |
| Basket Model | Example | Colab | Related Paper | from choicelearn.basketmodels import | Doc |
| ---------- | -------- | ------ | ---- | ------ | :---: |
| Shopper | notebook | | Ruiz et al. [16] | Shopper | # |
| Alea Carta | notebook |
| Désir et al. [17] | AleaCarta | # |
| Base Attention | notebook |
| Wang et al. [18] | AttentionBasedContextEmbedding | # |
Auxiliary tools
Algorithms leveraging choice models are integrated within the library:
- Assortment & Pricing optimization algorithms [Example] [8]
:trident: Getting Started
You can find the following tutorials to help you getting started with the package:
- Generic and simple introduction [notebook][doc]
- Detailed explanations of data handling depending on the data format [noteboook][doc]
- A detailed example of conditional logit estimation [notebook][doc]
- Introduction to custom modeling and more complex parametrization [notebook][doc]
- All models and algorithms have a companion example in the notebook directory
:trident: Installation
User installation
To install the required packages in a virtual environment, run the following command:
bash
make install
The easiest is to pip-install the package:
bash
pip install choice-learn
Otherwise you can use the git repository to get the latest version:
bash
git clone git@github.com:artefactory/choice-learn.git
Dependencies
For manual installation, Choice-Learn requires the following: - Python (>=3.9, <3.13) - NumPy (>=1.24) - pandas (>=1.5)
For modeling you need: - TensorFlow (>=2.14, <2.17)
:warning: Warning: If you are a MAC user with a M1 or M2 chip, importing TensorFlow might lead to Python crashing. In such case, use anaconda to install TensorFlow with
conda install -c apple tensorflow.
An optional requirement used for coefficients analysis and L-BFGS optimization is: - TensorFlow Probability (>=0.22)
Finally for pricing or assortment optimization, you need either Gurobi or OR-Tools: - gurobipy (>=11.0) - ortools (>=9.6)
:bulb: Tip: You can use the poetry.lock or requirements-complete.txt files with poetry or pip to install a fully predetermined and working environment.
:trident: Usage
Here is a short example of model parametrization to estimate a Conditional Logit on the ModeCanada dataset.
```python from choicelearn.data import ChoiceDataset from choicelearn.models import ConditionalLogit, RUMnet from choicelearn.datasets import loadmodecanada
transportdf = loadmodecanada(as_frame=True)
Instantiation of a ChoiceDataset from a pandas.DataFrame
dataset = ChoiceDataset.fromsinglelongdf(df=transportdf, itemsidcolumn="alt", choicesidcolumn="case", choicescolumn="choice", sharedfeaturescolumns=["income"], itemsfeaturescolumns=["cost", "freq", "ovt", "ivt"], choiceformat="one_zero")
Initialization of the model
model = ConditionalLogit()
Creation of the different weights:
addcoefficients adds one coefficient for each specified itemindex
intercept, and income are added for each item except the first one that needs to be zeroed
model.addcoefficients(featurename="intercept", itemsindexes=[1, 2, 3]) model.addcoefficients(featurename="income", itemsindexes=[1, 2, 3]) model.addcoefficients(featurename="ivt", items_indexes=[0, 1, 2, 3])
addsharedcoefficient add one coefficient that is used for all items specified in the items_indexes:
Here, cost, freq and ovt coefficients are shared between all items
model.addsharedcoefficient(featurename="cost", itemsindexes=[0, 1, 2, 3]) model.addsharedcoefficient(featurename="freq", itemsindexes=[0, 1, 2, 3]) model.addsharedcoefficient(featurename="ovt", itemsindexes=[0, 1, 2, 3])
history = model.fit(dataset, get_report=True) print("The average neg-loglikelihood is:", model.evaluate(dataset).numpy()) print(model.report) ```
:trident: Documentation
A detailed documentation of this project is available here.\ TensorFlow also has extensive documentation that can help you.\ An academic paper has been published in the Journal of Open-Source Software, here.
:trident: Contributing
You are welcome to contribute to the project ! You can help in various ways: - raise issues - resolve issues already opened - develop new features - provide additional examples of use - fix typos, improve code quality - develop new tests
We recommend to first open an issue to discuss your ideas. More details are given here.
:trident: Citation
If you consider this package or any of its feature useful for your research, consider citing our paper:
bash
@article{Auriau2024,
doi = {10.21105/joss.06899},
url = {https://doi.org/10.21105/joss.06899},
year = {2024},
publisher = {The Open Journal},
volume = {9},
number = {101},
pages = {6899},
author = {Vincent Auriau and Ali Aouad and Antoine Désir and Emmanuel Malherbe},
title = {Choice-Learn: Large-scale choice modeling for operational contexts through the lens of machine learning},
journal = {Journal of Open Source Software} }
If you make use of the AleaCarta model [17], consider citing the corresponding paper:
bash
@article{Desir2025
doi = {},
url = {},
year = {2025},
publisher = {},
volume = {},
number = {},
pages = {},
author = {Jules Désir and Vincent Auriau and Martin Možina and Emmanuel Malherbe},
title = {Better Capturing Interactions between Products in Retail: Revisited Negative Sampling for Basket Choice Modeling},
journal = {} }
License
The use of this software is under the MIT license, with no limitation of usage, including for commercial applications.
Affiliations
Choice-Learn has been developed through a collaboration between researchers at the Artefact Research Center and the laboratory MICS from CentraleSupélec, Université Paris Saclay.
:trident: References
Papers
[1]Representing Random Utility Choice Models with Neural Networks, Aouad, A.; Désir, A. (2022)\ [2]The Acceptance of Model Innovation: The Case of Swissmetro, Bierlaire, M.; Axhausen, K., W.; Abay, G. (2001)\ [3]Applications and Interpretation of Nested Logit Models of Intercity Mode Choice, Forinash, C., V.; Koppelman, F., S. (1993)\ [4]The Demand for Local Telephone Service: A Fully Discrete Model of Residential Calling Patterns and Service Choices, Train K., E.; McFadden, D., L.; Moshe, B. (1987)\ [5] Estimation of Travel Choice Models with Randomly Distributed Values of Time, Ben-Akiva, M.; Bolduc, D.; Bradley, M. (1993)\ [6] Personalize Expedia Hotel Searches - ICDM 2013, Ben Hamner, A.; Friedman, D.; SSA_Expedia. (2013)\ [7] A Neural-embedded Discrete Choice Model: Learning Taste Representation with Strengthened Interpretability, Han, Y.; Calara Oereuran F.; Ben-Akiva, M.; Zegras, C. (2020)\ [8] A branch-and-cut algorithm for the latent-class logit assortment problem, Méndez-Díaz, I.; Miranda-Bront, J. J.; Vulcano, G.; Zabala, P. (2014)\ [9] Stated Preferences for Car Choice in Mixed MNL models for discrete response., McFadden, D. and Kenneth Train (2000)\ [10] Modeling the Choice of Residential Location, McFadden, D. (1978)\ [11] Recreating passenger mode choice-sets for transport simulation: A case study of London, UK, Hillel, T.; Elshafie, M. Z. E. B.; Jin, Y. (2018)\ [12] ResLogit: A residual neural network logit model for data-driven choice modelling, Wong, M.; Farooq, B. (2021)\ [13] Enhancing Discrete Choice Models with Representation Learning, Sifringer, B.; Lurkin, V.; Alahi, A. (2018)\ [14] A Customer Choice Model with HALO Effect, Maragheh, R., Y.; Chronopoulou, A.; Davis, J., M. (2018)\ [15] Modeling Choice via Self-Attention, Ko, J.; Li, A., A. (2023)\ [16] SHOPPER: A Probabilistic Model of Consumer Choice with Substitutes and Complements, Ruiz, F. J. R.; Athey, S.; Blei, D. M. (2019)\ [17] Better Capturing Interactions between Products in Retail: Revisited Negative Sampling for Basket Choice Modeling, Désir, J.; Auriau, V.; Možina, M.; Malherbe, E. (2025), ECML PKDDD\ [18] Attention-based Transactional Context Embedding for Next-Item Recommendation, Wans, S.; Liang, H.; Longbing,C.; Xiaoshui, H.; Defu, L.; Wei, L. (2018)
Code and Repositories
Official models implementations:
[1] RUMnet\ [7] TasteNet [Repo1] [Repo2]\ [12] ResLogit\ [13] Learning-MNL\ [16] Shopper\ [17] AleaCarta
Owner
- Name: artefactory
- Login: artefactory
- Kind: organization
- Repositories: 12
- Profile: https://github.com/artefactory
JOSS Publication
Choice-Learn: Large-scale choice modeling for operational contexts through the lens of machine learning
Authors
Tags
choice operations machine learningCitation (CITATION.bib)
@article{Auriau2024,
doi = {10.21105/joss.06899},
url = {https://doi.org/10.21105/joss.06899},
year = {2024},
publisher = {The Open Journal},
volume = {9},
number = {101},
pages = {6899},
author = {Vincent Auriau and Ali Aouad and Antoine Désir and Emmanuel Malherbe},
title = {Choice-Learn: Large-scale choice modeling for operational contexts through the lens of machine learning},
journal = {Journal of Open Source Software}
}
GitHub Events
Total
- Create event: 60
- Release event: 3
- Issues event: 30
- Watch event: 31
- Delete event: 58
- Member event: 2
- Issue comment event: 227
- Push event: 321
- Pull request review comment event: 165
- Pull request review event: 74
- Pull request event: 101
- Fork event: 5
Last Year
- Create event: 60
- Release event: 3
- Issues event: 30
- Watch event: 31
- Delete event: 58
- Member event: 2
- Issue comment event: 229
- Push event: 321
- Pull request review comment event: 174
- Pull request review event: 75
- Pull request event: 101
- Fork event: 5
Committers
Last synced: 5 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| VincentAuriau | a****t@g****m | 625 |
| Jules DÉSIR | d****s@g****m | 65 |
| pre-commit-ci[bot] | 6****] | 12 |
| Emmanuel MALHERBE | e****e@a****m | 9 |
| Emmanuel MALHERBE | e****e@F****l | 5 |
| ma-aouad | a****a@g****m | 2 |
| chicham | h****o@g****m | 2 |
| Luca Serra | l****a@h****r | 1 |
| Scaffolder | s****r@b****o | 1 |
| Emmanuel MALHERBE | e****e@F****l | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 4 months ago
All Time
- Total issues: 54
- Total pull requests: 238
- Average time to close issues: 24 days
- Average time to close pull requests: 3 days
- Total issue authors: 5
- Total pull request authors: 8
- Average comments per issue: 0.33
- Average comments per pull request: 1.86
- Merged pull requests: 214
- Bot issues: 1
- Bot pull requests: 25
Past Year
- Issues: 25
- Pull requests: 112
- Average time to close issues: 14 days
- Average time to close pull requests: 5 days
- Issue authors: 3
- Pull request authors: 5
- Average comments per issue: 0.32
- Average comments per pull request: 3.51
- Merged pull requests: 95
- Bot issues: 1
- Bot pull requests: 25
Top Authors
Issue Authors
- VincentAuriau (47)
- tmigot (2)
- chicham (2)
- samuelduchesne (2)
- pre-commit-ci[bot] (1)
Pull Request Authors
- VincentAuriau (190)
- pre-commit-ci[bot] (23)
- julesdesir (13)
- chicham (6)
- ma-aouad (2)
- dependabot[bot] (2)
- luca-serra (1)
- EmmanuelMalherbe (1)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 169 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 11
- Total maintainers: 1
pypi.org: choice-learn
Large-scale choice modeling through the lens of machine learning.
- Homepage: https://github.com/artefactory/choice-learn
- Documentation: https://artefactory.github.io/choice-learn
- License: MIT
-
Latest release: 1.2.0
published 6 months ago
Rankings
Maintainers (1)
Dependencies
- actions/checkout v2 composite
- actions/setup-python v2 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- actions/checkout v2 composite
- actions/setup-python v2 composite
- codespell ^2.2 develop
- ipykernel ^6.9 develop
- nbstripout ^0.7 develop
- pre-commit ^3.3 develop
- pytest ^7.3.2 develop
- pytest-cov ^4.1 develop
- python-markdown-math ^0.8 develop
- ruff ^0.1.2 develop
- bandit ^1.7.5 docs
- mkdocs ^1.5 docs
- mkdocs-material ^9.5.3 docs
- mkdocs-nbconvert ^0.2.1 docs
- mkdocstrings-python ^1.7.5 docs
- nbstripout ^0.6.1 docs
- python-markdown-math ^0.8 docs
- numpy ^1.24.3
- pandas ^1.5.3
- python ^3.8
- tensorflow ^2.11.0
- tensorflow-probability ^0.20.1
- tqdm ^4.0.0
- Bottleneck ==1.3.7
- Brotli ==1.0.9
- Keras-Preprocessing ==1.1.2
- Markdown ==3.4.1
- MarkupSafe ==2.1.3
- PyJWT ==2.4.0
- PySocks ==1.7.1
- Pygments ==2.17.2
- Werkzeug ==2.3.8
- absl-py ==1.4.0
- aiohttp ==3.9.3
- aiosignal ==1.2.0
- appnope ==0.1.4
- asttokens ==2.4.1
- astunparse ==1.6.3
- async-timeout ==4.0.3
- attrs ==23.1.0
- backcall ==0.2.0
- blinker ==1.6.2
- cachetools ==4.2.2
- certifi ==2024.2.2
- cffi ==1.16.0
- charset-normalizer ==2.0.4
- click ==8.1.7
- cloudpickle ==2.2.1
- comm ==0.2.2
- cryptography ==41.0.3
- debugpy ==1.6.7
- decorator ==5.1.1
- dm-tree ==0.1.7
- executing ==2.0.1
- flatbuffers ==2.0
- frozenlist ==1.4.0
- gast ==0.4.0
- google-auth ==2.6.0
- google-auth-oauthlib ==0.4.4
- google-pasta ==0.2.0
- grpcio ==1.42.0
- h5py ==3.9.0
- idna ==3.4
- importlib_metadata ==7.0.2
- ipykernel ==6.29.3
- ipython ==8.12.0
- jax ==0.3.25
- jaxlib ==0.3.25
- jedi ==0.19.1
- jupyter_client ==8.6.1
- jupyter_core ==5.7.2
- keras ==2.11.0
- matplotlib-inline ==0.1.6
- multidict ==6.0.4
- nest_asyncio ==1.6.0
- numexpr ==2.8.4
- numpy ==1.24.3
- oauthlib ==3.2.2
- opt-einsum ==3.3.0
- packaging ==24.0
- pandas ==2.0.3
- parso ==0.8.3
- pexpect ==4.9.0
- pickleshare ==0.7.5
- pip ==23.3.1
- platformdirs ==4.2.0
- pooch ==1.7.0
- prompt-toolkit ==3.0.42
- protobuf ==3.20.3
- psutil ==5.9.8
- ptyprocess ==0.7.0
- pure-eval ==0.2.2
- pyOpenSSL ==23.2.0
- pyasn1 ==0.4.8
- pyasn1-modules ==0.2.8
- pycparser ==2.21
- python-dateutil ==2.8.2
- pytz ==2023.3.post1
- pyzmq ==24.0.1
- requests ==2.31.0
- requests-oauthlib ==1.3.0
- rsa ==4.7.2
- scipy ==1.10.1
- setuptools ==68.2.2
- six ==1.16.0
- stack-data ==0.6.2
- tensorboard ==2.11.0
- tensorboard-data-server ==0.6.1
- tensorboard-plugin-wit ==1.6.0
- tensorflow ==2.11.0
- tensorflow-estimator ==2.11.0
- tensorflow-probability ==0.19.0
- termcolor ==2.1.0
- tornado ==6.4
- tqdm ==4.65.0
- traitlets ==5.14.2
- typing_extensions ==4.10.0
- tzdata ==2023.3
- urllib3 ==2.1.0
- wcwidth ==0.2.13
- wheel ==0.35.1
- wrapt ==1.14.1
- yarl ==1.9.3
- zipp ==3.17.0
- bandit ==1.7.5 development
- ipykernel ==6.24.0 development
- mkdocs ==1.5.3 development
- mkdocs-material ==9.5.3 development
- mkdocs-nbconvert ==0.2.1 development
- mkdocstrings-python ==1.7.5 development
- nbstripout ==0.6.1 development
- pre-commit ==3.3.3 development
- pytest ==7.3.2 development
- python-markdown-math * development
- ruff ==0.1.2 development
- numpy ==1.24.3
- pandas ==1.5.3
- tensorflow ==2.13.0
- tensorflow_probability ==0.20.1
- tqdm ==4.65.0
- actions/checkout v3 composite
- actions/setup-python v4 composite
- ./.github/actions/publish * composite
- actions/checkout v4 composite
- actions/setup-python v5 composite
- ./.github/actions/publish * composite
- actions/checkout v4 composite
- 138 dependencies
