skada
Domain adaptation toolbox compatible with scikit-learn and pytorch
Science Score: 77.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
✓DOI references
Found 3 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org, scholar.google, zenodo.org -
✓Committers with academic emails
3 of 17 committers (17.6%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.3%) to scientific vocabulary
Keywords
Keywords from Contributors
Repository
Domain adaptation toolbox compatible with scikit-learn and pytorch
Basic Info
- Host: GitHub
- Owner: scikit-adaptation
- License: bsd-3-clause
- Language: Python
- Default Branch: main
- Homepage: https://scikit-adaptation.github.io/
- Size: 1.97 MB
Statistics
- Stars: 132
- Watchers: 6
- Forks: 34
- Open Issues: 73
- Releases: 5
Topics
Metadata Files
README.md
SKADA - Domain Adaptation with scikit-learn and PyTorch
SKADA is a library for domain adaptation (DA) with a scikit-learn and PyTorch/skorch compatible API with the following features:
- DA estimators and transformers with a scikit-learn compatible API (fit, transform, predict).
- PyTorch/skorch API for deep learning DA algorithms.
- Classifier/Regressor and data Adapter DA algorithms compatible with scikit-learn pipelines.
- Compatible with scikit-learn validation loops (crossvalscore, GridSearchCV, etc).
Citation: If you use this library in your research, please cite the following reference:
Gnassounou T., Kachaiev O., Flamary R., Collas A., Lalou Y., de Mathelin A., Gramfort A., Bueno R., Michel F., Mellot A., Loison V., Odonnat A., Moreau T. (2024). SKADA : Scikit Adaptation (version 0.3.0). URL: https://scikit-adaptation.github.io/
or in Bibtex format :
bibtex
@misc{gnassounou2024skada,
author = {Gnassounou, Théo and Kachaiev, Oleksii and Flamary, Rémi and Collas, Antoine and Lalou, Yanis and de Mathelin, Antoine and Gramfort, Alexandre and Bueno, Ruben and Michel, Florent and Mellot, Apolline and Loison, Virginie and Odonnat, Ambroise and Moreau, Thomas},
month = {7},
title = {SKADA : Scikit Adaptation},
url = {https://scikit-adaptation.github.io/},
year = {2024}
}
Implemented algorithms
The following algorithms are currently implemented.
Domain adaptation algorithms
- Sample reweighting methods (Gaussian [1], Discriminant [2], KLIEPReweight [3], DensityRatio [4], TarS [21], KMMReweight [23])
- Sample mapping methods (CORAL [5], Optimal Transport DA OTDA [6], LinearMonge [7], LS-ConS [21])
- Subspace methods (SubspaceAlignment [8], TCA [9], Transfer Subspace Learning [27])
- Other methods (JDOT [10], DASVM [11], OT Label Propagation [28])
Any methods that can be cast as an adaptation of the input data can be used in one of two ways:
- a scikit-learn transformer (Adapter) which provides both a full Classifier/Regressor estimator
- or an Adapter that can be used in a DA pipeline with make_da_pipeline.
Refer to the examples below and visit the galleryfor more details.
Deep learning domain adaptation algorithms
- Deep Correlation alignment (DeepCORAL [12])
- Deep joint distribution optimal (DeepJDOT [13])
- Divergence minimization (MMD/DAN [14])
- Adversarial/discriminator based DA (DANN [15], CDAN [16])
DA metrics
- Importance Weighted [17]
- Prediction entropy [18]
- Soft neighborhood density [19]
- Deep Embedded Validation (DEV) [20]
- Circular Validation [11]
Installation
The library is not yet available on PyPI. You can install it from the source code.
python
pip install git+https://github.com/scikit-adaptation/skada
Short examples
We provide here a few examples to illustrate the use of the library. For more details, please refer to this example, the quick start guide and the gallery.
First, the DA data in the SKADA API is stored in the following format:
python
X, y, sample_domain
Where X is the input data, y is the target labels and sample_domain is the
domain labels (positive for source and negative for target domains). We provide
below an example ho how to fit a DA estimator:
```python from skada import CORAL
da = CORAL() da.fit(X, y, sampledomain=sampledomain) # sample_domain passed by name
ypred = da.predict(Xt) # predict on test data ```
One can also use Adapter classes to create a full pipeline with DA:
```python from skada import CORALAdapter, makedapipeline from sklearn.preprocessing import StandardScaler from sklearn.linear_model import LogisticRegression
pipe = makedapipeline(StandardScaler(), CORALAdapter(), LogisticRegression())
pipe.fit(X, y, sampledomain=sampledomain) # sample_domain passed by name ```
Please note that for Adapter classes that implement sample reweighting, the
subsequent classifier/regressor must require sampleweights as input. This is
done with the `setfitrequiresmethod. For instance, withLogisticRegression, you
would useLogisticRegression().setfitrequires('sampleweight')`:
python
from skada import GaussianReweightAdapter, make_da_pipeline
pipe = make_da_pipeline(GaussianReweightAdapter(),
LogisticRegression().set_fit_request(sample_weight=True))
Finally SKADA can be used for cross validation scores estimation and hyperparameter selection :
```python from sklearn.modelselection import crossvalscore, GridSearchCV from sklearn.preprocessing import StandardScaler from sklearn.linearmodel import LogisticRegression
from skada import CORALAdapter, makedapipeline from skada.model_selection import SourceTargetShuffleSplit from skada.metrics import PredictionEntropyScorer
make pipeline
pipe = makedapipeline(StandardScaler(), CORALAdapter(), LogisticRegression())
split and score
cv = SourceTargetShuffleSplit() scorer = PredictionEntropyScorer()
cross val score
scores = crossvalscore(pipe, X, y, params={'sampledomain': sampledomain}, cv=cv, scoring=scorer)
grid search
paramgrid = {'coraladapterreg': [0.1, 0.5, 0.9]} gridsearch = GridSearchCV(estimator=pipe, paramgrid=paramgrid, cv=cv, scoring=scorer)
gridsearch.fit(X, y, sampledomain=sample_domain) ```
Acknowledgements
This toolbox has been created and is maintained by the SKADA team that includes the following members:
- Théo Gnassounou
- Oleksii Kachaiev
- Rémi Flamary
- Antoine Collas
- Yanis Lalou
- Antoine de Mathelin
- Ruben Bueno
SKADA has benefited from the financing or manpower from the following partners:

License
The library is distributed under the 3-Clause BSD license.
References
[1] Shimodaira Hidetoshi. "Improving predictive inference under covariate shift by weighting the log-likelihood function." Journal of statistical planning and inference 90, no. 2 (2000): 227-244.
[2] Sugiyama Masashi, Taiji Suzuki, and Takafumi Kanamori. "Density-ratio matching under the Bregman divergence: a unified framework of density-ratio estimation." Annals of the Institute of Statistical Mathematics 64 (2012): 1009-1044.
[3] Sugiyama Masashi, Taiji Suzuki, Shinichi Nakajima, Hisashi Kashima, Paul Von Bünau, and Motoaki Kawanabe. "Direct importance estimation for covariate shift adaptation." Annals of the Institute of Statistical Mathematics 60 (2008): 699-746.
[4] Sugiyama Masashi, and Klaus-Robert Müller. "Input-dependent estimation of generalization error under covariate shift." (2005): 249-279.
[5] Sun Baochen, Jiashi Feng, and Kate Saenko. "Correlation alignment for unsupervised domain adaptation." Domain adaptation in computer vision applications (2017): 153-171.
[6] Courty Nicolas, Flamary Rémi, Tuia Devis, and Alain Rakotomamonjy. "Optimal transport for domain adaptation." IEEE Trans. Pattern Anal. Mach. Intell 1, no. 1-40 (2016): 2.
[7] Flamary, R., Lounici, K., & Ferrari, A. (2019). Concentration bounds for linear monge mapping estimation and optimal transport domain adaptation. arXiv preprint arXiv:1905.10155.
[8] Fernando, B., Habrard, A., Sebban, M., & Tuytelaars, T. (2013). Unsupervised visual domain adaptation using subspace alignment. In Proceedings of the IEEE international conference on computer vision (pp. 2960-2967).
[9] Pan, S. J., Tsang, I. W., Kwok, J. T., & Yang, Q. (2010). Domain adaptation via transfer component analysis. IEEE transactions on neural networks, 22(2), 199-210.
[10] Courty, N., Flamary, R., Habrard, A., & Rakotomamonjy, A. (2017). Joint distribution optimal transportation for domain adaptation. Advances in neural information processing systems, 30.
[11] Bruzzone, L., & Marconcini, M. (2009). Domain adaptation problems: A DASVM classification technique and a circular validation strategy. IEEE transactions on pattern analysis and machine intelligence, 32(5), 770-787.
[12] Sun, B., & Saenko, K. (2016). Deep coral: Correlation alignment for deep domain adaptation. In Computer Vision–ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part III 14 (pp. 443-450). Springer International Publishing.
[13] Damodaran, B. B., Kellenberger, B., Flamary, R., Tuia, D., & Courty, N. (2018). Deepjdot: Deep joint distribution optimal transport for unsupervised domain adaptation. In Proceedings of the European conference on computer vision (ECCV) (pp. 447-463).
[14] Long, M., Cao, Y., Wang, J., & Jordan, M. (2015, June). Learning transferable features with deep adaptation networks. In International conference on machine learning (pp. 97-105). PMLR.
[15] Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., ... & Lempitsky, V. (2016). Domain-adversarial training of neural networks. Journal of machine learning research, 17(59), 1-35.
[16] Long, M., Cao, Z., Wang, J., & Jordan, M. I. (2018). Conditional adversarial domain adaptation. Advances in neural information processing systems, 31.
[17] Sugiyama, M., Krauledat, M., & Müller, K. R. (2007). Covariate shift adaptation by importance weighted cross validation. Journal of Machine Learning Research, 8(5).
[18] Morerio, P., Cavazza, J., & Murino, V. (2017). Minimal-entropy correlation alignment for unsupervised deep domain adaptation. arXiv preprint arXiv:1711.10288.
[19] Saito, K., Kim, D., Teterwak, P., Sclaroff, S., Darrell, T., & Saenko, K. (2021). Tune it the right way: Unsupervised validation of domain adaptation via soft neighborhood density. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 9184-9193).
[20] You, K., Wang, X., Long, M., & Jordan, M. (2019, May). Towards accurate model selection in deep unsupervised domain adaptation. In International Conference on Machine Learning (pp. 7124-7133). PMLR.
[21] Zhang, K., Schölkopf, B., Muandet, K., Wang, Z. (2013). Domain Adaptation under Target and Conditional Shift. In International Conference on Machine Learning (pp. 819-827). PMLR.
[22] Loog, M. (2012). Nearest neighbor-based importance weighting. In 2012 IEEE International Workshop on Machine Learning for Signal Processing, pages 1–6. IEEE (https://arxiv.org/pdf/2102.02291.pdf)
[23] Domain Adaptation Problems: A DASVM ClassificationTechnique and a Circular Validation StrategyLorenzo Bruzzone, Fellow, IEEE, and Mattia Marconcini, Member, IEEE (https://rslab.disi.unitn.it/papers/R82-PAMI.pdf)
[24] Loog, M. (2012). Nearest neighbor-based importance weighting. In 2012 IEEE International Workshop on Machine Learning for Signal Processing, pages 1–6. IEEE (https://arxiv.org/pdf/2102.02291.pdf)
[25] J. Huang, A. Gretton, K. Borgwardt, B. Schölkopf and A. J. Smola. Correcting sample selection bias by unlabeled data. In NIPS, 2007. (https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=07117994f0971b2fc2df95adb373c31c3d313442)
[26] Long, M., Wang, J., Ding, G., Sun, J., and Yu, P. (2014). Transfer joint matching for unsupervised domain adaptation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1410–1417
[27] S. Si, D. Tao and B. Geng. In IEEE Transactions on Knowledge and Data Engineering, (2010) Bregman Divergence-Based Regularization for Transfer Subspace Learning
[28] Solomon, J., Rustamov, R., Guibas, L., & Butscher, A. (2014, January). Wasserstein propagation for semi-supervised learning. In International Conference on Machine Learning (pp. 306-314). PMLR.
[29] Montesuma, Eduardo Fernandes, and Fred Maurice Ngole Mboula. "Wasserstein barycenter for multi-source domain adaptation." In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 16785-16793. 2021.
[30] Gnassounou, Theo, Rémi Flamary, and Alexandre Gramfort. "Convolution Monge Mapping Normalization for learning on sleep data." Advances in Neural Information Processing Systems 36 (2024).
[31] Redko, Ievgen, Nicolas Courty, Rémi Flamary, and Devis Tuia. "Optimal transport for multi-source domain adaptation under target shift." In The 22nd International Conference on artificial intelligence and statistics, pp. 849-858. PMLR, 2019.
[32] Hu, D., Liang, J., Liew, J. H., Xue, C., Bai, S., & Wang, X. (2023). Mixed Samples as Probes for Unsupervised Model Selection in Domain Adaptation. Advances in Neural Information Processing Systems 36 (2024).
[33] Kang, G., Jiang, L., Yang, Y., & Hauptmann, A. G. (2019). Contrastive Adaptation Network for Unsupervised Domain Adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 4893-4902).
[34] Jin, Ying, Wang, Ximei, Long, Mingsheng, Wang, Jianmin. Minimum Class Confusion for Versatile Domain Adaptation. ECCV, 2020.
[35] Zhang, Y., Liu, T., Long, M., & Jordan, M. I. (2019). Bridging Theory and Algorithm for Domain Adaptation. In Proceedings of the 36th International Conference on Machine Learning, (pp. 7404-7413).
[36] Xiao, Zhiqing, Wang, Haobo, Jin, Ying, Feng, Lei, Chen, Gang, Huang, Fei, Zhao, Junbo.SPA: A Graph Spectral Alignment Perspective for Domain Adaptation. In Neurips, 2023.
[37] Xie, Renchunzi, Odonnat, Ambroise, Feofanov, Vasilii, Deng, Weijian, Zhang, Jianfeng and An, Bo. MaNo: Exploiting Matrix Norm for Unsupervised Accuracy Estimation Under Distribution Shifts. In NeurIPS, 2024.
Owner
- Name: scikit-adaptation
- Login: scikit-adaptation
- Kind: organization
- Repositories: 1
- Profile: https://github.com/scikit-adaptation
Citation (CITATION.cff)
# This CITATION.cff file was generated with cffinit.
# Visit https://bit.ly/cffinit to generate yours today!
cff-version: 1.2.0
title: 'SKADA : Scikit Adaptation'
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Théo
family-names: Gnassounou
affiliation: University Paris-Saclay, Inria, CEA
- given-names: Oleksii
family-names: Kachaiev
- given-names: Rémi
family-names: Flamary
affiliation: 'École Polytechnique, IP Paris'
- given-names: Antoine
family-names: Collas
affiliation: 'University Paris-Saclay, Inria, CEA'
- given-names: Yanis
family-names: Lalou
affiliation: 'École Polytechnique, IP Paris'
- given-names: Antoine
name-particle: de
family-names: Mathelin
affiliation: 'ENS Paris Saclay'
- given-names: Alexandre
family-names: Gramfort
affiliation: 'University Paris-Saclay, Inria, CEA'
- given-names: Ruben
family-names: Bueno
affiliation: 'École Polytechnique, IP Paris'
- given-names: Florent
family-names: Michel
affiliation: 'University Paris-Saclay, Inria, CEA'
- given-names: Apolline
family-names: Mellot
affiliation: 'University Paris-Saclay, Inria, CEA'
- given-names: Virginie
family-names: ' Loison'
affiliation: 'University Paris-Saclay, Inria, CEA'
- given-names: Ambroise
family-names: Odonnat
affiliation: Huawei Noah’s Ark Lab
- given-names: Thomas
family-names: Moreau
affiliation: 'University Paris-Saclay, Inria, CEA'
repository-code: 'https://github.com/scikit-adaptation/skada/'
url: 'https://scikit-adaptation.github.io/'
abstract: >-
Python Domain Adaptation toolbow compatible with
scikit-learn and PyTorch
keywords:
- machine learning
- scikit-learn
- pytorch
- domain adaptation
license: BSD-3-Clause
version: 0.3.0
date-released: 2024-07-04
GitHub Events
Total
- Create event: 7
- Release event: 1
- Issues event: 21
- Watch event: 53
- Delete event: 16
- Issue comment event: 78
- Push event: 62
- Pull request review comment event: 148
- Pull request event: 94
- Pull request review event: 103
- Fork event: 13
Last Year
- Create event: 7
- Release event: 1
- Issues event: 21
- Watch event: 53
- Delete event: 16
- Issue comment event: 78
- Push event: 62
- Pull request review comment event: 148
- Pull request event: 94
- Pull request review event: 103
- Fork event: 13
Committers
Last synced: 9 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Theo Gnassounou | t****u@d****r | 141 |
| tgnassou | t****u@g****m | 115 |
| Oleksii Kachaiev | k****v@g****m | 103 |
| Yanis Lalou | 5****u | 51 |
| Rémi Flamary | r****y@g****m | 31 |
| Theo Gnassounou | t****u@d****r | 22 |
| Theo Gnassounou | t****u@d****r | 21 |
| Antoine Collas | c****t@a****r | 19 |
| antoinedemathelin | 4****n | 8 |
| Ruben Bueno | 6****n | 6 |
| Ambroise Odonnat | a****e@g****m | 4 |
| Florent-Michel | 6****l | 3 |
| Apolline Mellot | 8****t | 3 |
| Alexandre Gramfort | a****t@f****m | 1 |
| Max Barns | 1****e | 1 |
| Thomas Moreau | t****0@g****m | 1 |
| Virginie Loison | 4****n | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 99
- Total pull requests: 310
- Average time to close issues: 2 months
- Average time to close pull requests: 9 days
- Total issue authors: 13
- Total pull request authors: 26
- Average comments per issue: 0.69
- Average comments per pull request: 1.27
- Merged pull requests: 222
- Bot issues: 0
- Bot pull requests: 2
Past Year
- Issues: 28
- Pull requests: 122
- Average time to close issues: about 2 months
- Average time to close pull requests: 5 days
- Issue authors: 10
- Pull request authors: 18
- Average comments per issue: 0.14
- Average comments per pull request: 0.69
- Merged pull requests: 80
- Bot issues: 0
- Bot pull requests: 2
Top Authors
Issue Authors
- kachayev (40)
- tgnassou (21)
- YanisLalou (11)
- antoinedemathelin (8)
- rflamary (8)
- antoinecollas (4)
- eddardd (1)
- ambroiseodt (1)
- BuenoRuben (1)
- calvinmccarter (1)
- flefebv (1)
- mbarneche (1)
- B-Analytics (1)
Pull Request Authors
- YanisLalou (82)
- kachayev (46)
- tgnassou (38)
- rflamary (33)
- antoinecollas (28)
- BuenoRuben (15)
- antoinedemathelin (12)
- mbarneche (10)
- lionelkusch (10)
- ambroiseodt (9)
- apmellot (4)
- arthurdrk (3)
- Florent-Michel (3)
- vloison (2)
- calvinmccarter (2)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 61 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 3
- Total maintainers: 3
pypi.org: skada
A Python package for domain adaptation compatible with scikit-learn and Pytorch.
- Documentation: https://skada.readthedocs.io/
- License: BSD 3-Clause License Copyright (c) 2023 The SKADA developers. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. * Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
-
Latest release: 0.4.0
published about 1 year ago
Rankings
Maintainers (3)
Dependencies
- actions/cache v2 composite
- actions/checkout v2 composite
- actions/setup-python v4 composite
- numpydoc *
- sphinx_gallery *
- sphinx_rtd_theme *
- matplotlib *
- pandas *
- scikit-learn >=1.4.dev0
- scipy *
- larsoner/circleci-artifacts-redirector-action master composite
- POT >= 0.9.3
- numpy >= 1.24
- scikit-learn >= 1.4.0
- scipy >= 1.10
- skorch *
- torch *
- torchvision *