https://github.com/gsganden/model_inspector

A uniform interface to a curated set of methods for inspecting machine learning models

Science Score: 13.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
○
Academic publication links
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (7.4%) to scientific vocabulary

Keywords

data-science machine-learning scikit-learn visualization

Last synced: 5 months ago · JSON representation

Repository

A uniform interface to a curated set of methods for inspecting machine learning models

Basic Info

Host: GitHub
Owner: gsganden
License: apache-2.0
Language: Jupyter Notebook
Default Branch: main
Homepage: https://gsganden.github.io/model_inspector/
Size: 190 MB

Statistics

Stars: 4
Watchers: 2
Forks: 0
Open Issues: 18
Releases: 0

Topics

data-science machine-learning scikit-learn visualization

Created over 5 years ago · Last pushed over 2 years ago

Metadata Files

Readme Contributing License

Model Inspector

model_inspector aims to help you train better scikit-learn-compatible models by providing insights into their behavior.

Use

To use model_inspector, you create an Inspector object from a scikit-learn model, a feature DataFrame X, and a target Series y. Typically you will want to create it on held-out data, as shown below.

``` python import sklearn.datasets from sklearn.ensemble import RandomForestRegressor from sklearn.modelselection import traintest_split

from modelinspector import getinspector ```

python X, y = sklearn.datasets.load_diabetes(return_X_y=True, as_frame=True)

python X

	age	sex	bmi	bp	s1	s2	s3	s4	s5	s6
0	0.038076	0.050680	0.061696	0.021872	-0.044223	-0.034821	-0.043401	-0.002592	0.019907	-0.017646
1	-0.001882	-0.044642	-0.051474	-0.026328	-0.008449	-0.019163	0.074412	-0.039493	-0.068332	-0.092204
2	0.085299	0.050680	0.044451	-0.005670	-0.045599	-0.034194	-0.032356	-0.002592	0.002861	-0.025930
3	-0.089063	-0.044642	-0.011595	-0.036656	0.012191	0.024991	-0.036038	0.034309	0.022688	-0.009362
4	0.005383	-0.044642	-0.036385	0.021872	0.003935	0.015596	0.008142	-0.002592	-0.031988	-0.046641
...	...	...	...	...	...	...	...	...	...	...
437	0.041708	0.050680	0.019662	0.059744	-0.005697	-0.002566	-0.028674	-0.002592	0.031193	0.007207
438	-0.005515	0.050680	-0.015906	-0.067642	0.049341	0.079165	-0.028674	0.034309	-0.018114	0.044485
439	0.041708	0.050680	-0.015906	0.017293	-0.037344	-0.013840	-0.024993	-0.011080	-0.046883	0.015491
440	-0.045472	-0.044642	0.039062	0.001215	0.016318	0.015283	-0.028674	0.026560	0.044529	-0.025930
441	-0.045472	-0.044642	-0.073030	-0.081413	0.083740	0.027809	0.173816	-0.039493	-0.004222	0.003064

442 rows × 10 columns

python y

0      151.0
1       75.0
2      141.0
3      206.0
4      135.0
       ...  
437    178.0
438    104.0
439    132.0
440    220.0
441     57.0
Name: target, Length: 442, dtype: float64

python X_train, X_test, y_train, y_test = train_test_split(X, y)

python rfr = RandomForestRegressor().fit(X_train, y_train)

python rfr.score(X_test, y_test)

0.4145806969881506

python inspector = get_inspector(rfr, X_test, y_test)

You can then use various methods of inspector to learn about how your model behaves on that data.

The methods that are available for a given inspector depends on the types of its estimator and its target y. An attribute called methods tells you what they are:

python inspector.methods

['plot_feature_clusters',
 'plot_partial_dependence',
 'permutation_importance',
 'plot_permutation_importance',
 'plot_pred_vs_act',
 'plot_residuals',
 'show_correlation']

python ax = inspector.plot_feature_clusters()

python most_important_features = inspector.permutation_importance().index[:2] axes = inspector.plot_partial_dependence( features=[*most_important_features, most_important_features] ) axes[0, 0].get_figure().set_size_inches(12, 3)

python inspector.permutation_importance()

bmi    0.241886
s5     0.153085
sex    0.003250
s3     0.000734
bp     0.000461
s4    -0.002687
s2    -0.004366
s1    -0.008953
s6    -0.018925
age   -0.022768
dtype: float64

python ax = inspector.plot_permutation_importance()

python ax = inspector.plot_pred_vs_act()

python axes = inspector.plot_residuals()

python inspector.show_correlation()

	age	sex	bmi	bp	s1	s2	s3	s4	s5	s6	target
age	1.00	0.22	0.18	0.19	0.23	0.18	-0.04	0.19	0.28	0.32	0.13
sex	0.22	1.00	0.29	0.31	-0.05	0.08	-0.41	0.30	0.13	0.27	0.27
bmi	0.18	0.29	1.00	0.55	0.16	0.18	-0.43	0.45	0.43	0.49	0.66
bp	0.19	0.31	0.55	1.00	0.09	0.04	-0.20	0.19	0.36	0.44	0.51
s1	0.23	-0.05	0.16	0.09	1.00	0.88	0.07	0.57	0.50	0.26	0.09
s2	0.18	0.08	0.18	0.04	0.88	1.00	-0.16	0.66	0.23	0.18	0.09
s3	-0.04	-0.41	-0.43	-0.20	0.07	-0.16	1.00	-0.72	-0.37	-0.30	-0.46
s4	0.19	0.30	0.45	0.19	0.57	0.66	-0.72	1.00	0.60	0.41	0.41
s5	0.28	0.13	0.43	0.36	0.50	0.23	-0.37	0.60	1.00	0.52	0.46
s6	0.32	0.27	0.49	0.44	0.26	0.18	-0.30	0.41	0.52	1.00	0.35
target	0.13	0.27	0.66	0.51	0.09	0.09	-0.46	0.41	0.46	0.35	1.00

Scope

model_inspector makes some attempt to support estimators from popular libraries other than scikit-learn that mimic the scikit-learn interface. The following estimators are specifically supported:

From catboost:
- CatBoostClassifier
- CatBoostRegressor
From lightgbm:
- LGBMClassifier
- LGBMRegressor
From xgboost:
- XGBClassifier
- XGBRegressor

Install

pip install model_inspector

Alternatives

Yellowbrick

Yellowbrick is similar to Model Inspector in that it provides tools for visualizing the behavior of scikit-learn models.

The two libraries have different designs. Yellowbrick uses Visualizer objects, each class of which corresponds to a single type of visualization. The Visualizer interface is similar to the scikit-learn transformer and estimator interfaces. In constrast, model_inspector uses Inspector objects that bundle together a scikit-learn model, an X feature DataFrame, and a y target Series. The Inspector object does the work of identifying appropriate visualization types for the specific model and dataset in question and exposing corresponding methods, making it easy to visualize a given model for a given dataset in a variety of ways.

Another fundamental difference is that Yellowbrick is framed as a machine learning visualization library, while Model Inspector treats visualization as just one approach to inspecting the behavior of machine learning models.

SHAP

SHAP is another library that provides a set of tools for understanding the behavior of machine learning models. It has a somewhat similar design to Model Inspector in that it uses Explainer objects to provide access to methods that are appropriate for a given model. It has broader scope than Model Inspector in that it supports models from frameworks such as PyTorch and TensorFlow. It has narrower scope in that it only implements methods based on Shapley values.

Acknowledgments

Many aspects of this library were inspired by FastAI courses, including bundling together a model with data in a class and providing certain specific visualization methods such as feature importance bar plots, feature clusters dendrograms, tree diagrams, waterfall plots, and partial dependence plots. Its primary contribution is to make all of these methods available in a single convenient interface.

Owner

Name: Greg Gandenberger
Login: gsganden
Kind: user
Location: Chicago, IL
Company: @cruise-automation

Website: http://www.gandenberger.org/
Repositories: 46
Profile: https://github.com/gsganden

Data scientist modeling autonomous vehicle safety

GitHub Events

Total

Issues event: 1

Last Year

Issues event: 1

Committers

Last synced: over 1 year ago

All Time

Total Commits: 330
Total Committers: 4
Avg Commits per committer: 82.5
Development Distribution Score (DDS): 0.267

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
gsganden	g**n@g**m	242
greg.gandenberger	g**r@g**m	65
gsganden	g**g@g**g	22
Greg Gandenberger	g**r@s**m	1

Committer Domains (Top 20 + Academic)

shoprunner.com: 1 gandenberger.org: 1 getcruise.com: 1

Issues and Pull Requests

Last synced: 8 months ago

All Time

Total issues: 30
Total pull requests: 25
Average time to close issues: 8 months
Average time to close pull requests: 2 months
Total issue authors: 1
Total pull request authors: 2
Average comments per issue: 0.5
Average comments per pull request: 0.88
Merged pull requests: 12
Bot issues: 0
Bot pull requests: 12

Past Year

Issues: 1
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 1
Pull request authors: 0
Average comments per issue: 0.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

gsganden (30)

Pull Request Authors

dependabot[bot] (12)
gsganden (12)

Top Labels

Issue Labels

enhancement (3) bug (1) good first issue (1)

Pull Request Labels

dependencies (12)

Packages

Total packages: 1
Total downloads:
- pypi 124 last-month

Total dependent packages: 0
Total dependent repositories: 1
Total versions: 99
Total maintainers: 1

pypi.org: model-inspector

Inspect machine learning models

Homepage: https://github.com/gsganden/model_inspector/
Documentation: https://model-inspector.readthedocs.io/
License: Apache Software License 2.0
Latest release: 0.27.4
published almost 3 years ago

Versions: 99
Dependent Packages: 0
Dependent Repositories: 1
Downloads: 124 Last month

Rankings

Dependent packages count: 10.1%

Downloads: 20.4%

Average: 21.0%

Dependent repos count: 21.6%

Stargazers count: 23.1%

Forks count: 29.8%

Maintainers (1)

gsganden

Last synced: 6 months ago

Dependencies

.github/workflows/deploy.yaml actions

fastai/workflows/quarto-ghp master composite

.github/workflows/test.yaml actions

fastai/workflows/nbdev-ci master composite

docker-compose.yml docker

pyproject.toml pypi

setup.py pypi

https://github.com/gsganden/model_inspector

Science Score: 13.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

Model Inspector

Use

Scope

Install

Alternatives

Yellowbrick

SHAP

Acknowledgments

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: model-inspector

Rankings

Maintainers (1)

Dependencies