msapy

Hopefully, a compact and general-purpose Python package for Multiperturbation Shapley value Analysis (MSA).

https://github.com/kuffmode/msa

Science Score: 49.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 2 DOI reference(s) in README
✓
Academic publication links
Links to: sciencedirect.com, plos.org
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.3%) to scientific vocabulary

Keywords

artificial-neural-networks brainmapping causality gametheory python shapley-value

Last synced: 6 months ago · JSON representation

Repository

Hopefully, a compact and general-purpose Python package for Multiperturbation Shapley value Analysis (MSA).

Basic Info

Host: GitHub
Owner: kuffmode
License: mit
Language: Python
Default Branch: main
Homepage: https://kuffmode.github.io/msa/
Size: 94 MB

Statistics

Stars: 18
Watchers: 2
Forks: 1
Open Issues: 2
Releases: 8

Topics

artificial-neural-networks brainmapping causality gametheory python shapley-value

Created over 4 years ago · Last pushed 7 months ago

Metadata Files

Readme License Codemeta

TLDR: A Game theoretical approach for calculating the contribution of each element of a system (here network models of the brain) to a system-wide description of the system. The classic neuroscience example: How much each brain region is causally relevant to an arbitrary cognitive function.

Motivation & such:

MSA is developed by Keinan and colleagues back in 2004, the motivation for them was to have a causal picture of the system by lesioning its elements. The method itself is not new, if not the first, it was among one of the earliest ones used by neuroscientists to understand the brain. The reasoning is quite simple, let us study broken systems to see what's missing both from the brain and the behavior (or cognition) and assume that region was causally necessary for the emergence of that cognitive/behavioral state. What MSA does is to see this necessity as contribution. If the brain region is indeed the seed of this cognitive function (whatever this means) then its contribution should be very high while other regions will have near zero contribution. Having this in mind then we can see the whole scenario as a cooperative game in which a coalition of players work together and obtain some divisible outcome, then the question is quite the same. How to divide the outcome to the players in a "fair" way such that the most "important" player gets the biggest chunk. Shapley value is then that chunk! It is the result of a mathematically rigorous and axiomatic procedure that derives who should get how much from all possible combinations of coalitions and all ordering in which players can enter the game. Translating it to neuroscience, it derives a ranking of contributions from a dataset of all possible combinations of lesions. This means 2^N lesions (assuming lesions are binary, either perturbed or not), which N is the number of brain regions.

As you probably noticed this won't be feasible to calclulate as for example, it means a total number of 4,503,599,627,370,496 lesion combinations, assuming the brain is organized as Broadmann said, i.e., with 52 regions. So we estimate! For a more detailed description visit:

Keinan, Alon, Claus C. Hilgetag, Isaac Meilijson, and Eytan Ruppin. 2004. “Causal Localization of Neural Function: The Shapley Value Method.” Neurocomputing 58-60 (June): 215–22.

Keinan, Alon, Ben Sandbank, Claus C. Hilgetag, Isaac Meilijson, and Eytan Ruppin. 2006. “Axiomatic Scalable Neurocontroller Analysis via the Shapley Value.” Artificial Life 12 (3): 333–52.

And our own recent work Fakhar K, Hilgetag CC. Systematic perturbation of an artificial neural network: A step towards quantifying causal contributions in the brain. PLoS Comput Biol. 2022;18: e1010250. doi:10.1371/journal.pcbi.1010250

Installation:

The easiest way is to pip install msapy, This package is compatible for Python >=3.11. Other versions might not work.

How it works:

Here you can see a schematic representation of how the algorithm works (interested in math instead? check the papers above). Briefly, all MSA needs from you is a list of players and a game function. The players can be your nodes, for example, brain regions or indices in a connectivity matrix, or links between them as tuples. It then shuffles them to produce N orderings in which they can join the game. This can end with repeating permutations if the set is small but that's fine don't worry! MSA then produces a "combination space" in which it produces all the combinations the player should form coalitions. Then it uses your game function and fills the contributions of those coalitions. The last step is to perform a Shapley integration and isolate each player's contribution in that given permutation. Repeating this for all permutations produces a contribution table (shapley table) and you'll get your shapley values by averaging over permutations so the end result is a value per element/player. To get a better grasp of how this works in code, check the minimal example in the examples folder.

msa unbiased sampling algorithm

How it works in Python:

I tried to make the package compact and easy-to-use but still there are a few things to keep in mind. Please take a look at the examples but just to give a flavor let's start working with the set ABCD as we have in the above picture.

Importing will be just: python from msapy import msa, utils as ut Then we define some elements and generate the permutation space: python nodes = ['A', 'B', 'C', 'D'] permutation_space = msa.make_permutation_space(n_permutations=1000, elements=nodes) This results in a list of tuples, our permutation space that has 1000 permutations in it, here are the top 5 ones: python [('D', 'C', 'A', 'B'), ('A', 'D', 'C', 'B'), ('D', 'A', 'B', 'C'), ('D', 'B', 'C', 'A'), ('A', 'D', 'C', 'B')] Then we use this to produce our combination space: python combination_space = msa.make_combination_space(permutation_space=permutation_space) And a quick look of what's inside: python [frozenset({'D'}), frozenset(), frozenset({'C', 'D'}), frozenset({'A', 'C', 'D'}), frozenset({'A', 'B', 'C', 'D'}), frozenset({'A'}), frozenset({'A', 'D'}), frozenset({'A', 'B', 'D'}), frozenset({'B', 'D'}), frozenset({'B', 'C', 'D'}), frozenset({'C'}), frozenset({'A', 'B'}), frozenset({'A', 'B', 'C'}), frozenset({'A', 'C'}), frozenset({'B', 'C'}), frozenset({'B'})] As you can see eventhough the permutation space has 1000 permutations, the combination space is exhausted because the total number of possible combinations is 2⁴ or 16. Now here's the trick, we need to assign values to these combinations (coalitions) by keeping them intact while every other element is perturbed. In other words, the contribution of coalition {'B', 'C'} is isolated if we lesion {'A', 'D'} before playing the game. So what we do (and is not in the figure above) is to produce the "complement space" of the combination space: python complement_space = msa.make_complement_space(combination_space=combination_space, elements=nodes) that is the difference of what's in the combination space in that coalition and what is not:

python [('C', 'B', 'A'), ('C', 'D', 'B', 'A'), ('B', 'A'), ('B',), (), ('C', 'D', 'B'), ('C', 'B'), ('C',), ('C', 'A'), ('A',), ('D', 'B', 'A'), ('C', 'D'), ('D',), ('D', 'B'), ('D', 'A'), ('C', 'D', 'A')] As you can see, for example when combination is {'D'} the corresponding complement is ('C', 'B', 'A'). Note the difference in types, combination space is an OrderedSet of frozensets so the Shapley value calculations are quicker while complement space is an OrderedSet of Tuples So handling it in your objective function is easier. Speaking of, let's make the worst objective function that just produces random values regardless of what's what (see the example on ground-truth models.ipynb for a more elaborate version.)(see the example on ground-truth models.ipynb for a more elaborate version.) python def rnd(complements): return np.random.randint(1, 10) We'll next play the games and aquire the contributions as follows: python contributions, lesion_effects = msa.take_contributions(elements=nodes, combination_space=combination_space, complement_space=complement_space, objective_function=rnd) Both contributions and lesion_effects are the same values just addressed differently. For example, if the contribution of coalition {'B', 'C'} is 5 points then you can also say the effect of lesioning coalition {'A', 'D'} is 5 points. This by itself is not that informative but if you know the contribution of the grand coalition (intact system) then you can claim that the effect of lesioning {'A', 'D'} is a drop of some performance from x to 5.

Lastly, you can calculate Shapley values like:

```python import msa

shapleytable = msa.getshapleytable(contributions=contributions, permutationspace=permutation_space) ``Which gives you aShapleyTabledata structure which is a wrapper aroundpd.DataFrame` to work with.

The Interface:

To make things easier, msa comes with an interface function:

python shapley_table, contributions, lesion_effects = msa.interface(multiprocessing_method='joblib', elements=regions, n_permutations=1000, objective_function=rnd, n_parallel_games=-1, random_seed=1) For this one, all you have to do is to provide your elements, the objective function, and specify some parameters. For example, you can choose between two different multiprocessing toolboxes joblib and ray to distribute msa.take_contributions over n_parallel_games. Specifying a random_seed is encouraged for reproducibility but the default is None.

TODO (Interested in Contributing?):

More estimation methods, for example see: amiratag/neuronshapley.
GPU and HPC compatibilty
Providing built-in objective functions for common use-cases.
Improved documentation
More Tests

Cite:

@misc{MSA, author = {Kayson Fakhar and Shrey Dixit}, title = {MSA: A compact Python package for Multiperturbation Shapley value Analysis.}, year = {2021}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/kuffmode/msa}}, }

Acknowledgement:

I thank my good friend and Python mentor Fabrizio Damicelli whom I learned a lot from. Without him this package would be a disaster to look at.

Owner

Name: Kayson Fakhar
Login: kuffmode
Kind: user
Location: Hamburg
Company: University Medical Center Hamburg-Eppendorf (UKE)

Website: kaysonfakhar.com
Twitter: kaysonfakhar
Repositories: 4
Profile: https://github.com/kuffmode

A Github leech since 2017

CodeMeta (codemeta.json)

{
  "@context": "https://doi.org/10.5063/schema/codemeta-2.0",
  "@type": "SoftwareSourceCode",
  "license": "https://spdx.org/licenses/MIT",
  "codeRepository": "https://github.com/kuffmode/msa",
  "dateCreated": "2021-10-28",
  "datePublished": "2021-11-01",
  "dateModified": "2023-11-23",
  "downloadUrl": "https://github.com/kuffmode/msa/archive/refs/tags/v1.5.tar.gz",
  "issueTracker": "https://github.com/kuffmode/msa/issues",
  "name": "MSA",
  "version": "1.5",
  "description": "A compact and general-purpose Python package for Multi-perturbation Shapley value Analysis",
  "applicationCategory": "Computational Neuroscience",
  "releaseNotes": "Performance optimized. Works with Python 3.11",
  "referencePublication": "https://doi.org/10.1371/journal.pcbi.1010250",
  "funder": {
    "@type": "Organization",
    "name": "Institute of Computational Neuroscience, University Medical Center Hamburg"
  },
  "keywords": [
    "Neuroscience",
    "Artificial Intelligence",
    "Causal Inference",
    "Game-theory",
    "Shapley value"
  ],
  "programmingLanguage": [
    "Python"
  ],
  "operatingSystem": [
    "Linux",
    "Windows",
    "MacOs"
  ],
  "softwareRequirements": [
    "Python 3.9",
    "https://github.com/kuffmode/msa/blob/main/pyproject.toml"
  ],
  "relatedLink": [
    "https://kuffmode.github.io/msa/"
  ],
  "author": [
    {
      "@type": "Person",
      "@id": "https://kaysonfakhar.com",
      "givenName": "Kayson",
      "familyName": "Fakhar",
      "email": "kayson.fakhar@gmail.com",
      "affiliation": {
        "@type": "Organization",
        "name": "Institute of Computational Neuroscience, University Medical Center Hamburg, Hamburg, Germany"
      }
    },
    {
      "@type": "Person",
      "@id": "https://github.com/ShreyDixit",
      "givenName": "Shrey",
      "familyName": "Dixit",
      "email": "shrey.akshaj@gmail.com",
      "affiliation": {
        "@type": "Organization",
        "name": "Institute of Computational Neuroscience, University Medical Center Hamburg, Hamburg, Germany"
      }
    }
  ]
}

GitHub Events

Total

Watch event: 1
Issue comment event: 1
Push event: 4
Fork event: 1

Last Year

Watch event: 1
Issue comment event: 1
Push event: 4
Fork event: 1

Committers

Last synced: almost 3 years ago

All Time

Total Commits: 98
Total Committers: 3
Avg Commits per committer: 32.667
Development Distribution Score (DDS): 0.429

Top Committers

Name	Email	Commits
Shrey Dixit	s**j@g**m	56
Kayson Fakhar	k**r@g**m	35
Shrey Dixit	s**j@g**m	7

Issues and Pull Requests

Last synced: 8 months ago

All Time

Total issues: 6
Total pull requests: 25
Average time to close issues: 6 months
Average time to close pull requests: 24 days
Total issue authors: 3
Total pull request authors: 3
Average comments per issue: 1.0
Average comments per pull request: 0.12
Merged pull requests: 22
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 2
Pull requests: 1
Average time to close issues: N/A
Average time to close pull requests: 36 minutes
Issue authors: 1
Pull request authors: 1
Average comments per issue: 0.0
Average comments per pull request: 0.0
Merged pull requests: 1
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

kuffmode (4)
ShreyDixit (1)
zhuyingqin (1)

Pull Request Authors

ShreyDixit (24)
kuffmode (2)
luisa-hoerauf (2)

Top Labels

Issue Labels

documentation (1)

Pull Request Labels

Packages

Total packages: 1
Total downloads:
- pypi 41 last-month

Total dependent packages: 1
Total dependent repositories: 1
Total versions: 18
Total maintainers: 2

pypi.org: msapy

Multi-perturbation Shapley value Analysis (MSA)

Homepage: https://github.com/kuffmode/msa
Documentation: https://kuffmode.github.io/msa/
License: OSI Approved :: MIT License
Latest release: 1.7.3
published 7 months ago

Versions: 18
Dependent Packages: 1
Dependent Repositories: 1
Downloads: 41 Last month

Rankings

Dependent packages count: 10.1%

Downloads: 14.7%

Stargazers count: 15.6%

Average: 16.9%

Dependent repos count: 21.6%

Forks count: 22.7%

Maintainers (2)

kuffmode ShreyDixit

Last synced: 6 months ago

Dependencies

.github/workflows/makefile.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite

.github/workflows/python-app.yml actions

actions/checkout v2 composite
actions/setup-python v2 composite

.github/workflows/publish_pypi.yml actions

actions/checkout v3 composite
actions/setup-python v3 composite
pypa/gh-action-pypi-publish release/v1 composite

pyproject.toml pypi

fastprogress ^1.0.3
joblib ^1.3.2
numpy ^1.26.4
ordered-set ^4.1.0
pandas ^2.2.1
pytest ^8.1.1
python >=3.9
toml ^0.10.2
tqdm ^4.66.2
tqdm-joblib ^0.0.3
typeguard >=2.13.0,<2.14.0

msapy

Science Score: 49.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

docs/README.md

Motivation & such:

Installation:

How it works:

How it works in Python:

The Interface:

TODO (Interested in Contributing?):

Cite:

Acknowledgement:

Owner

CodeMeta (codemeta.json)

GitHub Events

Total

Last Year

Committers

All Time

Top Committers

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Packages

pypi.org: msapy

Rankings

Maintainers (2)

Dependencies