ethicml

Package for evaluating the performance of methods which aim to increase fairness, accountability and/or transparency

https://github.com/wearepal/ethicml

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
    2 of 8 committers (25.0%) from academic institutions
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (17.0%) to scientific vocabulary

Keywords

algorithmic-fairness computer-vision data-science ethical-artificial-intelligence ethical-data-science fairness fairness-ai fairness-assessment fairness-awareness-model fairness-comparison fairness-ml machine-bias machine-learning pytorch responsible-ai toolkit

Keywords from Contributors

interactive mesh interpretability profiles sequences generic projection optim embedded particles
Last synced: 6 months ago · JSON representation ·

Repository

Package for evaluating the performance of methods which aim to increase fairness, accountability and/or transparency

Basic Info
Statistics
  • Stars: 23
  • Watchers: 3
  • Forks: 3
  • Open Issues: 43
  • Releases: 47
Topics
algorithmic-fairness computer-vision data-science ethical-artificial-intelligence ethical-data-science fairness fairness-ai fairness-assessment fairness-awareness-model fairness-comparison fairness-ml machine-bias machine-learning pytorch responsible-ai toolkit
Created about 7 years ago · Last pushed 8 months ago
Metadata Files
Readme Contributing License Citation

README.md

EthicML: A featureful framework for developing fair algorithms

Checked with mypy

EthicML is a library for performing and assessing algorithmic fairness. Unlike other libraries, EthicML isn't an education tool, but rather a researcher's toolkit.

Other algorthimic fairness packages are useful, but given that we primarily do research, a lot of the work we do doesn't fit into some nice box. For example, we might want to use a 'fair' pre-processing method on the data before training a classifier on it. We may still be experimenting and only want part of the framework to execute, or we may want to do hyper-parameter optimization. Whilst other frameworks can be modified to do these tasks, you end up with hacked-together approaches that don't lend themselves to be built on in the future. Because of this, we built EthicML, a fairness toolkit for researchers.

Features include: - Support for multiple sensitive attributes - Vision datasets - Codebase typed with mypy - Tested code - Reproducible results

Why not use XXX?

There are an increasing number of other options, IBM's fair-360, Aequitas, EthicalML/XAI, Fairness-Comparison and others. They're all great at what they do, they're just not right for us. We will however be influenced by them. Where appropriate, we even subsume some of these libraries.

Installation

EthicML requires Python >= 3.8. To install EthicML, just do pip3 install ethicml

If you want to use the method by Agarwal et al., you have to explicitly install all dependencies: pip3 install 'ethicml[all]' (The quotes are needed in zsh and will also work in bash.)

Attention: In order to use all features of EthicML, PyTorch needs to be installed separately. We are not including PyTorch as a requirement of EthicML, because there are many different versions for different systems.

Documentation

The documentation can be found here: https://wearepal.ai/EthicML/

Design Principles

mermaid flowchart LR A(Datasets) -- load --> B(Data tuples); B --> C[evaluate_models]; G(Algorithms) --> C; C --> D(Metrics);

Keep things simple.

The Triplet

Given that we're considering fairness, the base of the toolbox is the triplet {x, s, y}

  • X - Features
  • S - Sensitive Label
  • Y - Class Label

Developer note: All methods must assume S and Y are multi-class.

We use a DataTuple class to contain the triplet

python triplet = DataTuple(x: pandas.DataFrame, s: pandas.DataFrame, y: pandas.DataFrame)

In addition, we have a variation: the TestTuple which contains the pair python pair = TestTuple(x: pandas.DataFrame, s: pandas.DataFrame) This is to reduce the risk of a user accidentally evaluating performance on their training set.

Using dataframes may be a little inefficient, but given the amount of splicing on conditions that we're doing, it feels worth it.

Separation of Methods

We purposefully keep pre, during and post algorithm methods separate. This is because they have different return types.

python pre_algorithm.run(train: DataTuple, test: TestTuple) # -> Tuple[DataTuple, TestTuple] in_algorithm.run(train: DataTuple, test: TestTuple) # -> Prediction post_algorithm.run(train_prediction: Prediction, train: DataTuple, test_prediction: Prediction, test: TestTuple) # -> Prediction where Prediction holds a pandas.Series of the class label. In the case of a "soft" output, SoftPrediction extends Prediction and provides a mapping from "soft" to "hard" labels. See the documentation for more details.

General Rules of Thumb

  • Mutable data structures are bad.
  • At the very least, functions should be Typed.
  • Readability > Efficiency.
  • Warnings must be addressed.
  • Always write tests first.

Future Plans

The aim is to make EthicML operate on 2 levels.

  1. We want a high-level API so that a user can define a new model or metric, then get publication-ready results in just a couple of lines of code.
  2. We understand that truly ground-breaking work sometimes involves tearing up the rulebook. Therefore, we want to also expose a lower-level API so that a user can make use of as much, or little of the library as is suitable for them.

We've built everything with this philosophy in mind, but acknowledge that we still have a way to go.

Contributing

If you're interest in this research area, we'd love to have you aboard. For more details check out CONTRIBUTING.md. Whether your skills are in coding-up papers you've read, writing tutorials, or designing a logo, please reach out.

Development

Install development dependencies with pip install -e .[dev]

To use the pre-commit hooks run pre-commit install

Owner

  • Name: Predictive Analytics Lab
  • Login: wearepal
  • Kind: organization
  • Location: University of Sussex, Brighton, UK

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Thomas"
  given-names: "Oliver"
  orcid: "https://orcid.org/0000-0001-8162-9633"
- family-names: "Kehrenberg"
  given-names: "Thomas"
  orcid: " https://orcid.org/0000-0002-2347-7165"
- family-names: "Bartlett"
  given-names: "Myles"
- family-names: "Quadrianto"
  given-names: "Novi"
title: "EthicML: A featureful framework for developing fair algorithms"
url: "https://github.com/predictive-analytics-lab/EthicML"
type: software

GitHub Events

Total
  • Delete event: 11
  • Issue comment event: 8
  • Push event: 12
  • Pull request event: 10
  • Create event: 8
Last Year
  • Delete event: 11
  • Issue comment event: 8
  • Push event: 12
  • Pull request event: 10
  • Create event: 8

Committers

Last synced: 10 months ago

All Time
  • Total Commits: 1,772
  • Total Committers: 8
  • Avg Commits per committer: 221.5
  • Development Distribution Score (DDS): 0.695
Past Year
  • Commits: 77
  • Committers: 2
  • Avg Commits per committer: 38.5
  • Development Distribution Score (DDS): 0.169
Top Committers
Name Email Commits
Thomas MK t****e@p****t 540
Oliver Thomas o****4@s****k 489
dependabot[bot] 4****] 393
thomkeh 7****h 320
Myles Bartlett 4****t 27
azure-pipelines[bot] a****] 1
Myles Bartlett m****5@m****k 1
Anonymous a****n@y****s 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 2
  • Total pull requests: 382
  • Average time to close issues: N/A
  • Average time to close pull requests: 6 days
  • Total issue authors: 2
  • Total pull request authors: 3
  • Average comments per issue: 1.0
  • Average comments per pull request: 0.12
  • Merged pull requests: 361
  • Bot issues: 0
  • Bot pull requests: 297
Past Year
  • Issues: 0
  • Pull requests: 34
  • Average time to close issues: N/A
  • Average time to close pull requests: 9 days
  • Issue authors: 0
  • Pull request authors: 2
  • Average comments per issue: 0
  • Average comments per pull request: 0.35
  • Merged pull requests: 25
  • Bot issues: 0
  • Bot pull requests: 29
Top Authors
Issue Authors
  • MylesBartlett (1)
  • tmke8 (1)
Pull Request Authors
  • dependabot[bot] (398)
  • tmke8 (91)
  • olliethomas (2)
Top Labels
Issue Labels
new-method (2)
Pull Request Labels
dependencies (398) enhancement (3) code-change (2) bug (1) breaking (1)

Packages

  • Total packages: 1
  • Total downloads:
    • pypi 93 last-month
  • Total dependent packages: 1
  • Total dependent repositories: 4
  • Total versions: 49
  • Total maintainers: 2
pypi.org: ethicml

EthicML is a library for performing and assessing algorithmic fairness. Unlike other libraries, EthicML isn't an education tool, but rather a researcher's toolkit.

  • Versions: 49
  • Dependent Packages: 1
  • Dependent Repositories: 4
  • Downloads: 93 Last month
Rankings
Dependent packages count: 4.8%
Dependent repos count: 7.5%
Average: 11.0%
Stargazers count: 12.5%
Downloads: 13.6%
Forks count: 16.9%
Maintainers (2)
Last synced: 6 months ago

Dependencies

.github/workflows/continuous_integration.yml actions
  • actions/checkout v3 composite
  • actions/setup-python v4 composite
.github/workflows/docs.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
  • peaceiris/actions-gh-pages v3 composite
.github/workflows/dummy_ci.yml actions
  • actions/checkout v2 composite
  • actions/setup-python v2 composite
docs/requirements.txt pypi
  • autodocsumm *
  • furo *
  • ipython-pygments *
  • sphinx *
  • toml *
  • typing_extensions *
.github/workflows/dependabot_auto.yml actions