https://github.com/csinva/matching-with-gans

Matching in GAN latent space for better bias benchmarking and semantic image editing. 👶🏻🧒🏾👩🏼‍🦰👱🏽‍♂️👴🏾

Science Score: 33.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
✓
Committers with academic emails
1 of 6 committers (16.7%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.0%) to scientific vocabulary

Keywords

ai bias causal-inference computer-vision deep-learning disentanglement facial-recognition fairness gan ml neural-network python3 pytorch stylegan2

Keywords from Contributors

blog projection archival interpretability interactive generic sequences profiles generative-adversarial-network transformer

Last synced: 5 months ago · JSON representation

Repository

Matching in GAN latent space for better bias benchmarking and semantic image editing. 👶🏻🧒🏾👩🏼‍🦰👱🏽‍♂️👴🏾

Basic Info

Host: GitHub
Owner: csinva
Language: Jupyter Notebook
Default Branch: master
Homepage: https://arxiv.org/abs/2103.13455
Size: 421 MB

Statistics

Stars: 20
Watchers: 4
Forks: 2
Open Issues: 2
Releases: 2

Topics

ai bias causal-inference computer-vision deep-learning disentanglement facial-recognition fairness gan ml neural-network python3 pytorch stylegan2

Created almost 6 years ago · Last pushed almost 3 years ago

Metadata Files

Readme

Matching in GAN-Space

Code for using GANs to aid in matching, accompanying the paper "Overcoming confounding in face datasets via GAN-based matching" (arXiv)

Projection and manipulation • Matching and benchmarking
Reproducibility • Reference
Quickstart demo (manipulate and interpolate your own face images!):

Projection and manipulation

This code allows one to project images into the GAN latent space, after which they can be modified for certain attributes (e.g. age, gender, hair-length) and mixed with other faces (e.g. other people, older/younger versions of the same person). All this code is handled by the projection_manipulation/project_and_manipulate.sh script - the easiest way to get started is to use the Colab notebook, where you can upload your own images, and they will be automatically cropped, aligned projected, manipulated, and interpolated

Start with 2 real images (higher-res photos work better, as well as photos where the face is front-facing and not obstructed by things like hats, scarves, etc.):

Interpolating between the images:

Manipulating an image along pre-specified attributes:

Can do a lot more, like blending together many faces or interpolating between different faces of the same person!

Matching and benchmarking

The matching code here finds images that match across a certain attribute (e.g. perceived gender). This is useful for removing confounding factors when doing downstream analyses of things like gender bias in facial recognition. Similarly, we can perform matching using other methods, such as propensity scores, using the GAN latent space as covariates. Some example matches:

Note: these annotations do not necessarily reflect the gender identity of the person, rather they refer to binarized gender as perceived by a casual observer

After performing matching, confounding is much lower on CelebA-HQ. This is illustrated by the fact that the mean values of several key (binary) attributes become much closer after matching:

Reproducibility

Dependencies

uses tensorflow-gpu 1.14.0 (the gpu dependencies are only required for the projection / manipulation code which uses StyleGAN2)
the required dependencies can be set up on AWS by selecting a deep learning AMI, running source activate python3, and then running pip install tensorflow-gpu==1.14.0

Data/cached outputs for reproducing pipeline in this gdrive folder

data/celeba-hq/ims folder
- unzip the images in celeba-hq dataset at 1024 x 1024 resolution into this folder
data/processed folder
- distances: dists_pairwise_gan.npy, dists_pairwise_vgg.npy, dists_pairwise_facial.npy, dists_pairwise_facial_facenet.npy, dists_pairwise_facial_facenet_casia.npy, dists_pairwise_facial_vgg2.npy - (30k x 30k) matrices storing the pairwise distances between all the images in celeba-hq using different distance measures
data/processed/gen/generated_images_0.1
- latents celeba_hq_latents_stylegan2.zip - these are used in downstream analysis and are required for the propensity score analysis
(already present in repo) - annotations (e.g. gender, smiling, eyeglasses) + predicted metrics (e.g. predicted yaw, roll, pitch, quality, race) for each image + latent StyleGAN2 directions for different attributes + precomputed match numbers
(optional) can download the raw annotations and annotated images as well
(optional) all these paths can be changed in the config.py file

Scripts

Both the matching_benchmarking folder and the projection_manipulation folder contain two types of files: - .py files in the scripts subdirectories - these scripts are used to calculate the cached outputs in the gdrive folder. They do not need to be rerun, but show how the cached outputs were generated and can be rerun on new datasets. - .ipynb notebooks - these are used to reproduce the results from the cached outputs in the gdrive folde. Noteboks beginning with eda are for exploratory analysis, which can be useful but are note required to generate the final results in the paper

Reference

this project builds on many wonderful open-source projects (see the readmes in the lib subfolders for more details) including
stylegan: stylegan2 and stylegan2 encoder
facial recogntion: dlib, python face_recognition, facenet
gender/race prediction: fairface
pose/background prediction: deepheadpose, face_segmentation, and faceQnet

r @article{singh2021matched, title={Matched sample selection with GANs for mitigating attribute confounding}, author={Chandan Singh and Guha Balakrishnan and Pietro Perona}, journal={arXiv preprint arXiv:2103.13455}, year={2021} }

Owner

Name: Chandan Singh
Login: csinva
Kind: user
Location: Microsoft research
Company: Senior researcher

Website: csinva.io
Twitter: csinva_
Repositories: 29
Profile: https://github.com/csinva

Senior researcher @Microsoft interpreting ML models in science and medicine. PhD from UC Berkeley.

GitHub Events

Total

Fork event: 1

Last Year

Fork event: 1

Committers

Last synced: 7 months ago

All Time

Total Commits: 195
Total Committers: 6
Avg Commits per committer: 32.5
Development Distribution Score (DDS): 0.097

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Chandan Singh	c**h@b**u	176
Chandan Singh	s**x@a**m	13
Ubuntu	u**u@i**l	3
dependabot[bot]	4****]	1
Ubuntu	u**u@i**l	1
Singh	s**x@3**m	1

Committer Domains (Top 20 + Academic)

3c22fb126b9d.ant.amazon.com: 1 ip-172-31-17-87.us-west-2.compute.internal: 1 ip-172-31-28-224.us-west-2.compute.internal: 1 amazon.com: 1 berkeley.edu: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 1
Total pull requests: 19
Average time to close issues: about 3 hours
Average time to close pull requests: 3 months
Total issue authors: 1
Total pull request authors: 1
Average comments per issue: 2.0
Average comments per pull request: 0.84
Merged pull requests: 1
Bot issues: 0
Bot pull requests: 19

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

cpatrickalves (1)

Pull Request Authors

dependabot[bot] (19)

Top Labels

Issue Labels

Pull Request Labels

dependencies (19)

Dependencies

lib/facenet/requirements.txt pypi

Pillow *
h5py *
matplotlib *
opencv-python *
psutil *
requests *
scikit-learn *
scipy *
tensorflow ==1.15.4

lib/stylegan2/Dockerfile docker

tensorflow/tensorflow 1.15.0-gpu-py3 build

requirements.txt pypi

dlib >=19.20.0
face_recognition *
opencv-python >=3.4.0.12
scikit_learn >=0.20.0
tensorflow ==1.14.0

https://github.com/csinva/matching-with-gans

Science Score: 33.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

readme.md

Matching in GAN-Space

Projection and manipulation

Matching and benchmarking

Reproducibility

Dependencies

Data/cached outputs for reproducing pipeline in this gdrive folder

Scripts

Reference

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies