ComputeFest 2019: Symposium on Data Science, Machine Learning, and Fairness in Computational Science

Table of Contents

Computefest 2019

Hosted by the Institute for Applied Computational Science (IACS), ComputeFest is an annual winter event of knowledge and skill-building activities in computational science, engineering and data science. The workshop content compliments the curriculum taught in DataFest.

IACS Symposium: "Data Science at the Frontier of Discovery: Machine Learning in the Physical World"

Tuesday, January 22nd, 2019 Harvard University Science Center, Hall B, 1 Oxford Street, Cambridge MA 02138

On Fairness and Interpretability

Model Agnostic Methods for Interpretability and Fairness

  • look at local local perturbations
  • decision boundaries
  • shapeley values

Local Perturbations

  • lime provides local modifications around the input values driving target

https://github.com/marcotcr/lime

  • input gradients around spend and volume 4 month sliding
  • use to clarify impact of feature

https://arxiv.org/abs/1611.07634

  • hold all factors constant
  • example was prob of default relative to debt to income
  • plot holds

BILE Decision Boundary

  • spend and lift -> SpendLiftV6

Shapely Values

  • requires retraining 2F model retraining to determining interpretability.

Workshop

  • load the training data
  • perform the core splitting based on the features and the labels
  • consider the heatmap as a 3d map where one could apply the facet plot
  • LICE plot uses the values as fixed for the training data then moving each of teh values through
  • consider looking at points near the decision boundary
  • ensure that fairness testing are part of the visualization not the pipeline

Measuring Fairness

  • Statistical parity (same %)
  • Conditional parity (same % but with grouped classifiers)
  • False positive rate

https://github.com/pblankley/interp-workshop-2019

AI Fairness 360 toolkit (AIF360)

https://github.com/ibm/aif360

How does one increase trust in ML algorithms: 1) fair, 2) repeatable, 3) explainable

Example: https://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data%29

  • Bias unit tests
  • Create new explainers on source dataset
  • create pre-processing algorithm

Toolbox

  • metrics
  • data set
  • algorithm

Glossary

  • favorable label
  • protected attribute
  • group vs. individual fairness
  • bias
  • fairness metric
  • explainers

Data Exploration Tools

  • Data Points
  • Describe DF
  • Describe feature + hist

Author: Jason Walsh

j@wal.sh

Last Updated: 2025-07-30 13:45:27

build: 2025-12-23 09:11 | sha: a10ddd7