Keywords: Explainability, Equity, TabularData
Need: There is a great need in healthcare to understand, measure and correct for bias in our data. Additionally, as we start to develop operational models further we need a clear understanding of model bias and how to measure and track this over time. There are multiple metrics that can address fairness in data focusing on group, subgroup or individual. However, these are not always compatible with one another making it difficult to optimise a model in a way that deals with the underlying data biases. This project would seek to investigate the way we identify and talk about fairness in both our data and models. In addition we would look to demonstrate a robust approach to fairness through optimising a directed acyclic graph for a variety of fairness measures with a clear demonstration of the impact.
Current Knowledge/Examples & Possible Techniques/Approaches: There is a wealth of papers and discussion around bias and fairness in data and models. In October of 2021 a call went out Addressing bias in big data and AI for health care.
During our previous projects we have come across a few models that attempt to include fairness within in order to create a synthetic fairer version of the base data. These include: DECAF, FR-GAN, and FairGAN.
Related Previous Internship Projects: n/a as first iteration of the project
Enables Future Work: Possibilities of feeding into many ongoing models and pieces of analysis
Outcome/Learning Objectives: Report with connect open code summarising the fairness metric landscape and how this could apply to healthcare models.
Datasets: Public covid cases data, other large open datasets with known and unknown bias
Desired skill set: When applying please highlight any experience around fairness, DAGs, coding experience (including any coding in the open), any other data science experience you feel relevant.
Return to list of all available projects.