Keywords: Synthetic, VAE, Tabular
Need: Over the course of three internship projects, we have developed NHSSynth, a Variational AutoEncoder (VAE) with differential privacy built into a modular pipeline. It allows tabular, single table, synthetic data to be generated alongside an evaluation metric suite, a fairness toolset, and an adversarial attack suite.
This project would investigate expanding this tool to be able to generate multi-table, longitudinal, or multi-modal data using recent advances in the field.
Current Knowledge/Examples & Possible Techniques/Approaches:
In terms of:
Enables Future Work: Allows NHS England to be generating a wider range of synthetic data for internal and external use
Outcome/Learning Objectives: Extension of the toolset into a new functional area.
MIMIC III is our standard for this work
Desired skill set: When applying please highlight any experience around work with synthetic data, variational autoencoders, other generative techniques, python coding experience and software development (including any coding in the open), and any other data science experience you feel relevant.
Return to list of all available projects.