NHS England Data Science PhD Internships

Synthetic Data Exploration

Longitudinal

Keywords: Synthetic, Simulation, TabularData

Need: The ability to analyse and build tools to support patient pathways is low, often due to data access. Creating synthetic patient pathways would be one way to resolve the access issue, at least for use where the fidelity of the data can be low. The field of synthetic longitudinal data is complicated due to the time element making identifiability much higher than simple transactional data, as well as requiring the models to deal with a higher order of freedom. This project would seek to investigate the generation of synthetic patient pathways or other longitudinal data using agent based simulations or alternative methods.

Current Knowledge/Examples & Possible Techniques/Approaches: A variety of methods have been suggested including extrapolating graph models, agent based simulations, timeGANs and SOM-VAEs. Similar ideas around defining patient clinical pathways are demonstrated in tools such as synthetichealth/synthea: Synthetic Patient Population Simulator. Potentially building on or using SynPath.

Related Previous Internship Projects:

Enables Future Work: Resource for patient pathways

Outcome/Learning Objectives: Small demonstrable tool with accompanying paper outlining potential and limitations.

Datasets: n/a

Desired skill set: When applying please highlight any experience around work with synthetic generation, probability & graph models, Agent Based modelling, coding experience (including any coding in the open), any other data science experience you feel relevant.


Return to list of all available projects.