Keywords: PETs, Encryption, Tabular
Need: The NHS has substantial amounts of sensitive data distributed across many secure data silos. Privacy Enhancing Technologies (PETs) support the secure sharing of data and/or the appropriate application of a model across multiple data sources without leaking the privacy of data in those sources. The ICO have recently gone out with a consultation on PETs to understand how they can facilitate safe, legal, and valuable data sharing.
Additionally, a recent healthcare PETs prize challenge was conducted to highlight the opportunity of a federated learning framework to healthcare. A feasibility study with a simulated environment and synthetic or safe aggregated data would be useful to ask how robust are the tools and how do we measure and validate solutions? This project would seek to demonstrate how different PETs combined can support analysis of disparate data sources with complete privacy.
Current Knowledge/Examples & Possible Techniques/Approaches: For an introduction to PETs see the CDEI PETs adoption guide.
The open library PySyft decouples private data from model training, using Federated Learning, Differential Privacy, and Encrypted Computation (like Multi-Party Computation (MPC) and Homomorphic Encryption (HE)) within the main Deep Learning frameworks like PyTorch and TensorFlow.
Related Previous Internship Projects: N/A as first iteration of the project
Enables Future Work: Sharing and Analysis across silos; federated data analysis
Outcome/Learning Objectives: Demonstration of how available tools and algorithms can be used to create end-to-end private frameworks.
Datasets: We would look to work with either public of safe data in two secure silos
Desired skill set: When applying please highlight any experience around privacy enhancing technologies, encryption, privacy accountants, cyber security, coding experience (including any coding in the open), any other data science experience you feel relevant.
Return to list of all available projects.