NHS England Data Science PhD Internships

Building Multimodal Patient Representations for Clinical Prediction

Keywords: Machine Learning, Single Patient Record, MultiModal

Need: This project builds on that work by moving beyond fairness diagnosis and instead focusing on developing robust, task-relevant multimodal patient representations. Specifically, it will create patient-level embeddings across structured (EHR), unstructured (clinical notes), and image (e.g., X-ray or CT metadata) data, and assess their utility across a range of downstream use cases, including:

Patient similarity search / retrieval
Clinical prediction (e.g., length of stay, readmission)
Patient clustering (e.g., phenotype discovery)

Previous NHS projects, such as mm-healthfair, have demonstrated that combining data across multiple modalities (e.g., structured data, free text, and imaging) can both improve performance and introduce new fairness concerns. Those projects highlighted how models can become unfairly biased depending on the modality fusion strategy and the richness of the non-tabular data sources.

By using consistent embedding frameworks, this project creates a foundation for future fairness auditing, interpretability analysis, and explainability research, enabling NHS use cases with traceable model reasoning.

Current Knowledge/Examples & Possible Techniques/Approaches:

Prior NHS Work: mm-healthfair highlighted fusion fairness issues; this project progresses from evaluation to representation building.
Multimodal Fusion Strategies:
- Early Fusion: Feature concatenation or cross-attention (e.g., Perceiver IO)
- Late Fusion: Modality-specific encoders with unified embedding space
- Cross-modal Representation: Contrastive learning (e.g., CLIP, MedCLIP, or ALBEF-style models)
Relevant Literature and Frameworks:
- Hansen et al. (2024) Multimodal representation learning for medical analytics - a systematic literature review.
- BioBERT, ClinicalBERT for text
- Densenet or ViT variants pretrained on medical imaging
- PyTorch Metric Learning for patient similarity retrieval tasks
- SHAP, Integrated Gradients for explainability

Related Previous Internship Projects:

https://nhsx.github.io/nhsx-internship-projects/mmbias/
Txt-Ray Align – Txt-Ray Align Project
Genomic + Clinical Integration (Upcoming)
Synthetic multimodal patient generation (e.g., NHSSynth)

Enables Future Work:

Modular patient embeddings that can be reused in multiple downstream projects (prediction, retrieval, fairness auditing)
Input for patient-similarity-based systems or recommendation tools
Foundation for cross-modal fairness evaluations and bias mitigation strategies
Improved architecture selection guidance for NHS multimodal ML

Outcome/Learning Objectives:

Design and implement at least one fusion strategy across structured and unstructured data
Generate patient-level embeddings across multiple modalities
Evaluate embeddings on one or more downstream tasks:
- Classification (e.g., 30-day readmission)
- Clustering (e.g., unsupervised phenotype discovery)
- Similarity retrieval (e.g., patient case recall)
Assess utility, explainability, and reusability of these embeddings
Document findings in a reusable open-source codebase and technical report

Datasets: Accessible Datasets such as MIMIC-IV

Desired skill set: When applying please highlight any experience around healthcare data, multimodal embedding and learning, coding experience (including any coding in the open), any other data science experience you feel relevant.

Return to list of all available projects.

This site is open source. Improve this page.