Keywords: Knowledge Graphs, Node Embeddings, Tabular
Need: In healthcare, knowledge graphs (KGs) can unify structured EHR data, clinical notes, guidelines, and biomedical ontologies into a semantically rich, interoperable format. This enables powerful downstream applications such as reasoning, similarity search, and explainable machine learning. However, the way a knowledge graph is constructed, what nodes and edges are defined, how attributes are represented, and how clinical context is incorporated can profoundly impact downstream utility.
This project will explore practical knowledge graph formation strategies for the NHS by developing a specific prototype: embedding a clinical knowledge graph linking patients, diagnoses, labs, and medications, and then comparing predictive accuracy and interpretability against models trained directly on raw tabular data. This will help identify how graph structure, ontology choices, and embedding methods affect performance, explainability, and generalisation.
Current Knowledge/Examples & Possible Techniques/Approaches:
Related Previous Internship Projects: N/A as first iteration of the project
Enables Future Work: Groundwork for federated graph learning or hybrid search tools (e.g. retrieval-augmented generation with KGs) and informs on projects attempting to integrate clinical ontologies with NHS data models.
Outcome/Learning Objectives:
Datasets: MIMIC IV
Desired skill set: When applying please highlight any experience around working with graph structures (especially graph ML methods), clinical terminologies, python coding experience (including any coding in the open), any other data science experience you feel relevant.
Return to list of all available projects.