Keywords: NLP, STM, Text
Need: Topic modelling is a particularly useful approach when trying to explore documents and provide unique insights. Structured Topic Models extend classical topic modelling approaches to include other metadata with the goal of improving the output topics, and this has proven useful in a variety of different domains.
Further development of Structured Topic Modelling approaches to incorporate other modern techniques available within open-source NLP frameworks (in both python and R), as well as making use of current knowledge bases or supplementary information, could further improve its efficacy and use within different healthcare settings.
Alongside this, an exploration around combining outputs of sentiment analysis models into representative vectors to increase their robustness in dealing with healthcare text, especially for short-text phrases, feedback forms, and other NHS specific texts, where a single model can find allocating a suitable sentiment score challenging, could further help extract deeper insights.
Current Knowledge/Examples & Possible Techniques/Approaches:
Related Previous Internship Projects: GitHub - nhsx/stm-survey-text
Enables Future Work: Provide alternative methods of extracting value from short-text prose using Structured Topic Modelling and enhancing the robustness of sentiment models when applied to healthcare text.
Outcome/Learning Objectives: Additional suite of functionality via e.g., semantic search analysis. Ability to combine multiple sentiment model outputs for robustness.
Datasets: Open Healthcare text datasets e.g. Nottinghamshire Healthcare NHS Foundation Trust Friends and Family Feedback
Desired skill set: When applying please highlight any experience around working with healthcare texts, natural language processing (specifically topic modelling), knowledge graphs, coding experience (including any coding in the open), and any other data science experience you feel relevant.
Return to list of all available projects.