Artificial intelligence (AI) meets patient generated notes: A novel NLP method to improve self management and remote care for families having children with special healthcare needs (CSHCN)

by Emre Sezgin

Coauthors: Syed-Amad Hussain Katelyn Krivchenia John Luna Steve Rust Yungui Huang

Medical Devices & Digital Health

Large volume of patient health information is stored within their EHR systems. However, much of the information is dependent on a patient’s recall of personal health events occurring outside the clinic (symptoms, medication compliance, over the counter medicines etc.), and provider’s interpretation before recording these events in the EHR. Patient generated health data (PGHD) has often been used to support clinical decision-making, such as medical diaries used by families to keep track of health information and events at home. To improve patient reported outcomes and facilitate the shared decision making in clinical practice, we developed a Natural Language Processing supported data synthesis pipeline for unstructured PGHD at home, focusing on pediatric care of CSHCN with a case of cystic fibrosis.
The proposed information extraction (IE) pipeline extracts a broad range of health information by combining rule- based ontology-linked approaches with pre-trained deep-learning-based entity recognition and sentence parsing models. Particularly, we build upon the scispaCy biomedical model suite, leveraging its syntax parsing and named entity recognition capabilities. We additionally connect these entities, via UMLS-ID, to established ontologies such as SNOMED and RXNORM. We also introduce interactive customizability to rapidly fit this pipeline to a specific patient or cohort. The pipeline is tested with simulated CF patient notes. Eventually, we prototype a dashboard using Unity for data integration, sharing, and assessment, which presents aggregated data on a timeline/calendar view.
In this study, we demonstrated that hybrid deep-learning rule-based approach can operate over a variety of natural language note types and allow customization for a given patient or cohort. Viable data extraction is achieved through simulated CF notes, extracting entities of medication, dose, therapies, symptoms, bowel movements and nutrition. This hybrid pipeline is robust to misspellings and varied word representations while providing interactivity to accommodate the needs of a specific patient, cohort, or clinician.