Identifying Disease with Natural Language Processing Technology


 – A recent study by Kaiser Permanente demonstrated the value of natural language processing (NLP) technology with clinicians identifying more than 50,000 patients with aortic stenosis, a common heart disease.

The study was conducted by Matthew Solomon, MD, PhD, a cardiologist at The Permanente Medical Group and a physician researcher at the Kaiser Permanente Division of Research in Oakland, California.

According to Solomon, while healthcare is currently in an era of big data and data analytics, it remains hard to identify patients with complex conditions such as valvular heart disease, making it difficult to study the disease, track practice patterns, and manage population health.

“Currently, health systems track patients using diagnosis or procedure codes, which are mostly created for billing purposes. These can be very non-specific and are not useful for clinical care or research,” Solomon told HealthITAnalytics.

“Without accurate and systematic case identification, population management and research on valvular heart conditions and many other complex conditions isn’t possible. We set out to tackle this problem by developing natural language processing algorithms that make it possible to teach a computer how to do this for us.”

Researchers trained the NLP to sort through over a million electronic medical records (EMR) and echocardiogram reports to identify certain abbreviations, words, and phrases associated with aortic stenosis.

Within minutes, the software recognized nearly 54,000 patients with the conditions, a process that would have likely taken years for physicians to perform manually.

“It was a magical moment when we were able to apply our developed and validated algorithms on our entire population and to then identify our large cohort of patients with aortic stenosis,” Solomon said.

“We could immediately imagine a not-too-distant future where these methods could be used to take population management, which Kaiser Permanente Northern California has excelled at for the past two decades, to the next level.”

With artificial intelligence-based technologies, researchers can improve efficiency and output. A vast amount of important clinical data exists in EMRs, Solomon explained. However, much of the data remains out of reach from most researchers due to its unstructured nature and inability to be easily searched.

“AI-based technologies like NLP can overcome this challenge, improving the efficiency and feasibility of research by allowing us to build large cohorts of patients to study without manual chart review. This process also yields much larger datasets that can produce more precise and generalizable results,” Solomon said.

NLP can also assist researchers in overcoming the limitations of procedure codes and diagnosis codes. According to Solomon, currently, diagnosis and procedure codes are not specific enough. The codes are not designed to include detailed data about a specific medical condition.

“For example, a patient with moderate or severe aortic stenosis is entirely different than a patient with mild aortic valve disease. Those nuances are not included in diagnosis or procedure codes. In addition, some codes simply state ‘aortic valve disease,’ which could be applied to an entirely different clinical problem of the aortic valve,” Solomon said.

Population health management and research on chronic conditions aren’t possible without accurate and systematic case identification. Due to unstructured data in medical records and reports, the most logical solution to identify conditions in patients is to develop NLP methods to comb over large EMRs.

While AI-based technologies are powerful tools for diagnosing chronic diseases, they can also aid in disease prevention efforts, a method that Kaiser Permanente plans to pursue.

“Identifying patients with complex conditions that cannot be otherwise well-characterized is only the start. Once we are able to build large cohorts for patients with various cardiovascular conditions, we can begin to do risk-based population management,” Solomon continued.

“This will allow us not only to focus resources on the sickest patients but also identify trends and novel predictors that affect outcomes or disease progression so that we can identify and intervene earlier to help the patients who may be most likely to progress to more severe disease.”


Please enter your comment!
Please enter your name here