Improving Model Accuracy Using Unstructured Data - Predicting the Onset of Congestive Heart Failure

David A. West, IBM Sspss

ABSTRACT
The advent of the Electronic Medical Record provides a wealth of untapped data which when used to build predictive models significantly improves the quality of care. Much of this data takes the form of unstructured clinical notes, lab test results and other textual information. Using unstructured data improves the accuracy of patient disease-status classification and consequently improves the accuracy of models used to predict the onset of critical diseases. Early prediction of diseases such as kidney disease, lung disease, and various cardiac conditions will improve longevity and quality of life while reducing the long-term cost of care. This presentation provides details of a practical example where detailed data from nearly 600,000 patients was used to identify those patients who were likely to develop congestive heart failure in the coming year. Techniques used in this example are directly applicable to similar patient-related predictions.

(Return to Program Resources)

Updated 02/21/2015