Stojanovic, J., Gligorijevic, Dj., Radosavljavic, V., Djuric, N., Grbovic, M., Obradovic, Z. (2016) “ Modeling Healthcare Quality via Compact Representations of Electronic Health Records,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, IEEE, 2016, July 14, doi 10.1109/TCBB.2016.2591523.
(Impact Factor: 1.609)


Increased availability of Electronic Health Record (EHR) data provides unique opportunities for improving the quality of health services. In this study, we couple EHRs with the advanced machine learning tools to predict three important parameters of healthcare quality. More specifically, we describe how to learn low-dimensional vector representations of patient conditions and clinical procedures in an unsupervised manner, and generate feature vectors of hospitalized patients useful for predicting their length of stay, total incurred charges, and mortality rates. In order to learn vector representations, we propose to employ state-of-the-art language models specifically designed for modeling co-occurrence of diseases and applied clinical procedures. The proposed model is trained on a large-scale EHR database comprising more than 35 million hospitalizations in California over a period of nine years. We compared the proposed approach to several alternatives and evaluated their effectiveness by measuring accuracy of regression and classification models used for three predictive tasks considered in this study. Our model outperformed the baseline models on all tasks, indicating a strong potential of the proposed approach for advancing quality of the healthcare system.

Data (disease&procedures2vec vectors):