TY - JOUR
T1 - Analysis of healthcare coverage
T2 - A data mining approach
AU - Delen, Dursun
AU - Fuller, Christie
AU - McCann, Charles
AU - Ray, Deepa
PY - 2009/3
Y1 - 2009/3
N2 - The existing disparity in the healthcare coverage is a pressing issue in the United States. Unfortunately, many in the US do not have healthcare coverage and much research is needed to identify the factors leading to this phenomenon. Hence, this study aims to examine the healthcare coverage of individuals by applying popular machine learning techniques on a wide-variety of predictive factors. Twenty-three variables and 193,373 records were utilized from the 2004 behavioral risk factor surveillance system survey data for this study. The artificial neural networks and the decision tree models were developed and compared to each other for predictive ability. The sensitivity analysis and variable importance measures are calculated to analyze the importance of the predictive factors. The experimental results indicated that the most accurate classifier for this phenomenon was the multi-layer perceptron type artificial neural network model that had an overall classification accuracy of 78.45% on the holdout sample. The most important predictive factors came out as income, employment status, education, and marital status. Using two popular machine learning techniques, this study identified the factors that can be used to accurately classify those with and without healthcare coverage. The ability to identify and explain the reasoning of those likely to be without healthcare coverage through the application of accurate classification models can potentially be used in reducing the disparity in healthcare coverage.
AB - The existing disparity in the healthcare coverage is a pressing issue in the United States. Unfortunately, many in the US do not have healthcare coverage and much research is needed to identify the factors leading to this phenomenon. Hence, this study aims to examine the healthcare coverage of individuals by applying popular machine learning techniques on a wide-variety of predictive factors. Twenty-three variables and 193,373 records were utilized from the 2004 behavioral risk factor surveillance system survey data for this study. The artificial neural networks and the decision tree models were developed and compared to each other for predictive ability. The sensitivity analysis and variable importance measures are calculated to analyze the importance of the predictive factors. The experimental results indicated that the most accurate classifier for this phenomenon was the multi-layer perceptron type artificial neural network model that had an overall classification accuracy of 78.45% on the holdout sample. The most important predictive factors came out as income, employment status, education, and marital status. Using two popular machine learning techniques, this study identified the factors that can be used to accurately classify those with and without healthcare coverage. The ability to identify and explain the reasoning of those likely to be without healthcare coverage through the application of accurate classification models can potentially be used in reducing the disparity in healthcare coverage.
KW - Classification
KW - Data mining
KW - Decision trees
KW - Healthcare coverage
KW - Neural networks
KW - Prediction
UR - http://www.scopus.com/inward/record.url?scp=56349092023&partnerID=8YFLogxK
U2 - 10.1016/j.eswa.2007.10.041
DO - 10.1016/j.eswa.2007.10.041
M3 - Article
AN - SCOPUS:56349092023
SN - 0957-4174
VL - 36
SP - 995
EP - 1003
JO - Expert Systems with Applications
JF - Expert Systems with Applications
IS - 2 PART 1
ER -