J Korean Soc Med Inform.  2009 Mar;15(1):13-23.

Prediction of Hospital Charges for the Cancer Patients with Data Mining Techniques

Affiliations
  • 1Department of Radiation Oncology, School of Medicine, Kyung Hee University, Seoul, Korea. kangjino@khmc.or.kr
  • 2Graduate School of Business, Korea University, Seoul, Korea.

Abstract


OBJECTIVE
Predictions of hospital charges for cancer patients are very important, because they provide a basis for allocating medical resources in the hospital and for establishing national medical policies. But previous studies to predict hospital charges were mainly based on statistical analysis, which has used only a small aspect among huge medical data so that the prediction power was limited. Thus we developed four data mining models, including two artificial neural network (ANN) models and two classification and regression tree (CART) models, to predict both the total amount of hospital charges and the amount paid by the insurance of cancer patients and compared their efficacies.
METHODS
The data was generated from400,625 medical records of 1,605 cancer patients who had been hospitalized toKyungHeeUniversityHospital fromMarch 1, 2003 to February 29, 2004. Clementine 8.1 programwas used to build four data mining prediction models, two for the total amount and two for the amount paid by insurance. The variables included all of the data fields of standard medical record form of Korea. The neural network model used feed-forward back propagation method, which had 2 hidden layers. For decision tree model, RELIEFF method was used and the maximum tree depth was set to 30.We divided the dataset into 67%of training dataset and 33%of test dataset, using stratified sampling. Linear correlation coefficient and gain chart were compared.
RESULTS
The ANN models showed better linear correlation coefficient than the CART models in predicting both the total amount (0.824 vs. 0.791) and the amount paid by insurance (0.838 vs. 0.699). The estimated accuracy of ANN model was more than 98%to predict both total amount and amount paid by insurance. The CART model for total amount showed that the relative importance of the variables were duration of admission(0.073), number of consultation(0.061), and treatment group 16(0.06). The CART model for the amount paid by insurance showed that the relative importance of the cariables were duration of admission (0.09), number of ICUadmission (0.063), and number of consultations (0.062). The percent gain of ANN model shows better %gain than CART to predict total amount but to predict amount paid by insurance, ANN showed similar pattern to CART
CONCLUSION
The ANNmodels showed better prediction accuracy than CART models. However, the CART models, which serve different information from ANN model, can be used to allocate limited medical resources effectively and efficiently. For the purpose of establishing medical policies and strategies, using those models together is warranted.

Keyword

Cost; Cancer; Data Mining; Neural Network Models; Decision Tree Models

MeSH Terms

Classification
Data Mining*
Dataset
Decision Trees
Hospital Charges*
Humans
Insurance
Korea
Medical Records
Neural Networks (Computer)
Referral and Consultation

Figure

  • Figure 1 Structure of Artificial Neural Network. The ANN model had two hidden layers. The ANN model for total amount included 56 input neurons and the model for amount paid by insurance included 53 neurons.

  • Figure 2 Decision Tree Model for Total Amount. The duration of admission was the most important variable to split.

  • Figure 3 Decision Tree Model for Amount Paid by Insurance. The duration of admission was the most important variable to split.

  • Figure 4 The y-axis Shows the Percentage of Gain. The x-axis shows the percentage of samples selected based on the data mining model, which is a fraction of total samples selected. ANN model shows better %gain than CART to predict total amount (upper). But ANN and CART showed similar pattern to predict amount paid by insurance (lower).


Reference

1. National Health Insurance Statistics 2007. Korea HIRaAS. 2008. updated 2008; cited 2008. Available from: http://www.hira.or.kr/cms/rd/rdi_statistics/morgue/1188982_5295.html.
2. Yoon SJ, Lee H, Shin Y, Kim YI, Kim CY, Chang H. Estimation of the burden of major cancers in Korea. J Korean Med Sci. 2002. 10. 17(5):604–610.
Article
3. Hirano S, Tsumoto S. Multiscale analysis of long time-seriesmedical databases. AMIA Annu Symp Proc. 2003. 289–293.
4. Ismael MB, Eisenstein EL, Hammond WE. Acomparison of neural network models for the prediction of the cost of care for acute coronary syndrome patients. Proc AMIA Symp. 1998. 533–537.
5. Demsar J, Zupan B, Aoki N, Wall MJ, Granchi TH, Robert Beck J. Feature mining and predictive model construction from severe trauma patient's data. Int J Med Inform. 2001. 09. 63(1-2):41–50.
Article
6. Brooks SE, Ahn J, Mullins CD, Baquet CR, D'Andrea A. Health care cost and utilization project analysis of comorbid illness and complications for patients undergoing hysterectomy for endometrial carcinoma. Cancer. 2001. 08. 15. 92(4):950–958.
Article
7. Penberthy L, Retchin SM, McDonald MK, McClish DK, Desch CE, Riley GF, et al. Predictors ofMedicare costs in elderly beneficiaries with breast, colorectal, lung, or prostate cancer. Health Care Manag Sci. 1999. 07. 2(3):149–160.
8. Tollestrup K, Frost FJ, Stidley CA, Bedrick E, McMillan G, Kunde T, et al. The excess costs of breast cancer health care in Hispanic and non-Hispanic female members of a managed care organization. Breast Cancer Res Treat. 2001. 03. 66(1):25–31.
Article
9. Dayhoff JE, DeLeo JM. Artificial neural networks: opening the black box. Cancer. 2001. 04. 91(8):Suppl. 1615–1635.
10. Goss E, Vozikis G. Improving Health Care Organizational Management Through Neural Network Learning. Health Care Manag Sci. 2002. 5(3):221–227.
11. Marshall AH, McClean SI, Millard PH. Addressing bed costs for the elderly: a new methodology for modelling patient outcomes and length of stay. Health Care Manag Sci. 2004. 02. 7(1):27–33.
Article
12. Chae YM, Ho SH, Cho KW, Lee DH, Ji SH. Data mining approach to policy analysis in a health insurance domain. Int JMed Inform. 2001. 07. 62(2-3):103–111.
Article
13. Lee SM, Kang JO, Suh YM. Comparison of hospital charge prediction models for colorectal cancer patients: neural network vs. decision tree models. J Korean Med Sci. 2004. 10. 19(5):677–681.
Article
14. Chien CW, Lee YC, Ma T, Lee TS, Lin YC, Wang W, et al. The application of artificial neural networks and decision treemodel in predicting post-operative complication for gastric cancer patients. Hepatogastroenterology. 2008. May-Jun. 55(84):1140–1145.
15. Goss EP, Vozikis GS. Improving health care organizational management through neural network learning. Health Care Manag Sci. 2002. 08. 5(3):221–227.
16. Fogel DB, Wasson EC 3rd, Boughton EM, Porto VW. Evolving artificial neural networks for screening features from mammograms. Artif Intell Med. 1998. 11. 14(3):317–326.
Article
17. Kononenko I. Machine learning for medical diagnosis: history, state of the art and perspective. Artif Intell Med. 2001. 08. 23(1):89–109.
Article
18. Bojarczuk CC, Lopes HS, Freitas AA, Michalkiewicz EL. A constrained-syntax genetic programming system for discovering classification rules: application to medical data sets. Artif Intell Med. 2004. 01. 30(1):27–48.
Article
19. Breault JL, Goodall CR, Fos PJ. Data mining a diabetic data warehouse. Artif Intell Med. 2002. Sep-Oct. 26(1-2):37–54.
Article
Full Text Links
  • JKSMI
Actions
Cited
CITED
export Copy
Close
Share
  • Twitter
  • Facebook
Similar articles
Copyright © 2024 by Korean Association of Medical Journal Editors. All rights reserved.     E-mail: koreamed@kamje.or.kr