J Korean Soc Med Inform.  2007 Jun;13(2):177-180.

Predicting Breast Cancer Survivability: Comparison of Five Data Mining Techniques

Affiliations
  • 1Graduate School of Medicine and Dentistry, Tokyo Medical and Dental University, 1-5-45 Yushima Bunkyo Tokyo Japan. aendo@bioinfo.tmd.ac.jp
  • 2Tokai Univ. School of Medicine, Bouseidai, Kanagawa, Japan.
  • 3Center for Information Medicine, Tokyo Medical and Dental University, Japan.

Abstract


OBJECTIVE
Today in United States, about one in eight women have been affected with breast cancer over their lifetime. Up to today, some various prediction models using SEER (Surveillance Epidemiology and End Results) datasets have been proposed in past studies. However, appropriate methods for predicting the 5 years survival rate of breast cancer have not established. In this study, we evaluate those models to predict the survival rate of breast cancer patients.
METHODS
Five data mining algorithms (Artificial Neural Network, Naive Bayes , Decision Trees (ID3) and Decision Trees(J48)) besides a most generally used statistical method (Logistic Regression) were used to evaluate the prediction models using a dataset (37,256 follow-up cases from 1992 to 1997). We also used 10-fold cross-validation methods to assess the unbiased estimate of the five prediction models for comparison of performance of each method.
RESULTS
The accuracy was 85.8+/-0.2%, 84.3+/-1.4%, 83.9+/-0.2%, 82.3+/-0.2%, 75.1+/-0.2% for the Logistic Regression, Artificial Neural, Naive Bayes, Decision Trees (ID3), Decision Trees(J48), respectively. Although the accuracy of Logistic Regression showed the highest performances, the Decision Trees (J48) was the lowest one.
CONCLUSIONS
The accuracy of Logistic Regression was the best performances, on the other hand Decision Trees (J48) was the worst. Artificial Neural Network indicated relatively high performance.

Keyword

Artificial Neural Network; Decision Tree; Naive Bayes; Breast Cancer; SEER Program

MeSH Terms

Bays
Breast Neoplasms*
Breast*
Data Mining*
Dataset
Decision Trees
Epidemiology
Female
Follow-Up Studies
Hand
Humans
Logistic Models
SEER Program
Survival Rate
United States
Full Text Links
  • JKSMI
Actions
Cited
CITED
export Copy
Close
Share
  • Twitter
  • Facebook
Similar articles
Copyright © 2022 by Korean Association of Medical Journal Editors. All rights reserved.     E-mail: koreamed@kamje.or.kr