Precis Future Med.  2024 Sep;8(3):92-104. 10.23838/pfm.2024.00030.

Computationally efficient and stable real-world synthetic emergency room electronic health record data generation: high similarity and privacy preserving diffusion model approach

Affiliations
  • 1Smart Health Lab, Department of Digital Health, Samsung Advanced Institute for Health Sciences & Technology, Sungkyunkwan University, Seoul, Korea
  • 2Department of Biomedical System Informatics, Yonsei University College of Medicine, Seoul, Korea
  • 3Data Sciences Institute, Research Institute for Future Medicine, Samsung Medical Center, Seoul, Korea
  • 4Cloud AI Research, Google Cloud, Google, Mountain View, CA, USA
  • 5Department of Digital Health, Samsung Advanced Institute for Health Sciences & Technology, Sungkyunkwan University, Seoul, Korea
  • 6Department of Emergency Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, Seoul, Korea
  • 7Digital Innovation, Samsung Medical Center, Seoul, Korea

Abstract

Purpose
This study aimed to develop real-world synthetic electronic health record (EHR) for emergency departments using computationally efficient and stable diffusion probabilistic models.
Methods
In this study, we compared the performance of diffusion models and state-ofthe-art generative adversarial networks (GANs) in terms of statistical similarity, privacy, medical usefulness, and the feasibility of using synthetic data for machine learning purposes.
Results
Our results demonstrate that diffusion models are significantly more computationally efficient than GANs and perform comparably or slightly better in terms of similarity, privacy, and utility. We also found that the data quality of the diffusion model is statistically very similar for both categorical and continuous values and can address class imbalance precisely. Moreover, the usefulness of synthetic data is almost identical to that of real EHR data. Our privacy analysis showed that the synthetic data generated by the diffusion models were private.
Conclusion
These findings have significant implications for improving the efficiency of emergency settings and enabling real-time emergency room data modeling. This demonstrates the potential of diffusion models for generating computationally efficient high-quality synthetic data. The study concluded that diffusion models can generate real-world synthetic EHRs that are computationally efficient, private, and high-quality, and can be used for machine learning purposes in emergency settings.

Keyword

Electronic health records; Emergency room; Privacy
Full Text Links
  • PFM
Actions
Cited
CITED
export Copy
Close
Share
  • Twitter
  • Facebook
Similar articles
Copyright © 2024 by Korean Association of Medical Journal Editors. All rights reserved.     E-mail: koreamed@kamje.or.kr