Healthc Inform Res.  2022 Jul;28(3):256-266. 10.4258/hir.2022.28.3.256.

Unsupervised Machine Learning to Identify Depressive Subtypes

Affiliations
  • 1Carbon Health, San Mateo, CA, USA
  • 2Institute of Psychiatry, Psychology and Neuroscience, King’s College London, London, UK
  • 3NIHR Maudsley BRC, London, UK
  • 4South London and Maudsley NHS Foundation Trust, Beckenham, UK

Abstract


Objectives
This study evaluated an unsupervised machine learning method, latent Dirichlet allocation (LDA), as a method for identifying subtypes of depression within symptom data.
Methods
Data from 18,314 depressed patients were used to create LDA models. The outcomes included future emergency presentations, crisis events, and behavioral problems. One model was chosen for further analysis based upon its potential as a clinically meaningful construct. The associations between patient groups created with the final LDA model and outcomes were tested. These steps were repeated with a commonly-used latent variable model to provide additional context to the LDA results.
Results
Five subtypes were identified using the final LDA model. Prior to the outcome analysis, the subtypes were labeled based upon the symptom distributions they produced: psychotic, severe, mild, agitated, and anergic-apathetic. The patient groups largely aligned with the outcome data. For example, the psychotic and severe subgroups were more likely to have emergency presentations (odds ratio [OR] = 1.29; 95% confidence interval [CI], 1.17–1.43 and OR = 1.16; 95% CI, 1.05–1.29, respectively), whereas these outcomes were less likely in the mild subgroup (OR = 0.86; 95% CI, 0.78–0.94). We found that the LDA subtypes were characterized by clusters of unique symptoms. This contrasted with the latent variable model subtypes, which were largely stratified by severity.
Conclusions
This study suggests that LDA can surface clinically meaningful, qualitative subtypes. Future work could be incorporated into studies concerning the biological bases of depression, thereby contributing to the development of new psychiatric therapeutics.

Keyword

Psychiatry; Depression; Mental Health; Machine Learning; Medical Informatics

Figure

  • Figure 1 Five-topic latent Dirichlet allocation (LDA) symptom distribution. Column colors represent individual subtypes. Symptoms were included here if they were one of the two most common symptoms for a subtype. The red column corresponds to the “Severe” group, blue to “Psychotic”, yellow to “Mild,” green to “Agitated,” and pink to “Anergic-apathetic.”

  • Figure 2 Three-class latent class analysis (LCA) symptom likelihoods. Column colors represent individual subtypes. The top 10 most common symptoms in the dataset were included here. The red and yellow columns can be viewed as severe subtypes, where the latter is distinguished by psychotic features. The blue, overall, forms a mild subtype.

  • Figure 3 Four-class LCA symptom likelihoods. Column colors represent individual subtypes. The top 10 most common symptoms in the dataset were included here. The red column corresponds to the “Severe” group, blue to “Psychotic,” yellow to “Moderate,” and green to “Mild.”

  • Figure 4 Symptom likelihoods for the latent Dirichlet allocation (LDA) patient groups. Symptoms were included here if they were one of the top 10 most common symptoms, and were one of the top two symptoms in an LDA subtype.

  • Figure 5 Symptom likelihoods for the latent class analysis (LCA) patient groups. Symptoms were included here if they were one of the top ten most common symptoms, and were one of the top two symptoms in an latent Dirichlet allocation (LDA) subtype.


Reference

References

1. World Health Organization. Depression and other common mental disorders: global health estimates. Geneva, Switzerland: World Health Organization;2017.
2. GBD 2017 Disease and Injury Incidence and Prevalence Collaborators. Global, regional, and national incidence, prevalence, and years lived with disability for 354 diseases and injuries for 195 countries and territories, 1990–2017: a systematic analysis for the global burden of disease study 2017. Lancet. 2018; 392(10159):1789–858. https://doi.org/10.1016/s0140-6736(18)32279-7.
3. Warden D, Rush AJ, Trivedi MH, Fava M, Wisniewski SR. The STAR*D Project results: a comprehensive review of findings. Curr Psychiatry Rep. 2007; 9(6):449–59. https://doi.org/10.1007/s11920-007-0061-3.
Article
4. Kern DM, Cepeda MS, Defalco F, Etropolski M. Treatment patterns and sequences of pharmacotherapy for patients diagnosed with depression in the United States: 2014 through 2019. BMC Psychiatry. 2020; 20(1):4. https://doi.org/10.1186/s12888-019-2418-7.
Article
5. Insel T, Cuthbert B, Garvey M, Heinssen R, Pine DS, Quinn K, et al. Research domain criteria (RDoC): toward a new classification framework for research on mental disorders. Am J Psychiatry. 2010; 167(7):748–51. https://doi.org/10.1176/appi.ajp.2010.09091379.
Article
6. Hasler G, Drevets WC, Manji HK, Charney DS. Discovering endophenotypes for major depression. Neuropsychopharmacology. 2004; 29(10):1765–81. https://doi.org/10.1038/sj.npp.1300506.
Article
7. Rush AJ. The varied clinical presentations of major depressive disorder. J Clin Psychiatry. 2007; 68(Suppl 8):4–10.
8. van Loo HM, de Jonge P, Romeijn JW, Kessler RC, Schoevers RA. Data-driven subtypes of major depressive disorder: a systematic review. BMC Med. 2012; 10:156. https://doi.org/10.1186/1741-7015-10-156.
Article
9. Ulbricht CM, Chrysanthopoulou SA, Levin L, Lapane KL. The use of latent class analysis for identifying subtypes of depression: a systematic review. Psychiatry Res. 2018; 266:228–46. https://doi.org/10.1016/j.psychres.2018.03.003.
Article
10. Marquand AF, Wolfers T, Mennes M, Buitelaar J, Beckmann CF. Beyond lumping and splitting: a review of computational approaches for stratifying psychiatric disorders. Biol Psychiatry Cogn Neurosci Neuroimaging. 2016; 1(5):433–47. https://doi.org/10.1016/j.bpsc.2016.04.002.
Article
11. Fernandes BS, Williams LM, Steiner J, Leboyer M, Carvalho AF, Berk M. The new field of 'precision psychiatry'. BMC Med. 2017; 15(1):80. https://doi.org/10.1186/s12916-017-0849-x.
Article
12. Horwitz T, Lam K, Chen Y, Xia Y, Liu C. A decade in psychiatric GWAS research. Mol Psychiatry. 2019; 24(3):378–89. https://doi.org/10.1038/s41380-018-0055-z.
Article
13. Fried EI, Nesse RM. Depression is not a consistent syndrome: An investigation of unique symptom patterns in the STAR*D study. J Affect Disord. 2015; 172:96–102. https://doi.org/10.1016/j.jad.2014.10.010.
Article
14. Blei DM, Ng AY, Jordan MI. Latent Dirichlet allocation. J Mach Learn Res. 2003; 3:993–1022.
15. Mori M, Krumholz HM, Allore HG. Using latent class analysis to identify hidden clinical phenotypes. JAMA. 2020; 324(7):700–1. https://doi.org/10.1001/jama.2020.2278.
Article
16. Lamers F, de Jonge P, Nolen WA, Smit JH, Zitman FG, Beekman AT, et al. Identifying depressive subtypes in a large cohort study: results from the Netherlands Study of Depression and Anxiety (NESDA). J Clin Psychiatry. 2010; 71(12):1582–9. https://doi.org/10.4088/jcp.09m05398blu.
Article
17. Sullivan PF, Kessler RC, Kendler KS. Latent class analysis of lifetime depressive symptoms in the national comorbidity survey. Am J Psychiatry. 1998; 155(10):1398–406. https://doi.org/10.1176/ajp.155.10.1398.
Article
18. Perera G, Broadbent M, Callard F, Chang CK, Downs J, Dutta R, et al. Cohort profile of the South London and Maudsley NHS foundation trust Biomedical Research Centre (SLaM BRC) case register: current status and recent enhancement of an Electronic Mental Health Record-derived data resource. BMJ Open. 2016; 6(3):e008721. https://doi.org/10.1136/bmjopen-2015-008721.
Article
19. Fernandes AC, Cloete D, Broadbent MT, Hayes RD, Chang CK, Jackson RG, et al. Development and evaluation of a de-identification procedure for a case register sourced from mental health electronic records. BMC Med Inform Decis Mak. 2013; 13:71. https://doi.org/10.1186/1472-6947-13-71.
Article
20. CRIS NLP Service. Library of production-ready applications [Internet]. London, UK: NIHR Maudsley Biomedical Research Centre;2020. [cited at 2022 Jul 25]. Available from: https://maudsleybrc.nihr.ac.uk/media/313772/applications-library-v12.pdf.
21. Delaffon V, Anwar Z, Noushad F, Ahmed AS, Brugha TS. Use of Health of the Nation Outcome Scales in psychiatry. Adv Psychiatr Treat. 2012; 18(3):173–9. https://doi.org/10.1192/apt.bp.110.008029.
Article
22. Linzer DA, Lewis JB. poLCA: an R package for polytomous variable latent class analysis. J Stat Softw. 2011; 42(10):1–29. https://doi.org/10.18637/jss.v042.i10.
Article
23. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: machine learning in Python. J Mach Learn Res. 2011; 12:2825–2830.
24. Tolentino JC, Schmidt SL. DSM-5 criteria and depression severity: implications for clinical practice. Front Psychiatry. 2018; 9:450. https://doi.org/10.3389/fpsyt.2018.00450.
Article
25. Lowe B, Spitzer RL, Grafe K, Kroenke K, Quenter A, Zipfel S, et al. Comparative validity of three screening questionnaires for DSM-IV depressive disorders and physicians' diagnoses. J Affect Disord. 2004; 78(2):131–40. https://doi.org/10.1016/s0165-0327(02)00237-9.
Article
26. van Loo HM, Wanders RB, Wardenaar KJ, Fried EI. Problems with latent class analysis to detect data-driven subtypes of depression. Mol Psychiatry. 2018; 23(3):495–6. https://doi.org/10.1038/mp.2016.202.
Article
27. Kroenke K, Spitzer RL, Williams JB. The PHQ-9: validity of a brief depression severity measure. J Gen Intern Med. 2001; 16(9):606–13. https://doi.org/10.1046/j.1525-1497.2001.016009606.x.
Article
28. Liu CH, Stevens C, Wong SH, Yasui M, Chen JA. The prevalence and predictors of mental health diagnoses and suicide among U.S. college students: implications for addressing disparities in service use. Depress Anxiety. 2019; 36(1):8–17. https://doi.org/10.1002/da.22830.
Article
Full Text Links
  • HIR
Actions
Cited
CITED
export Copy
Close
Share
  • Twitter
  • Facebook
Similar articles
Copyright © 2024 by Korean Association of Medical Journal Editors. All rights reserved.     E-mail: koreamed@kamje.or.kr