Uncovering psychiatric phenotypes using unsupervised machine learning: A data-driven symptoms approach

Amy Hofman, Isabelle Lier, M Arfan Ikram, Marijn van Wingerden, Annemarie I Luik

Research output: Contribution to journalArticleScientificpeer-review

2 Citations (Scopus)


Background Current categorical classification systems of psychiatric diagnoses lead to heterogeneity of symptoms within disorders and common co-occurrence of disorders. We investigated the heterogeneous and overlapping nature of symptom endorsement in a population-based sample across three of the most common categories of psychiatric disorders: depressive disorders, anxiety disorders, and sleep-wake disorders using unsupervised machine learning approaches. Methods We assessed a total of 43 symptoms in a discovery sample of 6,602 participants of the population-based Rotterdam Study between 2009 and 2013, and in a replication sample of 3,005 participants between 2016 and 2020. Symptoms were assessed using the Center for Epidemiologic Studies Depression Scale, the Hospital Anxiety and Depression Scale, and the Pittsburgh Sleep Quality Index. Hierarchical clustering analysis was applied on test items and participants to investigate common patterns of symptoms co-occurrence, and further quantitatively investigated with clustering methods to find groups that may represent similar psychiatric phenotypes. Results First, clustering analyses of the questionnaire items suggested a three-cluster solution representing clusters of mixed symptoms, depressed affect and nervousness, and troubled sleep and interpersonal problems. A highly similar clustering solution was independently established in the replication sample. Second, four groups of participants could be separated, and these groups scored differently on the item clusters. Conclusions We identified three clusters of psychiatric symptoms that most commonly co-occur in a population-based sample. These symptoms clustered stable over samples, but across the topics of depression, anxiety, and poor sleep. We identified four groups of participants that share (sub)clinical symptoms and might benefit from similar prevention or treatment strategies, despite potentially diverging, or lack of, diagnoses.

Original languageEnglish
Article numbere27
Pages (from-to)1-9
Number of pages9
JournalEuropean Psychiatry
Issue number1
Early online date2023
Publication statusPublished - 21 Feb 2023


  • Anxiety Disorders
  • Classification
  • Cluster Analysis
  • Depression Scale
  • Disorder
  • Hierarchical Taxonomy
  • Hospital Anxiety
  • Psychopathology Hitop
  • Reliability
  • Depression
  • Machine Learning
  • Sleep-wake Disorders


Dive into the research topics of 'Uncovering psychiatric phenotypes using unsupervised machine learning: A data-driven symptoms approach'. Together they form a unique fingerprint.

Cite this