Abstract
Predicting recovery after trauma is important to provide patients a perspective on their estimated future health, to engage in shared decision making and target interventions to relevant patient groups. In the present study, several unsupervised techniques are employed to cluster patients based on longitudinal recovery profiles. Subsequently, these data-driven clusters were assessed on clinical validity by experts and used as targets in supervised machine learning models. We present a formalised analysis of the obtained clusters that incorporates evaluation of (i) statistical and machine learning metrics, (ii) clusters clinical validity with descriptive statistics and medical expertise. Clusters quality assessment revealed that clusters obtained through a Bayesian method (High Dimensional Supervised Classification and Clustering) and a Deep Gaussian Mixture model, in combination with oversampling and a Random Forest for supervised learning of the cluster assignments provided among the most clinically sensible partitioning of patients. Other methods that obtained higher classification accuracy suffered from cluster solutions with large majority classes or clinically less sensible classes. Models that used just physical or a mix of physical and psychological outcomes proved to be among the most sensible, suggesting that clustering on psychological outcomes alone yields recovery profiles that do not conform to known risk factors.
Original language | English |
---|---|
Article number | 16990 |
Number of pages | 13 |
Journal | Scientific Reports |
Volume | 12 |
Issue number | 1 |
DOIs | |
Publication status | Published - Dec 2022 |
Keywords
- Bayes Theorem
- Cluster Analysis
- Humans
- Machine Learning
- Risk Factors
- Supervised Machine Learning