A variable selection method for simultaneous component based data integration

Research output: Contribution to journalArticleScientificpeer-review

Abstract

The integration of multiblock high throughput data from multiple sources is one of the major challenges in several disciplines including metabolomics, computational biology, genomics, and clinical psychology. A main challenge in this line of research is to obtain interpretable results 1) that give an insight into the common and distinctive sources of variations associated to the multiple and heterogeneous data blocks and 2) that facilitate the identification of relevant variables. We present a novel variable selection method for performing data integration, providing easily interpretable results, and recovering underlying data structure such as common and distinctive components. The flexibility and applicability of this method are showcased via numerical simulations and an application to metabolomics data.
Original languageEnglish
Pages (from-to)187-199
JournalChemometrics & Intelligent Laboratory Systems
Volume158
DOIs
Publication statusPublished - 15 Nov 2016

Fingerprint

Data integration
Data structures
Throughput
Computer simulation
Metabolomics
Genomics

Cite this

@article{da23be6ff6ef4ce2a8631ba732118ca9,
title = "A variable selection method for simultaneous component based data integration",
abstract = "The integration of multiblock high throughput data from multiple sources is one of the major challenges in several disciplines including metabolomics, computational biology, genomics, and clinical psychology. A main challenge in this line of research is to obtain interpretable results 1) that give an insight into the common and distinctive sources of variations associated to the multiple and heterogeneous data blocks and 2) that facilitate the identification of relevant variables. We present a novel variable selection method for performing data integration, providing easily interpretable results, and recovering underlying data structure such as common and distinctive components. The flexibility and applicability of this method are showcased via numerical simulations and an application to metabolomics data.",
author = "Z. Gu and {Van Deun}, K.",
year = "2016",
month = "11",
day = "15",
doi = "10.1016/j.chemolab.2016.07.013",
language = "English",
volume = "158",
pages = "187--199",
journal = "Chemometrics & Intelligent Laboratory Systems",
issn = "0169-7439",
publisher = "Elsevier Science BV",

}

A variable selection method for simultaneous component based data integration. / Gu, Z.; Van Deun, K.

In: Chemometrics & Intelligent Laboratory Systems, Vol. 158, 15.11.2016, p. 187-199.

Research output: Contribution to journalArticleScientificpeer-review

TY - JOUR

T1 - A variable selection method for simultaneous component based data integration

AU - Gu, Z.

AU - Van Deun, K.

PY - 2016/11/15

Y1 - 2016/11/15

N2 - The integration of multiblock high throughput data from multiple sources is one of the major challenges in several disciplines including metabolomics, computational biology, genomics, and clinical psychology. A main challenge in this line of research is to obtain interpretable results 1) that give an insight into the common and distinctive sources of variations associated to the multiple and heterogeneous data blocks and 2) that facilitate the identification of relevant variables. We present a novel variable selection method for performing data integration, providing easily interpretable results, and recovering underlying data structure such as common and distinctive components. The flexibility and applicability of this method are showcased via numerical simulations and an application to metabolomics data.

AB - The integration of multiblock high throughput data from multiple sources is one of the major challenges in several disciplines including metabolomics, computational biology, genomics, and clinical psychology. A main challenge in this line of research is to obtain interpretable results 1) that give an insight into the common and distinctive sources of variations associated to the multiple and heterogeneous data blocks and 2) that facilitate the identification of relevant variables. We present a novel variable selection method for performing data integration, providing easily interpretable results, and recovering underlying data structure such as common and distinctive components. The flexibility and applicability of this method are showcased via numerical simulations and an application to metabolomics data.

U2 - 10.1016/j.chemolab.2016.07.013

DO - 10.1016/j.chemolab.2016.07.013

M3 - Article

VL - 158

SP - 187

EP - 199

JO - Chemometrics & Intelligent Laboratory Systems

JF - Chemometrics & Intelligent Laboratory Systems

SN - 0169-7439

ER -