Optimal rank-based tests for common principal components.

M. Hallin, D. Paindaveine, T. Verdebout

    Research output: Contribution to journalArticleScientificpeer-review

    Abstract

    This paper provides optimal testing procedures for the mm-sample null hypothesis of Common Principal Components (CPC) under possibly non-Gaussian and heterogeneous elliptical densities. We first establish, under very mild assumptions that do not require finite moments of order four, the local asymptotic normality (LAN) of the model. Based on that result, we show that the pseudo-Gaussian test proposed in Hallin et al. (J. Nonparametr. Stat. 22 (2010) 879–895) is locally and asymptotically optimal under Gaussian densities, and show how to compute its local powers. A numerical evaluation of those powers, however, reveals that, while remaining valid, this test is poorly efficient away from the Gaussian. Moreover, it still requires finite moments of order four. We therefore propose rank-based procedures that remain valid under any possibly heterogeneous m-tuple of elliptical densities, irrespective of the existence of any moments. In elliptical families, indeed, principal components naturally can be based on the scatter matrices characterizing the density contours, hence do not require finite variances. Those rank-based tests, as usual, involve score functions, which may or may not be associated with a reference density at which they achieve optimality. A major advantage of our rank tests is that they are not only validity-robust, in the sense of surviving arbitrary elliptical population densities: unlike their pseudo-Gaussian counterparts, they also are efficiency-robust, in the sense that their local powers do not deteriorate away from the reference density at which they are optimal. We show, in particular, that in the homokurtic case, their normal-score version uniformly dominates, in the Pitman sense, the aforementioned pseudo-Gaussian generalization of Flury’s test. Theoretical results are obtained via a nonstandard application of Le Cam’s methodology in the context of curved LAN experiments. The finite-sample properties of the proposed tests are investigated via simulations.
    Original languageEnglish
    Pages (from-to)2153-2779
    JournalBernoulli
    Volume19
    Issue number5B
    Publication statusPublished - 2013

    Fingerprint

    Common Principal Components
    Local Asymptotic Normality
    Local Power
    Moment
    Valid
    Rank Test
    Score Function
    Principal Components
    Asymptotically Optimal
    Scatter
    Null hypothesis
    Optimality
    Testing
    Methodology

    Cite this

    Hallin, M., Paindaveine, D., & Verdebout, T. (2013). Optimal rank-based tests for common principal components. Bernoulli, 19(5B), 2153-2779.
    Hallin, M. ; Paindaveine, D. ; Verdebout, T. / Optimal rank-based tests for common principal components. In: Bernoulli. 2013 ; Vol. 19, No. 5B. pp. 2153-2779.
    @article{f681e932d6854143816d14c2456e0992,
    title = "Optimal rank-based tests for common principal components.",
    abstract = "This paper provides optimal testing procedures for the mm-sample null hypothesis of Common Principal Components (CPC) under possibly non-Gaussian and heterogeneous elliptical densities. We first establish, under very mild assumptions that do not require finite moments of order four, the local asymptotic normality (LAN) of the model. Based on that result, we show that the pseudo-Gaussian test proposed in Hallin et al. (J. Nonparametr. Stat. 22 (2010) 879–895) is locally and asymptotically optimal under Gaussian densities, and show how to compute its local powers. A numerical evaluation of those powers, however, reveals that, while remaining valid, this test is poorly efficient away from the Gaussian. Moreover, it still requires finite moments of order four. We therefore propose rank-based procedures that remain valid under any possibly heterogeneous m-tuple of elliptical densities, irrespective of the existence of any moments. In elliptical families, indeed, principal components naturally can be based on the scatter matrices characterizing the density contours, hence do not require finite variances. Those rank-based tests, as usual, involve score functions, which may or may not be associated with a reference density at which they achieve optimality. A major advantage of our rank tests is that they are not only validity-robust, in the sense of surviving arbitrary elliptical population densities: unlike their pseudo-Gaussian counterparts, they also are efficiency-robust, in the sense that their local powers do not deteriorate away from the reference density at which they are optimal. We show, in particular, that in the homokurtic case, their normal-score version uniformly dominates, in the Pitman sense, the aforementioned pseudo-Gaussian generalization of Flury’s test. Theoretical results are obtained via a nonstandard application of Le Cam’s methodology in the context of curved LAN experiments. The finite-sample properties of the proposed tests are investigated via simulations.",
    author = "M. Hallin and D. Paindaveine and T. Verdebout",
    year = "2013",
    language = "English",
    volume = "19",
    pages = "2153--2779",
    journal = "Bernoulli",
    issn = "1350-7265",
    publisher = "International Statistical Institute",
    number = "5B",

    }

    Hallin, M, Paindaveine, D & Verdebout, T 2013, 'Optimal rank-based tests for common principal components.', Bernoulli, vol. 19, no. 5B, pp. 2153-2779.

    Optimal rank-based tests for common principal components. / Hallin, M.; Paindaveine, D.; Verdebout, T.

    In: Bernoulli, Vol. 19, No. 5B, 2013, p. 2153-2779.

    Research output: Contribution to journalArticleScientificpeer-review

    TY - JOUR

    T1 - Optimal rank-based tests for common principal components.

    AU - Hallin, M.

    AU - Paindaveine, D.

    AU - Verdebout, T.

    PY - 2013

    Y1 - 2013

    N2 - This paper provides optimal testing procedures for the mm-sample null hypothesis of Common Principal Components (CPC) under possibly non-Gaussian and heterogeneous elliptical densities. We first establish, under very mild assumptions that do not require finite moments of order four, the local asymptotic normality (LAN) of the model. Based on that result, we show that the pseudo-Gaussian test proposed in Hallin et al. (J. Nonparametr. Stat. 22 (2010) 879–895) is locally and asymptotically optimal under Gaussian densities, and show how to compute its local powers. A numerical evaluation of those powers, however, reveals that, while remaining valid, this test is poorly efficient away from the Gaussian. Moreover, it still requires finite moments of order four. We therefore propose rank-based procedures that remain valid under any possibly heterogeneous m-tuple of elliptical densities, irrespective of the existence of any moments. In elliptical families, indeed, principal components naturally can be based on the scatter matrices characterizing the density contours, hence do not require finite variances. Those rank-based tests, as usual, involve score functions, which may or may not be associated with a reference density at which they achieve optimality. A major advantage of our rank tests is that they are not only validity-robust, in the sense of surviving arbitrary elliptical population densities: unlike their pseudo-Gaussian counterparts, they also are efficiency-robust, in the sense that their local powers do not deteriorate away from the reference density at which they are optimal. We show, in particular, that in the homokurtic case, their normal-score version uniformly dominates, in the Pitman sense, the aforementioned pseudo-Gaussian generalization of Flury’s test. Theoretical results are obtained via a nonstandard application of Le Cam’s methodology in the context of curved LAN experiments. The finite-sample properties of the proposed tests are investigated via simulations.

    AB - This paper provides optimal testing procedures for the mm-sample null hypothesis of Common Principal Components (CPC) under possibly non-Gaussian and heterogeneous elliptical densities. We first establish, under very mild assumptions that do not require finite moments of order four, the local asymptotic normality (LAN) of the model. Based on that result, we show that the pseudo-Gaussian test proposed in Hallin et al. (J. Nonparametr. Stat. 22 (2010) 879–895) is locally and asymptotically optimal under Gaussian densities, and show how to compute its local powers. A numerical evaluation of those powers, however, reveals that, while remaining valid, this test is poorly efficient away from the Gaussian. Moreover, it still requires finite moments of order four. We therefore propose rank-based procedures that remain valid under any possibly heterogeneous m-tuple of elliptical densities, irrespective of the existence of any moments. In elliptical families, indeed, principal components naturally can be based on the scatter matrices characterizing the density contours, hence do not require finite variances. Those rank-based tests, as usual, involve score functions, which may or may not be associated with a reference density at which they achieve optimality. A major advantage of our rank tests is that they are not only validity-robust, in the sense of surviving arbitrary elliptical population densities: unlike their pseudo-Gaussian counterparts, they also are efficiency-robust, in the sense that their local powers do not deteriorate away from the reference density at which they are optimal. We show, in particular, that in the homokurtic case, their normal-score version uniformly dominates, in the Pitman sense, the aforementioned pseudo-Gaussian generalization of Flury’s test. Theoretical results are obtained via a nonstandard application of Le Cam’s methodology in the context of curved LAN experiments. The finite-sample properties of the proposed tests are investigated via simulations.

    M3 - Article

    VL - 19

    SP - 2153

    EP - 2779

    JO - Bernoulli

    JF - Bernoulli

    SN - 1350-7265

    IS - 5B

    ER -

    Hallin M, Paindaveine D, Verdebout T. Optimal rank-based tests for common principal components. Bernoulli. 2013;19(5B):2153-2779.