Posterior calibration of posterior predictive p-values

G.H. van Kollenburg, J. Mulder, J.K. Vermunt

Research output: Contribution to journalArticleScientificpeer-review

Abstract

In order to accurately control the Type I error rate (typically .05), a p value should be uniformly distributed under the null model. The posterior predictive p value (ppp), which is commonly used in Bayesian data analysis, generally does not satisfy this property. For example there have been reports where the sampling distribution of the ppp under the null model was highly concentrated around .50. In this case, a ppp of .20 would indicate model misfit, but when comparing it with a significance level of .05, which is standard statistical practice, the null model would not be rejected. Therefore, the ppp has very little power to detect model misfit. A solution has been proposed in the literature, which involves calibrating the ppp using the prior distribution of the parameters under the null model. A disadvantage of this “prior-cppp” is that it is very sensitive to the prior of the model parameters. In this article, an alternative solution is proposed where the ppp is calibrated using the posterior under the null model. This “posterior-cppp” (a) can be used when prior information is absent, (b) allows one to test any type of misfit by choosing an appropriate discrepancy measure, and (c) has a uniform distribution under the null model. The methodology is applied in various testing problems: testing independence of dichotomous variables, checking misfit of linear regression models in the presence of outliers, and assessing misfit in latent class analysis.
Original languageEnglish
Pages (from-to)382-396
JournalPsychological Methods
Volume22
Issue number2
DOIs
Publication statusPublished - 2017

Fingerprint

Linear Models

Cite this

@article{9fbb4ea62a4e4252973adaa18726ca26,
title = "Posterior calibration of posterior predictive p-values",
abstract = "In order to accurately control the Type I error rate (typically .05), a p value should be uniformly distributed under the null model. The posterior predictive p value (ppp), which is commonly used in Bayesian data analysis, generally does not satisfy this property. For example there have been reports where the sampling distribution of the ppp under the null model was highly concentrated around .50. In this case, a ppp of .20 would indicate model misfit, but when comparing it with a significance level of .05, which is standard statistical practice, the null model would not be rejected. Therefore, the ppp has very little power to detect model misfit. A solution has been proposed in the literature, which involves calibrating the ppp using the prior distribution of the parameters under the null model. A disadvantage of this “prior-cppp” is that it is very sensitive to the prior of the model parameters. In this article, an alternative solution is proposed where the ppp is calibrated using the posterior under the null model. This “posterior-cppp” (a) can be used when prior information is absent, (b) allows one to test any type of misfit by choosing an appropriate discrepancy measure, and (c) has a uniform distribution under the null model. The methodology is applied in various testing problems: testing independence of dichotomous variables, checking misfit of linear regression models in the presence of outliers, and assessing misfit in latent class analysis.",
author = "{van Kollenburg}, G.H. and J. Mulder and J.K. Vermunt",
year = "2017",
doi = "10.1037/met0000142",
language = "English",
volume = "22",
pages = "382--396",
journal = "Psychological Methods",
issn = "1082-989X",
publisher = "AMER PSYCHOLOGICAL ASSOC",
number = "2",

}

Posterior calibration of posterior predictive p-values. / van Kollenburg, G.H.; Mulder, J.; Vermunt, J.K.

In: Psychological Methods, Vol. 22, No. 2, 2017, p. 382-396.

Research output: Contribution to journalArticleScientificpeer-review

TY - JOUR

T1 - Posterior calibration of posterior predictive p-values

AU - van Kollenburg, G.H.

AU - Mulder, J.

AU - Vermunt, J.K.

PY - 2017

Y1 - 2017

N2 - In order to accurately control the Type I error rate (typically .05), a p value should be uniformly distributed under the null model. The posterior predictive p value (ppp), which is commonly used in Bayesian data analysis, generally does not satisfy this property. For example there have been reports where the sampling distribution of the ppp under the null model was highly concentrated around .50. In this case, a ppp of .20 would indicate model misfit, but when comparing it with a significance level of .05, which is standard statistical practice, the null model would not be rejected. Therefore, the ppp has very little power to detect model misfit. A solution has been proposed in the literature, which involves calibrating the ppp using the prior distribution of the parameters under the null model. A disadvantage of this “prior-cppp” is that it is very sensitive to the prior of the model parameters. In this article, an alternative solution is proposed where the ppp is calibrated using the posterior under the null model. This “posterior-cppp” (a) can be used when prior information is absent, (b) allows one to test any type of misfit by choosing an appropriate discrepancy measure, and (c) has a uniform distribution under the null model. The methodology is applied in various testing problems: testing independence of dichotomous variables, checking misfit of linear regression models in the presence of outliers, and assessing misfit in latent class analysis.

AB - In order to accurately control the Type I error rate (typically .05), a p value should be uniformly distributed under the null model. The posterior predictive p value (ppp), which is commonly used in Bayesian data analysis, generally does not satisfy this property. For example there have been reports where the sampling distribution of the ppp under the null model was highly concentrated around .50. In this case, a ppp of .20 would indicate model misfit, but when comparing it with a significance level of .05, which is standard statistical practice, the null model would not be rejected. Therefore, the ppp has very little power to detect model misfit. A solution has been proposed in the literature, which involves calibrating the ppp using the prior distribution of the parameters under the null model. A disadvantage of this “prior-cppp” is that it is very sensitive to the prior of the model parameters. In this article, an alternative solution is proposed where the ppp is calibrated using the posterior under the null model. This “posterior-cppp” (a) can be used when prior information is absent, (b) allows one to test any type of misfit by choosing an appropriate discrepancy measure, and (c) has a uniform distribution under the null model. The methodology is applied in various testing problems: testing independence of dichotomous variables, checking misfit of linear regression models in the presence of outliers, and assessing misfit in latent class analysis.

U2 - 10.1037/met0000142

DO - 10.1037/met0000142

M3 - Article

VL - 22

SP - 382

EP - 396

JO - Psychological Methods

JF - Psychological Methods

SN - 1082-989X

IS - 2

ER -