TY - UNPB
T1 - A tutorial on safe anytime-valid inference
T2 - Practical maximally flexible sampling designs for experiments based on e-values
AU - Ly, Alexander
AU - Böhm, Udo
AU - Grünwald, Peter
AU - Ramdas, Aaditya
AU - van Ravenzwaaij, Don
PY - 2025
Y1 - 2025
N2 - We demonstrate how e-values simplify both experimental design and the inference process. With e-values researchers can perform anytime-valid tests and construct confidence intervals that maintain type I error control regardless of the sample size. This enables real-time monitoring of evidence as data are collected, permitting early termination of experiments without intolerably inflating the risk of false discoveries. Early stopping not only conserves resources, but also mitigates risk for participants in clinical settings. Anytime-valid tests allow for optional continuation, that is, the extension of an experiment, for instance if more funds become available, or even if the evidence looks promising and the funding agency, a reviewer, or an editor urges the experimenter to collect more data. Analogously, a researcher can be assured that a 95% anytime-valid confidence interval will, with at least95% probability, cover the true effect size regardless of how, or even if, data collection is stopped. We use the free and open-source software package safestats implemented in R to illustrate the practical benefits of this novel inference framework.
AB - We demonstrate how e-values simplify both experimental design and the inference process. With e-values researchers can perform anytime-valid tests and construct confidence intervals that maintain type I error control regardless of the sample size. This enables real-time monitoring of evidence as data are collected, permitting early termination of experiments without intolerably inflating the risk of false discoveries. Early stopping not only conserves resources, but also mitigates risk for participants in clinical settings. Anytime-valid tests allow for optional continuation, that is, the extension of an experiment, for instance if more funds become available, or even if the evidence looks promising and the funding agency, a reviewer, or an editor urges the experimenter to collect more data. Analogously, a researcher can be assured that a 95% anytime-valid confidence interval will, with at least95% probability, cover the true effect size regardless of how, or even if, data collection is stopped. We use the free and open-source software package safestats implemented in R to illustrate the practical benefits of this novel inference framework.
KW - adaptive sampling design
KW - evidence
KW - reproducible science
KW - research waste reduction
KW - sequential analysis
UR - https://osf.io/mdbqe/files/osfstorage
U2 - 10.31234/osf.io/h5vae_v3
DO - 10.31234/osf.io/h5vae_v3
M3 - Working paper
T3 - Behavior Research Methods
BT - A tutorial on safe anytime-valid inference
PB - PsyArXiv Preprints
ER -