TY - JOUR
T1 - Generic E-variables for exact sequential k-sample tests that allow for optional stopping
AU - Turner, Rosanne J.
AU - Ly, Alexander
AU - Grünwald, Peter D.
N1 - Publisher Copyright:
© 2023 The Author(s)
PY - 2024/5
Y1 - 2024/5
N2 - We develop E-variables for testing whether two or more data streams come from the same source or not, and more generally, whether the difference between the sources is larger than some minimal effect size. These E-variables lead to exact, nonasymptotic tests that remain safe, i.e., keep their type-I error guarantees, under flexible sampling scenarios such as optional stopping and continuation. In special cases our E-variables also have an optimal ‘growth’ property under the alternative. While the construction is generic, we illustrate it through the special case of k×2 contingency tables, i.e. k Bernoulli streams, allowing for the incorporation of different restrictions on the composite alternative. Comparison to p-value analysis in simulations and a real-world 2 × 2 contingency table example show that E-variables, through their flexibility, often allow for early stopping of data collection — thereby retaining similar power as classical methods — while also retaining the option of extending or combining data afterwards.
AB - We develop E-variables for testing whether two or more data streams come from the same source or not, and more generally, whether the difference between the sources is larger than some minimal effect size. These E-variables lead to exact, nonasymptotic tests that remain safe, i.e., keep their type-I error guarantees, under flexible sampling scenarios such as optional stopping and continuation. In special cases our E-variables also have an optimal ‘growth’ property under the alternative. While the construction is generic, we illustrate it through the special case of k×2 contingency tables, i.e. k Bernoulli streams, allowing for the incorporation of different restrictions on the composite alternative. Comparison to p-value analysis in simulations and a real-world 2 × 2 contingency table example show that E-variables, through their flexibility, often allow for early stopping of data collection — thereby retaining similar power as classical methods — while also retaining the option of extending or combining data afterwards.
KW - Composite hypothesis
KW - E-variables
KW - Hypothesis testing
KW - Sequential test
KW - Test martingale
KW - Type-I error control
UR - http://www.scopus.com/inward/record.url?scp=85175262787&partnerID=8YFLogxK
U2 - 10.1016/j.jspi.2023.106116
DO - 10.1016/j.jspi.2023.106116
M3 - Article
AN - SCOPUS:85175262787
SN - 0378-3758
VL - 230
JO - Journal of Statistical Planning and Inference
JF - Journal of Statistical Planning and Inference
M1 - 106116
ER -