Beyond Neyman-Pearson
Peter Grünwald, CWI and Leiden University
A standard practice in statistical hypothesis testing is to mention the p-value alongside the accept/reject decision. We show the advantages of mentioning instead an e-value, an alternative measure of evidence that has recently started to attract attention. With p-values, we cannot use an extreme observation (e.g. p << alpha) for getting better frequentist decisions. With e-values we can, since they provide Type-I risk control in a generalized Neyman-Pearson setting with the decision task (a general loss function) determined post-hoc, after observation of the data — thereby providing a handle on the perennial issue of ‘roving alpha’s’. When Type-II risks are taken into consideration, the only admissible decision rules in this post-hoc setting turn out to be e-value-based. This provides e-values with an additional interpretation on top of their original one in terms of bets.
We also propose to replace confidence intervals and distributions by the *e-posterior*, which provides valid post-hoc frequentist uncertainty assessments irrespective of prior correctness: if the prior is chosen badly, e-intervals get wide rather than wrong, suggesting e-posterior credible intervals as a safer alternative for Bayes credible intervals. The resulting *quasi-conditional paradigm* addresses foundational and practical issues in statistical inference.