Matt Lapa's Blog

Category: Statistics

Sequential Probability Ratio Test Part 3

Wald’s Identity and the Expected Stopping Time This is the third post in our ongoing series on the Sequential Probability Ratio Test (SPRT). In the first post in this series we introduced the idea of sequential hypothesis testing and then gave an introduction to the SPRT. In the second post we started to explore the…

April 28, 2025
Sequential Probability Ratio Test Part 2

How to prove that the stopping time is finite In the first post in this series we introduced the concept of sequential hypothesis testing and then gave an introduction to the simplest sequential test, the Sequential Probability Ratio Test (SPRT). That post was intended to provide the basic information that you would need to start…

November 3, 2024
The Sequential Probability Ratio Test

This will be the first post in a series of posts on the topic of sequential hypothesis testing. Specifically, these posts will focus on the Sequential Probability Ratio Test (SPRT), which is one of the simplest and most well-known examples of a sequential test. When conducting a standard statistical test, we first need to decide…

July 8, 2024
The Markov chain approach to CUSUM

Note: the code used to do the calculations in this post can be found here in the “change-point-detection” repository on my GitHub page. In the last post we introduced the problem of online change point detection and the CUSUM method for solving that problem. In this post we’ll dive deeper into the math you need…

November 20, 2023
Online change point detection and CUSUM

Note: the code used to generate the figures in this post can be found here in the “change-point-detection” repository on my GitHub page. In this post we’ll start to look at change point detection, which is the problem of detecting a sudden change in a parameter that characterizes some ongoing process. There are actually two…

October 2, 2023
Multiple hypothesis testing part 3: how to prove that the Benjamini-Hochberg method works

In our last post we introduced the approach to the Multiple Comparisons Problem based on control of the false discovery rate (FDR). As we discussed there, the FDR is, roughly speaking, equal to the proportion of false positives among the null hypotheses that we decide to reject when testing multiple hypotheses at the same time.…

June 21, 2023
Multiple hypothesis testing part 2: the false discovery rate and the Benjamini-Hochberg method

In the first post on this blog we introduced the Multiple Comparisons Problem (MCP), which is the increased risk of false positives (a.k.a. type 1 errors) that we face when testing multiple hypotheses at the same time. In that post we mainly discussed the idea of solving this problem by controlling the family-wise error rate…

May 23, 2023
Using data to bound the probability of a rare event

Note: the code used to generate the numbers and figures in this post can be found here in the “rare-event-probability” repository on my GitHub page. In this post we’ll continue exploring the theme of the last post, which was about the probability that a new data point will differ significantly from a set of data…

May 9, 2023
Chebyshev’s inequality with sample mean and sample variance

In this post we’ll look at a very interesting fundamental result in statistics that deals with the following situation. Suppose we are studying a system that is producing data according to some unknown process, and we have already observed \(n\) data points. We are about to observe the next data point, and we’d like to…

April 25, 2023
A/B testing with small samples

Note: the code used to generate the figures in this post can be found here in the “AB-testing-small-samples” repository on my GitHub page. In this post we look at the problem of A/B testing with small sample sizes. This is a tricky situation for several reasons. First, the statistical test that is commonly used to…

April 11, 2023