Today we’re going to look at another nonparametric test: the single-sample runs test.
When Would You Use It?
The runs test is a nonparametric test used in a single sample situation to determine if the distribution of a series of binary events, in the population, is random.
What Type of Data?
The single sample runs test requires categorical (nominal) data.
Step 1: Formulate the null and alternative hypotheses. The null hypothesis claims that the events in the underlying population (represented by the sample series) are distributed randomly. The alternative hypothesis claims that the events in the underlying population are distributed nonrandomly.
Step 2: Compute the number of times each of the two alternatives appears in the sample series (n1 and n2) and the number of runs, r, in the series. A run is a sequence within the series in which one of the alternatives occurs on consecutive trials.
Step 3: Obtain the critical value. Unlike most of the tests we’ve done so far, you don’t get a precise p-value when computing the runs test results. Rather, you calculate your r and then compare it to an upper and lower limit for your specific n1 and n2 values. This is done using a table (such as the one here*). For the values of n1 and n2 in your sample (labeled on this table as n and m), find the intersecting cell of the two values.
Step 4: Determine the conclusion. If your r is greater than or equal to the larger number or smaller than or equal to the smaller number in that cell, you have a statistically significant result, meaning you reject the null (that is, reject the claim that the distribution of the binary events in the population is nonrandom). If your r is between the smaller and larger number, fail to reject the null.
For this example, I decided to see if the coin flips from the website Just Flip a Coin were, in fact, random. I “flipped” the coin a total of 30 times and recorded my results as “H” for heads and “T” for tails. This series is recorded below.
H0: The distribution of heads and tails in the population is random
Ha: The distribution of heads and tails in the population is nonrandom.
T H TT HHHHHH T H TT HH TT HH T HHH T HHHH T
n1 = 19 (heads)
n2 = 11 (tails)
r = 15
According to the table, my lower bound is 9 and my upper bound is 21. Since my r is in between these two values, I do not have a statistically significant result. I fail to reject H0 and claim that the distribution of heads and tails in the population is indeed random.
*Note that this table, like many others, only has a maximum of 20 for either n or m, and is constructed with α = 0.05 for the two-sided test and α = 0.025 for the one-sided test.
Example in R
No R example this week, since it’s probably more work to do this in R than it is to do it by hand, haha.