Today we’re going to talk about our first nonparametric test: the Wilcoxon signed-ranks test!
When Would You Use It?
The Wilcoxon signed-ranks test is a nonparametric test used in a single sample situation to determine if the sample originates from a population with a specific median θ.
What Type of Data?
The Wilcoxon signed-ranks test requires ordinal data.
- The sample is a simple random sample from the population of interest.
- The original scores obtained for each of the individuals in the sample are in the format of interval or ratio data.
- The underlying population distribution is symmetrical.
Step 1: Formulate the null and alternative hypotheses. The null hypothesis claims that the median in the population is equal to a specific value; the alternative hypothesis claims otherwise (the population median is greater than, less than, or not equal to the value specified in the null hypothesis).
Step 2: Compute the test statistic. Since this is best done with data, please see the example shown below to see how this is done.
Step 3: Obtain the critical value. Unlike most of the tests we’ve done so far, you don’t get a precise p-value when computing the results here. Rather, you calculate your test statistic and then compare it to a specific value. This is done using a table (such as the one here). Find the number at the intersection of your sample size n and the specified alpha-level. Compare this value with your test statistic.
Step 4: Determine the conclusion. If your test statistic is equal to or less than the table value, reject the null hypothesis. If your test statistic is greater than the table value, fail to reject the null (that is, claim that the median in the population is in fact equal to the value specified in the null hypothesis).
The data for this example come from a little analysis of my Facebook friends’ birthdays I did awhile ago. In that analysis, I recorded the birth months for the n = 97 friends who had their birthdays visible. All I did at the time was see how many birthdays there were per month. Now, however, I want to see if the median number of birthdays per month is equal to a certain value—say, 8. Set α = 0.05.
H0: θ = 8
Ha: θ ≠ 8
The following table shows several different columns of information. I will explain the columns below.
Column 1 is just the name of the month.
Column 2 is the number of observed birthdays in my sample for that corresponding month.
Column 3 is calculated by taking the number of observed birthdays minus the hypothesized median, which is θ = 8 in this case.
Column 4 is the absolute value of the difference in Column 3.
Column 5 is the rank of the value in Column 4. Ranking is done as follows: rank the values of Column 4 from smallest to largest. If there are ties, sum the number of ranks that are taken by the ties, and then divide that value by the number of ties. For example, there are five months that have a |D| = 3. That means that this tied value takes up the rank places of the 1st, 2nd, 3rd, 4th, and 5th observations, had they not been all tied at 1. Thus, I sum these rank places and divide by 5 to get (1+2+3+4+5)/5 = 3, then assign all these five months the rank of 3.
Column 6 contains the same values as Column 5, but signs them depending on the sign in Column 3.
The next step is to sum all the positive ranks in Column 6 and sum all the negative ranks in Column 6. Doing so, we get:
The test statistic itself is the absolute value of the smaller of the above values; in this case, we get T = 37.5. In the table, the critical value for n = 12 and α = 0.05 for a two-tailed test is 13. Since T > 13, we fail to reject the null and retain the claim that the population median is, in fact, 8.
Example in R
No R example this week; most of this is easy enough to do by hand for a small-ish sample.