Tag Archives: kendall’s tau

Week 40: Kendall’s Tau

Let’s do another measure of correlation, shall we? Kendall’s tau!

When Would You Use It?
Kendall’s tau is a nonparametric test used to determine, in the population represented by a sample, if the correlation between subject’s scores on two variables equal to a value other than zero.

What Type of Data?
Kendall’s tau requires both variables to be ordinal data.

Test Assumptions
No assumptions listed.

Test Process
Step 1: Formulate the null and alternative hypotheses. The null hypothesis claims that in the population, the correlation between the ranks of subjects on variable X and variable Y is equal to zero. The alternative hypothesis claims otherwise (that the correlation is less than, greater than, or simply not equal to zero).

Step 2: Compute the test statistic, a z-value. To do so, Kendall’s tau must be computed first. The following steps must be employed:

  1. Arrange the data by the ranking on the X variable (smallest to largest ranking).
  2. Begin with the first Y rank corresponding to the first (smallest) X rank. If the Y rank for the smallest X ranking is larger than any Y ranks corresponding to any of the other X ranks, note it with a “D” for discordant. If the Y rank for the smallest X ranking is smaller than any Y ranks corresponding to any other X ranks, note it with a “C” for concordant.
  3. Once this is done for all comparisons for the first Y rank, move on to the second Y rank and repeat steps 2 and 3 until all ranks are considered.
  4. For each Y rank, sum the number of Cs and the number of Ds. The sum of all the Cs across all rankings gives you nC, the total number of C entries, and the sum of all the Ds across all rankings gives you nD, the total number of D entries.

Compute Kendall’s tau as follows:

where nC and nD are as defined above and n is the total number of data points in the sample.

The test statistic itself is calculated as:

Step 3: Obtain the p-value associated with the calculated z-score. The p-value indicates the probability of observing a correlation as extreme or more extreme than the observed sample correlation, under the assumption that the null hypothesis is true.

Step 4: Determine the conclusion. If the p-value is larger than the prespecified α-level, fail to reject the null hypothesis (that is, retain the claim that the correlation between the ranks in the population is zero). If the p-value is smaller than the prespecified α-level, reject the null hypothesis in favor of the alternative.

Example
I want to see if there’s a correlation between my ranking of 12 of my songs from 2009 and the ranking of those 12 same songs in 2016. Let X be the ranking in 2009 and Y be the ranking in 2016. Let’s use α = 0.05. I actually have no idea if this will end up a positive or negative correlation, so let’s go with the most general hypotheses:

H0: τ = 0
Ha: τ ≠ 0

The table below shows the rankings of the 12 songs for 2009 and 2016, as well as the method to obtain the sums of Cs and sums of Ds.

nC = 42 and nD = 24

Kendall’s tau and the test statistic:

Since our calculated p-value is larger than our α-level, we fail to reject H0 and conclude that the correlation between the ranks in the population is not significantly different from zero.

Advertisements