Week 40: Kendall’s Tau


Let’s do another measure of correlation, shall we? Kendall’s tau!

When Would You Use It?
Kendall’s tau is a nonparametric test used to determine, in the population represented by a sample, if the correlation between subject’s scores on two variables equal to a value other than zero.

What Type of Data?
Kendall’s tau requires both variables to be ordinal data.

Test Assumptions
No assumptions listed.

Test Process
Step 1: Formulate the null and alternative hypotheses. The null hypothesis claims that in the population, the correlation between the ranks of subjects on variable X and variable Y is equal to zero. The alternative hypothesis claims otherwise (that the correlation is less than, greater than, or simply not equal to zero).

Step 2: Compute the test statistic, a z-value. To do so, Kendall’s tau must be computed first. The following steps must be employed:

  1. Arrange the data by the ranking on the X variable (smallest to largest ranking).
  2. Begin with the first Y rank corresponding to the first (smallest) X rank. If the Y rank for the smallest X ranking is larger than any Y ranks corresponding to any of the other X ranks, note it with a “D” for discordant. If the Y rank for the smallest X ranking is smaller than any Y ranks corresponding to any other X ranks, note it with a “C” for concordant.
  3. Once this is done for all comparisons for the first Y rank, move on to the second Y rank and repeat steps 2 and 3 until all ranks are considered.
  4. For each Y rank, sum the number of Cs and the number of Ds. The sum of all the Cs across all rankings gives you nC, the total number of C entries, and the sum of all the Ds across all rankings gives you nD, the total number of D entries.

Compute Kendall’s tau as follows:

where nC and nD are as defined above and n is the total number of data points in the sample.

The test statistic itself is calculated as:

Step 3: Obtain the p-value associated with the calculated z-score. The p-value indicates the probability of observing a correlation as extreme or more extreme than the observed sample correlation, under the assumption that the null hypothesis is true.

Step 4: Determine the conclusion. If the p-value is larger than the prespecified α-level, fail to reject the null hypothesis (that is, retain the claim that the correlation between the ranks in the population is zero). If the p-value is smaller than the prespecified α-level, reject the null hypothesis in favor of the alternative.

Example
I want to see if there’s a correlation between my ranking of 12 of my songs from 2009 and the ranking of those 12 same songs in 2016. Let X be the ranking in 2009 and Y be the ranking in 2016. Let’s use α = 0.05. I actually have no idea if this will end up a positive or negative correlation, so let’s go with the most general hypotheses:

H0: τ = 0
Ha: τ ≠ 0

The table below shows the rankings of the 12 songs for 2009 and 2016, as well as the method to obtain the sums of Cs and sums of Ds.

nC = 42 and nD = 24

Kendall’s tau and the test statistic:

Since our calculated p-value is larger than our α-level, we fail to reject H0 and conclude that the correlation between the ranks in the population is not significantly different from zero.

Advertisements

What sayest thou? Speak!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: