Week 39: Spearman’s Rank-Order Correlation Coefficient

Let’s talk about the Spearman’s rank-order correlation coefficient today!

When Would You Use It?
The Spearman’s rank-order correlation coefficient is a nonparametric test used to determine, in the population, if the correlation between values on two variables is some value other than zero. More specifically, it is used to determine if there is a significant linear relationship between the two variables.

What Type of Data?
The Spearman’s rank-order correlation coefficient requires both variables to be ordinal data.

Test Assumptions
No assumptions listed.

Test Process
Step 1: Formulate the null and alternative hypotheses. The null hypothesis claims that in the population, the correlation between the scores on variable X and variable Y is equal to zero. The alternative hypothesis claims otherwise (that the correlation is less than, greater than, or simply not equal to zero).

Step 2: Compute the test statistic, a t value. To do so, Spearman’s rank-order correlation coefficient, r_s, must be computed first. The following steps must be employed:

Rank both variables in order from smallest to largest, assigning a value of “1” to the smallest value for each variable, a “2” for the second-smallest value for each variable, etc.
For each pair of observations (that is, for each paired value of X and Y, compute di, the difference between the ranked values of X_i and Y_i.
Compute d_i², the squared difference of the ranks of X_i and Y_i.
Compute r_s as follows:

The test statistic itself is calculated as:

which is a t-value with degrees of freedom n – 2. Here, r_s is the Spearman rank-order correlation coefficient and n is the sample size.

Step 3: Obtain the p-value associated with the calculated z-score. The p-value indicates the probability of observing a correlation as extreme or more extreme than the observed sample correlation, under the assumption that the null hypothesis is true.

Step 4: Determine the conclusion. If the p-value is larger than the prespecified α-level, fail to reject the null hypothesis (that is, retain the claim that the correlation in the population is zero). If the p-value is smaller than the prespecified α-level, reject the null hypothesis in favor of the alternative.

Example
Let’s look at a random selection of 10 of my songs and see if there is a significant correlation between the number of stars a song has (its “rating”) and the number of times it has been played (its “playcount”). Let the X variable be the song’s rating and the Y variable be its playcount. I suspect a positive correlation between rating and playcount (or else my rating system is highly flawed!) Here, n = 10 and let α = 0.05.

H₀: r_s = 0
H_a: r_s > 0

The following table shows the raw data and the rankings needed to compute r_s.

Here R_x and R_y represent the ranks of X and Y, respectively, d represents the difference R_x – R_y, and d² is the squared differences.

Since our calculated p-value is smaller than our α-level, we reject H₀ and conclude that the correlation in the population is significantly greater than zero.

Eigenblogger

Week 39: Spearman’s Rank-Order Correlation Coefficient

What sayest thou? Speak! Cancel reply

Search Posts

Blog Stats

Random Blog Post

Subscribe!

Monthly Archive

Tags

Please don’t steal!

Eigenblogger

Week 39: Spearman’s Rank-Order Correlation Coefficient

Share this:

Related

What sayest thou? Speak! Cancel reply

Search Posts

Blog Stats

Random Blog Post

Subscribe!

Monthly Archive

Categories

Tags

Please don’t steal!