Week 4: The Single-Sample Test for Evaluating Population Skewness


Last week we did a test for population variance, which represents the second moment about the mean. Today we’re going to go one moment further and do a single-sample test to evaluate population skewness (which represents the third moment about the mean)!

When Would You Use It?
The test of population skewness test is a parametric test used in a single sample situation to determine if a sample originates from a population that is symmetrical (that is, not skewed).

What Type of Data?
The test for skewness requires interval or ratio data.

Test Assumptions
None listed.

Test Process
Step 1: Formulate the null and alternative hypotheses. The null hypothesis claims that the skewness parameter γ in the population is equal to 0, which corresponds to symmetry; the alternative hypothesis claims otherwise (the population skewness parameter is greater than, less than, or not equal to the value specified in the null hypothesis, suggesting there is some skew).

Step 2: Compute the test statistic value, a z-score. The test statistic requires several calculations to be obtained. The calculations are as follows:

Test4

Step 3: Obtain the p-value associated with the calculated chi-square. The p-value indicates the probability of observing a skew as extreme or more extreme than the observed sample skew, under the assumption that the null hypothesis is true.

Step 4: Determine the conclusion. If the p-value is larger than the prespecified α-level, fail to reject the null hypothesis (that is, retain the claim that there is symmetry (no skew) in the population). If the p-value is smaller than the prespecified α-level, reject the null hypothesis in favor of the alternative.

Example
As in the last test, the data for this example come from my n = 365 song downloads from 2010. I want to create a hypothesis test regarding the skew of the distribution of song lengths (in seconds). Based on the following histogram, I’m going to say that this distribution has a right skew.

Test4b

Thus,

H0: γ = 0
Ha: γ > 0

Set α = 0.05.

Computations:

Test4c

Since our p-value is basically zero, it is smaller than our alpha-level, and we reject H0 and claim that the population is indeed positively skewed (γ > 0)

Example in R

dat = read.table('clipboard',header=T) #'dat' is the name of the imported raw data
hist(dat)                              #creates histogram of data
n = 365
m3 =(n*sum((dat-mean(dat))^3))/((n-1)*(n-2))
s3 = sqrt((sum(dat^2)-(((sum(dat))^2)/n))/(n-1))
g1 = m3/(s3)^3
b1 = ((n-2)*g1)/(sqrt(n*(n-1)))
A = b1*sqrt(((n+1)*(n+3))/(6*(n-2)))
B = (3*((n^2)+(27*n)-70)*(n+1)*(n+3))/((n-2)*(n+5)*(n+7)*(n+9))
C = sqrt(2*(B-1))-1
D = sqrt(C)
E = 1/sqrt((log(D)))
F = A/(sqrt(2/(C-1)))
z = E*log(F+sqrt((F^2)+1))            #test statistic     
pval = (1-pnorm(z))                   #p-value

 

Advertisements

What sayest thou? Speak!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: