So today is kind of a big deal for me. Ten years ago, on May 1, 2006, I gave in to peer pressure and started this blog with the intent of posting every day. Because it was 2006, this blog was started on MySpace and was basically just me rambling about my daily activities.
Since then, however, my blog has grown up a little bit, gotten its own domain name, and has become a little archive of my life since 2006, with a post for every day of every year since the blog was started.
So I figured it’s appropriate to acknowledge that today is my 10 year anniversary of blogging! Yup, believe it or not, there’s been a decade of this nonsense.
This coming week will be all about blog statistics and looking back on the past 10 years to see what’s changed for me (and what’s stayed the same). So that’s something to look forward to!
As always, I appreciate everyone who reads this (subscribers and passers-by), and I hope I’ve kept you entertained enough so that you’ll want to keep reading long into the next decade of my blatherings.
Today we’re going to talk about another nonparametric test: the Siegel-Tukey test for equal variability!
When Would You Use It?
The Siegel-Tukey test for equal variability is a nonparametric test used to determine if two independent samples represent two populations with different variances.
What Type of Data?
The Siegel-Tukey test for equal variability requires ordinal data.
- Each sample is a simple random sample from the population it represents.
- The two samples are independent.
- The underlying distributions of the samples have equal medians.
Step 1: Formulate the null and alternative hypotheses. The null hypothesis claims that the two population variances are equal. The alternative hypothesis claims otherwise (one variance is greater than the other, or that they are simply not equal).
[Note that from here on out, the calculations are exactly the same as for the Mann-Whitney U test. The only thing that differs is how the data are ranked.]
Step 2: Compute the test statistics: U1 and U2. Since this is best done with data, please see the example shown below to see how this is done.
Step 3: Obtain the critical value. Unlike most of the tests we’ve done so far, you don’t get a precise p-value when computing the results here. Rather, you calculate your U values and then compare them to a specific value. This is done using a table (such as the one here). Find the number at the intersection of your sample sizes for both samples at the specified alpha-level. Compare this value with the smaller of your U1 and U2 values.
Step 4: Determine the conclusion. If your test statistic is equal to or less than the table value, reject the null hypothesis. If your test statistic is greater than the table value, fail to reject the null (that is, claim that the variances are equal in the population).
Today’s data come from my 2012 music selection. I wanted to see if the median play counts for two genres—pop and electronic—were the same. I chose these two because I think most of my favorite songs are of one of the two genres. To keep things relatively simple for the example, I sampled n = 8 electronic songs and n = 8 pop songs. Set α = 0.05.
H0: σ2pop = σ2electronic
Ha: σ2pop ≠ σ2electronic
The following table shows several different columns of information. I will explain the columns below.
Column 1 is the genre of each song.
Column 2 is the play count for each song, ranked from least to greatest
Column 3 is the rank of each play count. In order to obtain the ranks for this test, start by giving a rank of “1” to the lowest play count value. Then a rank of “2” to the highest play count value, a rank of “3” to the second highest play count value, a rank of “4” to the second lowest play count value, etc. (that is, assign ranks by alternating from one extreme to the other).
To compute U1 and U2, use the following equations:
The test statistic itself is the smaller of the above values; in this case, they’re both the same, so we get U = 32. In the table, the critical value for n1 = 8 and n2 = 8 and α = 0.05 for a two-tailed test is 13. Since U > 13, we fail to reject the null and retain the claim that the population variances are equal.
Example in R
No R example this week; most of this is easy enough to do by hand for a small-ish sample.
It’s time for some Watership Down!
Have I read this before: Yes, the summer after high school. However, I read it while I was recovering from having my wisdom teeth yanked out, so I was kind of loopy and don’t remember much.
Review: This is a fantastic book, yo. In case you’ve never read it (or know nothing about it), it’s about rabbits. I love the way Adams writes the rabbits. It’s very natural—you get their behaviors and attitudes and fear. And it’s basically impossible to not sympathize with them as they go through their troubles. If you’ve never read this, read it. If you’ve read it, read it again.
Favorite part: This is going to sound weird, but my favorite part is the epilogue. I love the way it’s written and I love how it gives us closure with Hazel. I think it’s very beautiful and I remember it making me cry when I first read this. But I was on drugs then.
Awhile back (2011), I decided to try taking an online practice version of the AP Statistics exam. I think it went okay (I unfortunately didn’t record my exact final score on the multiple choice, ‘cause I’m a loser), but I decided to try it again because, you know, I’ve actually got a better background for it now.
Well, not the same practice test. I found a newer one (2010 I think?) on which there were fewer multiple choice questions but the questions were decidedly more difficult. This time I ended up getting 16/18 correct (I shouldn’t have missed one of the ones I did; I just completely blanked on how to do it, even though I’ve taught it in lab like 20 times).
Anyway. If you want to give it a shot, it’s located here!
Hello, fellow humans!
Have some Tumblr crap, ‘cause I’m like in the weeks between “school panic” and “relaxing with my mom” right now, so not much is going on.
Except “I’m in the weeks between “school panic” and “relaxing with my mom” right now” panic.
I miss Ray!
I am seriously addicted to this song. This is the fastest a song has been promoted to a Five Star in a long time.
Today I had to go invigilate a MATH 277 final as part of my TA requirements (we each have to invigilate/proctor two final exams; sometimes we get ones we’ve actually been TAs for and sometimes we don’t. This was a case of the latter). It turns out that MATT 277 is University of Calgary’s version of MATH 275, or multivariate calculus. The test involved about 20 or so questions.
Our job as TAs, apart from making everybody sign in on the little attendance sheet, was mainly to just walk around in order to discourage cheating and to help anybody out who raised their hand.
So let me just quickly set the scene for you: a large gym full of 250+ students, a 2-hour exam, and lots and lots of calculus.
I bet you can guess what I was thinking about.
I was thinking about Leibniz!
I was wondering, as I walked down the aisles of seats, watching students write the elongated “s” for integration and the dx/dy (or variations of that) for differentiation, what Leibniz would think if he saw a roomful of people, in 2016, still using some of his original symbols. Like, how ridiculous is that? Calculus has been studied, expanded upon, and extended to a ton of different fields/uses since it was first developed, but we’re still using some of Leibniz’ original symbols.
And what would he think about calculus being taught as basically standard curriculum at universities? What would he think about the tons of different uses of calculus today?
I know I kind of talked about this in a previous post, but I actually think about this quite a bit. Especially today.
Yay calculus! Yay Leibniz!
Holy crap, I did waaaaay better on that STAT 723 final than I thought I did. He must have curved it.
But anyway, that was the last final I’ll ever have to take for the last class I’ll ever have to take.
I say that, but then again, I’ve said that several times since 2008, so…
Hello, reader(s)! So this is going to be the TMI posts of all TMI posts (at least until I post the thing I plan on posting sometime this summer), so if you don’t want to read about my nasty-ass body, feel free to skip this.
I’m very close to hitting 1,000 walking miles for the year (it should happen either the last week of April or the first week of May, if all goes well). Since I take approximately two days off from walking per week (one of the weekend days, and my “I’m a slug, deal with it” Fridays), that means I walk approximately 11 to 12 miles five days a week. Some weeks it’s more, some weeks it’s less, but on average, I’d say approximately 12 per day. Here’s some of the weird shit that this walking regimen has done to my body:
- Two of my toenails are completely black. Like, always. I think there’s blood under the nail from them being pushed against the insides of my shoes. This is, however, most likely due to the fact that I had to revert to Kinvara 5s when I
murdered without remorsewore out my Kinvara 6s. I wore an 8 in the 6s and went back to a 7.5 in the 5s; I think my feet had gotten used to a little more room.
- My big toes’ toenails, on the other hand, are shiny enough to be used to signal a passing plane from a remote island. I think my socks are polishing them.
- My calf muscles are super visible when I flex them. It makes me very happy.
- My knee-high socks do not appreciate my calves, though.
- After my longer walks (15+ miles) or walks when I’m carrying groceries home (12 miles, 20+ pounds or so in my backpack?) I’ve started getting hematuria. Apparently it’s fairly common with strenuous and/or long bouts of exercise (and particularly for runners) and not too big of a deal. Which is good, ‘cause it scared the hell out of me at first. It also goes away within about 5 hours.
- I have the most godawful tan pattern going on, especially on my arms. It’s mainly on my left arm, ‘cause I wear my extra hairbands and Fitbit on that arm, but all of them have been worn for different amounts of time, so my skin’s in like 30 different shades on my left arm. It’s weird. I’m going to look awful at my wedding; I should get a long-sleeved dress.
You know you’re bad at sleeping like a normal human being when, after the first two questions in a sleep survey, you get this:
HELLO, FOOL MACHINES!
So I’ve been really into posting links to recipes over the past year or so, right? Well, last night I decided to just make a giant Word document containing the ingredients/instructions for my favorite of said posted recipes, just in case they ever get taken down.
I shall share this document with you here!
The original sources for the recipes are linked, and the pictures belong to the corresponding sources. Recipes that do not have sources (or pictures) are recipes I learned from my mom.
[This is coming out on a Monday ’cause I was super busy yesterday and had no time to make this/post it.]
Today’s test is a non-parametric test for two samples: the Kolmogorov-Smirnov test for two independent samples!
When Would You Use It?
The Kolmogorov-Smirnov test for two independent samples is a nonparametric test used to determine if two independent samples represent two different populations.
What Type of Data?
The Kolmogorov-Smirnov test for two independent samples requires ordinal data.
- All of the observations in the samples are randomly selected and independent of one another.
- The scale of the measurement is ordinal.
Step 1: Formulate the null and alternative hypotheses. The null hypothesis claims that the the distribution underlying the population for one sample is the same as the distribution underlying the population for the other sample. The alternative claims that the distributions are not the same.
Step 2: Compute the test statistic. The test statistic, in the case of this test, is defined by the point that represents the greatest vertical distance at any point between the cumulative probability distribution constructed from the first sample and the cumulative probability distribution constructed from the second sample. I will refer you to the example shown below to show how these calculations are done in a specific testing situation.
Step 3: Obtain the critical value. Unlike most of the tests we’ve done so far, you don’t get a precise p-value when computing the results here. Rather, you calculate your test statistic and then compare it to a specific value. This is done using a table. Find the number at the intersection of your sample sizes for your specified alpha-level. Compare this value with your test statistic.
Step 4: Determine the conclusion. If your test statistic is equal to or larger than the table value, reject the null hypothesis (that is, claim that the distribution of the data is inconsistent with the hypothesized population distribution). If your test statistic is less than the table value, fail to reject the null.
For this test’s example, I want to use some of my music data from 2012. I know that I tend to listen to music from the “electronic” genre and from the “dance” genre fairly equally, so I want to determine, based on play count, if I can say that the population distributions for these genres are similar. To keep things simple, I will use nelectronic = 6 and ndance = 6.
H0: Felectronic(X) = Fdance(X) for all values of X
Ha: Felectronic(X) ≠ Fdance(X) for at least one value of X
For the computations section of this test, I will display a table of values for the data and describe what the values are and how the test statistic is obtained.
Column A and Column C, together, show the ranked values of the play counts for electronic (Column A) and dance (Column C).
Column B represents the cumulative proportion in the sample for each play count in Column A. For example, for the play count = 7, the cumulative proportion of that value is just 1/6, since there is no smaller value in Column A.
Column D represents the same thing as column B, except for Column C.
Column E is Column B – Column D.
The test statistic is obtained by determining the largest value from Column E. Here, the test statistic is .5. This value is compared to the critical value at α = 0.05, n1 = 6, n2 = 6, which is .667. Since our test statistic is not larger than our critical value, we fail to reject the null and claim that the distributions of play counts for electronic and dance are similar.
Example in R
No R example this week, as this is pretty easy to do by hand, especially with having to rank things.
I AM DEAD from that freaking test, man.
Have some of my favorite bookmarks, ‘cause I’m pretty much worthless for anything else today.
- 5 Second Films. Because they’re the best.
- This site gives you letters one by one and you have to make words out of them in a timed environment.
- Dogwood Ceramics. Want clay and glaze and related art tools? This is the place!
- Want to look at recipes? Foodgawker!
- I’ve linked to this before, but if you want to read fiction with a mathematical theme, go here!
- Powder Game. I remember Aaron playing this for HOURS when we first discovered it.
- A list of good redwood hikes, in case you ever get to northern California and want to see some amazing trees.
- Wind map for the US.
UGH, that test was brutal. Like, the problems were all very similar to the homework questions, but they were all similar to the six most difficult homework questions. The ones that required weird-ass tricks that were somewhat unrelated to the material we needed to know for the class.
But whatever, you know? Either I make it through this class or I have to stay an extra year and torment those who don’t want me around.
Those who shall not be named.
It’s the April List and the LAST LIST BEFORE A DECADE OF BLOGGING!
- Since I’ve been in study mode, I’ve been listening to a lot of brown noise. It’s like white noise, but a little bit lower (?) and sounds a lot like the ocean. It really helps me concentrate. Here’s brown noise for 8 hours.
- A few of our practice problems for the exam involve the use of integration. I’m always careful to take the time to draw my integral sign as nicely as possible, ‘cause it’s a freaking awesome symbol and ‘cause of Leibniz.
- (Mostly ‘cause of Leibniz.)
- I did a thing! I don’t want to mention specifically what it was, since I don’t want to jinx it, but I did a thing. Hopefully it will turn out well.
- Pretty jewelry! If I had money (and wore jewelry), I’d get the necklace with the constellation Aquarius on it.
- I want tomorrow to be over already. I’m so done with all this nonsense right now, man, you have no idea.
- Sorry this is so short.
THE TIME FOR DISGUSTING FEET!
I have no freaking idea how my feet get so dirty. I mean, I get that I’m out walking for like 3+ hours, but it’s not like I’m walking on dusty paths. I’m on the sidewalks. Is Calgary really that dusty?
I won’t spam you with my nasty feets this summer, I promise.
Do you know what time of the year it is?
It’s FAKE UI CLASS SCHEDULE TIME!!!
Let’s do it.
HIST 411: Colonial North America (10:30 – 11:20)
MATH 310: Ordinary Differential Equations (11:30 – 12:20)
BIOL 120: Human Anatomy (12:30 – 1:20)
MATH 579: Combinatorics (1:30 – 2:20)
CS 360: Database Systems (12:30 – 1:45)
GEOG 301: Meteorology (2:00 – 3:15)
MUSA 121: Concert Band (4:30 – 5:20)
BIOL 120: Human Anatomy Lab (8:30 – 11:20)
ENGL 582: Techniques of Fiction (5:00 – 7:50)
So here’s something interesting.
I was looking at my old 23andMe results that I got back in like 2012, ‘cause I wanted to show them to Nate. Back when I had first gotten the results back, I was more interested in the medical results (things I have more of a risk of getting, things I am low risk for, etc.). But today, I decided to look in more detail at the Ancestry Composition information. Here’s what I’ve got for my composition:
First off, I thought I was basically 100% European. Which is apparently not the case. I have no idea where that relatively large (in my opinion—remember I thought I was like 100% European) Native American percentage is coming from. Or that tiny bit of West African. Like…have you seen my family?
Also, something I didn’t know: Ashkenazi is a Jewish ethnic division, mainly from Germany, so that’s cool.
So I am mainly European, but not as European as I thought I was.