Tag Archives: statistician

Take a Gauss

Every mathematician needs a site like this.


  • Gauss didn’t discover the normal distribution, nature conformed to his will.
  • Gauss can recite all of pi – backwards.
  • Gauss doesn’t understand stochastic processes because he can predict random numbers.
  • Gauss is an exclusive member of an empty set.
  • Gauss can calculate the determinant of a non-square matrix
  • When Gauss was thirsty, he used Banach–Tarski paradox to get more orange juice.

Said the statistician with the small sample size to the statistician with the large one: “I’m ‘n’-vious!”

POP QUIZ GO: What Englishman was it that Anders Hall called “a genius who almost single-handedly created the foundations for modern statistical science”?

It’s the same guy who Richard Dawkins labeled “the greatest biologist since Darwin.”


An evolutionary biologist, geneticist, and statistician, Fisher lived from 1890 to 1962. He had plans to enter the British Army upon his graduation from the University of Cambridge (where he studied biology/eugenics) in 1912, but he had horrible eyesight and failed the vision test. So what did he do instead? He worked as a statistician for London, among other things. He also started to write for the Eugenic Review, which only increased his interest in stat methods.

In 1918, his paper The Correlation Between Relatives on the Supposition of Mendelian Inheritance was published in which he introduced the method of analysis of variance (yup, ANOVA!). A year later, after taking a job with an agricultural station, he began to gather numerous sets of data—both large and small—which allowed him to develop methods of experimental design as well as small sample statistics. Throughout his professional career, he continued to develop ANOVA, promoted ML estimation, described the z-distribution (now used in the form of the F-distribution), and pretty much set up the foundation for the field of population genetics. He also (and I didn’t know this until I read more about him) opposed Bayesian statistics quite vehemently.

Anyway. Thought he deserved a bit of a mention today, since he died on this day in 1962.

TWSB: Happy birthday, Sir Ronald Fisher!

Happy birthday to one of the greatest statisticians ever: Sir Ronald Fisher!

Fisher (1890 – 1962) was an English statistician/biologist/geneticist who did a few cool things…you know…like CREATING FREAKING ANALYSIS OF VARIANCE.

Yes, that’s right kids. Fisher’s the guy that came up with ANOVA. In fact, he’s known as the father of modern statistics. Apart from ANOVA, he’s also responsible for coining the term “null hypothesis”, the F-distribution (F for “Fisher!”), and maximum likelihood.

Seriously. This guy was like a bundle of statistical genius. What would it be like to be the dude who popularized maximum likelihood? “Oh hey guys, I’ve got this idea for parameter estimation in a statistical model. All you do is select the values of the parameters in the model such that the likelihood function is maximized. No big deal or anything, it just maximizes the probability of the observed data under the distribution.”

I dealt with ML quite a bit for my thesis and I’m still kinda shaky with it.

I would love to get into the heads of these incredibly smart individuals who come up with this stuff. Very, very cool.

TWSB: The Plane Truth

Happy New Year, everyone!

I just realized that I only moved once in 2012. A good year indeed.
Actually, 2012 wasn’t too bad. At least the latter half. I think Vancouver Karma is finally reversing itself. I really hope 2013 is as good or better.

Anyway. To the blog!

Abraham Wald was a mathematician born in Austria-Hungary (present day Romania) in 1902. He studied mathematics and statistics and worked for the Statistical Research Group (SRG) during WWII. Wald’s job was to estimate the vulnerability of aircraft returning from battle.

To do so, he made note of the location of bullet holes on a ton of returning Allied aircraft to determine the best places to reinforce the planes to promote survival. He made several diagrams showing where the planes were most bullet-ridden (which was pretty much everywhere but the cockpit and the tail).

Showing these diagrams to his supervisors, the supervisors concluded something a lot of us would probably expect—that the best course of action to increase the rate of survival of the planes was to reinforce the areas that were the most damaged.

But Wald came to a different conclusion. He stated that rather than adding reinforcing armor to the bullet-ridden areas of the planes, the plane manufacturers should instead reinforce the areas that were bullet free. His reasoning behind this? The planes survived the battles because the cockpit and tail were undamaged. That is, the parts most vital for the planes’ survival were untouched by bullets. The planes that had been damaged to the point of being destroyed, of course, would not be able to make it back and be observed by Wald and his team. Since only planes whose cockpits and tails were undamaged were returning to be sampled, Wald concluded that it was likely planes sustaining damage to the cockpits and tails were the ones that were not surviving the battles—thus, those two parts of the airplane were the most vital to the survival of the plane overall. The wings/body/etc. were sustaining damage, but the planes were able to return even after sustaining this damage. Wald concluded, therefore, that extra armor should be added to the components of the plane that had to remain undamaged for the planes to survive.

Wald’s observation actually helped to prevent the SRG from making conclusions under the influence of “survival bias”—including only the aircraft that survived the battles and not including the planes that were damaged beyond repair and did not return to be included in the sample.

How cool is that??

There is a paper by Mangel and Samaniego discussing Wald’s findings and the math behind them. It gets pretty technical pretty quickly, but if anyone’s interested, here you go!


Watch this right now

This gentleman is my new favorite living human being.


I’d add that linear algebra is an important middle step as well. A lot of stuff that I really enjoy in the field of statistics is stuff I wouldn’t understand nearly as well had I not taken linear algebra.

Ideal universe:
Basic statistics (like the stuff I’m teaching) –> Linear algebra –> more advanced statistics (FA, PCA, SEM) –> calculus (or taught concurrently with the previous) –> mathematical statistics

In my personal experience, I was able to get to SEM-level without calculus. I took calculus, but I never really used it in the context of stats.

But now that I’m taking it again, even at the basic level of 170, I’m seeing how this will apply to statistics (especially mathematical stats). And that’s super exciting.

So I don’t think this idea of “stats before calc” discounts the importance of calculus. Rather, I think it focuses on this idea of “practical versus theoretical” understanding. Statistics, especially very basic statistics, is something I think everyone should know. It’s practical, it’s applicable in every field. Calculus gives you a stronger understanding of WHY it’s so practical and applicable (at least in my opinion).

So yeah. Dr. Benjamin was also on the Colbert Report some time ago. I’ll have to find that vid.

Haha, speaking of the Report, I’m going to go watch the Maurice Sendak interviews again.

Meet a Statistician: Charles Spearman

So in my researching for my English essay, I read quite a bit about the Royal Society (or, the Royal Society of London for Improving Natural Knowledge as its full title stands).

Well, today I found out that one of the main developers of my favorite statistical test EVER (factor analysis) was also a part of the Society for awhile: Charles Spearman!

So let’s check him out, shall we?

Charles Spearman (1863 – 1945) resigned from 15 years of service in the British Army to pursue a PhD in experimental psychology. By the time he obtained his degree he had already published a paper on the factor analysis of human intelligence. This paper impressed many of his fellow psychologists at the time, mainly because of Spearman’s rigorous application of mathematical techniques and models (factor analysis!) to the analysis of the human mind.

In fact, his work was so impressive that it earned him a place in the Royal Society in 1924. Spearman continued his work, focusing mainly on developing new statistical techniques that could be applied to, among other things, psychological constructs and concepts. He was especially influenced by Galton (developer of correlation) and worked to create a nonparametric version of Pearson’s method of calculating correlation.*

But probably his greatest contribution had to be the part he played in the development of factor analysis. Even today, it’s probably one of the most used statistical techniques in the realm of the social sciences, particularly in psychology.

So there you go! A little bit about one of the founders of the super awesome factor analysis. Cool, huh?


*Actually, this ended up as another “two smart dudes can’t get along feud” between himself and Pearson, the latter not appreciating the nonparametric adaptation of his technique. What do they put in that Royal Society water, anyway?

HOT DAMN, Tukey Sandwiches!

No, that is not a misspelling.

NNNNH I have such a freakish urge to cook.
Hence these.

They’re a tribute to John Tukey, American statistician and source of horrible, horrible lunch meat puns. Yeah, I know, I made the joke two days ago and I haven’t been able to go 15 minutes without thinking, “exactly what would a Tukey sandwich entail?”
Ingredients, process, and general apology to Mr. Tukey as follows (I didn’t really measure stuff as I made this, so fair warning).

You will need:

  • Bread. A small loaf works perfectly. You’ll need six pieces.
  • Butter. About a tablespoon will work fine.
  • Cheese. Colby Jack is preferred. Make sure it’s in a block so you can slice it.
  • Turkey. Clean pieces of breast meat are best/neatest.
  • Bacon. Three long slices will suffice.
  • Mayo. 2-3 tablespoons.
  • Cinnamon. A teaspoon sounds about right.
  • Corn bread (or muffin) dry mix. Three or four tablespoons will be fine.
  • Oil. Just a bit, maybe a teaspoon.
  • Mrs. Dash.
  • Water.

What you need to do to make this awesomeness happen:

1. Cut the crust off the six pieces of bread so that you have nice little squares.

2. Take three of said squares and coat one side of each lightly in butter.

3. Take the other three slices and toast them lightly, just enough to get them a little brown and provide them with a bit of structural integrity.

4. Mix the teaspoon of cinnamon with the mayonnaise. Add more cinnamon if you’d like. Mine looked pinkish when I was done. Once the three toasting pieces are done toasting, spread the mayo/cinnamon mix on one side of each of the three slices and set aside.

5. Place the buttered slices butter-side down onto a frying pan and turn on to low heat. Cut a three medium-thin slices of Colby Jack cheese and put a square onto each piece of bread as they heat up (note: they have up here in Canada land these cute little rectangular cuboids of cheese. They’re smaller in area than the bread, but I think it works fine that way).  Sprinkle the cheese and bread with Mrs. Dash and let it cook until the underside of the bread is golden brown and/or the cheese is gooey.

6. Cook bacon (I’m lazy, so mine was precooked and all I did was heat it up in the microwave). Tear the strips in half and position them in an “X” position on the mayo/cinnamon bread.

7. Now it gets fun. Take the corn bread dry mix and mix it with the teaspoon of oil and some water. I really didn’t measure this, but you’ll want a consistency similar to that of the mayo/cinnamon. Don’t make it too moist, but don’t make it dry enough to crumble.

8. Lay out the turkey meat and spread it with the corn meal mix. It looks gross, I know, but it tastes good.

9. Fold the turkey into nice little square packets and place each packet onto the bacon and mayo/cinnamon bread.

10. Complete the sandwich by putting the cheese/Mrs. Dash bread on top of the turkey and securing with a pretty frill. In my opinion, these taste equally good hot and cool, so if you made a super mess out of your kitchen like I did, go ahead and clean before you try them.


So why does this qualify as a tribute to Mr. Tukey again?
– There are six pieces of bread because he came up with the Six Pack Test.
– The sandwiches are square because he came up with the boxplot.
– They’re small because he coined the word “bit.”
– They’ve got turkey in them because DUH.
– Cinnamon is brown. He went to Brown University. I’m funny.
– He was born in Massachusetts. Corn muffins are the state’s official muffin.
– The turkey is also Massachusetts’ state bird, which is hilarious.
– He made significant contributions to jackknife estimation, hence Colby Jack cheese. It’s a stretch, but so is this entire thing.
– And I just assumed he liked bacon.

So yeah. This is why I should not be allowed to have free time.

You know what I’m really tempted to do?


ETS, the company I want to work for when I’m done with all this school business, has a position open right now that would be absolutely PERFECT for me. It’s like my freaking dream job, listen to this:

Arrange for and perform routine statistical analysis and data-processing tasks using GENASYS, user-oriented computer packages, and statistical software packages. Create datasets, enter computer job control information, code parameters, and submit programs for execution. Draft standard statistical reports and assist in the preparation of complex reports. Prepare and check critical information for score reporting, tables, and figures for statistical procedures, documentation, and reports. Update textual material for such documents. Update and run routine statistical analyses using SAS. Perform a wide variety of statistical calculations (e.g., mean, percentiles, standard error of measurement, and reliability estimates).


So I applied just on a whim…If I get the job, I’m outta here. Screw grad school for a few years. If it’s meant to be, it’s meant to be.


Though I don’t think they’ll take me though, ‘cause I think they’ll think I’m too far away to relocate “ASAP.”

Top Ten Reasons to Become a Statistician


10. Deviation is considered normal.
9. We feel complete and sufficient.
8. We are mean lovers.
7. Statisticians do it discretely and continuously.
6. We are right 95% of the time.
5. We can safely comment on someone’s posterior distribution.
4. We may not be normal but we are transformable.
3. We never have to say we are certain.
2. We are honestly significantly different.
1. No one wants our jobs.