# Said the statistician with the small sample size to the statistician with the large one: “I’m ‘n’-vious!”

**POP QUIZ GO:** What Englishman was it that Anders Hall called “a genius who almost single-handedly created the foundations for modern statistical science”?

It’s the same guy who Richard Dawkins labeled “the greatest biologist since Darwin.”

Give up? It’s **SIR RONALD FISHER!**

An evolutionary biologist, geneticist, and statistician, Fisher lived from 1890 to 1962. He had plans to enter the British Army upon his graduation from the University of Cambridge (where he studied biology/eugenics) in 1912, but he had horrible eyesight and failed the vision test. So what did he do instead? He worked as a statistician for London, among other things. He also started to write for the Eugenic Review, which only increased his interest in stat methods.

In 1918, his paper The Correlation Between Relatives on the Supposition of Mendelian Inheritance was published in which he introduced the method of analysis of variance (yup, ANOVA!). A year later, after taking a job with an agricultural station, he began to gather numerous sets of data—both large and small—which allowed him to develop methods of experimental design as well as small sample statistics. Throughout his professional career, he continued to develop ANOVA, promoted ML estimation, described the z-distribution (now used in the form of the F-distribution), and pretty much set up the foundation for the field of population genetics. He also (and I didn’t know this until I read more about him) opposed Bayesian statistics quite vehemently.

Anyway. Thought he deserved a bit of a mention today, since he died on this day in 1962.

# TWSB: Happy birthday, Sir Ronald Fisher!

Happy birthday to one of the greatest statisticians ever:** Sir Ronald Fisher!**

Fisher (1890 – 1962) was an English statistician/biologist/geneticist who did a few cool things…you know…like **CREATING FREAKING ANALYSIS OF VARIANCE.**

Yes, that’s right kids. Fisher’s the guy that came up with ANOVA. In fact, he’s known as the father of modern statistics. Apart from ANOVA, he’s also responsible for coining the term “null hypothesis”, the F-distribution (F for “Fisher!”), and maximum likelihood.

Seriously. This guy was like a bundle of statistical genius. What would it be like to be the dude who popularized maximum likelihood? “Oh hey guys, I’ve got this idea for parameter estimation in a statistical model. All you do is select the values of the parameters in the model such that the likelihood function is maximized. No big deal or anything, it just maximizes the probability of the observed data under the distribution.”

I dealt with ML quite a bit for my thesis and I’m still kinda shaky with it.

I would love to get into the heads of these incredibly smart individuals who come up with this stuff. Very, very cool.

# Good lord…

I think Sir Ronald Fisher is the statistics equivalent to Leibniz in my mind. Check it:

*…this paper laid the foundation for what came to be known as biometrical genetics, and introduced the very important methodology of the analysis of variance, which was a considerable advance over the correlation methods used previously.*

The freaking ANOVA, people.

*In addition to “analysis of variance”, Fisher invented the technique of maximum likelihood and originated the concepts of sufficiency, ancillarity, Fisher’s linear discriminator and Fisher information. His 1924 article “On a distribution yielding the error functions of several well known statistics” presented Karl Pearson’s chi-squared and Student’s t in the same framework as the Gaussian distribution, and his own “analysis of variance” distribution Z (more commonly used today in the form of the F distribution). These contributions easily made him a major figure in 20th century statistics.*

Do you know what the Z distribution is? It’s used for setting confidence intervals around correlation estimates. Since a correlation is bound by -1 and 1, any sampling distribution with a mean correlation other than zero has a skewed distribution, and thus requires unsymmetrical confidence intervals to be set. Fisher’s Z distribution is a non-linear transformation of correlations that FORCES THEM INTO A NORMAL DISTRIBUTION in which you can set symmetrical confidence intervals, and then you can TRANSFORM THOSE LIMITS BACK and get confidence intervals in the original sampling distribution of correlations.

Seriously, that’s pretty freaking amazing. This guy rocked. Go search him in Wikipedia and see the massive list of “see also” pages.