Although I currently work as a statistician, my original training was in mathematics. In many mathematical fields there is a result that is so profound that it earns the name "The Fundamental Theorem of [Topic Area]." A fundamental theorem is a deep (often surprising) result that connects two or more seemingly unrelated mathematical ideas.

It is interesting that statistical textbooks do not usually highlight a "fundamental theorem of statistics." In this article I briefly and informally discuss some of my favorite fundamental theorems in mathematics and cast my vote for the fundamental theorem of statistics.

The fundamental theorem of arithmetic

The fundamental theorem of arithmetic connects the natural numbers with primes. The theorem states that every integer greater than one can be represented uniquely as a product of primes.

This theorem connects something ordinary and common (the natural numbers) with something rare and unusual (primes). It is trivial to enumerate the natural numbers, but each natural number is "built" from prime numbers, which defy enumeration. The natural numbers are regularly spaced, but the gap between consecutive prime numbers is extremely variable. If p is a prime number, sometimes p+2 is also prime (the so-called twin primes), but sometimes there is a huge gap before the next prime.

The fundamental theorem of algebra

The fundamental theorem of algebra connects polynomials with their roots (or zeros). Along the way it informs us that the real numbers are not sufficient for solving algebraic equation, a fact known to every child who has pondered the solution to the equation x2 = –1. The fundamental theorem of algebra tells us that we need complex numbers to be able to find all roots. The theorem states that every nonconstant polynomial of degree n has exactly n roots in the complex number system. Like the fundamental theorem of arithmetic, this is an "existence" theorem: it tells you the roots are there, but doesn't help you to find them.

The fundamental theorem of calculus

The fundamental theorem of calculus (FTC) connects derivatives and integrals. Derivatives tell us about the rate at which something changes; integrals tell us how to accumulate some quantity. That these should be related is not obvious, but the FTC says that the rate of change for a certain integral is given by the function whose values are being accumulated. Specifically, if f is any continuous function on the interval [a, b], then for every value of x in [a,b] you can compute the following function: The FTC states that F'(x) = f(x). That is, derivatives and integrals are inverse operations.

Unlike the previous theorems, the fundamental theorem of calculus provides a computational tool. It shows that you can solve integrals by constructing "antiderivatives."

The fundamental theorem of linear algebra

Not everyone knows about the fundamental theorem of linear algebra, but there is an excellent 1993 article by Gil Strang that describes its importance. For an m x n matrix A, the theorem relates the dimensions of the row space of A (R(A)) and the nullspace of A (N(A)). The result is that dim(R(A)) + dim(N(A)) = n.

The theorem also describes four important subspaces and describes the geometry of A and At when thought of as linear transformations. The theorem shows that some subspaces are orthogonal to others. (Strang actually combines four theorems into his statement of the Fundamental Theorem, including a theorem that motivates the statistical practice of ordinary least squares.)

The fundamental theorem of statistics

Although most statistical textbooks do not single out a result as THE fundamental theorem of statistics, I can think of two results that could make a claim to the title. These results are based in probability theory, so perhaps they are more aptly named fundamental theorems of probability.

• The Law of Large Numbers (LLN) provides the mathematical basis for understanding random events. The LLN says that if you repeat a trial many times, then the average of the observed values tend to be close to the expected value. (In general, the more trials you run, the better the estimates.) For example, you toss a fair die many times and compute the average of the numbers that appear. The average should converge to 3.5, which is the expected value of the roll because (1+2+3+4+5+6)/6 = 3.5. The same theorem ensures that about one-sixth of the faces are 1s, one-sixth are 2s, and so forth.
• The Central Limit theorem (CLT) states that the mean of a sample of size n is approximately normally distributed when n is large. Perhaps more importantly, the CLT provides the mean and the standard deviation of the sampling distribution in terms of the sample size, the population mean μ, and the population variance σ2. Specifically, the sampling distribution of the mean is approximately normally distributed with mean μ and standard deviation σ/sqrt(n).

Of these, the Central Limit theorem gets my vote for being the Fundamental Theorem of Statistics. The LLN is important, but hardly surprising. It is the basis for frequentist statistics and assures us that large random samples tend to reflect the population. In contrast, the CLT is surprising because the sampling distribution of the mean is approximately normal regardless of the distribution of the original data! As a bonus, the CLT can be used computationally. It forms the basis for many statistical tests by estimating the accuracy of a statistical estimate. Lastly, the CLT connects important concepts in statistics: means, variances, sample size, and accuracy of point estimates.

Do you have a favorite "Fundamental Theorem"? Do you marvel at an applied theorem such as the fundamental theorem of linear programming or chuckle at a pseudo-theorems such as the fundamental theorem of software engineering? Share your thoughts in the comments.

Share Distinguished Researcher in Computational Statistics

Rick Wicklin, PhD, is a distinguished researcher in computational statistics at SAS and is a principal developer of PROC IML and SAS/IML Studio. His areas of expertise include computational statistics, simulation, statistical graphics, and modern methods in statistical data analysis. Rick is author of the books Statistical Programming with SAS/IML Software and Simulating Data with SAS.

1. It's a reflection of my own ignorance, but it really stuck with me when ET Jaynes pointed out that it's the Central (Limit Theorem), not the (Central Limit) Theorem. My brain always heard it the second way.

• Me, too! Great tip.

2. My favorite is the fundamental theorem of stochastic calculus:
http://en.wikipedia.org/wiki/Ito's_lemma

3. The fundamental theorem of arithmetic is very cool indeed and, as you doubtless know, so much has been done with the primes (although no one has yet proved that the twin primes go on forever). One nice result is that there will be gaps in the series that are as large as you want. For N!, all of N! + 1 through N! + n are all composite.

The fundamental theorem of statistics is one thing. How about a fundamental theorem of data analysis?

Perhaps "All modes are wrong, but some are useful"? From George Box

or

"There are no routine statistical questions, only questionable statistical routines" (From Donald Cox)

Or maybe they could be combined into a Box Cox theorem!
Or maybe

4. Even though I work in optimization, in particular linear programming, I've never heard of that result referred to as the fundamental theorem of linear programming. For me the fundemental theorem of linear programming would be the strong duality theorem.

5. Well, I know a couple of people who would bet on Bayes theorem as the fundamental theorem of statistics :)

6. Sarbarup Banerjee on

I love the Theorem on Matrix Factorization,the main back bone for Factor and Principal Component analysis

• I assume you are referring to the Singular Value Decomposition. That is an awesome theorem, and Strang talks about it in his paper on the fundamental theorem of linear algebra.

7. I must have first heard of the law of large numbers as an undergrad, but I remember later in a more advanced course learning properties of estimators, unbiasedness, efficiency, and consistency etc. I never made the connection, but isn't the LLN basically just a statement that the sample mean is a *consistent* estimator of the population mean?

Also, I doubt it makes the list for being a fundamental theorem, but in graduate school the series of 'slutsky' theorems presented in Goldberger's 'A Course in Econometrics' were pretty interesting tools.

• Yes, and in fact a stronger version of the theorem is that the empirical CDF is a consistent estimator of the population distribution.

• Hi Rick,

I think that you are referring to the Glivenko-Cantelli theorem, and I'm happy that you brought it up. Numerous sources (including my own mathematical statistics professor in my graduate studies) call it the fundamental theorem of statistics!

See, for example, Page 261 of John Taylor's An introduction to measure and probability (Springer, 1997).

• Yes. The Glivenko-Cantelli theorem says that the empirical distribution converges uniformly to the population distribution.

8. my favourit one iz stoks theorem

9. The CLT is definitely profound and extremely useful. It encapsulates the heart of many fundamental theorems in mathematics in its ability to connect so many different results in statistics. It's also a nice result to have when co-workers dispute your analyses, and you have a handful of statistical tests that back you up (you can't argue with mathematical proof when you have many observations to rely upon!)

10. Lyudmil Antonov on

In statistics, there is the Fundamental Lemma of Neyman-Pearson (see, e.g. Erich Lehmann. Fisher, Neyman, and the creation of classical statistics; E. Lehmann & J. Romano. Testing statistical hypotheses, 3.2. The Neyman-Pearson Fundamental Lemma) though I fail to see how is it "fundamental". Moreover, in different books this lemma is formulated in different ways.

11. nazim noueihed on

the (LLN) is a very powerful theorem. It states that the sequence of relative frequencies of an event (A) converge , where the the terms of the sequence do not have an explicit formula or defined recursively, and the limit of this sequence is the true probability measure of (A). equivalently it states that the true probability space that models the random experiment or phenomenon is obtained as a limit of a sequence of probability spaces.