Skip to main content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
Ask Us

STA 2023: Statistics: The Central Limit Theorem

This guide is designed to introduce students to the fundamentals of statistics with special emphasis on the major topics covered in their STA 2023 class including methods for analyzing sets of data, probability, probability distributions and more.

Central Limit Theorem

Central Limit Theorem

The central limit theorem states that the sampling distribution of the mean of any independent,random variable will be normal or nearly normal, if the sample size is large enough.

How large is "large enough"? The answer depends on two factors.

  • Requirements for accuracy. The more closely the sampling distribution needs to resemble a normal distribution, the more sample points will be required.
  • The shape of the underlying population. The more closely the original population resembles a normal distribution, the fewer sample points will be required.

In practice, some statisticians say that a sample size of 30 is large enough when the population distribution is roughly bell-shaped. Others recommend a sample size of at least 40. But if the original population is distinctly not normal (e.g., is badly skewed, has multiple peaks, and/or has outliers), researchers like the sample size to be even larger.

See also:   Statistics Tutorial: Sampling Distributions

Central Limit Theorem

Central Limit Theorem

The central limit theorem states that the sampling distribution of the mean of any independent,random variable will be normal or nearly normal, if the sample size is large enough.

How large is "large enough"? The answer depends on two factors.

  • Requirements for accuracy. The more closely the sampling distribution needs to resemble a normal distribution, the more sample points will be required.
  • The shape of the underlying population. The more closely the original population resembles a normal distribution, the fewer sample points will be required.

In practice, some statisticians say that a sample size of 30 is large enough when the population distribution is roughly bell-shaped. Others recommend a sample size of at least 40. But if the original population is distinctly not normal (e.g., is badly skewed, has multiple peaks, and/or has outliers), researchers like the sample size to be even larger.

T-Distribution vs. Normal Distribution

The t distribution and the normal distribution can both be used with statistics that have a bell-shaped distribution. This suggests that we might use either the t-distribution or the normal distribution to analyze sampling distributions. Which should we choose?

Guidelines exist to help you make that choice. Some focus on the population standard deviation.

  • If the population standard deviation is known, use the normal distribution
  • If the population standard deviation is unknown, use the t-distribution.

Other guidelines focus on sample size.

  • If the sample size is large, use the normal distribution. (See the discussion above in the section on the Central Limit Theorem to understand what is meant by a "large" sample.)
  • If the sample size is small, use the t-distribution.

In practice, researchers employ a mix of the above guidelines. On this site, we use the normal distribution when the population standard deviation is known and the sample size is large. We might use either distribution when standard deviation is unknown and the sample size is very large. We use the t-distribution when the sample size is small, unless the underlying distribution is not normal. The t distribution should not be used with small samples from populations that are not approximately normal.

Test Your Understanding

In this section, we offer two examples that illustrate how sampling distributions are used to solve commom statistical problems. In each of these problems, the population sample size is known; and the sample size is large. So you should use the Normal Distribution Calculator, rather than the t-Distribution Calculator, to compute probabilities for these problems.

Normal Distribution Calculator

The normal calculator solves common statistical problems, based on the normal distribution. The calculator computes cumulative probabilities, based on three simple inputs. Simple instructions guide you to an accurate solution, quickly and easily. If anything is unclear, frequently-asked questions and sample problems provide straightforward explanations. The calculator is free. It can be found under the Stat Tables tab, which appears in the header of every Stat Trek web page.

Normal Calculator

 

Example 1

Assume that a school district has 10,000 6th graders. In this district, the average weight of a 6th grader is 80 pounds, with a standard deviation of 20 pounds. Suppose you draw a random sample of 50 students. What is the probability that the average weight of a sampled student will be less than 75 pounds?

Solution: To solve this problem, we need to define the sampling distribution of the mean. Because our sample size is greater than 30, the Central Limit Theorem tells us that the sampling distribution will approximate a normal distribution.

To define our normal distribution, we need to know both the mean of the sampling distribution and the standard deviation. Finding the mean of the sampling distribution is easy, since it is equal to the mean of the population. Thus, the mean of the sampling distribution is equal to 80.

The standard deviation of the sampling distribution can be computed using the following formula.

σx = [ σ / sqrt(n) ] * sqrt[ (N - n ) / (N - 1) ] 
σx = [ 20 / sqrt(50) ] * sqrt[ (10,000 - 50 ) / (10,000 - 1) ] = (20/7.071) * (0.995) = 2.81

Let's review what we know and what we want to know. We know that the sampling distribution of the mean is normally distributed with a mean of 80 and a standard deviation of 2.82. We want to know the probability that a sample mean is less than or equal to 75 pounds.

 

Because we know the population standard deviation and the sample size is large, we'll use the normal distribution to find probability. To solve the problem, we plug these inputs into the Normal Probability Calculator: mean = 80, standard deviation = 2.81, and normal random variable = 75. The Calculator tells us that the probability that the average weight of a sampled student is less than 75 pounds is equal to 0.038.

Note: Since the population size is more than 20 times greater than the sample size, we could have used the "approximate" formula σx = [ σ / sqrt(n) ] to compute the standard error. Had we done that, we would have found a standard error equal to [ 20 / sqrt(50) ] or 2.83.

Example 2

Find the probability that of the next 120 births, no more than 40% will be boys. Assume equal probabilities for the births of boys and girls. Assume also that the number of births in the population (N) is very large, essentially infinite.

Solution: The Central Limit Theorem tells us that the proportion of boys in 120 births will be approximately normally distributed.

The mean of the sampling distribution will be equal to the mean of the population distribution. In the population, half of the births result in boys; and half, in girls. Therefore, the probability of boy births in the population is 0.50. Thus, the mean proportion in the sampling distribution should also be 0.50.

The standard deviation of the sampling distribution (i.e., the standard error) can be computed using the following formula.

σp = sqrt[ PQ/n ] * sqrt[ (N - n ) / (N - 1) ]

Here, the finite population correction is equal to 1.0, since the population size (N) was assumed to be infinite. Therefore, standard error formula reduces to:

σp = sqrt[ PQ/n ] 
σp = sqrt[ (0.5)(0.5)/120 ] = sqrt[0.25/120 ] = 0.04564

Let's review what we know and what we want to know. We know that the sampling distribution of the proportion is normally distributed with a mean of 0.50 and a standard deviation of 0.04564. We want to know the probability that no more than 40% of the sampled births are boys.

Because we know the population standard deviation and the sample size is large, we'll use the normal distribution to find probability. To solve the problem, we plug these inputs into the Normal Probability Calculator: mean = .5, standard deviation = 0.04564, and the normal random variable = .4. The Calculator tells us that the probability that no more than 40% of the sampled births are boys is equal to 0.014.

Note: This problem can also be treated as a binomial experiment. Elsewhere, we showed how to analyze a binomial experiment. The binomial experiment is actually the more exact analysis. It produces a probability of 0.018 (versus a probability of 0.14 that we found using the normal distribution). Without a computer, the binomial approach is computationally demanding. Therefore, many statistics texts emphasize the approach presented above, which uses the normal distribution to approximate the binomial.

Attribution

All the information on this page comes from Stat Trek:  http://stattrek.com/sampling/sampling-distribution.aspx