What is the appropriate distribution ifn 30 and the population standard deviation is unknown?

The formula you show is an estimate of how large a sample you need to have a chance of $(1-\alpha)$ to find the sample mean within "error" of the true mean. That depends, as you say, on knowing the actual standard deviation $\sigma$ in the population (or the variance $\sigma^2$).

You can use the sample standard deviation as an estimate of the population standard deviation similarly to how you use the sample mean to estimate the population mean. There are two potential problems, however.

First, the sample standard deviation is a biased estimate of the population standard deviation. It is the square root of an unbiased estimate of the population variance.

Second, the sample standard deviation itself has a wide distribution. For a normal distribution, the variance of the sample variance is:

$$ \frac{2\sigma^4}{n-1},$$

where $\sigma^2$ is the population variance and $n$ is the sample size. There isn't any break at a particular value of $n$; the variance of the sample variance just decreases inversely with $(n-1)$.

To see the relative advantages of the 2 proposed methods for estimating $\sigma$, you can try a simulation. Here's an example in R. I made a population of 1000 normally distributed values having mean of 2 and standard deviation of 1.

set.seed(1234) normData <- rnorm(1000,mean=2,sd=1) sd(normData) # [1] 0.9973377

I then took 1000 samples of size $n=10$ from that population and collected the distribution of the SD estimates based on the 2 methods, sample SD and range/4.

sdDist10 <-NULL rangeDist10 <- NULL for(i in 1:1000) {sam <- sample(normData,10,replace=TRUE); sdDist10<-c(sdDist10,sd(sam)); rangeDist10 <- c(rangeDist10,(max(sam)-min(sam))/4)}

I plotted the distributions of the SD estimates obtained those ways:

plot(density(rangeDist10), col="red", xlab="SD estimate", main="For n = 10", [1]bty="n") lines(density(sdDist10)) legend("topleft",legend="red, from range\nblack, from SD",bty="n")

What is the appropriate distribution ifn 30 and the population standard deviation is unknown?

As expected with a small sample, the estimates based on the sample SD had a wide distribution but were fairly symmetric about the true value. The estimates based on the range (in red) tended to be low and with a similarly wide distribution. I don't see that using the second option, based on range, is an improvement at all.

Play with this approach for different values of sample size $n$. If you repeat the above for 1000 samples of size $n = 100$, the sample SD values have a much narrower distribution around the peak near SD = 1 while the range-based estimates tend to over-estimate the true SD.

A take-home point is that you can need a surprisingly large sample size just to get a reliable estimate of the variance, an estimate you need to perform power calculations for study design. You can also use this approach to see how frequently your sample mean values at different sample sizes are within a specified "error" of the actual mean.

It's good to learn how to do quick simulations like this. In complex designs, simulations are often the best way to estimate the sample size needed for a study.

In order to continue enjoying our site, we ask that you confirm your identity as a human. Thank you very much for your cooperation.

Recommended textbooks for you

  • What is the appropriate distribution ifn 30 and the population standard deviation is unknown?

    MATLAB: An Introduction with Applications

    Publisher:John Wiley & Sons Inc

    What is the appropriate distribution ifn 30 and the population standard deviation is unknown?

    Probability and Statistics for Engineering and th...

    Publisher:Cengage Learning

    What is the appropriate distribution ifn 30 and the population standard deviation is unknown?

    Statistics for The Behavioral Sciences (MindTap C...

    Author:Frederick J Gravetter, Larry B. Wallnau

    Publisher:Cengage Learning

  • What is the appropriate distribution ifn 30 and the population standard deviation is unknown?

    Elementary Statistics: Picturing the World (7th E...

    Author:Ron Larson, Betsy Farber

    What is the appropriate distribution ifn 30 and the population standard deviation is unknown?

    The Basic Practice of Statistics

    Author:David S. Moore, William I. Notz, Michael A. Fligner

    What is the appropriate distribution ifn 30 and the population standard deviation is unknown?

    Introduction to the Practice of Statistics

    Author:David S. Moore, George P. McCabe, Bruce A. Craig

  • What is the appropriate distribution ifn 30 and the population standard deviation is unknown?

    MATLAB: An Introduction with Applications

    ISBN:9781119256830

    Author:Amos Gilat

    Publisher:John Wiley & Sons Inc

    What is the appropriate distribution ifn 30 and the population standard deviation is unknown?

    Probability and Statistics for Engineering and th...

    ISBN:9781305251809

    Author:Jay L. Devore

    Publisher:Cengage Learning

    What is the appropriate distribution ifn 30 and the population standard deviation is unknown?

    Statistics for The Behavioral Sciences (MindTap C...

    ISBN:9781305504912

    Author:Frederick J Gravetter, Larry B. Wallnau

    Publisher:Cengage Learning

    What is the appropriate distribution ifn 30 and the population standard deviation is unknown?

    Elementary Statistics: Picturing the World (7th E...

    ISBN:9780134683416

    Author:Ron Larson, Betsy Farber

    Publisher:PEARSON

    What is the appropriate distribution ifn 30 and the population standard deviation is unknown?

    The Basic Practice of Statistics

    ISBN:9781319042578

    Author:David S. Moore, William I. Notz, Michael A. Fligner

    Publisher:W. H. Freeman

    What is the appropriate distribution ifn 30 and the population standard deviation is unknown?

    Introduction to the Practice of Statistics

    ISBN:9781319013387

    Author:David S. Moore, George P. McCabe, Bruce A. Craig

    Publisher:W. H. Freeman