![]() ![]() Let’s get back to the spreadsheet and use it to explore what happens when we change the sample size. Where the chi-squared distribution is defined by this function. The variance sampling distribution turns out to be equal to the probability of s-squared is equal to n-1 divided by sigma squared times the chi squared distribution of type n-1, whose argument is s-squared times n-1 divided by sigma squared. It turns out that the chi squared distribution we need is the one with k=n-1 degrees of freedom. The variance is the sum of a bunch of normal distributions squared, and sums of that kind are described by a family of so-called chi squared distributions– the subject of one of our other videos. As this definition gives an unbiased estimate of the population variance. Let’s define our sample variance as, s-squared is the sum of xi minus x-bar squared divided by n-1. Understanding how the variance values from a series of samples are distributed takes a little more work. ![]() If you generate enough samples of size n, the histogram agrees well with the theoretical answer. That is to say that the probability density function for x-bar will be a normal distribution with mean mu and standard deviation sigma over root n, where the function n is defined as shown.Īctivating the “superimpose normal” box overlays this Normal curve on the histogram. Thus, the sample means will be distributed according to a Normal distribution with a mean of mu and a standard deviation of sigma over root n. The variance of x-bar will be equal to 1/n2 times the sum of the variances of the sample means, which simplifies to sigma2/n. This new random variable x-bar will have a normal distribution and its mean will be 1/n times the sum of the means of the sample elements, which simplifies to mu. If the population is Normally distributed, each of these xi values will be too, and their means and standard deviations will be the same as those of the population from which they are taken. Recall that the mean of a sample of size n is a random variable produced by adding the random variables that describe each element in the sample and dividing by n. It is not difficult to mathematically figure out the distribution of sample means. The means looks like it is normally distributed, but the variance is visibly skewed. These histograms and any equations that we might find to describe them are called sampling distributions. The spreadsheet generates histograms for the means and variances that it calculates. Sometimes the spreadsheet takes a while to run, and this window lets you monitor progress of the simulation. To start, let’s use a sample size of three and generate 500 separate samples. Analyses that use a computer to generate random data and process it, like we are doing here, are called Monte Carlo simulations. Based on the population mean and standard deviation that we specify here, the spreadsheet generates n elements for each sample and calculates the sample mean and variance. Let’s assume that the population has a Normal distribution with known mean and standard deviation. We will need lots of samples in order to construct these distributions, and we will obtain them using a spreadsheet. How do you think the means will be distributed? Maybe like this? And how do you think the standard deviations will be distributed? Maybe like this? I think it will look like this. ![]() In this video we examine how the means and variances of a set of samples from the same population are distributed. You would not expect the means of those samples to be identical to each other, nor would you expect their variances to match perfectly. Suppose you took a series of samples from a population. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |