|
Estimating
Parameters Web Page
©Copyright 1997, 2000 Tom
Malloy
This is the text of the
in-class lecture which accompanied the Authorware visual graphics
on this topic. You may print this text out and use it as a textbook.
Or you may read it online. In either case it is coordinated with
the online Authorware graphics.
Topic
Locator Map

This
map allows you to--
-
Jump directly to a topic which interests you.
-
Co-ordinate the dynamic visual Authorware presentations with the
corresponding text available on this web page.
1.
To find a topic which interests you: Look at the map of menus
above. Choose a menu that interests you. Notice that the menu buttons
have topics printed on them. Click on any button (topic) on the
menu; you will jump directly to the text that corresponds to the
topic printed on the button.
2.
To coordinate this web page with Authorware presentations: The
corresponding Authorware program should already be open. Go to the
menu of your choice in the Authorware program and click any
button which interests you. Then on the topic locator map above
click on the same button on the same menu; you will
jump to the text that corresponds to the Authorware presentation.
End
of Topic Locator Map
PRINT:
You may print this web page on your local printer if you wish. Then
you can read the hard copy of the lecture text as you look at the
Authorware graphics.
Beginning
of Text explaining how we Estimate Population Parameters
How can
we use sample data to make guesses about population parameters?

Go
Back to Menu Locator Map
The next topic is estimating
parameters. Let's continue with the example we used when we studied
the Sampling Distribution of the Mean (SDM). As can be seen on the
graphic, we have a population of SAQ scores. It is normally distributed
with with mu = 150 and sigma = 30, that is, it is N(150,30). You
take a random sample from this population.
Why we have to estimate
population parameters. Now, what if you actually don't know
the population mu? In fact, in the usual research setting you never
know population parameters like mu and sigma. You have to make guesses
about them from your sample data.
In the reality of science,
you don't know things like the true mean (mu) of the population
of SAQ scores. You only have your data (which is a group SAQ scores).
Or, in another example, you don't know the true water absorption
rate in an Amazonia rain forest. So you have to go measure water
absorption rates on different plots of land in a rainforest. From
this sample data you can make a guess about the true water absorption
rate.
When we study probability
distributions such as the binomial or normal, we simply assume we
know the true parameter values like mu or sigma. But in actual research
you don't know. And so we're getting to a more scientifically realistic
situation now, where you only have data, but do not know the truth.
From your data you're going to want to make guesses about the population
which generated the data.
In terms of homeworks
and tests, instead of the word problem giving you mu and sigma,
it will give you a data set. And you will have to calculate statistics
which are good guesses about population parameters. How do we estimate
or guess what the value of mu is?

Estimating the mu
of the population. This graphic shows an overview of all the
relationships. In step 1 (in the upper left-hand corner of the graphic)
you can see that the dependent variable, SAQ, has been modeled as
a normal distribution. In step 2 we do a research project on spatial
ability; this is equivalent to taking a sample of certain size,
n, from this population of SAQ scores. So we've got a sample. In
step 3 we calculate a statistic on the sample data. In this case
we calculate the mean.
Estimated mu = M.
The estimate of the population mean, mu, is the sample mean. That's
as simple as it can be. Unfortunately, it's going to be messier
when we get to estimating population variance.
The sample mean is our
best guess as to what the population mean is. On the graphic, we've
used a blue line to connect the sample mean with the population
mean.

Estimating sigma.
The second population parameter we want to estimate is the standard
deviation, sigma. The current screen gives the formula for calculating
an estimate of population standard deviation (sigma) from sample
data.
Go
Back to Menu Locator Map

Little s versus Big S: As a start, let's establish
some symbols.
|
|
Little s. In the text you are now reading, we will
use little s as a symbol for the statistical formula
that estimates the population sigma.
On the graphic
screens we're going to use a little script s as a symbol
for the same thing.
So any little s, either typed or script, will stand for the
formula for guessing the population sigma.
|
Big S. In contrast, we will continue to
use a S as a symbol for the sample standard deviation.
When we studied descriptive statistics, we worked a lot with
big S, so you should be familiar with its formula. |

Definitional formula
for little s. On the graphic screen (repeated immediately above)
you can see that big S and little s are very related. At the top
of the graphid you can see that one way to find little s is to multiply
big S by the square root of n divided by n-1. Below you can see
that another way to calculate little s is to find the square root
of [the sum of the squared deviations divided by n-1]. Examine these
two formulas carefully and write them in your workbook.
Little s is our estimated
sigma. We won't give a worked example of little s here because the
computations are so similar to the computations for the sample standard
deviation (big S).
Recall, as a review,
big S is the square root of [the sum of the squared deviations divided
by n].
The only computational
difference between big S and little s whether you divide the sum
of the squares by n or by n-1.
Inferential versus
Descrptive Statistics. While the formulas for big S and little
s are very similar, the two formulas represent two very different
concepts. Little s is an inferential statistic. We use little s
to infer from the sample data what the population standard deviation
might be. In contrast, big S is a descriptive statistic. We use
the formua for big S to describe the standard deviation of a sample.
Many books don't make
the distinction between big S and little s. This is because, when
n is large, there is essentially no practical difference between
big S and little s. Dividing some someing by n = 50 or dividing
it by n-1 = 49, may not even show up significantly in the answer.
But not making the distinction
can lead to a great deal of confusion for students who see formulas
that require dividing by n in one place and require dividing by
n-1 in another place. Moreover, there is a real and important distinction
between inferential and descriptive statistics.
A description of sample
variability or dispersion is accurately given by big S.
An inference about population
variability is accurately given by little s.
On
the graphic screens we will use this little script s for
our estimate of population sigma. In the online text you are reading
we will use a little s to indicate the same thing.

Computational formula
for little s. The next screen graphic shows the computational
formula for little s. The computational formula is pretty easy to
use on a hand calculator. Some people call it the sum of squares
formula.

Overview. The
next graphic shows the overall relationships among population, sample
and little s. Theoretically, there is a population from which you
take some sample. The sample data can be used to make a guess (estimate)
about the population sigma. Little s is our best guess, or estimate,
as to what the population sigma is.
In terms of the specific
example shown on the graphic, we pretend that we know the population
is normal and that it has mu of 150 and sigma of 30. We take a sample
of a certain size n from the population. From that sample we calculate
a statistic, little s, which is the best sample estimate of sigma.
Now the populatin sigma is 30 but little s is very unlikely to be
30. It is just a guess. It should, however, be near 30. Depending
on all the random factors in sampling, the data differ from sample
to sample. So little s will differ from sample to sample.
Summary. One important
point to get is that there are two formulas, little s and big S.
Big S is the standard deviation of the sample. It describes
the amount of spread of the sample data around their mean. Little
s is a guess or estimate about the value of the population sigma.

Estimated SEM.
Next we will consider how to make a guess about the value of the
standard error of the mean (SEM). Recall that the the SEM is the
standard deviation (sigma)of the sampling distribution of the mean
(SDM). We will begin with a review of the formulas and conceptual
difference between big S and little s.
Go
Back to Menu Locator Map

Big S. Big S,
is a descriptive statistic; it's the sample standard deviation.
As a formula it is the square root of the sum of the squared deviations
around the Mean over n.

Little s. In contrast,
little s, is an inferential statistic because we're not describing
the data now, we're making an inference about a population parameter,
sigma.

Take a sample.
To begin the estimation process, we take a sample of human beings
from a population. Using our running example we take a sample of
humans from the SAQ population. Now that we have the sample data
we want to define a statistic that will be a good estimate of the
SEM (standard error of the mean).

How to estimate SEM.
We have two formulas; they both work equally well. As you can see
on the graphic, you can estimate the standard error of the mean
by dividing litttle s by the square root of n. Or you can divide
big S by the square root of n-1. Both formulas will give you the
same result (within rounding error). The blue line on the graphic
shows that these two formulas estimate the standard deviation of
the sampling distribution of the mean.
It's useful to have both
these formulas because at certain times you may have big S available
and at other times you may have little s available.
Example. In the
SAQ example we are using, the population is normal with a sigma
of 30. Supppose we took a sample of n equal 25 people from the SAQ
population. As a review, remember that the Sampling Distribution
of the Mean (SDM) is also normal with a standard deviation equal
to the population sigma divided by the square root of n. The standard
deviation of of the SDM is called the standard error of the mean
(SEM). In our example, the SEM is equal to 30 over the square root
of 25. So the true SEM is equal to 6.
Using big S. Suppose
it turns out that our sample standard deviation (big S) is equal
to 27.8. Then our estimated SEM = 27.8 divided by the square root
of 24. This gives us an estimated SEM of 5.674.
Using little s.
Now let's use our sample estimate of the SAQ population sigma. If
big S = 27.8, then little s would be equal to 28.37 (See FYI, below).
So our estimated SEM = 28.37 divided by the square root of 25. Estimated
SEM = 5.674.
FYI: If you want
to work on the numbers in the example above, recall that little
s = big S times the square root of [n divided by n-1].
We are finished with
our discussion of how to estimate population parameters. We have
developed estimates of the population mu and sigma as well as an
estimate of the the standard error of the mean.
Go
Back to Menu Locator Map
|