|
Variability
Lecture Web Page
©Copyright 1997, 2000 Tom
Malloy
This
is the text of the in-class lecture which accompanied the Authorware
visual graphics on this topic. You may print this text out and use
it as a textbook. Or you may read it online. In either case it is
coordinated with the online Authorware teaching program.
Topic
Locator Map



This
map allows you to--
-
Jump directly to a topic which interests you.
-
Co-ordinate the dynamic visual Authorware presentations with the
corresponding text available on this web page.
1.
To find a topic which interests you: Look at the map of menus
above. Choose a menu that interests you. Notice that the menu buttons
have topics printed on them. Click on any button (topic) on the
menu; you will jump directly to the text that corresponds to the
topic printed on the button.
2.
To coordinate this web page with Authorware presentations: The
corresponding Authorware program should already be open. Go to the
menu of your choice in the Authorware program and click any
button which interests you. Then on the topic locator map above
click on the same button on the same menu; you will
jump to the text that corresponds to the Authorware presentation.
End
of Topic Locator Map
Variability
Lecture
This is the text of the
in-class lecture which accompanied the Authorware visual graphics
on this topic. You may print this text out and use it as a textbook.
Or you may read it online. In either case it is coordinated with
the online Authorware graphics.

Variability.
Variability
is the statistical concept that describes and assigns a numeric
value to the degree of spread-out-ness or grouped-together-ness
of a set of scores.

Back
to Topic Locator Map
Variability
(also called dispersion) addresses the issue of
how spread out the scores are rather than what the center
of the numbers is. Numbers in a set can be tight and compact around
their center or they can be dispersed and spread out a long way
from their center. We're going to look at several descriptive statistics
which will describe the degree of spread in a set of numbers around
their center.

Example.
Here is an example of two basketball players, with their shooting
percentages. You can translate this into anything else that you
want, but I assume basketball is a cultural experience that's common
to most people.
We
have Player #1 and Player #2. Let's say they each play in five games.
In these games, Player #1 shoots many shots in each game. His game
by game shooting percentages are: 48%, 53%, 47%, 52%, and 50%. If
we average these, we'll see that this player is shooting well, averaging
about 50 percent. That's a good percentage in basketball especially
if you play outside than inside. Let's say that we have another
player and this player shoots 60%, 57%, 50%, 43%, and 40% for the
same five games. If you find the mean for this player, you'll find
that he is also averaging 50% for the 5 games. Each one of these
players is shooting 50% in five games. So for these five games they
appear to be about the same on the average. They're indistinguishable
on the average and yet there's something very different about the
play of these two players.
Most
followers of the sport know these two types of players, one is very
dependable and always shoots about the same percentage, and the
other type is hot and cold with a wide range of different percentages.
What we would say about Player #1 is that his shooting percentages
have low variability, that is the scores don't vary much around
their central tendency. They're not very dispersed, as opposed to
Player #2 who shows high variability. This second person has shooting
percentages that are quite varied, or dispersed around their center.
This
is a basic example of the idea of variability. This idea is different
than the notion of central tendency. Central tendency statistics
provide a set of measures that describe the center of a set of numbers
. Statistics also has a set of other measures that we'll get into
now, that describe how spread out things are from the center, or
how diverse they are. Variability is a measure of diversity, which
is an interesting thing to know about.
Effects
of Stress Example. The details of this example of the
effects of stress on performance are given in the Notes posted under
Variability topic.

Let's
say we're doing a study on the effects of stress on performance.
Our hypothesis might be that stress is not going to affect the average
performance of a group, but that it will affect the variability
of performance in a group, because different people react to stress
differently. Some people do better under stress and other people
do worse.
Let's
say as our research project we take two groups of people and ask
them to perform a difficult task. Then we stress the people in one
group but we do not stress the people in the other group. We have
a Stress group and a No-Stress group. Our hypothesis is that some
people in the Stress group will do better than normal under stress
and some people will do worse. Stress may not affect the average
performance because, if our hypothesis is right, some people will
do better and some will do worse. What we'd expect the stress to
do, in a large group, is affect the variability of the scores, but
not necessarily affect the average.
The Independent Variable
in this example is Stress versus No-stress. The Dependent Variable
is task performance, which we will say can vary from 1 (poor performance)
to 7 (excellent performance).


Back
to Topic Locator Map
Range.
There are several measures of variability. We will start
with a very simple one that's not particularly important, but it's
very common. This measure of variability is called the range.
The
range simply describes the difference between the highest score
and the lowest score. To use the range as a measure of variability,
you take the lowest score and subtract it from the highest score.
The range uses information from only two scores out of the whole
sample. You could have a million scores and all you're going to
do is take the lowest one and subtract it from the highest one.
You throw away all the information from all the other scores. This
is not desirable, but it is simple. The range gives you a very quick
kind of measure of variability, but it obviously throws away a lot
of information.

Let's
return to our hypothesis about the impact of stress on performance:
Stress will affect the variability of the scores, but not necessarily
affect the average. On the graphic just above this text, you will
see some results from the stress experiment. The Stress Group has
the performance scores 6, 2, 1, and 7, and the No-Stress Group has
4, 3, 4, and 5. With such few numbers, it's easy to see that the
Stress Group does seem to be more spread out than the No-Stress
Group.
Let's
make some formal calculations on the data. In the Stress Group,
the range is 7 minus 1 which equals 6. For the No-Stress Group,
it is 5 minus 3, which is 2. We can see that the range fits our
intuition as a measure of variability or dispersion because it gives
a larger result for the group that appears to have more variability
in it than it gives to the group that appears to have less variability.

AVERAGE
DEVIATION?
The next measure of variability we will discuss is the average deviation.
The formula for the average deviation, AD, is given on the graphic
below. The average deviation is the sum of the absolute value of
the deviation of each score from the mean divided by the total number
of scores. The two straight bars in the formula are the mathematical
symbol for the absolute value, and in mathematics they indicate
that we're to use the absolute value of whatever is inside
the bars. The Absolute Value function
is an instruction to disregard all minus signs and treat all numbers
as pluses.
Back
to Topic Locator Map

In the AD formula, what's inside the absolute value signs, the large
straight bars, are the deviations of the scores from their mean,
that is, Xi minus the mean. You are familiar
with deviations around the mean from studying the mean. What we're
going to do is calculate a deviation of each score from the mean,
and then we're going to make them all plus or positive numbers.
That is, we're going to take the absolute value of the deviations,
which is simply to ignore all the signs and treat them all like
positive numbers. Next, we're going to average the deviations. We're
going to sum up all of the absolute values of the deviations and
divide by the number of scores. Let's go ahead and do that with
our stress example.

Let's
calculate the deviation for each of the individual scores in the
Stress Group. The mean for this group is 4. For X1,
which is 6, the deviation is 6 minus 4 which is positive 2. For
X2, the deviation is 2 minus 4, which is negative
2. And so forth.
You'll
notice that if we sum up all the deviations we'll get zero. Remember
that one important property of the mean is that the
sum of the deviations around the mean is always 0. This
means that simply adding up the deviations won't tell us anything.
It will always give us zero whether variability is high or low.
To measure variability,
what we want to know is how far away things are from the mean on
the average. We don't care about the pluses and minuses.
So
what we're going to do is create another column next to the deviation
column, which records the absolute values of the deviations. Basically
we make all of these deviations positive numbers. For example, the
absolute value of 6 minus 4 is 2, and the absolute value of 2 minus
4 is 2, etc. Now when we add all of the absolute values of the deviations
up we get 10 instead of zero. The obvious reason taking the absolute
value of the deviations is we avoid getting the trivial case every
time of zero.
The
average deviation will be 10 divided by 4, because we have 4 scores.
This gives us AD = 2.5. The average deviation gives us the average
distance scores are away from the mean. That's an intuitive way
to talk about dispersion or variability.

Turning
to the No-Stress Group, we see that the sum of the absolute deviations
is 2, with a total of 4 scores. So the average deviation is one-half.
Like the range, the average deviation statistic supports our intuitive
feeling from looking at the raw scores that the variability in the
Stress Group is greater than in the No-Stress Group. The average
deviation for the Stress Group is 2.5 compared to only .5 for the
No-Stress Group.
The
average deviation is intuitively appealing to beginning students,
and is seldom used in statistics.
The reason it is included is that it makes a nice bridge to the
next concept, variance, which is really useful. Average Deviation
is
a nice way to warm up into the fact that if you're going to look
at variability, then a pretty sensible thing is to look at deviations
of scores around the mean. Deviations are central to variability
concepts because variability addresses how far things are away from
the center. The Average Deviation will give you some sense of that.
If the deviations are all small, there's not much variability. If
the deviations are large, there is a lot of variability.
The
problem that has to be dealt with (which average deviation does
with absolute values) is that if you just add up the deviations
around the mean they always come out to be zero. The sum of the
deviations around the mean can't distinguish a case with high variability
from low variability because the sum of the deviations always generates
zero. So while it is intuitively appealing to use deviations as
a measure of variability the fact that they always sum to zero ruins
it. One solution is just to turn them all positive as in the Average
Deviation. A better solution which has many uses in statistics is
described in the next section.


Variance.
Variance
is one of the two most important measures of variability. Because
of this, we will spend some time looking at what it is and what
it means. We'll start with the formula which defines the variance
of a set of numbers.
Back
to Topic Locator Map
Conceptual
Formula for Variance. On the graphic above is the formula for
variance. We will use a capital or big S squared as our symbol for
variance. As you write this formula down in your lecture notes,
look at it carefully. While it may look a little complicated, it
actually is not. If you look at the heart of it, you just have Xi
minus the mean. This is the deviation of a score from the mean,
which we've just talked about under average deviation. This time
the formula asks us to square each deviation (multiply each deviation
by itself) before we add them together. Variance is based on the
squared deviations of the individual scores from the mean. Once
we have each squared deviation, we calculate their average. That
is, we add them all up and divide by n. So really the concept of
the variance is the average squared deviation.
Sum
of Squares Formula. Also, notice that, on the bottom of the
graphic above, there is a formula for the Sum of Squares (SS). The
sum of squares is the sum of the squared deviations around the mean.
We will come back to this formula many times during the course.

Deviations.
Let's return to the data from the Stress Study which we've been
working with. In the Stress Group there are four scores, 6, 2, 1,
and 7. And as we've seen in the past, we can get a deviation from
the mean for each individual score. To repeat, the first score is
6, the mean is 4, and 6 minus 4 gives us a deviation of plus 2.
Remember, if we sum these deviations, we will always get zero regardless
of the spread of the scores.
Squared
deviations. We will get a much more satisfying result if we
go over to the next column and square each of these deviations.
If we square the deviations, all the scores become positive. A negative
number squared is positive, and a positive number squared is positive--all
squared (real) numbers are positive. The column in the graphic above
shows what happens when all the deviations are squared--2 squared
is 4 and minus 2 squared is 4. 3 squared is 9 and minus 3 squared
is 9. Now you'll notice that the formula for the variance uses squared
deviations and it asks you to sum them all up. So once we square
all of the deviations, we want to add them all up and put them down
here at the bottom of that column.
Sum
of Squares. The numerator of the variance formula asks you to
sum up the squared deviations. This part of the formula is called
the sum of squares. When we add up the squared deviations for the
Stress Group we get 26. That takes care of the top part of the formula--the
sum of the squared deviations is 26.
Average
squared deviations. Now all we have to do is divide by n. 26
divided by 4 gives us 6.5. That's the variance for the Stress Group
or the average squared deviation. To get the variance, sum up all
the squared deviations and then divide by n.

Let's
calculate the variance for the No-Stress Group. The graphic above
works out the full example. We can see that the squared deviations
are small because the deviations are small and the sum of the squared
deviations is only 2. The variance is 2 divided by 4, or one half.
Variance,
like AD, gives a higher number to the group of scores which is more
variable, the Stress Group, than to the group of scores which is
less variable, the No-Stress Group.
Practice.
Let's run through a couple of examples. What is the variance of
the following scores: 3, 2, 1, 2? (Answer: .5.) What is the variance
of 3, 3, 0? (Answer: 2.) What is the variance of 3, 3, 3? (Answer:
0.) Why did a variance of 0 make sense in the previous example?
(Answer: Because there is no variability in the data.)

WHY
THIS FORMULA? Now
we'll go through the formula piece by piece. If you look at the
formula you wrote down or picture it in your mind, you'll notice
that the very heart of the formula is Xi minus the Mean. In other
words the heart of the variance formula is the deviation of the
score from its mean. Deviation is what is interesting about variance.
The deviation of the scores generates variance.

Back
to Topic Locator Map
If
the deviations are small then the numbers are compact around the
mean. If the deviations are large, that means the scores are widely
dispersed around the mean. If deviations are small then the scores
are not spread out, if deviations are big then they are spread out.
Deviation from the mean simply gives units away from the mean.
Now,
as we pointed out before a couple of times, the trouble with deviations
is they sum to 0. It's an algebraic property of the mean, so adding
them up doesn't do us any good. We've got to do something else with
all of these deviations besides add them up.

In
the case of the average deviation, what we did was use absolute
values to turn all deviations positive. In the case of variance,
we squared them. So one of the things that squaring does is turn
everything into a positive number. A negative number when it's squared
becomes positive, and a positive number stays positive. Squaring
the numbers accomplishes the same function as using the absolute
value, but it also does something else.


Squares
amplify larger deviations more than small deviations. If our deviation
is 0, 0 squared is 0. If we have a deviation of 1, then 1 squared
is 1. A deviation of 2 becomes 2 squared, which is 4. A deviation
of 3 becomes 3 squared is 9, and of course 4 squared would be 16
and off the top of the graph below.

This
means that as the deviations get larger, the squares of the deviations
get even larger. Let's
say we have a set of numbers, 0, 1, and 5. And the mean is 2, and
we calculate our deviations. They are a minus 2, a minus 1, and
a plus 3.
 
Let's
put these deviations onto a number line in the graphic above. We
can see that the mean is 2, the number 1 is one unit below the mean,
zero is two units below the mean, and the number 5 is three units
above the mean. The three deviations are indicated by three blue
lines going from the mean to the score.
What
happens when we square these deviations? The next graphic shows
visually what happens when you square a line. Geometrically, it
turns into a square. So 2 turns into a square with an area of 4.
A linear 1 turns into an area of 1. And a linear 3 turns into an
area of 9.
A
deviation is one-dimensional; it is a length (away from the mean).
When length increases it increases only in one dimension. Length
squared is two-dimensional; it is an area. When the length of a
side of a square is increased, we are talking about a one-dimensional
increase. But as the length of a square's side increases, the area
of the square increases in two dimensions. So the area is increasing
much faster than the length of a side.

The
visual equivalent of summing the squared deviations is putting the
three areas together. What you'll notice is that the squared 1 is
rather small. The squared 2 is somewhat larger, and the biggest
one by far is the squared 3. In fact the squared 3 accounts for
more of the area of the sum of the squares than the other two combined.

This
is a visual way of seeing what squaring does to deviations. Because
of the 2-D squaring function, it amplifies large deviations more
than small deviations.
In
short, we're describing a property whereby, if you square the deviations,
you're amplifying larger deviations more than you are smaller deviations.
Squaring is especially sensitive to large deviations. Why is that
good? The answer has to do with measurement error.
Measurement
error. In the Interface to Science lecture we pointed out that
scientists measure things and turn them into numbers. They identify
some kind of natural event or phenomenon of interest. They create
measurement operations to measure some aspect of the event or phenomenon.
When the phenomenon is a person, we might measure things such height,
weight, IQ, aggressiveness, personality.
In general, scientists recognize that most of their measurements
probably contain some level of error. If I were to measure the length
or the width of a table, I would never think that I could measure
it exactly. I would be likely to get a different result each time
I tried. Each result could be ten thousandths of an inch off from
the previous measurement, even though the length of the table stayed
the same.
All
measurements produce some kind of error. However, we are not
very concerned if our error is small. If I'm one ten thousandth
of an inch off , it's not going to make much difference for building
a table. But if I'm a long way off, if I'm two inches off, that
will make a difference in the final product. A two inch difference
in the length of the legs will make the table wobbly.

The
same is true when we're measuring someone's personality, intelligence,
aggressiveness, or their memory in psychology. We assume small errors
don't really matter but big errors do. Metaphorically, big measurement
errors will create wobbly theoretical constructs.
Squared
error. Variance is one of the statistics which is sensitive
to big errors. Variance will get big really fast if there is even
one large error in measurement. But it doesn't get big very fast
if we just have small errors. So this property of amplifying large
deviations from the mean more than small deviations is considered
to be a good measurement property. Variance is important
in this particular way, because it tends to minimize or even to
ignore differences that are small. But if you get one measurement
that's a long way off from the mean, then the variance will increase
dramatically. Variance will help you detect that you've got some
measurement that's a long way off the mean.
Least
squared error criterion. Big errors are so much more costly
than small errors. The basic idea of least squared error is that,
based on the wobbly table argument, we would like statistical formulas
that give us the least amount squared error. That is, we
don't mind small errors so much, but large errors can make things
wobbly. Since squaring errors amplifies large errors the most, then
any procedure that gives us the least amount of squared error is
the one without large errors. So
statisticians use Least Squared Error as a criterion for choosing
statistics. We will evaluate the worthwhileness of a statistic (like
the variance formula) by how much squared error it gives us. The
less the better.
This
is just a warm-up. We will return to the least squared error ideas
in the Regression lecture.
Review.
For now, the main points you need to think about are that variance
does a couple of things: (1)
it turns the deviations
positive so they don't cancel out and (2)
it amplifies
large deviations. Those are both considered to be good operating
characteristics.
Let's
return to the formula for the variance. Now we add up all of the
squared deviations and then we average them.

Next,
divide by n to get the average squared deviation. This is the variance.

Summary.
Deviations from the mean are appealing as a measure of dispersion
because they show us the distance scores are from the mean. But
deviations can't be used because they cancel themselves out and
always sum to 0. Variance is based on squared deviations for a couple
of reasons. First, it makes all the deviations positive, and second
it amplifies large deviations more than small ones. Variance is
the sum of the squared deviations divided by n. The variance is
average squared deviation.

Computational
Formula for Variance.
Previously I gave you the computational formula for the variance
because that formula is easier to understand conceptually. But there's
an easier formula to use when you are actually calculating the variance.
In
the computational formula the variance is equal to the sum of the
scores squared, divided by n, minus the mean squared.
Back
to Topic Locator Map

What
the computational formula asks you to do is to square each score,
then add them up and divide by n. Once you have that figure, you
subtract the Mean squared. With a large data set, this formula is
much easier to use than the conceptual formula. It also avoids a
lot of rounding error that you get when you are using the conceptual
formula. I highly recommend that you use
the computational formula rather than the conceptual formula when
you are calculating the variance. This is
also called the computational formula. You should note that the
computational and conceptual formulas are identical algebraically.
They give the same result (within the limits of rounding error).
If you are interested you can prove their equivalence for yourself.
The
conceptual formula is good for explaining variance because it shows
the idea behind the variance--it is the average of the square deviations
around the mean. The concept behind it is relatively easy to see.
With the computational formula it is hard to explain the concept
of variance but it's a lot more convenient for you in doing calculations.

Calculations.
Let's go back to our Stress study example and calculate the variance.
Remember that the Stress Group has scores of 6, 2, 1, and 7. The
first thing that you have to do to use this formula is square each
of the scores. So we do that … 6 turns into 36, 2 into 4, and so
forth. We see this squaring amplifies the larger numbers more than
smaller numbers. The next thing you have to do is to sum them. The
squared values are 36, 4, 1, 49, and summing those squares we get
90. We've already calculated the mean to be 4, and so substituting
it in the formula, we get 90 divided by 4 minus 4 squared is equal
to 22.5 minus 16. That gives us 6.5. Which is the same number we
got the last time using the conceptual formula for variance.

Let's
work through the calculations for the No-Stress Group to give you
a second example. We've squared each score, summed the squares to
get 66, and then substituted into the formula, which gives us 66
over 4 minus 4 squared. We get .5 as an answer. This is, again,
the same answer we got with the conceptual formula.
Practice.
What is the variance of 10, 8, 9, 4? (Answer: 5.19. There may be
rounding error.)


Back
to Topic Locator Map
There
are many perfectly good formulas for calculating the variance which
are all algebraically identical to the conceptual formula and the
computational formula we are using. Different people and different
texts prefer different formulas but they all give the same answer.
Sum
of Squares Formula. The Sum of Squares formula is used in many
different areas in statistics and can be used as a first step toward
getting the variance. The current lecture graphic (above) shows
the computational formula for the sum of the squared deviations
around the mean.

SS
and Variance.
If you want to use the SS formula, then you can find the variance
by the formula variance equals SS/n. This, of course, is the average
squared deviation.

The
current series of lecture graphics show how to calculate the variance
from the SS formula. We won't go through the details in the text
since they are laid out clearly on the lecture graphic.
You've
now had several formulas for calculating the variance. This not
meant to confuse you, but to let you know that if you read other
texts or talk to other statisticians, you may find their formulas
different. At least you'll be aware of the issue and have seen a
few of them.
Practice.
If SS = 103 and n = 14, find the variance. (Answer: 7.357...)


Standard
Deviation.
The next measure of dispersion or variability we will discuss is
the standard deviation. The variance and the standard deviation
are essentially the same idea. They're highly related and they're
both central to the kinds of inferential statistics that we're going
to discuss later in the class.
First,
we will introduce the formulas for the standard deviation, and then
explain them.
Back
to Topic Locator Map

The
standard deviation is simply the square root of the variance. The
symbol that I will use for the standard deviation is capital S.
[NOTE: Not everyone nor all books use this same symbol for standard
deviation, so be careful if you read different texts. Make sure
you notice what symbol goes with what concept.]
The
variance by its very nature is a squared entity. It
is the average squared deviation. When you take the
square root of the variance, you get the standard deviation.
Symbols.
Because the variance is averaged squared deviation, its symbol is
S squared. When you take the square root of S squared, you
naturally get S--the standard deviation. So the symbols make
the relationship obvious. The square root of a number squared is
just the number.
Conversely,
the variance is the standard deviation squared. If you have the
standard deviation and you want find the variance, just think of
the symbol for the variance -- it just says to square the standard
deviation (S). Since the standard deviation is the square root of
the variance, you can find the standard deviation by simply putting
a square root sign over the formula you just wrote for the variance.
The standard deviation is the square root of the sum of the squared
deviations divided by n. If you want to use the computational form,
the standard deviation is whatever the formula you have used for
the variance, with a square root symbol over it.
Let's
look at some formulas for calculating S. In each case it will be
exactly the same as the variance formula, except that it will be
under a square root sign.

Conceptual
Formula for S. This is just the square root of the conceptual
formula for variance.
SS
Formula for S. This is just the square root of the SS formula
for the variance.

Computational
formula for S.
Again this is just the square root of the computational formula
for variance.

Calculations.
We've done most of the calculations already by calculating the variance.
Let's look at the data again for the Stress Group and the No-Stress
Group--the same data as before. In the Stress Group the variance
was 6.5, so if you wanted the standard deviation of the Stress Group,
you would simply take the square root of 6.5 which is 2.55.
Similarly
the variance of the No-Stress Group is .5, so the standard deviation
is the square root of .5. As you can see S = .7071 in the No-Stress
Group.
Practice.
Given the data set 7, 2, 4, 6; find S. (Answer: 1.92. There may
be rounding error.)


Back
to Topic Locator Map
Rationale
for S. Let me talk a little about the relationship between the
standard deviation (S) and the variance (S squared).
Difficulty
interpreting variance. The variance, of course, is a squared
entity. You'll notice that its formula uses the squared deviations
around the mean. Squared measures sometimes make sense, especially
in physical measurement. If you have measured things in feet and
you want to know the variability, the variance will be measured
in square feet.
For
example, suppose you measured the height of every house in the neighborhood
to the top of the chimney. Therefore the mean height would be in
feet and the variance in height would be in square feet. Perhaps
the mean height would be 30 feet, and the the variance in height
might be 25 square feet. What does that mean? The meaning of variance
in terms of its dimensionality is difficult to understand intuitively.
The variance, by its nature, is going to be in squared feet.
It's not going to be in feet because it is based on the squared
deviations from the mean.
S
is easier to understand. Suppose we take the square root of
the variance and get S = 5 feet. The average height of houses is
30 feet and the standard deviation is 5 feet. It's a lot
easier to interpret standard deviation than it is to interpret
variance. The standard deviation gets us back to the same units
that we started in when we measured. Rather than being in squared
feet, we have returned to feet which are much easier to understand.
Squared
IQ. When you move into psychology or other more abstract topics
the squaring function is even more difficult to interpret. Measure
somebody's IQ; then the variance is in squared IQ. We know what
a square foot is, but we have no interpretation of a squared IQ.
Squared IQ doesn't correspond to anything we know of the way that
square feet does. It no longer makes any sense.
So
taking the square root of the variance puts our measurements back
into their original scale. It takes square feet and puts them back
into feet; it takes squared IQ and puts it back into IQ units. That's
one of the reasons why the standard deviation is important.
Both the concepts of variance and standard deviation will be important
throughout the course.


Back
to Topic Locator Map
Examples.
The lecture graphic shows the calculation of S when you already
know the variance. This is straightforward. The details are on the
graphic so we will not describe them in the text.

Z-scores.
z-scores are also called standard scores. The formula for z uses
the standard deviation around the mean.

Back
to Topic Locator Map
The
distance that a particular score is from the mean is measured by
Xi minus M (mean). Look at the formula for z. It is a score's
deviation from the mean divided by the standard deviation.
Deviation
divided by Standard Deviation. The concept is simple. Every
score has a deviation from the mean. A z score is simply the score's
deviation divided by the standard deviation.
Let's
take an example.

Example.
Suppose that 300 students took a midterm in Psych 1010. Suppose
those 300 midterm scores had a mean of 65 and a standard deviation
of 5. Suppose a student receives a score of 72 on the midterm. What
is the z-score for a raw score of 72 on this exam? z for our raw
score of 72 is going to be the score minus the mean over that standard
deviation which is 72 minus 62 divided by 5, or 7 divided by 5.
That turns out to be a positive 1.4. Notice that z-scores can have
either positive or negative signs. The sign of the z-score matters.
Interpretation.
The person who got a raw score of 72 has a z-score of 1.4. Actually
when we think about it, what this means is that this student is
1.4 standard deviations above the mean. What has happened is we've
divided the student's deviation by the standard deviation. The student's
deviation is 7, the standard deviation is 5. 5 goes into that 7
1.4 times. What we're saying is that the raw score of 72 is 1.4
standard deviations above the mean.
Let's
take another example and see how this is useful.

Which
is better, 36 or 72?
Let's say that a student takes two exams receiving a 36 in math
and a 72 in English. On which exam did the student do better? How
do we answer that question? 72 is a higher number than 36, in fact,
72 is twice as high as 36. However, you've taken enough exams to
know that a good score is relative concept on each different exam.
36 and 72 are just raw scores. We have no idea how many questions
there were on each exam, how close the student is to the top score,
and how well all the other students did. So we can't really say
on which exam the student did better.
How
to evaluate what score is a good score. Let's say that we have
some basic statistics, that the mean in math is 25 and the mean
in English is 87, now we can begin to answer the question.
We see that in math the person is above the mean and in English
they're below the mean. We can begin to say the math score is a
better score because at least it's above average in the class. But
we still don't know how far above and below the mean 36 and 72 are.
Maybe
they are both very close the mean and so the student got a C in
both classes. Or maybe one of them is exceptionally far from the
mean. In order to find out those kinds of things we need the standard
deviation. The standard deviation will give you some standard of
what a deviation is worth.
Let's
say that the standard deviation for math is equal to 5 and the standard
deviation for English is 10.

z
of 36. The
z for the raw score of 36 on the math test is 36 minus the mean,
25, divided by the standard deviation of 5. That gives us 11 divided
by 5, or a z-score of positive 2.2. This person is 2.2 standard
deviations above the mean in on his/her math test.
z
of 72. On the English test the standard deviation is 10. The
z-score for 72 is (72 - 87)/10 = -1.5. So the student's z score
on the English test is negative 1.5.
The
Positive/Negative sign shows direction. The sign of a z-score
matters. A positive z-score indicate that the raw score is above
the mean: a negative z-score indicates that the raw score is below
the mean.
The
z-score is the number of standard deviations above or below the
mean. The exact value of the z-score gives you the number of
standard deviations the score is either above (+) or below (-) the
mean. The student
had a z score of +2.2 in math. That means s/he scored more than
2 standard deviations above the mean. A raw score of 36 was a very
good score on the math exam. However, the
student had a z score of -1.5 in English. That means s/he scored
1 and 1/2 standard deviations below the mean. A raw score of 72
is a poor score on the English exam.
Practice.
If the mean of a set of data is 10 and the standard deviation is
3, what is the z-score of a raw score of 14? (Answer: +1.333...).
For the data set 6, 4, 5, 5, 0, what is the z score of the raw score
0? (Answer: M = 4, S = 2.0976..., so z of 0 = 1.9069...)

Bell
curve. We will
now move on to the Bell Curve. We've already studied the normal
curve which is one kind of bell curve, but there's a whole class
of distributions that are called bell curves. All of these are symmetrical--one
side is just like the other side. There is a big bell-shaped bulge
in the middle and the curve tapers off into two tails, one on each
side. This kind of curve is important in the history and evolution
of statistical thought.
Normal
Curve. The normal curve is the specific bell curve we will work
with in this class. You can find very precise probabilities using
the Normal Tool. The discussion below gives some approximations
and rules of thumb you can use when you don't have access to probability
tools.
Back
to Topic Locator Map

Rules
of Thumb on a Bell-shaped Curve.
On the lecture graphic above shows a bell-shaped probability distribution
with the mean in the center. It has standard deviations marked out
along the horizontal axis.
50
% below and 50% above. The probability of being below the mean
(the lower 1/2 of the curve) is .5 and the probability of being
above the mean (the upper 1/2 of the curve) is also .5. This means
that we would expect about half the students' scores on an exam
to be above the mean and half below it.
Two-thirds
within 1 S. Two-thirds (66%) of the probability fall between
-1 and +1 standard deviation of the mean. This means that we expect
about 2/3 of all student scores to cluster within 1 S of the mean.
95%
within 2 S. 95% of the probability falls between -2 and +2 standard
deviations of the mean. This means we expect 95% (19 out of 20)
of student scores to be within 2 standard deviations of the mean.
99%
within 3 S. Essentially all (99%) of the probability falls between
- 3 and + 3 standard deviations of the mean. This means we expect
all student scores to be within 3 standard deviations of the mean.
It would be a very unusual score that was beyond 3 standard deviations
from the mean.
Since
the curve is symmetrical, and 66% is within 1 S of M, you can divide
that in half and say 33% are between the mean and plus one standard
deviation. Of course, then, 33% are between the mean and minus one
standard deviation.
Again,
these are just rough rules of thumb.

Back
to PSYCH 1010 Example.
Let's return to the Psychology 1010 exam that had a mean of 65 and
standard deviation of 5. A student earned a raw score of 72 on the
exam. As we previously calculated, a raw score of 72 corresponds
to a z-score of +1.4.
Before
we interpret the z score, let's use the example to get less abstract
about what all these plus and minus standard deviations are. Let's
follow the example through on the following graphic.

M
= 65. First,
we put the mean = 65 right in the center of the curve where it says
"M" on the bell curve graphic.
+1
S. To find out the value for +1 standard deviation, we take
the mean, 65, and add one standard deviation to it. 65 plus 5 is
70. A raw score of 70 is exactly +1 S above M in this example. To
make this clear visually, put 70 right below +1 S on the graphic.
+2
S. Similarly, to calculate a +2 standard deviation in this example
we take the mean, 65, and add two standard deviations to it. (S
= 5, so 2S = 10.) 65 plus 10 is 75. A raw score of 75 is exactly
+2 S above M in this example. To make this clear visually, put 75
right below +2 S on the graphic.
+3
S. To calculate a +3 standard deviation in this example we take
the mean, 65, and add three standard deviations to it. (S = 5, so
3S = 15.) 65 plus 15 is 80. A raw score of 80 is exactly +3 S above
M in this example. To make this clear visually, put 80 right below
+3 S on the graphic.
-1
S. To find
the value for -1 standard deviation, we take the mean, 65, and subtract
one standard deviation from it. 65 minus 5 is 60. A raw score of
60 is exactly -1 S below M in this example. To make this clear visually,
put 60 right below -1 S on the graphic.
And
so on. Finish calculating and assigning -2 and -3 standard deviations
in this example.
Where
does 72 fall?
Now we can take the score of 72 earned on the Psych exam and put
it on the horizontal axis. It goes between 70 (+1 S) and 75 (+2
S). The z score for the raw score of 72 is +1.4; that means that
72 is 1.4 standard deviations above the mean. You can see that by
looking at the lecture graphic.
How
well did the student do? Let's look at the graphic and use our
rules of thumb to get a sense of how well the student did. Remember
we expect 50% of the students to fall below 65 (the mean). We also
figured out that another 33% of the students fall between the mean
(65) and +1 S (70). Therefore 50% + 33% = 83% of the students are
expected to fall below a score of 70 (+1 S). So more than 83% of
the students fall below a score of 72 (or a z score of +1.4). Being
above 83% of the class is doing pretty well.
Making
a Curve. Teachers often generate a curve to assign grades. How
do they do that? One way is to use the mean and standard deviation
(and implicitly, at least, z scores). Look at the graphic with the
Psychology 1010 example. Typically all students who are within 1
S of M receive a C. That assures that about 2/3 of the class get
a C. In this example scores between 60 and 70 would be given a C.
Typically
a B is given to students who get between +1 S and +2 S. In this
example all students who get between 70 and 75 would get a B. So
the student with a score of 72 would get a B.
Typically
an A is given to students who score above +2 S. So any grade above
80 on the Psych exam would get an A.
In
the same way, scores between -1 and -2 S are given a D and scores
below -2 S are given an E. So on the Psych exam, 55 to 60 is a D
and below 55 is an E.
Practice.
To solve these, use our rules of thumb (NOT exact probabilities
found with the Normal Tool). What grade would a score of 59
get on the Psych exam? (Answer: D.)
If
a student received a score of 114 on a history test and this raw
score had a z-score of +2.1, what grade would s/he get? (Answer:
Using our way of generating a curve, any student who gets higher
than +2 S above the mean would receive an A.)
What
percent of scores fall between M and +1 S? (Answer: 33%.) What %
of scores fall between M and +2 S? (Answer: Look at the lecture
graphic. 95% fall between -2 and +2 S from M. The distribution is
symmetrical around M, so we can deduce that 95/2 = 47.5% fall between
M and +2 S.) What percent of scores fall between +1 and +2 S? (Answer:
14.5 % fall between M and +2 S. 33% fall between M and +1 S. So
the probability between +1 and +2 is 47.5 - 33 = 14.5%.) What is
the probability below +2 S? (Answer: 47.5 % fall between M and +2
S. Also, 50% fall below M. Therefore, 50 + 47.5 = 97.5% fall below
+2 S.) All these answers are based on rules of thumb and not
on exact normal probabilities.
End
of lecture!
Back
to Topic Locator Map
|