|
Normal
Probability Distribution Web Page
©Copyright 1997, 2000 Tom
Malloy
This
is the text of the in-class lecture which accompanied the Authorware
visual graphics on this topic. You may print this text out and use
it as a textbook. Or you may read it online. In either case it is
coordinated with the online Authorware graphics.
Topic
Locator Map



This
map allows you to--
-
Jump directly to a topic which interests you.
-
Co-ordinate the dynamic visual Authorware presentations with the
corresponding text available on this web page.
1.
To find a topic which interests you: Look at the map of menus
above. Choose a menu that interests you. Notice that the menu buttons
have topics printed on them. Click on any button (topic) on the
menu; you will jump directly to the text that corresponds to the
topic printed on the button.
2.
To coordinate this web page with Authorware presentations: The
corresponding Authorware program should already be open. Go to the
menu of your choice in the Authorware program and click any
button which interests you. Then on the topic locator map above
click on the same button on the same menu; you will
jump to the text that corresponds to the Authorware presentation.
End
of Topic Locator Map
Begin
Text Explaining the Normal Distribution


Back to
Menu Locator Map
The Normal Probability
Distribution. The most generally used bell shaped curve is called
the normal probability distribution. You can see its shape on the
screen. Many of the inferential statistics we will study later assume,
rightly or wrongly, that the normal probability distribution is
a good model of the dependent variable measurement operations.
Horizontal Axis: The
horizontal axis (see blue arrow on graphic) gives you the values
of the dependent variable, whatever that happens to be in a particular
research project. It might be values of IQ or blood pressure or
highway safety. The values of the DV run along the horizontal axis.
Vertical Axis:
The height of the Normal Probability curve is the probability the
values of the DV (one the horizontal axis) will occur. Notice that
there is more probability in the center of the curve where the big
bump or bell is. The probability tapers off from the center in both
directions. So out in either direction the probability gets smaller
and smaller until it is nearly zero.
Tails. In both
directions away from the center, where the probability gets very
small, are what are called the tails of the distribution. This is
because in both directions (toward negative and positive infinity)
the distribution gets very thin like tails.

Parameters. The
normal probability distribution has two parameters. One is symbolized
by the Greek letter mu; and the other is symbolized by the Greek
letter sigma. These two symbols will gain meaning as we go through
the course. But we will start by getting a general sense of mu and
sigma now.

Mu. Mu is the
value at the exact center of the normal distribution. I've
drawn an arbitrary example in which mu equals 200. So you can see
at the very center of the distribution, right below its highest
point, I've put the value 200. Mu is also called the mean
of the normal distribution.

Sigma. The second
parameter of the normal distribution is what's called its standard
deviation, and it is symbolized by the small Greek letter sigma.
As we go through the course we'll develop insight about what standard
deviation means. As an arbitrary example, I've given sigma a value
of 25 so that we have an example to work with.

The concept behind sigma
is central to the rest of the class so we will spend some time now
starting to understand it.
Inflection Points.
The easiest way, I think, of giving you some beginning understanding
of sigma or standard deviation is to approach the idea visually.
To do this I have to introduce some jargon: inflection points. An
Inflection point is where any a curve changes from concave upward
to concave downward and visa versa. On the graphic you can see that
two inflection points are circled. Starting from the left (negative
infinity) the curve is at first cupped upward. Then, after the first
circled inflection point the curve is cupped downward. The curve
stays cupped downward until the second inflection point where it
returns to being cupped upward as it goes out to positive infinity.
Okay so we can see there
are two inflection points on the normal curve, one below the mean
(mu) and one above the mean.

How is an Inflection
Point related to Sigma?
For a normal distribution, each inflection point is always one sigma
away from the mean (mu). The graphic shows arrows going down from
the normal curve to the horizontal axis (DV). Where that arrow touches
the horizontal axis will be a value of the DV which is one sigma
away from mu. It's best just to look at the graphic and figure out
the relationships before going on. The next paragraph talks through
all the details, but grasping the big picture first will make the
discussion easier.
One sigma above mu.
In our example we've let mu (mean) equal 200 and sigma (standard
deviation) equal 25. As we've just said, the inflection points are
one sigma away from the mean. Since sigma = 25, the upper inflection
point (to the right, toward positive infinity) will be 25 units
above 200 (mu). So the arrow cuts the axis at 225 (which is 200
+ 25). The score 225 would be exactly one sigma above mu. 225 would
be the value right below the upper inflection point. We add 25 (sigma)
to 200 (mu) to get 225.

One sigma below mu.
With sigma equal to 25 and mu = 200, what score would be one sigma
below mu? It would be 200 minus 25, which is 175. So the arrow coming
down from the lower inflection point cuts the axis at 175.
What we're doing is creating
a correspondence between the graphical representation of the normal
distribution which has two inflection points and an arithmetic formula
in which we add to or subtract the value of sigma from mu.

What happens if we
change sigma? Let's see some implications of what we are learning.
Let's compare three normal distributions which all have the same
center (mu) but which have three different sigmas. The top one has
a sigma = 15, the middle one has sigma equal to 25 and the lower
one has sigma equal to 50. The question is what's the effect of
changing sigma. By inspecting the graphic you can see the answer
for yourself. The bigger the sigma, the shorter and wider will be
the normal distribution.
We will continue through
the details.

Sigma = 15. Look
at the top distribution where sigma = 15. The inflection points
are very close to mu because sigma is small. To find the values
below the inflection points we simply add or subtract sigma from
mu. So the upper inflection point is above 200 + 15 = 215. The value
215 is quite close to 200.
Conversely when we look
at the inflection point below mu we subtract 15 from 200 and we
get 185. So you can see that with sigma 15, the inflection points
are close very close to 200

Sigma = 25. Now
when we look at the middle distribution where sigma is 25, the normal
distribution is less tall and less narrow than the top distribution.
The inflection points are farther from mu than they were when sigma
was 15.
To calculate the value
of the upper inflection point we add 25 to 200 and get 225. When
we want to get one sigma below the mean we subtract 25 from 200
to get 175.
By now you should be
able to find one sigma above and below the mean when sigma = 50,
before you go on.

Sigma = 50. When
sigma is 50, the normal distribution is relatively short and relatively
spread out, and the upper inflection point, or one sigma above mu
is at 250. One sigma below mu is 150.

Summary. If we
look at all three distributions, you can see something about the
meaning of sigma, about what its effect is on the normal distribution.
Sigma (standard deviation) is a measure, or determiner, of how spread
out the normal distribution is. As sigma increases the normal distribution
becomes more spread out.
Whereas mu gives you
the center of the distribution, sigma gives you spread-out-ness.
Together, mu and sigma are the two characteristics of a normal distribution.
Area under normal
curves is the same. As we mentioned in the probability lecture,
the area under the normal curve is used to represent probability.
And since the probability of the Sure Event (that the DV
value will fall between negative and positive infinity) is always
1, then the total area under all normal curves must always equal
1.
And... because the area
under all three curves is the same, if something gets more spread
out, then it is going to have to get shorter.

Mu and sigma are called
parameters of the normal distribution. The normal distribution is
completely specified by mu and sigma; once you know mu and sigma,
you know everything there is to know about a normal distribution.
Notation: Because
it'll make things quicker and easier to write down in your notes,
there is a standard notation for specifying normal distributions.
If a distribution is normal, it is usually noted by a capital N,
and then, in parentheses, the values of mu and sigma. If we want
to specify a particular normal distribution, say the top one, which
has mu = 200 and sigma = 15, then we just write N(200, 15). The
middle distribution would be N(200, 25) and the bottom one would
be N(200, 50).
That's an introduction
to the normal distribution. Now let's go back up to the main menu
and review the idea of the normal distribution as a dependent variable
model.


Back to
Menu Locator Map
Abduction review.
Remember the modeling process we've discussed before. We take an
infinite process in nature, a person in the current example, and
we reduce that person to a single number via measurement operations.
In this case the measurement operations are an IQ test. The particular
person in our example scored an IQ = 103. Then we model the DV numbers
as a normal probability distribution. Typically IQ is set with a
mean (or mu) of 100 and a standard deviation (or sigma) of 15. This
example gives pretty typical parameters for an IQ test. Using our
compressed notation, we could describe the test as N(100, 15).
This normal probability
distribution (or random variable) is often called a normal population.
You take some independent
process in nature, then you do some kind of scientific reduction
of that process, to numbers via measurement operations and finally
you model those numbers in terms of some kind of probability distribution.
The next topic will be
about taking samples from normal populations.


Back to
Menu Locator Map

Back to Menu Locator
Map
Achievement Test Example.
Let's start again with an example of abduction. Let's say we have
an infinite process, a child, happily playing in a tree, not knowing
what's waiting for her, and then somebody shows up with a standardized
achievement test at the end of second grade and she gets welcomed
to the corporate world. So, she has to take a test which is designed
to measure her achievement level for various culturally relevant
school skills. When she's done the test is scored and she receives
a number purported to measure her level of achievement. You are
probably familiar with these sorts of tests since they are commonly
given in most schools on a yearly basis. And most likely you had
to take the SAT or the ACT to apply for college.
We model the results
of her test with the normal distribution. The graphic shows that
the test has mu = 200 and sigma = 10. So we can summarize all this
information as N(200, 10). For the moment you simply have to accept
these parameters (mu = 100 and sigma = 10) which I've made up more
or less arbitrarily for this example. The scientific procedures
required for the test makers to determine these parameters is long
and complex and beyond the scope of this lecture.
We will call this normal
distribution a population.

The
Sampling Process. Statistically, we think of a population of
people as distributed normally with some mean, mu, and some standard
deviation, sigma. The number of people in a population is so large
that it might as well be infinite.
When we
do research we randomly draw a small number of people from the population.
That is, using some sampling process (e.g., randomly choosing names
from a voter registration list or randomly choosing a small number
of property owners from the county records) we select a small portion
of the population to study.
The people
we draw are called the sample. In the graphic 4 people have been
drawn from the population.
Now we measure
our DV. That is, we turn each person into a number. This gives us
our sample data. In the graphic, n = 4 pieces of sample data.
Back
to Menu Locator Map
Scientific
procedures. Suppose we want to givean achievement test to 10
second graders. As scientists we have to arrange to go to a school
and get permission from the school, parents and children to do our
research. Then we have to arrange a time and place to give the test.
We have to find 10 volunteers to take the test. We have to administer
the test carefully, making sure that time limits and other procedures
are followed exactly. Then we have to score the 10 tests. This gives
us 10 numbers which we call our data. Suppose that the first
student gives us a score of 205, the second student gives us 198,
and so on until the last student gives us 201. Collecting data is
a lot of work and generally takes months or even years.
Statistical
models. Look at the graphic. The population of achievement scores
has been modeled as N(200, 10). The arrow coming out of that population
indicates that we have randomly taken a sample of 10 scores from
the population. The first score is 205, the second is 198, and so
on until the last score, which is 201. This is, of course, the same
data which we generated by our scientific procedures above. But
in statistics, when we say that we "take a random sample of
achievement scores" we summarize in that single phrase all
the work involved in collecting scientific data.
So for statistical models
we summarize scientific data collection simply as sampling from
a population.
Next, we will make a
vocabulary distinction between "statistics" and "parameters."


Back to Menu Locator
Map
Population Parameters.
In the statistical model we think that there is a population which
is N(200, 10). We randomly sample 10 scores from that population
to get our data. Mu and sigma are said to be the parameters
of the population. Recall that we also said that we can call mu
the "mean" of the population and we can call sigma the
"standard deviation" of the population. We learned that
the mean is the center of the population and the standard deviation
indicates how spread out the population is.
Unfortunately we also
use the terms mean and standard deviation in a related but distinct
way. This use of mean and standard deviation to refer to different
things can cause confusion unless a clear distinction is drawn.
Sample Statistics.
A little later in the course we will discuss how to find the mean
and the standard deviation of the sample data. We haven't done that
yet, so don't expect yourself to know how. I'm simply giving you
a heads up warning that the terms mean and standard deviation are
used for both the sample data and for the population. And it will
eventually be important to know which of these two we are talking
about.
On the graphic I've shown
the sample mean to be 198.665. I didn't show how I calculated it
so don't worry about how to find the sample mean. Just notice that
the sample mean is a little different than the population mean.
Mu is 200 but the sample mean (symbolized by M) is equal to 198.665.
The population mean and the sample mean are highly related but distinct
concepts.
Notice also on the graphic
that the sample standard deviation (S) is equal to 8.530. Again,
I've not shown how to calculate the sample standard deviation, so
you don't need to know that right now. But S, the sample standard
deviation, has a slightly different value (8.530) than does the
population standard deviation (10). The population standard deviation
and the sample standard deviation are related but distinct concepts.
The symbol we will use
for the sample mean is M. The symbol we will use for the sample
standard deviation is S.
Parameters refer
to probability distributions (populations).
Statistics refer
to sample data.
Now we are going to
turn to a StatCenter tool which allows you to collect samples from
a normal distribution.


Back to
Menu Locator Map
Finding "Normal
Sample Tool". You can find this tool from either the Desk,
the Ducks, or the Course Menu interfaces. From the Desk click on
the Interactive Learning icon and look for Normal Sample Tool. From
Ducks, just click on the Normal Sample Tool link under the Interact
& Integrate section that follows the Normal Distribution Lecture.
And from the Menu, just open the Work and Learn folder and click
on Interactive Learning and choose Normal Sample Tool.
A tool for creating random
samples from a normal population will then pop up. This is a very
useful tool. I recommend opening and using the Normal Sample Tool
as you think about the material in this section.
The current graphic (above)
shows the Normal Sample Tool along with notes on how to use it.
It will allow you to generate samples from any Normal Probability
Distribution.
Setting Mu and Sigma.
First, the Normal Sample Tool allows you to set the population parameters,
mu and sigma. The graphic shows you where to type in the values
of mu and sigma. Because our Achievement Test example uses N(200,
10), I have already typed in mu = 200 and sigma = 10. But if you
have opened up the tool, you need to type in the correct parameters.
Do that now.
Setting Sample Size
(n). Next, the Normal Sample Tool allows you to set the number
of data points in your sample. We use "n" to indicate
how many scores we have in our sample. On the lecture graphic, I've
set n to be 10.
Getting a sample.
Simply clicking on the "Get Sample" button will give you
a sample of the size you asked for from the normal population you
defined. On the right hand side, upper panel, a normal distribution
will appear with the mu and sigma you have set. On the right hand
side, lower panel, a sample of scores will appear. The number of
scores you get will depend on n, the sample size you set. The current
lecture graphic shows a sample of size 10 taken from a population
which is N(200, 10).
Each time you click the
"Get Sample" button you will get a new sample with a different
set of scores.
Sample Statistics.
Notice that below the sample data the Normal Sample Tool automatically
calculates the sample mean and standard deviation for you. Right
now we haven't covered those topics yet, so just notice that the
tool will make them available to you when, in the future, you will
need them.
Click on the Get Data
button several times and notice how the sample data (as well as
the sample statistics) change each time you take a sample. You are
exploring a statistical model in which you assume that data comes
from normal probability distributions and that collecting data amounts
to taking a sample from a normal population.
Now go back to the Normal
Probability Distribution menu and select "Areas under the Normal
Curve," which will be our next topic.


Back to
Menu Locator Map
We are now
going to find out how to use the Normal Distribution to find probabilities.
The Area
between two scores.
We will now learn how to find the probability that a score will
fall between any two values on a normal probability distribution.
If the previous sentence didn't make a lot of sense to you that's
OK; we'll talk about what it means in some detail. For the moment
you may recall that in the Probability lecture we mentioned that
one interpretation of probability is that it can be represented
as the area beneath a curve.

Back to
Menu Locator Map
Just to
make sure we stay grounded in the natural curiosity of science,
recall that we find some interesting phenomena in nature, reduce
it to numbers by measurement operations, and then model those numbers
as a random variable. The random variable we use most often is the
normal probability distribution.

Height Example.
Just to make sure that we don't focus too much on the details of
a single example, let's change the example again. In this example
we will be interested in the heights of northern European males.
We take such a person and reduce them to a single number via the
usual operations for measuring someone's height. Then we model the
height of northern European males as a normal population with mu
= 150 cm and sigma = 30 cm. In other words, our model is N(150,
30).

Finding "Normal
Tool". You can find this tool from either the Desk, the
Ducks, or the Course Menu interfaces. From the Desk click on the
Interactive Learning icon and look for Normal Tool. From Ducks,
just click on the Normal Tool link under the Normal Distribution
Lecture. And from the Menu, just open the Work and Learn folder
and click on Interactive Learning and choose Normal Tool.
The Normal Tool menu
will appear. Click on the top button. Now we'll go on to explain
the tool.

What is the Probability
Between 140 and 170? We have modeled the heights of northern
European males as N(150, 30). If that model is true, and if we sample
one man from that population, what are the chances he has a height
between 140 cm and 170 cm? We can answer such questions as that
with StatCenter's Normal Tool. And...such questions will be common
on homeworks and exams.

Total Area under the
Normal curve.
Remember that we can interpret the area below a normal curve as
probability. The total area below the normal curve (from negative
infinity up to positive infinity) is assumed to be 1. That is, the
probability that a man's height will fall between negative and positive
infinity is 1. The previous statement should make sense. All possible
heights must be between negative and positive infinity. And the
probability of all possibilities is 1.
Area Between.
First off, the current question we are asking is about the probability
between 140 and 170 cm. Since the total area under the curve
is 1, the area between 140 and 170 must be some fraction of 1. As
we have said, we can interpret this area under the curve (which
is some fraction of 1) as probability. But how do we use the tool
to find this area (probability)?
On the Normal Tool the
first thing you must do is make sure that the little icon indicating
"area between" is clicked (see lecture graphic). "Between"
is the default setting for the Normal Tool, so when you
open it up it automatically gives you the area between two values.
Set mu. On the
lecture graphic, arrows point to little boxes where you can set
mu and sigma. First type in the mu which is relevant to whatever
example you are working on. Then click the "Enter mu (50 -
500)" button right next to the box where you entered the value
of mu. (Note: The Normal Probability Tool only accepts values of
mu between 50 and 500.) For our height example, I have entered mu
= 150.
Set sigma. The
lecture graphic also shows where to enter the value of sigma (toward
the lower right-hand corner of the tool). For our height example,
I have entered sigma = 30. You must type in the value of sigma and
then press the "Enter sigma" button next to it.
Set lower value.
We are looking for the area (probability) between two values. The
lecture graphic shows you where you can enter the lower of the two
values. Once you type in the number, click on the button which says
"Enter the lower score." For the height example, the lower
value is 140 cm, so on the lecture graphic I have set the lower
value to 140.
Set upper value.
Similarly, as you can see on the lecture graphic, there's a box
where you can enter the upper score. Following the height example,
I have set the upper score to 170 on the lecture graphic.
Find probability.
You have entered mu, sigma, upper score and lower score. Now you
are ready to find the answer to the question. The lecture graphic
points to a box where the probability will appear. All you have
to do is read it and record it. For the height example, the probability
that a northern European man's height will fall between 140 and
170 cm is .3747.
Black Area.
Probability is represented by the black area under the curve. Look
at the normal distribution on Normal Probability Tool. The black
area between 140 and 170 represents a probability of .3747.
Area and Probability
again. Conceptually what we are doing is interpreting the area
under the normal curve as probability. We set the total area (from
negative to positive infinity) to be 1. Then the area between any
two values is some proportion of 1. In our case, the area under
the curve between 140 and 170 was .3747 parts of 1. This area corresponds
to the probability of .3747.
In other words.
If we sample one man from our population, N(150, 30), the probability
that he will have a height between 140 and 170 is .3747.
We have set up a correspondence
between area on a picture we can see and the concept of probability.
This allows us to picture probability clearly and simply.
That's how the Normal
Tool works for finding the probability between two values.
Practice. I recommend
that you practice using the Normal Probability Tool now. For example
what is the probability that the height of the a northern European
male is between 110 and 140 cm? (Answer: .2789.) What is the probability
that a man's height will fall between 120 and 180 cm? (Answer: .6827.)
What is the probability that a man's height will be between 90 and
210 cm? (Answer: .9545.) You can make up more questions for yourself.
Thought Problem:
What is the probability that a randomly sampled northern European
male will be within one standard deviation (sigma) of the mean (mu)?
This way of asking a question is new to us and we'll be asking the
question this way throughout the course. So for now let's just introduce
the idea. If it seems a little confusing that's OK, just work along
with this example. Your experience will be useful to you as we go
along. First, let sigma = 30 and the mean = 150. When we say "within
one sigma" in this example we mean between 120 and 180. The
value 120 is one sigma (standard deviation) below the mean (mu).
The value 180 is one sigma above the mean. So to find the probability
that a male will be within one sigma of the mean we have to find
the probability that he is between 120 and 180 cm. You've already
done that in the practice problems above. The probability of being
within one sigma of mean is .6827.
Click and drag.
Play with the Normal Tool. You'll notice that there are two blue
pointers just below the normal curve. One is labeled "lower
score" and the other "upper score." If you click
on either of them, you can drag the black area to whatever value
you want. The upper or lower score changes accordingly. The probability
changes also accordingly. Try it and watch how the black area and
the probability change together.
Positive and Negative
Infinity. Play with the Normal Tool some more. You'll notice
that to the right of the white boxes where you enter the upper and
lower scores there are buttons labeled "-oo" and "+oo."
This is as close as we could get to the symbols for negative infinity
(-oo) and positive infinity (+oo). If you click on the minus infinity
button (-oo) the lower score will become minus infinity. If you
click on the plus infinity button (+oo) the upper score will become
plus infinity. Try this out now. Find the probability that a height
will fall between minus and plus infinity. (Answer: 1.) What is
the probability that a height will fall between minus infinity and
150 cm? (Answer: .5.)
Now we will turn to a
related question--what is the area probability outside of
two values?


Back to
Menu Locator Map
What is the Probability
Outside 140 and 170? If we sample one northern European
male, what's the probability that his height will fall outside of
140 and 170? In other words, what are the chances that he'll be
either below 140, or he'll be above 170 in height? That's what we
mean by the word "outside."

Area Outside.
As you can see by watching the moving graphics on the Authorware
program or the static graphics printed on this page, the first thing
you have to do is click the icon for "Area Outside" on
the Normal Tool. The Normal Tool will now show you the area outside
140 and 170. It will also change the probability.
And then you do exactly
the same thing that you did before. For our current example, you
set mu at 150, set sigma at 30, set the
lower value at 140, set the upper value at 170.
Find probability.
Then you simply read the probability. This time it is .6253. The
probability that a height will fall outside (above or below) 140
and 170 cm is .6253.

Black Areas. The
probability of .6253 is represented by those two black areas under
the normal curve. Again, we are creating a correspondence between
the idea of probability and the area under a curve.
Practice. Once
again, I recommend that you practice using the Normal Probability
Tool now. For example what is the probability that the height of
a northern European male is outside 145 and 185 cm? (Answer: .5595.)
What is the probability that the height will fall outside 120 and
180 cm? (Answer: .3173.) What is the probability that the height
will fall outside 90 and 210 cm? (Answer: .0455.) You can make up
questions for yourself. Play with the Normal Tool.
Now we will turn to finding
the area above a certain value.


Back to Menu Locator
Map
What is the Probability
Above 170? Perhaps a basketball coach is interested in
tall men. We have modeled the heights of northern European males
as N(150, 30). If that model is true, and if we sample one man from
that population, what are the chances he has a height above
170 cm? This question implies that the lower score will be 170 and
the upper score will be plus infinity. All scores above 170 will
fall between 170 (on the low end) and plus infinity (on the high
end).

Set mu: 150.
Set sigma: 30
Click Between Icon.
Set lower score:
170
Set upper score:
+oo.
Read probability:
.2546. There's about a 25% chance that the man would have a height
above 170 cm. That's represented by the black area under the normal
curve.
Practice: Play
some more with the Normal Tool. What's the probability that a height
will be above 150 cm? (Answer: .5.) What's the probability that
a height will be above 210 cm? (Answer: .0228.) What's the probability
that a height will be below 140 cm? (Answer: .3707.)
[Note: Set lower score to -oo and upper score to 140.] What's the
probability that a height will be below 150 cm? (Answer: .5.) What's
the probability that a height will be below 210 cm? (Answer: .9772).
Now we will turn to a
specialized topic called the Unit Normal.


Back to Menu Locator
Map
N(0, 1): There
is a particular form of the normal distribution which is very commonly
used in statistics. It is called unit normal or the standard normal
or the z distribution. The unit normal is simply a normal distribution
which has a mean (mu) = 0, and a standard deviation (sigma) = 1.
In more compressed symbols the unit normal is N(0, 1).
.gif)
Everything works exactly
the same with the unit normal as it does for any normal. So everything
we've already learned applies to this topic. We will just be using
a particular member of the normal family of distributions. This
member of the family has mu = 0 and sigma = 1 and is sometimes called
the z distribution.

z-Tables in Stat Books.
The unit normal is the particular form of the normal that is found
in z-tables in the back of stat books. "In the old days"
before we had interactive programs like Normal Tool, we had to convert
all questions to z scores and look up probabilities in z-tables.
For that reason the unit normal has historical importance. So we'll
study it here a little bit. But we will use the Normal Tool to find
probabilities. We won't have to learn to look up probabilities in
stat book tables.
Finding the Standard
Normal Option on the Normal Tool. The Standard Normal
(z) tool by clicking the lower button on the Normal Tool menu.
The standard normal is
also called the unit normal or the z-distribution.
Question. Suppose
that we have N(0, 1) as our probability model. What is the probability
of a score between -1 and +1 on (N(0, 1)?

Don't need to set
mu and sigma. On the unit normal, N(0, 1), mu is always 0 and
sigma is always 1. So you don't need to set them.
Click on the Area
Between Icon.
Set lower and upper
scores. Set the lower and upper score as we did above. In this
case the lower score is -1 and the upper score is +1. When you start
the Unit Normal option, it
will come up with minus one
and plus one as the lower and upper scores. So we don't have to
do anything to solve the particular question we have asked.
Read the probability.
The answer is .6827. This should be familiar to you. If it's not,
it soon will be.
Connections. Now
you'll notice since the standard deviation of N(0, 1) is 1, then
the score "-1" is one standard deviation below the mean.
And the score "+1 is one standard deviation above the mean.
That's going to
be the same probability we got when we solved the thought problem
above. In that thought problem where the model for northern European
heights was N(150, 30) you were asked to find the probability a
height was within one standard deviation of the mean (150) which
we translated into asking what is the probability between 120 (minus
one standard deviation) and 180 (plus one standard deviation).
So if you did that problem,
you'll notice that it came out exactly the same: .6827.
On N(150, 30) the scores
120 and 180 are one standard deviation below and above the mean.
On N(0, 1), the scores -1 and +1 are one standard deviation below
and above mean. The probability of being within one standard deviation
of the mean is .6827 for all normal distributions.
Practice. Play
with the Unit Normal option. What is the probability of a score
falling between -.25 and +1.96 on N(0, 1)? (Answer: .5737.) What
is the probability of a score falling below +1.96? (Answer: 9750).
[Note: Set the lower score to -oo and the upper score to 1.96.]
What is the probability of a score falling above 1.96? (Answer:
.025.) What is the probability of a score falling between -1.96
and +1.96? What is the probability of a score falling outside -1.96
and +1.96?
So for the unit normal
(z distribution),
mu is always 0, so it's very convenient, you don't have to set mu.
And sigma is always one, so that's also convenient, you don't have
to set sigma.
All you do have to do
is set the lower score and the upper score and decide if you are
looking for an area between or outside the upper and lower scores.
Now the question is
why do we call this the z distribution? We will go on and examine
z scores.

z Scores Conceptually.
Conceptually, z scores are used to convert any Normal
Distribution to the Unit Normal, N(0, 1).
This is our first encounter with z scores, but this idea will be
used throughout the class.
Height example.
Let's go back to our height example. We modeled height as N(150,
30). Suppose we have a man from that population who has a height
of 135 cm. What is his z score? In other words we want to convert
a score of 135 cm from our population to a z score from the standardized
normal, N(0, 1). Just to have a useful name, we will call 135 cm
the "raw score." Typically, this raw score will be symbolized
by X. We will convert this raw score (X) into a z score.
z Formula. As
you can see on the lecture graphic, z = the difference between a
raw score and the mean of its population divided by the standard
deviation of the population. Our raw score (X) is 135.
The graphic shows that
a raw score of 135 on N(150, 30) has a z score = -.5 on N(0, 1).
Practice. If a
population is modeled as N(100, 10), what is the z score of a raw
score of 80? (Answer: z = -2.) What is the z score of a raw score
115? (Answer: z = 1.5.) If you have a z score of 2, what would its
raw score be on a population modeled as N(100, 10)? (Answer: X =
120.) [Note: Write down the z equation in symbols. Then rewrite
it with all the information I just gave you plugged in. The only
symbol left will be X. Solve for X.]

Review of Inflection
Points, sigma, and the z score conversion. Remember that inflection
points are where a curve shifts from facing concave downward to
concave upward, or vice versa. The normal curve has two inflection
points. The lower one is exactly one standard deviation (sigma)
below the mean (mu). The upper one is exactly one standard deviation
above the mean. Let's integrate that idea with converting from any
population to the standardized normal.
New Example: Suppose
we have a population which is modeled by N(270, 20). On the
picture, what population values are directly under the two inflection
points? Well, we know that the lower inflection point is one sigma
below the mean. So it will be at 270 - 20 which is = 250. The upper
inflection point will be one sigma above the mean. So it will be
at 270 + 20 which is = 290.

Conversion to z distribution.
The z formula will convert any score, X, from any normal distribution
to the standardized normal, N(0, 1). Let's convert the two inflection
points on N(270, 20) to z scores on N(0, 1). The two inflection
points are 250 and 290. They are the scores which are one standard
deviation from the mean (270).
As you can see from the
graphic, the raw score 250 converts to a z score of -1. And the
raw score of 290 converts to a z score of +1.
The graphic also shows
that -1 and +1 are the two inflection points of the unit normal.
That is because the unit normal has a standard deviation of 1 and
mean of 0. So -1 is one sigma below the mean and +1 is one sigma
above the mean.
Now we will go on and
start to foreshadow some material we will cover much later. We will
examine a case
in which the probability of being above a score is .05.


Back to Menu Locator
Map
Converse question.
Suppose that we go back to our example about the heights of northern
European males. We modeled height as N(150, 30). We can ask a question
which is the converse of the type of questions we have been asking.
We have been asking questions like what is the probability that
a height will be above 160 cm? Now we ask what is the height which
has a certain probability above it.
.05. Let's find
a height above which there is a .05 probability of sampling a man.
Or, above what height does .05 of the probability lie? This probability
(.05) will be of considerable interest to us later; so we will start
playing with it now.

In the previous topic
we were using the Normal Tool for the Unit Normal. Go back to the
Normal Tool's menu and choose the Normal Tool for any normal
population.
The question is "above
what height does the .05 of the probability lie?" We've got
a probability and now have to find the height. It's always a little
harder to answer that kind of question because neither the tables
in the backs of books nor the StatCenter probability tools are set
up to give you that information very well.
Set up. When we
have our Normal Tool up and running, we press the Between Icon,
and then we set our mu = 150, our sigma = 30, and our upper score
to plus infinity. Again, the question is above what height (clear
up to +oo) does .05 of the area lie?
Drag lower score.
Drag the blue lower score pointer and watch the probability output
window down here in the lower right corner. Drag the lower score
pointer until you get close to a probability of .05. You may have
to go back and forth. Sometimes probability will be bigger than
.05, sometimes smaller than .05. By trial and error you'll finally
get down to 2 numbers that kind of bracket .05. But it'll never
fall exactly on .05. The Normal Tool is not that exact. So we can
only be approximate.
On the graphic I stopped
dragging when I got a probability of .0485. This is very close to
.05. So I got a probability as close to .05 as I could possibly
get.

Solution. Once
you have the probability as close to .05 as you can get, stop dragging
the pointer. Look down at the window which shows the value of the
lower score. In this example the lower score will come out to be
200.
The height which has
.05 of the probability above it is 200 cm. Or we can say that the
probability of sampling a man taller than 200 cm is .05. Actually,
the probability is .0485, which is as close to .05 as we can get.
About 5% of the all heights
in this population fall above 200 cm.
Practice. Above
what height does 10% of the population lie? (Answer: 189 cm.) Above
what z score on the standardize normal does .05 of the probability
lie? (Answer: 1.64 or 1.65.)
Back to Menu Locator
Map
|