Normal Probability Distribution Web Page
Click above to start an Interactive Visual Presentation (Plugin Required)
Click here to go to our plugin download and plugin tutorial page
This is the text of the in-class lecture which accompanied the Authorware visual graphics on this topic. You may print this text out and use it as a textbook. Or you may read it online. In either case it is coordinated with the online Authorware graphics.
This map allows you to--
1. To find a topic which interests you: Look at the map of menus above. Choose a menu that interests you. Notice that the menu buttons have topics printed on them. Click on any button (topic) on the menu; you will jump directly to the text that corresponds to the topic printed on the button.
2. To coordinate this web page with Authorware presentations: The corresponding Authorware program should already be open. Go to the menu of your choice in the Authorware program and click any button which interests you. Then on the topic locator map above click on the same button on the same menu; you will jump to the text that corresponds to the Authorware presentation.
End of Topic Locator Map
Begin Text Explaining the Normal Distribution
Back to Menu Locator Map
The Normal Probability Distribution. The most generally used bell shaped curve is called the normal probability distribution. You can see its shape on the screen. Many of the inferential statistics we will study later assume, rightly or wrongly, that the normal probability distribution is a good model of the dependent variable measurement operations.
Horizontal Axis: The horizontal axis (see blue arrow on graphic) gives you the values of the dependent variable, whatever that happens to be in a particular research project. It might be values of IQ or blood pressure or highway safety. The values of the DV run along the horizontal axis.
Vertical Axis: The height of the Normal Probability curve is the probability the values of the DV (one the horizontal axis) will occur. Notice that there is more probability in the center of the curve where the big bump or bell is. The probability tapers off from the center in both directions. So out in either direction the probability gets smaller and smaller until it is nearly zero.
Tails. In both directions away from the center, where the probability gets very small, are what are called the tails of the distribution. This is because in both directions (toward negative and positive infinity) the distribution gets very thin like tails.
Parameters. The normal probability distribution has two parameters. One is symbolized by the Greek letter mu; and the other is symbolized by the Greek letter sigma. These two symbols will gain meaning as we go through the course. But we will start by getting a general sense of mu and sigma now.
Mu. Mu is the value at the exact center of the normal distribution. I've drawn an arbitrary example in which mu equals 200. So you can see at the very center of the distribution, right below its highest point, I've put the value 200. Mu is also called the mean of the normal distribution.
Sigma. The second parameter of the normal distribution is what's called its standard deviation, and it is symbolized by the small Greek letter sigma. As we go through the course we'll develop insight about what standard deviation means. As an arbitrary example, I've given sigma a value of 25 so that we have an example to work with.
The concept behind sigma is central to the rest of the class so we will spend some time now starting to understand it.
Inflection Points. The easiest way, I think, of giving you some beginning understanding of sigma or standard deviation is to approach the idea visually. To do this I have to introduce some jargon: inflection points. An Inflection point is where any a curve changes from concave upward to concave downward and visa versa. On the graphic you can see that two inflection points are circled. Starting from the left (negative infinity) the curve is at first cupped upward. Then, after the first circled inflection point the curve is cupped downward. The curve stays cupped downward until the second inflection point where it returns to being cupped upward as it goes out to positive infinity.
Okay so we can see there are two inflection points on the normal curve, one below the mean (mu) and one above the mean.
How is an Inflection Point related to Sigma? For a normal distribution, each inflection point is always one sigma away from the mean (mu). The graphic shows arrows going down from the normal curve to the horizontal axis (DV). Where that arrow touches the horizontal axis will be a value of the DV which is one sigma away from mu. It's best just to look at the graphic and figure out the relationships before going on. The next paragraph talks through all the details, but grasping the big picture first will make the discussion easier.
One sigma above mu. In our example we've let mu (mean) equal 200 and sigma (standard deviation) equal 25. As we've just said, the inflection points are one sigma away from the mean. Since sigma = 25, the upper inflection point (to the right, toward positive infinity) will be 25 units above 200 (mu). So the arrow cuts the axis at 225 (which is 200 + 25). The score 225 would be exactly one sigma above mu. 225 would be the value right below the upper inflection point. We add 25 (sigma) to 200 (mu) to get 225.
One sigma below mu. With sigma equal to 25 and mu = 200, what score would be one sigma below mu? It would be 200 minus 25, which is 175. So the arrow coming down from the lower inflection point cuts the axis at 175.
What we're doing is creating a correspondence between the graphical representation of the normal distribution which has two inflection points and an arithmetic formula in which we add to or subtract the value of sigma from mu.
What happens if we change sigma? Let's see some implications of what we are learning. Let's compare three normal distributions which all have the same center (mu) but which have three different sigmas. The top one has a sigma = 15, the middle one has sigma equal to 25 and the lower one has sigma equal to 50. The question is what's the effect of changing sigma. By inspecting the graphic you can see the answer for yourself. The bigger the sigma, the shorter and wider will be the normal distribution.
We will continue through the details.
Sigma = 15. Look at the top distribution where sigma = 15. The inflection points are very close to mu because sigma is small. To find the values below the inflection points we simply add or subtract sigma from mu. So the upper inflection point is above 200 + 15 = 215. The value 215 is quite close to 200.
Conversely when we look at the inflection point below mu we subtract 15 from 200 and we get 185. So you can see that with sigma 15, the inflection points are close very close to 200
Sigma = 25. Now when we look at the middle distribution where sigma is 25, the normal distribution is less tall and less narrow than the top distribution. The inflection points are farther from mu than they were when sigma was 15.
To calculate the value of the upper inflection point we add 25 to 200 and get 225. When we want to get one sigma below the mean we subtract 25 from 200 to get 175.
By now you should be able to find one sigma above and below the mean when sigma = 50, before you go on.
Sigma = 50. When sigma is 50, the normal distribution is relatively short and relatively spread out, and the upper inflection point, or one sigma above mu is at 250. One sigma below mu is 150.
Summary. If we look at all three distributions, you can see something about the meaning of sigma, about what its effect is on the normal distribution. Sigma (standard deviation) is a measure, or determiner, of how spread out the normal distribution is. As sigma increases the normal distribution becomes more spread out.
Whereas mu gives you the center of the distribution, sigma gives you spread-out-ness. Together, mu and sigma are the two characteristics of a normal distribution.
Area under normal curves is the same. As we mentioned in the probability lecture, the area under the normal curve is used to represent probability. And since the probability of the Sure Event (that the DV value will fall between negative and positive infinity) is always 1, then the total area under all normal curves must always equal 1.
And... because the area under all three curves is the same, if something gets more spread out, then it is going to have to get shorter.
Mu and sigma are called parameters of the normal distribution. The normal distribution is completely specified by mu and sigma; once you know mu and sigma, you know everything there is to know about a normal distribution.
Notation: Because it'll make things quicker and easier to write down in your notes, there is a standard notation for specifying normal distributions. If a distribution is normal, it is usually noted by a capital N, and then, in parentheses, the values of mu and sigma. If we want to specify a particular normal distribution, say the top one, which has mu = 200 and sigma = 15, then we just write N(200, 15). The middle distribution would be N(200, 25) and the bottom one would be N(200, 50).
That's an introduction to the normal distribution. Now let's go back up to the main menu and review the idea of the normal distribution as a dependent variable model.
Back to Menu Locator Map
Abduction review. Remember the modeling process we've discussed before. We take an infinite process in nature, a person in the current example, and we reduce that person to a single number via measurement operations. In this case the measurement operations are an IQ test. The particular person in our example scored an IQ = 103. Then we model the DV numbers as a normal probability distribution. Typically IQ is set with a mean (or mu) of 100 and a standard deviation (or sigma) of 15. This example gives pretty typical parameters for an IQ test. Using our compressed notation, we could describe the test as N(100, 15).
This normal probability distribution (or random variable) is often called a normal population.
You take some independent process in nature, then you do some kind of scientific reduction of that process, to numbers via measurement operations and finally you model those numbers in terms of some kind of probability distribution.
The next topic will be about taking samples from normal populations.
Back to Menu Locator Map
Back to Menu Locator Map
Achievement Test Example. Let's start again with an example of abduction. Let's say we have an infinite process, a child, happily playing in a tree, not knowing what's waiting for her, and then somebody shows up with a standardized achievement test at the end of second grade and she gets welcomed to the corporate world. So, she has to take a test which is designed to measure her achievement level for various culturally relevant school skills. When she's done the test is scored and she receives a number purported to measure her level of achievement. You are probably familiar with these sorts of tests since they are commonly given in most schools on a yearly basis. And most likely you had to take the SAT or the ACT to apply for college.
We model the results of her test with the normal distribution. The graphic shows that the test has mu = 200 and sigma = 10. So we can summarize all this information as N(200, 10). For the moment you simply have to accept these parameters (mu = 100 and sigma = 10) which I've made up more or less arbitrarily for this example. The scientific procedures required for the test makers to determine these parameters is long and complex and beyond the scope of this lecture.
We will call this normal distribution a population.
The Sampling Process. Statistically, we think of a population of people as distributed normally with some mean, mu, and some standard deviation, sigma. The number of people in a population is so large that it might as well be infinite.
When we do research we randomly draw a small number of people from the population. That is, using some sampling process (e.g., randomly choosing names from a voter registration list or randomly choosing a small number of property owners from the county records) we select a small portion of the population to study.
The people we draw are called the sample. In the graphic 4 people have been drawn from the population.
Now we measure our DV. That is, we turn each person into a number. This gives us our sample data. In the graphic, n = 4 pieces of sample data.
Back to Menu Locator Map
Scientific procedures. Suppose we want to givean achievement test to 10 second graders. As scientists we have to arrange to go to a school and get permission from the school, parents and children to do our research. Then we have to arrange a time and place to give the test. We have to find 10 volunteers to take the test. We have to administer the test carefully, making sure that time limits and other procedures are followed exactly. Then we have to score the 10 tests. This gives us 10 numbers which we call our data. Suppose that the first student gives us a score of 205, the second student gives us 198, and so on until the last student gives us 201. Collecting data is a lot of work and generally takes months or even years.
Statistical models. Look at the graphic. The population of achievement scores has been modeled as N(200, 10). The arrow coming out of that population indicates that we have randomly taken a sample of 10 scores from the population. The first score is 205, the second is 198, and so on until the last score, which is 201. This is, of course, the same data which we generated by our scientific procedures above. But in statistics, when we say that we "take a random sample of achievement scores" we summarize in that single phrase all the work involved in collecting scientific data.
So for statistical models we summarize scientific data collection simply as sampling from a population.
Next, we will make a vocabulary distinction between "statistics" and "parameters."
Back to Menu Locator Map
Population Parameters. In the statistical model we think that there is a population which is N(200, 10). We randomly sample 10 scores from that population to get our data. Mu and sigma are said to be the parameters of the population. Recall that we also said that we can call mu the "mean" of the population and we can call sigma the "standard deviation" of the population. We learned that the mean is the center of the population and the standard deviation indicates how spread out the population is.
Unfortunately we also use the terms mean and standard deviation in a related but distinct way. This use of mean and standard deviation to refer to different things can cause confusion unless a clear distinction is drawn.
Sample Statistics. A little later in the course we will discuss how to find the mean and the standard deviation of the sample data. We haven't done that yet, so don't expect yourself to know how. I'm simply giving you a heads up warning that the terms mean and standard deviation are used for both the sample data and for the population. And it will eventually be important to know which of these two we are talking about.
On the graphic I've shown the sample mean to be 198.665. I didn't show how I calculated it so don't worry about how to find the sample mean. Just notice that the sample mean is a little different than the population mean. Mu is 200 but the sample mean (symbolized by M) is equal to 198.665. The population mean and the sample mean are highly related but distinct concepts.
Notice also on the graphic that the sample standard deviation (S) is equal to 8.530. Again, I've not shown how to calculate the sample standard deviation, so you don't need to know that right now. But S, the sample standard deviation, has a slightly different value (8.530) than does the population standard deviation (10). The population standard deviation and the sample standard deviation are related but distinct concepts.
The symbol we will use for the sample mean is M. The symbol we will use for the sample standard deviation is S.
Parameters refer to probability distributions (populations).
Statistics refer to sample data.
Now we are going to turn to a StatCenter tool which allows you to collect samples from a normal distribution.
Back to Menu Locator Map
Finding "Normal Sample Tool". You can find this tool from either the Desk, the Ducks, or the Course Menu interfaces. From the Desk click on the Interactive Learning icon and look for Normal Sample Tool. From Ducks, just click on the Normal Sample Tool link under the Interact & Integrate section that follows the Normal Distribution Lecture. And from the Menu, just open the Work and Learn folder and click on Interactive Learning and choose Normal Sample Tool.
A tool for creating random samples from a normal population will then pop up. This is a very useful tool. I recommend opening and using the Normal Sample Tool as you think about the material in this section.
The current graphic (above) shows the Normal Sample Tool along with notes on how to use it. It will allow you to generate samples from any Normal Probability Distribution.
Setting Mu and Sigma. First, the Normal Sample Tool allows you to set the population parameters, mu and sigma. The graphic shows you where to type in the values of mu and sigma. Because our Achievement Test example uses N(200, 10), I have already typed in mu = 200 and sigma = 10. But if you have opened up the tool, you need to type in the correct parameters. Do that now.
Setting Sample Size (n). Next, the Normal Sample Tool allows you to set the number of data points in your sample. We use "n" to indicate how many scores we have in our sample. On the lecture graphic, I've set n to be 10.
Getting a sample. Simply clicking on the "Get Sample" button will give you a sample of the size you asked for from the normal population you defined. On the right hand side, upper panel, a normal distribution will appear with the mu and sigma you have set. On the right hand side, lower panel, a sample of scores will appear. The number of scores you get will depend on n, the sample size you set. The current lecture graphic shows a sample of size 10 taken from a population which is N(200, 10).
Each time you click the "Get Sample" button you will get a new sample with a different set of scores.
Sample Statistics. Notice that below the sample data the Normal Sample Tool automatically calculates the sample mean and standard deviation for you. Right now we haven't covered those topics yet, so just notice that the tool will make them available to you when, in the future, you will need them.
Click on the Get Data button several times and notice how the sample data (as well as the sample statistics) change each time you take a sample. You are exploring a statistical model in which you assume that data comes from normal probability distributions and that collecting data amounts to taking a sample from a normal population.
Now go back to the Normal Probability Distribution menu and select "Areas under the Normal Curve," which will be our next topic.
Back to Menu Locator Map
We are now going to find out how to use the Normal Distribution to find probabilities.
The Area between two scores. We will now learn how to find the probability that a score will fall between any two values on a normal probability distribution. If the previous sentence didn't make a lot of sense to you that's OK; we'll talk about what it means in some detail. For the moment you may recall that in the Probability lecture we mentioned that one interpretation of probability is that it can be represented as the area beneath a curve.
Back to Menu Locator Map
Just to make sure we stay grounded in the natural curiosity of science, recall that we find some interesting phenomena in nature, reduce it to numbers by measurement operations, and then model those numbers as a random variable. The random variable we use most often is the normal probability distribution.
Height Example. Just to make sure that we don't focus too much on the details of a single example, let's change the example again. In this example we will be interested in the heights of northern European males. We take such a person and reduce them to a single number via the usual operations for measuring someone's height. Then we model the height of northern European males as a normal population with mu = 150 cm and sigma = 30 cm. In other words, our model is N(150, 30).
Finding "Normal Tool". You can find this tool from either the Desk, the Ducks, or the Course Menu interfaces. From the Desk click on the Interactive Learning icon and look for Normal Tool. From Ducks, just click on the Normal Tool link under the Normal Distribution Lecture. And from the Menu, just open the Work and Learn folder and click on Interactive Learning and choose Normal Tool.
The Normal Tool menu will appear. Click on the top button. Now we'll go on to explain the tool.
What is the Probability Between 140 and 170? We have modeled the heights of northern European males as N(150, 30). If that model is true, and if we sample one man from that population, what are the chances he has a height between 140 cm and 170 cm? We can answer such questions as that with StatCenter's Normal Tool. And...such questions will be common on homeworks and exams.
Total Area under the Normal curve. Remember that we can interpret the area below a normal curve as probability. The total area below the normal curve (from negative infinity up to positive infinity) is assumed to be 1. That is, the probability that a man's height will fall between negative and positive infinity is 1. The previous statement should make sense. All possible heights must be between negative and positive infinity. And the probability of all possibilities is 1.
Area Between. First off, the current question we are asking is about the probability between 140 and 170 cm. Since the total area under the curve is 1, the area between 140 and 170 must be some fraction of 1. As we have said, we can interpret this area under the curve (which is some fraction of 1) as probability. But how do we use the tool to find this area (probability)?
On the Normal Tool the first thing you must do is make sure that the little icon indicating "area between" is clicked (see lecture graphic). "Between" is the default setting for the Normal Tool, so when you open it up it automatically gives you the area between two values.
Set mu. On the lecture graphic, arrows point to little boxes where you can set mu and sigma. First type in the mu which is relevant to whatever example you are working on. Then click the "Enter mu (50 - 500)" button right next to the box where you entered the value of mu. (Note: The Normal Probability Tool only accepts values of mu between 50 and 500.) For our height example, I have entered mu = 150.
Set sigma. The lecture graphic also shows where to enter the value of sigma (toward the lower right-hand corner of the tool). For our height example, I have entered sigma = 30. You must type in the value of sigma and then press the "Enter sigma" button next to it.
Set lower value. We are looking for the area (probability) between two values. The lecture graphic shows you where you can enter the lower of the two values. Once you type in the number, click on the button which says "Enter the lower score." For the height example, the lower value is 140 cm, so on the lecture graphic I have set the lower value to 140.
Set upper value. Similarly, as you can see on the lecture graphic, there's a box where you can enter the upper score. Following the height example, I have set the upper score to 170 on the lecture graphic.
Find probability. You have entered mu, sigma, upper score and lower score. Now you are ready to find the answer to the question. The lecture graphic points to a box where the probability will appear. All you have to do is read it and record it. For the height example, the probability that a northern European man's height will fall between 140 and 170 cm is .3747.
Black Area. Probability is represented by the black area under the curve. Look at the normal distribution on Normal Probability Tool. The black area between 140 and 170 represents a probability of .3747.
Area and Probability again. Conceptually what we are doing is interpreting the area under the normal curve as probability. We set the total area (from negative to positive infinity) to be 1. Then the area between any two values is some proportion of 1. In our case, the area under the curve between 140 and 170 was .3747 parts of 1. This area corresponds to the probability of .3747.
In other words. If we sample one man from our population, N(150, 30), the probability that he will have a height between 140 and 170 is .3747.
We have set up a correspondence between area on a picture we can see and the concept of probability. This allows us to picture probability clearly and simply.
That's how the Normal Tool works for finding the probability between two values.
Practice. I recommend that you practice using the Normal Probability Tool now. For example what is the probability that the height of the a northern European male is between 110 and 140 cm? (Answer: .2789.) What is the probability that a man's height will fall between 120 and 180 cm? (Answer: .6827.) What is the probability that a man's height will be between 90 and 210 cm? (Answer: .9545.) You can make up more questions for yourself.
Thought Problem: What is the probability that a randomly sampled northern European male will be within one standard deviation (sigma) of the mean (mu)? This way of asking a question is new to us and we'll be asking the question this way throughout the course. So for now let's just introduce the idea. If it seems a little confusing that's OK, just work along with this example. Your experience will be useful to you as we go along. First, let sigma = 30 and the mean = 150. When we say "within one sigma" in this example we mean between 120 and 180. The value 120 is one sigma (standard deviation) below the mean (mu). The value 180 is one sigma above the mean. So to find the probability that a male will be within one sigma of the mean we have to find the probability that he is between 120 and 180 cm. You've already done that in the practice problems above. The probability of being within one sigma of mean is .6827.
Click and drag. Play with the Normal Tool. You'll notice that there are two blue pointers just below the normal curve. One is labeled "lower score" and the other "upper score." If you click on either of them, you can drag the black area to whatever value you want. The upper or lower score changes accordingly. The probability changes also accordingly. Try it and watch how the black area and the probability change together.
Positive and Negative Infinity. Play with the Normal Tool some more. You'll notice that to the right of the white boxes where you enter the upper and lower scores there are buttons labeled "-oo" and "+oo." This is as close as we could get to the symbols for negative infinity (-oo) and positive infinity (+oo). If you click on the minus infinity button (-oo) the lower score will become minus infinity. If you click on the plus infinity button (+oo) the upper score will become plus infinity. Try this out now. Find the probability that a height will fall between minus and plus infinity. (Answer: 1.) What is the probability that a height will fall between minus infinity and 150 cm? (Answer: .5.)
Now we will turn to a related question--what is the area probability outside of two values?
Back to Menu Locator Map
What is the Probability Outside 140 and 170? If we sample one northern European male, what's the probability that his height will fall outside of 140 and 170? In other words, what are the chances that he'll be either below 140, or he'll be above 170 in height? That's what we mean by the word "outside."
Area Outside. As you can see by watching the moving graphics on the Authorware program or the static graphics printed on this page, the first thing you have to do is click the icon for "Area Outside" on the Normal Tool. The Normal Tool will now show you the area outside 140 and 170. It will also change the probability.
And then you do exactly the same thing that you did before. For our current example, you set mu at 150, set sigma at 30, set the lower value at 140, set the upper value at 170.
Find probability. Then you simply read the probability. This time it is .6253. The probability that a height will fall outside (above or below) 140 and 170 cm is .6253.
Black Areas. The probability of .6253 is represented by those two black areas under the normal curve. Again, we are creating a correspondence between the idea of probability and the area under a curve.
Practice. Once again, I recommend that you practice using the Normal Probability Tool now. For example what is the probability that the height of a northern European male is outside 145 and 185 cm? (Answer: .5595.) What is the probability that the height will fall outside 120 and 180 cm? (Answer: .3173.) What is the probability that the height will fall outside 90 and 210 cm? (Answer: .0455.) You can make up questions for yourself. Play with the Normal Tool.
Now we will turn to finding the area above a certain value.
Back to Menu Locator Map
What is the Probability Above 170? Perhaps a basketball coach is interested in tall men. We have modeled the heights of northern European males as N(150, 30). If that model is true, and if we sample one man from that population, what are the chances he has a height above 170 cm? This question implies that the lower score will be 170 and the upper score will be plus infinity. All scores above 170 will fall between 170 (on the low end) and plus infinity (on the high end).
Set mu: 150.
Set sigma: 30
Click Between Icon.
Set lower score: 170
Set upper score: +oo.
Read probability: .2546. There's about a 25% chance that the man would have a height above 170 cm. That's represented by the black area under the normal curve.
Practice: Play some more with the Normal Tool. What's the probability that a height will be above 150 cm? (Answer: .5.) What's the probability that a height will be above 210 cm? (Answer: .0228.) What's the probability that a height will be below 140 cm? (Answer: .3707.) [Note: Set lower score to -oo and upper score to 140.] What's the probability that a height will be below 150 cm? (Answer: .5.) What's the probability that a height will be below 210 cm? (Answer: .9772).
Now we will turn to a specialized topic called the Unit Normal.
Back to Menu Locator Map
N(0, 1): There is a particular form of the normal distribution which is very commonly used in statistics. It is called unit normal or the standard normal or the z distribution. The unit normal is simply a normal distribution which has a mean (mu) = 0, and a standard deviation (sigma) = 1. In more compressed symbols the unit normal is N(0, 1).
Everything works exactly the same with the unit normal as it does for any normal. So everything we've already learned applies to this topic. We will just be using a particular member of the normal family of distributions. This member of the family has mu = 0 and sigma = 1 and is sometimes called the z distribution.
z-Tables in Stat Books. The unit normal is the particular form of the normal that is found in z-tables in the back of stat books. "In the old days" before we had interactive programs like Normal Tool, we had to convert all questions to z scores and look up probabilities in z-tables. For that reason the unit normal has historical importance. So we'll study it here a little bit. But we will use the Normal Tool to find probabilities. We won't have to learn to look up probabilities in stat book tables.
Finding the Standard Normal Option on the Normal Tool. The Standard Normal (z) tool by clicking the lower button on the Normal Tool menu.
The standard normal is also called the unit normal or the z-distribution.
Question. Suppose that we have N(0, 1) as our probability model. What is the probability of a score between -1 and +1 on (N(0, 1)?
Don't need to set mu and sigma. On the unit normal, N(0, 1), mu is always 0 and sigma is always 1. So you don't need to set them.
Click on the Area Between Icon.
Set lower and upper scores. Set the lower and upper score as we did above. In this case the lower score is -1 and the upper score is +1. When you start the Unit Normal option, it will come up with minus one and plus one as the lower and upper scores. So we don't have to do anything to solve the particular question we have asked.
Read the probability. The answer is .6827. This should be familiar to you. If it's not, it soon will be.
Connections. Now you'll notice since the standard deviation of N(0, 1) is 1, then the score "-1" is one standard deviation below the mean. And the score "+1 is one standard deviation above the mean. That's going to be the same probability we got when we solved the thought problem above. In that thought problem where the model for northern European heights was N(150, 30) you were asked to find the probability a height was within one standard deviation of the mean (150) which we translated into asking what is the probability between 120 (minus one standard deviation) and 180 (plus one standard deviation).
So if you did that problem, you'll notice that it came out exactly the same: .6827.
On N(150, 30) the scores 120 and 180 are one standard deviation below and above the mean. On N(0, 1), the scores -1 and +1 are one standard deviation below and above mean. The probability of being within one standard deviation of the mean is .6827 for all normal distributions.
Practice. Play with the Unit Normal option. What is the probability of a score falling between -.25 and +1.96 on N(0, 1)? (Answer: .5737.) What is the probability of a score falling below +1.96? (Answer: 9750). [Note: Set the lower score to -oo and the upper score to 1.96.] What is the probability of a score falling above 1.96? (Answer: .025.) What is the probability of a score falling between -1.96 and +1.96? What is the probability of a score falling outside -1.96 and +1.96?
So for the unit normal (z distribution), mu is always 0, so it's very convenient, you don't have to set mu. And sigma is always one, so that's also convenient, you don't have to set sigma.
All you do have to do is set the lower score and the upper score and decide if you are looking for an area between or outside the upper and lower scores.
Now the question is why do we call this the z distribution? We will go on and examine z scores.
z Scores Conceptually. Conceptually, z scores are used to convert any Normal Distribution to the Unit Normal, N(0, 1). This is our first encounter with z scores, but this idea will be used throughout the class.
Height example. Let's go back to our height example. We modeled height as N(150, 30). Suppose we have a man from that population who has a height of 135 cm. What is his z score? In other words we want to convert a score of 135 cm from our population to a z score from the standardized normal, N(0, 1). Just to have a useful name, we will call 135 cm the "raw score." Typically, this raw score will be symbolized by X. We will convert this raw score (X) into a z score.
z Formula. As you can see on the lecture graphic, z = the difference between a raw score and the mean of its population divided by the standard deviation of the population. Our raw score (X) is 135.
The graphic shows that a raw score of 135 on N(150, 30) has a z score = -.5 on N(0, 1).
Practice. If a population is modeled as N(100, 10), what is the z score of a raw score of 80? (Answer: z = -2.) What is the z score of a raw score 115? (Answer: z = 1.5.) If you have a z score of 2, what would its raw score be on a population modeled as N(100, 10)? (Answer: X = 120.) [Note: Write down the z equation in symbols. Then rewrite it with all the information I just gave you plugged in. The only symbol left will be X. Solve for X.]
Review of Inflection Points, sigma, and the z score conversion. Remember that inflection points are where a curve shifts from facing concave downward to concave upward, or vice versa. The normal curve has two inflection points. The lower one is exactly one standard deviation (sigma) below the mean (mu). The upper one is exactly one standard deviation above the mean. Let's integrate that idea with converting from any population to the standardized normal.
New Example: Suppose we have a population which is modeled by N(270, 20). On the picture, what population values are directly under the two inflection points? Well, we know that the lower inflection point is one sigma below the mean. So it will be at 270 - 20 which is = 250. The upper inflection point will be one sigma above the mean. So it will be at 270 + 20 which is = 290.
Conversion to z distribution. The z formula will convert any score, X, from any normal distribution to the standardized normal, N(0, 1). Let's convert the two inflection points on N(270, 20) to z scores on N(0, 1). The two inflection points are 250 and 290. They are the scores which are one standard deviation from the mean (270).
As you can see from the graphic, the raw score 250 converts to a z score of -1. And the raw score of 290 converts to a z score of +1.
The graphic also shows that -1 and +1 are the two inflection points of the unit normal. That is because the unit normal has a standard deviation of 1 and mean of 0. So -1 is one sigma below the mean and +1 is one sigma above the mean.
Now we will go on and start to foreshadow some material we will cover much later. We will examine a case in which the probability of being above a score is .05.
Back to Menu Locator Map
Converse question. Suppose that we go back to our example about the heights of northern European males. We modeled height as N(150, 30). We can ask a question which is the converse of the type of questions we have been asking. We have been asking questions like what is the probability that a height will be above 160 cm? Now we ask what is the height which has a certain probability above it.
.05. Let's find a height above which there is a .05 probability of sampling a man. Or, above what height does .05 of the probability lie? This probability (.05) will be of considerable interest to us later; so we will start playing with it now.
In the previous topic we were using the Normal Tool for the Unit Normal. Go back to the Normal Tool's menu and choose the Normal Tool for any normal population.
The question is "above what height does the .05 of the probability lie?" We've got a probability and now have to find the height. It's always a little harder to answer that kind of question because neither the tables in the backs of books nor the StatCenter probability tools are set up to give you that information very well.
Set up. When we have our Normal Tool up and running, we press the Between Icon, and then we set our mu = 150, our sigma = 30, and our upper score to plus infinity. Again, the question is above what height (clear up to +oo) does .05 of the area lie?
Drag lower score. Drag the blue lower score pointer and watch the probability output window down here in the lower right corner. Drag the lower score pointer until you get close to a probability of .05. You may have to go back and forth. Sometimes probability will be bigger than .05, sometimes smaller than .05. By trial and error you'll finally get down to 2 numbers that kind of bracket .05. But it'll never fall exactly on .05. The Normal Tool is not that exact. So we can only be approximate.
On the graphic I stopped dragging when I got a probability of .0485. This is very close to .05. So I got a probability as close to .05 as I could possibly get.
Solution. Once you have the probability as close to .05 as you can get, stop dragging the pointer. Look down at the window which shows the value of the lower score. In this example the lower score will come out to be 200.
The height which has .05 of the probability above it is 200 cm. Or we can say that the probability of sampling a man taller than 200 cm is .05. Actually, the probability is .0485, which is as close to .05 as we can get.
About 5% of the all heights in this population fall above 200 cm.
Practice. Above what height does 10% of the population lie? (Answer: 189 cm.) Above what z score on the standardize normal does .05 of the probability lie? (Answer: 1.64 or 1.65.)
Back to Menu Locator Map