|
Binomial
Probability Distribution Web Page
This
is the text of the in-class lecture which accompanied the Authorware
visual graphics on this topic. You may print this text out and use
it as a textbook. Or you may read it online. In either case it is
coordinated with the online Authorware graphics.
Topic
Locator Map


This
map allows you to--
-
Jump directly to a topic which interests you.
-
Co-ordinate the dynamic visual Authorware presentations with the
corresponding text available on this web page.
1.
To find a topic which interests you: Look at the map of menus
above. Choose a menu that interests you. Notice that the menu buttons
have topics printed on them. Click on any button (topic) on the
menu; you will jump directly to the text that corresponds to the
topic printed on the button.
2.
To coordinate this web page with Authorware presentations: The
corresponding Authorware program should already be open. Go to the
menu of your choice in the Authorware program and click any
button which interests you. Then on the topic locator map above
click on the same button on the same menu; you will
jump to the text that corresponds to the Authorware presentation.
End
of Topic Locator Map
Begin
Text Explaining the Binomial Probability Distribution

In the last lecture we
studied the Normal Probability Distribution. The Binomial is another
very important probability distribution which is deeply involved
in the theory of statistics. We will begin this topic by reviewing
what we mean by a Bernoulli Trial. We touched on Bernoulli
Trials previously in the Probability Lecture. It's an idea which
is not generally familiar to people and it involves some jargon,
so it's worth going through again.

Back to Menu Locator
Map
Abduction: It's a
girl. Let's take an infinite process, the birth of a human child.
There are many dependent measurement operations that are commonly
applied to a birth. We could measure the weight at birth or the
length of the child in inches. We could measure the number of hours
of labor, and so on. There's an infinite number of things we could
measure about a birth. None of them nor all of them together, of
course, would fully capture the experience of the woman or child
during birth.
One operation we seem
to be utterly fascinated with is to categorize the child by gender.
So one of the things we do, both conversationally and in research,
is reduce the person to their gender, male or female.
Once we have categorize
the birth by gender, we can model gender in terms of probability.
We say things like "the probability of a girl is .5."
Abduction: It's a
head. Another thing that we like to do is flip coins. As we've
noted before flipping a coin is an infinite process. We could measure
the amount time the coin is in the air, we could count the number
of times the coin turns over, we could make any number of measurements
about this process. Of course we generally just look at the side
facing up when the coin comes to rest. By convention we name the
two sides Heads and Tails. So we reduce the process of a coin flip
to one of two categories: head or tail.
As with birth, we model
the result of the coin flip in terms of probability. We say things
like "the probability of a head is .5."

Bernoulli Trials.
There are processes in nature which are reduced by human operations
to two categories and then modeled in terms of probability. This
is what we mean by a Bernoulli Trial. A Bernoulli Trial is a process
that can have only two outcomes.
Now, as I keep stressing,
most processes are infinite and the two outcomes are a result of
our measurement operation. So obviously our child at birth is an
infinite process, and the two outcomes exist only because we choose
to talk about or do research on gender. This is fine; but
for both human and scientific reasons, it is important to remember
the infinity behind the operations.

Success and Failure.
Traditionally in the jargon of probability theory one of the two
possible outcomes of a Bernoulli Trial is called a success and the
other one a failure. There's no value judgment implied by the use
of these words in the probability context. Success and failure are
being used as conventions. They are merely names which indicate
the two outcomes of the Bernoulli Trial. Whatever you decide to
call a success and a failure is completely arbitrary.
A head could be called
a success. In that case we would say that the probability of a success
is .5. Of course that makes a tail a failure. And the probability
of a failure would also be .5.

p and q. The probability
of a success is typically denoted by a small p. And the probability
of a failure is denoted by a small q.
Okay, so probability
of a success is equal to p, probability of a failure is equal to
q. That's the main information here.
Since there are two outcomes,
we might as well notice that p plus q must be equal to 1 (because
the total probability in any system must be 1). [Remember that the
sure event in flipping a coin is that you're going to get a head
or a tail, and the probability of the sure event is 1. So p plus
q must be equal to 1.]
That means that if you
have p you can find q because q will be 1 - p.
We should note that p
and q do not always have to be .5. We just happen to have used two
examples where p and q are both .5. But they can be other probabilities.
In general, p might very well be .8, and then q would be .2.
Now we're going to define
the binomial distribution.


Back to
Menu Locator Map
Binomial
Distribution
N Independent Bernoulli
Trials. To define a binomial distribution, we need two things.
First, we need some number, N, independent Bernoulli trials.
By independent I mean events that are not correlated with each other.
When we flip two coins, the outcome of one flip has no effect on
the outcome of the other. If the first coin is a head, the probability
of a head on the second coin is still .5. There is no connection
or correlation between the results of the two flips. So flipping
two coins can be thought of as 2 Bernoulli Trials. The outcome of
the second trial is independent of the outcome of the first trial.
In this case N = 2.
Flipping 5 coins can
be thought of as 5 Bernoulli Trials. The result of each flip is
independent of the results of the other flips. In this case N =
5. If we flip 10 coins then N = 10 because we have 10 Bernoulli
Trials.
r Successes in N Trials.
Second, we need to count some number of successes in the N Bernoulli
Trials. We use r as the symbol for the number of successes.
Suppose we flip 10 coins and call a head a success; then N = 10.
We might wonder what the chances are of getting 8 successes (heads)
in 10 flips. In this case r = 8 and N =10. We might wonder what
the chances are of getting 4 successes in 10 flips. In this case
r = 4 and N = 10.
Let's say that there
is a neighborhood in which, of 20 children, 11 are girls. We might
wonder what the probability of getting 11 girls in 20 births is.
Here r = 11 and N = 20.
Binomial Probability
Distribution. When you have a question that asks the probability
of a certain number of successes (r) in a certain number (N) of
Bernoulli trials then that probability can be calculated from what's
called the Binomial Distribution.
The Binomial gives you
the probability of r successes in N trials.
.gif)
P(r; p, N). To
work with the Binomial we must specify three things: r, p, and N.
We must know how many successes we are interested in; we must know
the probability of a success: and we must know how many trials we
are talking about. The symbols for these three are, of course, r,
p, and N. The standard notation for expressing the probability of
r successes in N trials with p as probability of a success is P(r:
p, N).
For example, suppose
we have 8 independent births and we define a success as a girl.
Suppose p = .5. We might want to know the probability of 4 girls
(successes) in 8 births. This would be expressed as P(4; .5, 8).
The probability of 11 girls in 20 births would be P(11; .5, 20).
The standard notation
allows you to write a big sentence in a little symbol.
.gif)
Between. Just
as with the Normal Distribution, with the Binomial we will distinguish
"probabilities between values" from "probabilities
outside values." Suppose N = 8, p = .5 and we want to know
what the chances are of getting between 2 and 5 girls in
8 births.
The convention is that
"between" is inclusive. When I say between 2 and 5 girls
I mean 2 or 3 or 4 or 5 girls. Both 2 and 5 are included in the
potential number of successes.
.gif)
Outside. Conversely,
the convention is that the probability of outside 2 and 5 girls
excludes 2 and 5. If we have N = 8 births then outside 2 and 5 means
1 or 6 or 7 or 8 births. Both 2 and 5 are excluded.
Between is inclusive.
Outside is exclusive.

The last graphic on this
topic shows how the Binomial Distribution looks when drawn by StatCenter's
Binomial Tool. We will learn how to use that tool in the next topic.
Notice that the number
of successes (r) runs along the horizontal axis. The probability
for each number of successes goes up the vertical axis--the higher
the black area the higher the probability. And, circled in green,
in the top left corner, the standard notation appears. In this case
it says P(r; p = 0.5, N = 10).
In the example shown
in the graphic N = 10 and p =.5. The number of successes (horizontal
axis) runs from 0 to 10. This of course is all the possible numbers
of successes we could get in 10 trials. Notice that the probability
of 5 success is relatively high compared to, say, the probability
of 2 successes. If that's not obvious yet (or you can't read the
graphic very well), that's OK. We'll practice a lot with this distribution.


Back to
Menu Locator Map
Suppose
we flip a fair coin 8 times. Suppose we define a success as a head.
Suppose also, the coin is fair, p = .5. The Binomial Distribution
allows us to answer questions like what's the probability of 4 heads
in 8 flips. Or, what's the probability of between 2 and 5 heads
in 8 flips. Or, what's the probability of getting outside 2 and
5 heads in 8 flips. Use StatCenter's Binomial Tool to find these
probabilities.

Back to Menu Locator
Map
Finding "Binomial
Tool". You can find this tool from either the Desk, the
Ducks, or the Course Menu interfaces. From the Desk click on the
Interactive Learning icon and look for Binomial Tool. From Ducks,
just click on the Binomial Sample Tool link under the Binomial Distribution
Lecture. And from the Menu, just open the Work and Learn folder
and click on Interactive Learning and choose Binomial Tool.

Click on the middle button
(Probability Tools for Binomial Distribution). The Binomial Tool
will appear. Alternate between the Binomial Tool and this web page
and, perhaps, the Authorware program. Best to minimize or close
all other programs except these three. For example, if StatCenter's
Main Menu is on the screen, minimize it.

Now let's use the Binomial
Tool. Suppose we flip a fair coin 8 times. Define a head as a success,
with p = .5. What is the probability of r = 4 successes in 8 trials
when p = .5?

Set N. The graphic
points where to set the number of trials, N. In this case enter
8, and click the Enter N button.
Set p. The graphic
also points out where to set the probability of a success, p. In
this case p should already read .5. If not, enter .5 and click the
Enter Probability button.
Set Between. Click
on the "Between Icon."
Set upper and lower
scores. In this case we want to know the probability of exactly
r = 4 successes in 8 trials. So we will set both the upper
and the lower score to 4. Remember that between is inclusive. So
if the both the lower and upper score are set to 4, the probability
will include 4 (and only 4). This may seem a bit odd at first, but
it works.
Read the Probability.
The small white window in the lower right corner should read 0.2734.
This is the probability of 4 successes in 8 trials when p = .5.
Another way to write this is P(4; .5, 8) = 0.2734.
Black Area. As
with the Normal Distribution, with the Binomial the black area represents
the relevant probability.
That's all you have to
do. After your experience with the Normal Tool, this procedure should
seem familiar.
Practice. What
is the probability of exactly 3 successes in 8 trials with p = .5?
(Answer: .2188.) What is the probability of 3 successes in 8 trials
when p = .4? (Answer: .2787.) What is P(4; .4, 8)? (Answer:
.2322. You have to change the upper value first.) What is P(12;
.8, 20)? (Answer: .0222.)


Back to
Menu Locator Map
Now let's answer another
kind of question. What's the probability of getting between
2 and 5 heads in 8 flips of a fair coin? Here again, you use the
binomial tool.
Input information
given in the question. Set N = 8, set p = .5, set the lower
value of r =2, and set the upper value of r=5. Make sure the Between
Icon is clicked.
Then simply read the
probability which is .8203. The probability will again be represented
by the black area. As usual with both the Normal and Binomial distributions,
we're interpreting
probability as an area.
Practice. What
is the probability of getting between 0 and 3 heads in 8 flips of
a fair coin? (Answer: .3633.) What is the probability of getting
between 0 and 8 heads in 8 flips of a fair coin? (Answer: 1. Between
0 and 8 successes in 8 trials includes all possible outcomes. And
all possible outcomes must have a probability of 1.) What is probability
of between 3 and 14 heads in 20 flips of a fair coin? (Answer: .9791.)
What is the probability of getting between 3 and 14 successes if
p = .8? (Answer: .1958.)


Back to
Menu Locator Map
Now let's find the probability
of getting outside 2 and 5 heads in 8 tosses of a fair coin. This
will be the same of the example we just finished, of course,
except that you click on the Outside Icon instead of the Between
Icon.
Input information
from the question. Click on the Outside Icon. Enter p = .5.
Enter N = 8. Enter lower and upper scores of 2 and 5. Remember that
"outside" is exclusive and does not include 2 and 5.
Read the probability.
In the probability output window in the lower right corner you will
find the answer. The probability of getting outside 2 and 5 heads
is .1797.
Black Area. Once
again the black area represents probability. Click back and forth
between the Outside Icon and the Between Icon so you can see the
relationship between them both with the black area and the probability.
Notice that the probability outside plus the probability between
equals 1. That is, .8203+.1797=1.
Practice. What
is the probability of getting outside 6 and 8 successes in
12 trials if p = .6? (Answer: .3835.) What is the probability of
getting between 6 and 8 successes in 12 trials if p = .6:
(Answer: .6165.)
Play with the Binomial
Tool, entering various different parameters to discover what happens.


Back to
Menu Locator Map
Catching Cold.
Let's say that the probability of catching a cold in Salt Lake City
in January and February is known to be .5. In this example we're
keeping the probability at .5 so that it's like the familiar flip
of a coin. Suppose also that some researchers develop a cold vaccine
and therefore are on the threshold of becoming famous and perhaps
rich. But before they get famous, they will have to convince people
that their vaccine works. One way to convince them is to do some
research. They run a study with 10 volunteers to test whether the
vaccine is effective or not. They administer the vaccine to the
10 volunteers and then determine if each volunteer gets a cold or
not during January and February. If the vaccine works, the chances
of catching cold among the volunteers should be less than .5.
Science. The scientific
hypothesis is that the vaccine will improve health (that is, it
will reduce the chances of a cold). The independent variable (IV)
then is the vaccine, and the dependent variable (DV) is health,
which they will measure by categorizing whether or not each volunteer
gets a cold.
No Effect. The
scientists expect the vaccine (IV) will affect the DV (health).
But for the moment we're going to assume that (unknown to researchers)
the vaccine is completely ineffective. Despite the researchers'
belief that it's effective, it's actually worthless. Therefore the
vaccine is going to have no effect on the chances of getting a cold.
Therefore the 10
volunteers have the same probability of catching a cold as anyone
else. P(Cold) = .5. That's the set up.

Abduction.
We take volunteers and we reduce them via measurement operations
to a variable, call it X. X will be equal to 0 if they don't
get a cold during January and February. X will equal 1 if they do
get a cold. During January and February each volunteer is going
to have a lot of experiences. But we've reduced this person to a
0 or a 1. Perhaps the person will sign up for a really bad class
and wish that their life wasn't ruined by it, or maybe he or she
will meet someone really interesting. There's all kinds of things
that are going to happen during January and February. But we've
reduced her or him down to 0 and 1--get a cold and don't get a cold.
That's a massive reduction but of some use for the question we're
asking about vaccines.
Once we
have X defined we can model it as a Bernoulli trial. We'll
call getting a cold a success and not getting a cold a failure.
(Remember, success and failure are just arbitrary names.) The probability
of a success, p, is .5 because we are assuming that the vaccine
is ineffective.

10 Bernoulli
Trials. Now each person in the study has been modeled as a Bernoulli
trial. Since we have 10 volunteers we have 10 Bernoulli trials with
p = .5.
Binomial Distribution.
If we want to know the probability of a certain number of successes
(colds) in 10 volunteers we can use the Binomial Distribution.

Questions. Now
we can answer questions like what are the chances of 0 colds in
our sample of 10 subjects? That is, what is that probability that
none of our subjects will get a cold? Or what are the chances that
just 0 or 1 subject will get a cold? Or what are the chances of
0, 1, or 2 colds in our 10 subjects?
To answer the questions,
open up the Binomial Tool. Set N = 10 because there are 10 subjects.
Enter .5 as a probability of success.
If you want to know the
probability of 0 colds in our 10 volunteers, you click the Between
Icon, and enter 0 as both the lower and upper values. Read the probability
which is .0010 which is .001 or one in a thousand. There is one
chance in a thousand that you would get no colds out of 10 people
in this two month period, assuming that the probability of a cold
is one half.
That's the same probability
as flipping no heads in ten flips of a fair coin. It could happen
but the probability is low, just one in a thousand.
The next question is
what is the probability of 0 or 1 colds in our 10 volunteers? Keep
all other information the same and enter 1 as the upper value. The
probability is .0107, which is pretty close to one in a hundred.
There's about one in a hundred chance that we would get either 0
or 1 colds in our sample of 10 volunteers.
What's the probability
of 0 or 1 or 2 colds in our sample? Just enter 2 as the upper value
and read the probability, which is .0547. So the chances of getting
0 or 1 or 2 colds in this group of 10 subjects is .0547.
Wrapping up the Vaccine
example. That's all we have to say about the Vaccine example
for now. We will return to it later in the Hypothesis Testing lecture.
Make sure you understand it well, because we will build on it at
that time.


Back to
Menu Locator Map
Normal approximation.
An interesting proof in probability theory is that as N becomes
large, and for mathematicians becoming large means approaching infinity,
the shape of the binomial approaches the shape of the normal probability
distribution. In
the limiting case of an infinite number of Bernoulli trials, when
N is infinite, there's no difference between the normal distribution
and the binomial distribution. They are both normal.
In the current graphic
I've set p = .5, and N equal 100. You can see that with N merely
equal to a hundred (which is far below infinity) the binomial is
giving us an approximation of what the normal distribution looks
like.
Practice. Set
p = .5 and N = 250. Look at the resulting Binomial Distribution.
Is it even smoother and more like the Normal Distribution than when
N was 100?
Back to Menu Locator
Map
|