|
Detect
Difference & Double Sample Lecture
©Copyright, 2000 Tom Malloy
This is the text of the
in-class lecture explaining the Double Sample Tool and the Detect
Difference Game. You may print this text out and use it as a textbook.
Or you may read it online.
There are no web homeworks
or practice homeworks that accompany this topic. But the Detect
Difference game will record your performance and enter it into the
database as part of your grade.
To
PRINT this web page: Click on the "Print" button at the
top of your browser.

Interactive
Simulation
Quick
Review. In
the Normal Lecture we leaned about the Normal Probability
Distribution. We learned that the Normal Distribution has two
parameters, mu and sigma. Mu sets where the center of a Normal
Distribution is. Sigma sets how compact or spread out a Normal
Distribution is.
We also learned about
sampling from a Normal Probability Distribution. When scientists
do research they develop measurement operations which generate
numbers (data). We learned that scientists model the
data they collect as a sample from a Normal Distribution.
In reality, collecting data is a lot of work, often taking years.
In the probability model data collection is a simple process of
sampling from a probability distribution.
Interactive
Simulation: A New Way to Learn
You've
already learned to use the Normal Sample Tool to take samples
from a Normal Distribution. The Normal Sample Tool simulates
with just the push of a button the weeks, months or years of work
it might take to collect data. Normal Sample tool is interactive--it
changes its output depending on what we input for mu, sigma, and
sample size.
Simulation allows
us to take an explicit model (such as the Normal Probability Distribution)
and learn what should happen in the world if the model is true.
We can simulate what the sample data should be and how it should
behave if the Normal Distribution is the correct model.
Simulation has a tremendous
advantage--it allows us to take many samples quickly and to learn
what happens to the sample data when we change the population parameters
(mu and sigma). We quickly learn how the population and sample data
go together-- how sample data are related to what is happening in
the population, and how what is happening in the population is related
to the sample data.
Interactivity is crucial.
Our ability to interact with computer programs allows us to set
variables and parameters (like mu and sigma) in our model so that
we can simulate a wide variety of practical and theoretical situations.
For example we can simulate a small research project in which the
IQ's of 7 participants are measured. In theory IQ is distributed
normally with mu = 100. Different IQ tests have different sigma's,
so say that the particular test we are using has a sigma of 10.
All we have to do is set the parameters in Normal Sample Tool (mu
= 100, sigma = 10, n = 7) and press the Sample button. We will get
a random sample of 7 IQ scores. This simulates doing the actual
research. Moreover, we can easily press the sample button many times
and look for important meta-patterns in the data, things like how
the sample mean corresponds to the population mu. That's the direction
we are going in this lecture. We are going to gain extensive experience
with interactive simulations to learn important relationships between
sample data and population parameters assumed to be true by our
statistical model.
New avenue of learning.
Learning about these kinds of relationships between populations
and samples is one of the most difficult and subtle parts of learning
statistics. That's because in the past this kind of learning only
occurred through mathematical proofs or years of experience collecting
data. Generally, introductory statistics students did not have the
mathematical sophistication to gain insight by mathematical proof;
nor did they have the research background to gain insight that way
either. But interactive computer simulation opens up a new avenue
of learning. The series of Tools and Games you are going to use
in this class are designed to open up for you this new way to learn.
Let us begin by laying
out the theoretical framework which underlies this new way of learning.
A
Process Learning Framework.
Every process proceeds in accordance with its structure. By structure
I don't mean material things like foundations and girders. I mean
structure in the sense of logical structure, mathematical structure,
or grammatical structure. Through repeatedly engaging a process
humans learn the structure of that process. Let me restate that
more fully. While repeatedly engaging external processes we structure
our own internal processes in ways which enable us to relate to
the structure of external processes. What we are calling Process
Learning here is frequently called Procedural Learning in
cognitive psychology. Procedural learning refers to the learning
of motor and mental skills. Classic researach in cognitive psychology
indicates that procedural or process learning is often, but not
necessarily, highly skilled, automatic and unconscious. Every process
proceeds in accordance with its structure; and, through process
learning, we proceed in accordance with our learned internal structures.

Learning
the structure of language. Well known examples of process learning
include language and music. By engaging the language which they
hear, small children build internal structures which allow them
to relate in useful ways to the structure of that language. The
structures of Chinese, Spanish, and English are very different.
By engaging the language process over many years, children learn
the deep structure of their native language. The knowledge of linguistic
deep structure is typically so automatic that it is brought to consciousness
only with effort. People can speak; but they seldom consciously
know the grammatical rules by which they speak unless or until they
study grammar formally. Similarly, a civilization's music is defined
by its musical scales and other musical forms. As prospective musicians
repeatedly play instruments to the point of fluidity they incorporate
structures within themselves which enable them to use their instruments
to play the rhythmic and melodic structures of their culture's music.
Depending on how they learn music, this implicit musical structural
knowledge may or may not be consciously accessible.
Learning
the Structure of Computer Programs.
More prosaic
but still familiar examples of process learning come from the experiences
of computer users. Every user has learned a favorite application
(a word processing program, a draw program, a data analysis program)
and an operating system (Mac OS, Windows 3.1, Windows 98, UNIX,
LINUX). Moreover, we have all learned applications and operating
systems over a long period of time to the point of unconscious fluidity.
We have generated internal structures which allow our thought processes
to relate to the structure of these programs. We think a thought
and it appears on the screen with little or no awareness of the
menu or key or icon structures of the program. It is as if the program
has become intuitive and natural.
And, alas,
almost all users have also had the experience of changing programs
and operating systems. Frequently this is challenging, inconvenient,
and aggravating. Why? Through process learning, we proceed in accordance
with our learned internal structures and our learned structures
do not relate well to the external structure of the new program.
Let's look in more detail at how the process learning which results
from using computers can be a powerful learning context.
Learning
to structure processes. In building applications and operating
systems, designers and programmers have a plan, a schema, an overall
logic of how a program works and of how its various functions relate
to each other. Every process proceeds in accordance with its structure;
and frequently the structure of a computer program is expressed
as a flow chart. Long term users of a program must in some deep
sense learn to structure their internal processes to fit the program's
structure. Users' structures need not be, almost certainly are not,
isomorphic with the programmers' ideas, nor do they need to conform
in some strict way to the flow chart. It would be a rare user who
could accurately draw a flow chart of a familiar program. Nevertheless,
it is proposed here that long term users must structure their internal
processes in a way that lets them usefully relate to the program's
structure. Notice that this sentence does not imply that their internal
processes will be the same as or a representation of external structure.
It says that, through learning, their internal structures come to
relate to and fit with external structures imbedded in the program.
This learning typically takes considerable experiential commitment.
Process
learning is powerful. There are enormous advantages resulting from
having learned to structure our own processes to relate automatically
to the structure of some external process (Shiffrin and Schneider,
1977). But there are also disadvantages. One disadvantage is felt
when external processes change and we keep proceeding by the structure
of our learned processes. We proceed in ways which no longer relate
well to external process. For example, sometimes circumstances entice
or require us to change familiar computer applications or operating
systems. Changing from Word Perfect to Word or visa versa can be
demanding, even disorienting. All the internal structures that were
built up over months or years no longer relate so well to the structure
of the new program. Sometimes they do not relate at all. The new
program can seem unnatural, counterintuitive, even stupid and certainly
confusing. Switching from the Mac OS to Windows or visa versa can
feel all wrong. It's not so much the specific menus and keystrokes
that have to be relearned, rather it is a deeper level of how we
organize our thinking in a way that fits with subtle underlying
logic in the structure of the operating system which must be learned.
I want to
address how the structural knowledge resulting from the use of computers
can be used to great educational benefit. But first, I want to make
a couple more conceptual distinctions and agree upon some terminology.
Old
Media. In the electronic context, I use "old media"
to include written text, pictures, illustrations, recorded music,
movies, videos, television, and animations. Text, pictures, and
illustrations are best for static concepts; videos and animations
are best for dynamic concepts where movement is crucial. What characterizes
the old media is that the audience can change little, if anything,
about the information coming at it; the audience is receptive, even
inclined to be passive. Whether it be from watching endless hours
of television or from sitting, still and quiet, in a classroom listening
to lectures for eight to sixteen years, one of the consequences
of old media process learning is the development of learning strategies
which are passive and which produce the perception of knowledge
as external to the learner and passed down from experts. When a
person uses old media to quench the desire for knowledge, the internal
structures which result are primarily receptive. Another consequence
of learning through old media is the atrophy (or at least the lack
of opportunity to develop) active discovery processes as a basis
of learning. Any activity on the part of the person (a library or
Internet search) comes before reading the book or watching the video.
The medium is the message at the deepest levels: Old media audiences
structure their learning processes to be receptive, and at times
even passive. Like people changing operating systems, they can find
it challenging, inconvenient, and aggravating to engage processes
which require them actively to seek and to discover knowledge. Frequently
students accept reading an assigned chapter in their text but dislike
choosing, even don't know how to choose, their own topic and do
an active library search. This is one deep form of learning accomplished
by processing old media.
Multimedia.
In the electronic context, educators frequently combine different
types of old media as a way of gaining educational advantages. For
example, Mayer and Anderson (1991) found that combining media produced
better subsequent problem solving than the use one medium alone.
New
Media: Interactive Simulations. The emerging new media are
based on the interactive nature of computers. My candidates for
the defining examples of new media are found among scientific computer
simulations models. More and more frequently, simulation is being
used as a powerful scientific tool enabling discoveries that were
impossible with other methods. But I include in new media also computer
games, draw programs, word processors, the Internet as a whole,
any sort of program whose interactivity allows the user to be active
and creative and to change the output of the program. The direction
I want to take this discussion is that if computer simulations can
be used as scientific discovery processes, then surely they can
be used in a parallel manner by students to learn to develop their
own discovery processes. For example, a little later in the course
you will learn to use StatCenter's Virtual Lab, which allows you
to use a scientific lab to do research to discover the principles
of a simulated reality.
Audience
versus User. The term "audience," so appropriate with
old media, no longer seems right with new media; we prefer terms
like "user," "player," or "gamer,"
and so on, depending on the specific medium. The difference between
"audience" and "user" points at a crucial difference
between old and new media. An audience receives experience. Users
create experience by interacting with the medium. More accurately,
an old media audience creates the highest quality experience for
itself by being attentive and receptive to input. Users of new media
create the highest quality experience for themselves by attending
to and actively altering input to fit their own goals and desires.
This distinction parallels the distinction in the development of
knowledge in children between conceptual cogntion and participatory
cognition (Fogel 1993, in press).
Computer
Games. Computer games constitute a special case of computer
simulation. Alex Garland (1998) in The Beach, p. 139, describes
(to those who play little or not at all) what video gamers all know,
"In video games, play occurs in levels of increasing difficulty.
The term 'boss' refers to the ultimate challenge a player meets,
blocking the way to the next level of play. Until you get past the
boss you cannot play the next level. Most bosses have a pattern;
crack the pattern, kill the boss. A typical pattern is illustrated
by Dr. Robotnik during his first incarnation in Sonic One, Megadrive
version, Green hills zone. As he descends from the top of the screen,
you jump at him from the left platform. Then, as he starts swinging
toward you, you duck under and jump at him from the right. As he
swings back, you repeat the process in reverse until, eight hits
later, he explodes and runs away. That's an easy boss." From
Garland's description, it is clear that learning the game's strategic
structure is as important as eye-hand coordination. In fact, eye-hand
coordination is not enough; a very fast and accurate player can't
win without learning strategic game structures. One of the themes
Garland explores in The Beach is the darker consequences
of some of the subtle strategies learned by video gamers.
Many people
have noticed pretty much the same thing that Alex Garland described.
By playing or, more likely in many cases, by watching our children
play computer games we noticed that players must learn strategic
ways of thinking to succeed at the game. Much recent media analysis
has focused on media content; but there are important societal consequences
of what people learn from media processes as well. A game's structure
is as important as, perhaps more important than, that game's content.
Players are learning HOW to think as much as learning WHAT to think.
For
the educational purpose of this lecture let's use the following
definition: Process learning is the structuring of our internal
processes in ways that allow us to relate to external processes.
Both implicit memory and procedural memory can reflect those changes
of structure that result from process learning. Based on phenomena
such as implicit and procedural memory, one intriguing question,
a question which is prior really to memory for process, is to ask
is how to change the structure of process? What I am proposing is
that students and teachers can take advantage of process learning.
We can identify structures of thought that we believe are valuable.
We can then use or build interactive computer programs whose structure
set the context for people to learn useful thought structures.

Relationships
between Populations and Samples
Double
Sample Tool
Double
Sample Tool allows you to take samples from two Normal Populations
simultaneously. You can begin to learn how the two samples are affected
when you change the characteristics of the two populations.
The Double Sample Tool
also allows you to simulate data collection in a research
project with two groups.
From the Menu or Ducks
or Desk find the
Detect Difference Game and click on its link.
A menu like the one in
the graphic will pop up. Select Double Sample.
You will see a Quick
Reference Instruction page like the one shown below. The Double
Sample Tool is pretty straight forward so if you play with computers
a lot you may simply want to look over the Quick Reference Instructions
and then play with the tool. Return to this lecture/tutorial a little
way down down in the section called Playing with Data Simulations.
Instructions.
The next bit of text will be explicit comments on how to use Double
Sample Tool. If you haven't already done so, open up the Detect
Difference Game and click on the Double Sample button. Refer
to the actual tool as you go through this lecture/tutorial text.
That is, look at the Quick Reference Instruction page and then the
blue Start button so you go
past the instructions to the actual Tool.
Lavender,
White and Yellow Areas. As you can see on the graphic (and even
better on the actual Tool) in the lavender area are two distributions:
a red and a green distribution. Each distribution has a mu and sigma.
So we have a red mu and a red sigma as well as a green mu and a
green sigma.
Below the distribution
and on the left is a white area with 4 little boxes. These
boxes allow you to see (and to change by typing in new values) the
4 parameters (red mu, green mu, red sigma, and green sigma).
Below the distributions
and on the right is a pale yellow area with a large Get Data
button. When you press Get Data you will get two samples of data
(one from the red and one from the green distribution.)
Changing mu's.
The red and green triangles are pointers which you can click and
drag to move the red and green Normal Distributions. Drag
the two distributions back and forth (on the actual Tool, not on
the lecture graphic). Notice
that, as you move the populations, the values of mu change (in the
two little boxes below the populations. Dragging a population with
its pointer changes where its center is, so dragging changes the
value of mu.
You can also change the
mu of either population by typing in an exact value in one of the
boxes. (But you have to click Get Data before the populations change.
Changing sigma's.
The two sigma's (red and green) can only be changed by typing in
values in the little boxes below the distributions. Change the sigma
on one of the distributions to see what happens (you have to click
Get Data to get change). Notice how the shape of the Normal Distributions
change with changing values of sigma.
Lock: When you
press the lock button it turns yellow and locks the two population
sigmas to the same value. This is because many famous statistics
which we will study later assume that the two population sigma's
are equal. This assumption is called Homogeneity of Variance
(and we will study it later). For now just know that pressing the
lock button ensures that the data you get conforms to an important
assumption in statistics. (Don't worry about understanding that
assumption at the present). Of course, you can press the Lock button
a second time to unlock the sigma's.
Get Data: When
you press the Get Data button two columns of data appear, one from
the red distribution, one from the green distribution. Observe how
the data changes as you change the parameters of the populations.
Some
statistics. Below the data columns are a bunch of statistics.
You've learned about the mean (M) and the standard deviation. In
lecture we used the symbol S for standard deviation. On Double Sample
Tool we use the symbol SD for standard deviation. (This is not uncommon.
I suppose there is a little irony in the fact that there is a lot
of variability in the symbols for standard deviation in statistics.)
You haven't learned anything in this course yet about the two other
statistics given by Double Sample Tool (SEM and t). Even without
knowing a thing about them you can still notice their behavior as
you change the populations parameters and get data. I'm going to
ask you to notice the behavior of t, even though you don't
know about it yet. This may seem a little backwards, but actually
knowing something about the behavior of a statistic can make understanding
it much easier later. (As we go along, we'll make explicit what
we mean by the "behavior" of a statistic, but basically
it just means what value did it take on in a certain circumstance
and how does that value change when you change the circumstances.)
For now, watch the behavior
of the statistics (i.e., what values they take on). Notice any patterns
in their relationship to the populations. The mean (M) is
particularly useful. You should also notice the behavior of t
(the bottom statistic). How does it change as you change the distance
between the two populations? (FYI: The rule governing t is
that it depends on how far apart the two distributions are relative
to how spread out they are. So t depends on the sigma's as
well as the mu's. It's an interesting and useful statistic that
we will study later.)
Playing
with Data Simulations
Let's wander along the
new avenue of interactive learning. We will set various parameters
in our model and do repeated simulations looking for patterns.
Playing
with mu
(Set the two sigma's
to about 21, which is where they default to.) Let's start by seeing
what happens when we change the center (mu) of the two distributions.
Distributions
far apart. Drag the green triangle to the left so that green
mu (look for its value in the little box) is at
100. Also drag the red triangle to the right so that red
mu is at 180. (You can also enter exact values in the
little boxes.) As you click Get Data many times, watch the two columns
of data and answer the following questions for yourself.
Data.
How does the green data differ
from the red data? How much
do the data values in the green column overlap with the values
in the red column? How do the data in each column relate to the
value of mu for their respective populations (i.e., are the data
points in the same neighborhood as their respective population mean,
mu)?
Statistics. Look
at the statistics as you click on the Get Data button repeatedly.
How does the value of green M
differ from the value of red
M? Are two M's close together or far apart? Is
the green M always smaller than the red M? As you sample repeatedly
(i.e., click the Get Data button), the particulars change, but the
overall relationships stay the same. The generalizations you can
make from observing the answers to these questions are straight
forward. They are also important. Just put them down in your own
words on a sheet of note paper. (Currently, May 25, 2000, there
is no "Notes" outline for this lecture, so you have to
make your own notes on your own paper.)
What about t?
As you replicate your data (that is, repeatedly take samples), just
notice the range of values that t takes on.
Distributions
close together. Drag the red and green triangles so that the
green mu is about 120 and the red mu is about 150. (Keep the same
values of sigma you used before.)
Data
and Statistics. Now click Get Data (that is, take two samples
from the the two distributions). Notice the two columns of data.
Do they overlap more or less than when the mu's were very far apart?
(The answer is they do overlap more. In fact, other than their color,
they might be rather difficult to tell apart just by eyeballing
the numbers.)
Look at the red M and
the green M. How is each related to the value of the corresponding
mu? Replicate (Get Data) your samples several times as you answer
the last question. How do the red M and the green M relate to each
other, that is, relative to the previous example (when the mu's
were 100 and 400), are the means close together or far apart? Again,
these relationships are as straightforward as they are important.
How often Is the green
M lower than the red M?
What about t?
As you replicate your samples, what kind of range are the t values
in? How does this range of t values relate to the t values you got
when the mu's were far apart (at 100 and 180)?
Distributions very
close. Move the two distributions very close to each other;
let green mu = 250 and red mu = 255. Sample several times, noticing
how much the data overlap. Notice also what happens to the values
of green M and red M. How often is green M lower than red M? What
is the range of t values that come up? (One
general observation you should have discovered by now is that the
closer the two distributions are together, the lower will be the
value of t.)
The
Extreme Case. Set the green and red distributions to have exactly
the same parameter values. For example you might set both mu's to
145 and both sigma's to 21.
Data
and Statistics. Now draw repeated samples. How much data overlap
is there in the two data sets now? What's the behavior of the means
(e.g., do they frequently reverse which is higher and which is lower)?
What's the behavior of t?
The extreme case reduces
to one population. In the extreme case, when we set both the
red and the green populations to the same parameters, we are basically
saying that there is only one population. We are just taking
two samples from the same probability distribution. The Normal
Curve has only two parameters, mu and sigma. When two normal curves
both that the same mu and sigma, then they are the same distribution.
This is a simple but very important point.
Notice that I've directed
your attention to M and t as much as I have to the data itself.
That's because subtle patterns are hard to pick up from the (raw)
data alone. Some patterns are easier to discover in statistics like
M and t than in the data. One reason to go through all the effort
and bother of learning statistics is that they help us discover
patterns. That may not be fully evident yet, but it should become
so as we go through the course.
Some
Generalizations
After your experience
with the Double Sample Tool up to this point, the following generalizations
should make sense to you. With population parameters held constant,
the sample data change (vary) from sample to sample. Therefore,
the sample statistics change (vary) from sample to sample. Although
the sample M's vary from sample to sample, they stay in the neighborhood
of the population mu's. Sometimes the data in the two samples don't
overlap at all; other times they overlap a lot. Data overlap is
related to population overlap. That is, when the populations are
far apart, the samples don't overlap. As the populations get closer
together the data in the two samples are more likely to overlap.
Playing with sigma.
I suggest you play with changing sigma a little bit on your own,
looking for obvious patterns. I don't want to overload you right
now with abstractions about relationships between populations and
sample data. So we're going to go on to a concrete example and then
play the Detect Difference game so that you can consolidate your
experience. But simulating what happens when you change the sigma
parameter in the Normal Model is important and we will return to
it toward the end of this lecture.
A
Puzzle
The
heart can be compared to a pump. When it contracts, it pumps blood
through the blood vessels and pressure increases. This is systolic
blood pressure. When the heart relaxes between beats, the pressure
decreases. This is diastolic pressure. It is generally thought that
a "normal" (i.e., healthy) blood pressure reading for an adult is
120/80. BP is measured in mm of mercury.
Here is a specific example
of a very common research puzzle. Suppose a group of scientists
is working on developing a new chemical formula for a pill to lower
blood pressure. Suppose they have two groups of volunteer participants.
(They plan to test this new chemical first on people with normal
blood pressures.) They give one group (Control Group) a placebo
pill and the other group (Experimental Group) a pill based on the
newly developed chemical that might lower blood pressure. No one
knows if it lowers blood pressure or not; that's why the scientists
are doing research--to find out.
At the end of the study
they have two samples of numbers, one sample from each group. They
have to figure out from these two sets of numbers if this chemical
is worth pursuing, or whether they would be better off pursuing
another line of research.
Constructing
a model of the puzzle
Recall
that in the Interface to Science lecture, we talked about the mysteries
of the universe (such as human beings and their hearts). We
talked about how, for various practical and theoretical reasons,
not the least of which is to engage the mysteries more deeply, scientists
reduce these mysteries to numbers (like blood pressures) by means
of measurement operations. We also talked about how scientists
then often create statistical models of these numbers. We
are now about to do that last step in some detail.
In the Normal lecture
we leaned to model measurement operations as a random variable called
the Normal Probability Distribution. As a first step, since we have
research participants with healthy blood pressures, suppose that
we model systolic blood pressure as a normal distribution
with mu = 120 and sigma = 20. That's the heart (so
to speak) of our model. But we need to add to the model a bit if
we are going to simulate the research puzzle. There are two groups
of participants, each generating its own sample of data. (Double
Sample Tool comes to mind, but first let's introduce some scientific
terminology.)
Treatments
and Treatment effects. The experimental blood pressure pill
is called a treatment. Treatment is more or less a synonym for independent
variable (IV). We expect the treatment (or IV) to have an effect
on the DV (data). In this case we expect the experimental pill to
lower blood pressure.
Treatment effect size.
How big an effect will the pill have on blood pressure? Will the
pill lower blood pressure by 10 points, 5 points, 2 points? That's
what we mean by treatment effect size or simply effect size.
If the pill is ineffective then its effect size is zero.
Construct a model.
As we said, we will model systolic BP as normal with mu = 120 and
sigma = 20. This is a good model for where the data come from in
the Control Group since the placebo is not expected to have an effect.
But
how do we create a model of where the Experimental Group data come
from? Suppose that in reality the experimental pill has an effect
size of -10, that is, it should lower BP by 10 points. (We need
a minus sign on the effect size because we are expecting the pill
to lower (subtract from) normal BP.) In that case we can use a normal
distribution with mu = 110 and sigma 20 as simulator for the Experimental
Group data (because the experimental pill would lower BP by ten
points).
If the effect size were
-5, then we would model the Experimental Group with N(115, 20).
If the effect size were -2, then we would model the Experimental
Data with N(118, 20). So we have to assume an effect size to create
our model fully.
Homegeniety of Variance.
Remember, our statistics generally assume that the two populations
have the same sigma. This is generally an assumption that scientists
use to model data. Understanding why at this point is not important,
just notice that it is a standard assumption.
Summary. For the
moment, let's say the effect size is -10. We model the Control Group
with N(120, 20)
and the Experimental Group with N(110,
20). That is, we will use Double Sample Tool's red and
green distributions to simulate the experiment.
Simulating
the Experiment
The scientists
only get to do the research once, and they must make all their decisions
based on the two samples of data they get. But we can simulate the
experiment as many times as we want. Maybe we will notice some patterns
in our simulations that will help them make a decision.
Effect
Size = 0. Setting effect size to 0 corresponds to saying the
experimental pill is ineffective. It has no effect. The way to simulate
an ineffective treatment is to make both populations identical.
So set the green population to N(120, 20) and the red populations
to N(120, 20). Now take repeated samples. What do you notice? When
I do that I notice that the data in the two samples overlap a great
deal. I also notice that, for a given simulation, the mean of the
green (control) BP might be higher or lower than the mean of the
red (experimental) BP. For example, of the last 10 simulations I
did, 6 times the experimental group had a higher mean BP than did
the control group. I also found t values to quite low, below 2.
That gives
us an idea of how the data and statistics behave when there is no
treatment effect. For example, I found that in 10 simulations the
experimental group BP was higher than the control BP 6 times. This
pattern of results is certainly consistent with the idea that there
is no effect of the pill. (If the pill were effective, the mean
experimental group BP should be lower than the mean control group
BP.)
0 Effect
Size means there is only One Population.
When there is no effect, we have the extreme case we talked of before.
No treatment effect means we are just taking two samples from the
same distribution. (Remember, by the assumption of homogeneity of
variance, the sigma's must be equal.)
Now let's
put a treatment effect into our simulation.
Effect
Size = -10. Set the green populations parameters to N(120, 20)
and the red population's values to N(110, 20). Look at the two distributions,
notice whether they are far apart or close, and whether they overlap
a lot or not. Now click Get Data as many times as you like. Notice
the two samples of data on each simulation. If the experimental
pill works, then BP's in the Experimental (red) sample should be
lower than those in the Control (green) sample. Do the data in the
two samples overlap? That is, in any given sample, are some of the
BP values in the red (experimental) sample higher than those in
the green (Control) sample? How much overlap is there? When I do
the simulations I find that there is some overlap in the two samples.
Some of the scores in the red group are higher than some of the
scores in the green group.
What
about the means? Over repeated simulations, is the control (green)
M always or nearly always higher than the red M's? If the pill is
effective, then the mean control BP should be higher than the mean
experimental BP. When I do the simulations, I find that yes the
green M is almost always higher and the red M is almost always lower.
That is, the average BP in the experimental group is generally lower
than the average BP in the control group. On my latest run of ten
simulations, I found that the experimental (red) mean BP was lower
than the control (green) mean BP 9 out of 10 times. This is what
I should find if the pill is effective. Still that one reversal
is a bit troubling, since the scientists only get one set of data.
What if they got the one where mean experimental BP was HIGHER than
the mean control BP (even though the pill lowers BP by 10 units)?
They might conclude that the pill raised BP when in fact it lowers
BP.
Effect
Size = -5. Set the two distributions to N(120, 20) and N(115,
20). Now simulate the data several times. Make your own observations.
What I find is that now there is a lot of overlap in the two data
sets for each simulation. Another important pattern that I find
is that the red M still generally stays below the green M but sometimes
the red M is higher than the green M. For example, I ran 10 simulations
and the mean experimental BP was lower than the mean control BP
7 times. Three times it was higher. Remember that the scientists
only get to see one set of data. Even though the pill lowers BP
by 5 units, they have some good chance of getting data that make
it look like the pill raises BP. Think about the implications of
this pattern for science.
Effect
Size = -2. Set the two distributions to N(120, 20) and N(118,
20). Simulate the research. When I do multiple simulations, I find
lots of overlap in the data sets. I also find lots of reversals
in the means. It is not likely that this research project would
find the pill to be effective.
Effect
size not equal to 0 means there are Two Populations.
Notice that whatever the effect size is, if it's not 0, then there
are two populations, each with a different mu.
Treatment
Populations. Here is some useful jargon. When there is a treatment
effect, then the two populations that result are called Treatment
Populations.
Wrap
up of Double Sample Tool
Your
Experience. We've covered a lot of ground up to this point.
What counts is the experiences you had simulating the scientist's
research. I've introduced some terminology and jargon. I've guided
you toward some general observations by asking questions. But these
observations are not really to be hoarded. It's worth writing them
down so you can remember the big picture. But we're not going to
ask you to memorize a list of the "correct" answers to
every question I've asked in this lecture. It's a little more Socratic
than that; I'm trying to get you to discover stuff. What counts
is your experience. If you followed along carefully, using the tool
to answer questions and make observations you have been learning
a very powerful and subtle perspective about how science works when
it uses probability theory and statistical models to describe the
research process. (It's worth taking notes, of course.)
Summary.
We've used the Double Sample Tool to simulate a simple, two-group
research design. But the patterns you've discovered generalize to
a very broad range of experimental research.
Let's go back to our
scientists. They have run their two-group research project, yielding
two sets of numbers. One set is from the Control Group who were
given an inactive placebo. The other set is from the Experimental
Group who were given the experimental pill with the active ingredient.
They look at the numbers. If the new chemical is ineffective
(as most are) then what they are looking at is two samples drawn
from the same population. If the new chemical is effective
at lowering blood pressure then the two samples are drawn from two
different populations. That is, if the chemical works, then the
group which received the experimental pill is a sample from a population
of people with low blood pressure--and the placebo group is a sample
from a population with normal blood pressure.
They look at the two
sets of numbers. They have to decide whether these two samples lead
to the conclusion that there are two populations (the chemical is
effective) or that there is only one population (the chemical is
ineffective).
Now we will
turn to a game which puts you in the place of the scientists. What
if you couldn't see the populations, only the data. And you had
to decide on the basis of the data whether a treatment is effective
or not?
All the
knowledge you gained with Double Sample Tool about how populations,
data, and statistics relate to each other will start to be useful
as you play at being the researcher.
Getting
to the Detect Difference Game:
Press the "Back to Menu" button in the white area on the
lower left side of the Double Sample Tool. It is just below the
"Lock" and the "Help" buttons. It will take
you back to the menu shown in the graphic below.
[The "Exit"
button below the tool will close the the screen. Use it if you don't
want to go on to the Detect Difference Game.]


Click
on "Detect Difference."
Simulating
a Scientific Puzzle
Detect
Difference Game
The Detect
Difference Game works very much like the Double Sample Tool. One
crucial difference is that when you play, your score will be recorded
and will count toward your grade.
Conceptually,
the biggest difference between the two is that you now no longer
see the Normal Distribution(s). A grey screen has come up from the
bottom like the door of an industrial elevator.You can see black
and yellow stripes along the leading edge of the firmly shut door.
It hides the populations. This, of course, simulates the fact that
scientists don't get to see the populations, just the data. Imagine
a two-group study, just like in the Double Sample Tool. As a scientist,
your job is to decide if the Treatment had an effect or not. When
we studied the simulation model with the Double Sample Tool, we
found that if the Treatment has no effect the data you see are just
two samples from one distribution. But if the treatment is effective,
then the data you see are two samples, one from each of two distributions.
If the Treatment
is effective, you are sampling from two distributions. If the treatment
is ineffective, you are sampling from one distribution. So the fundamental
scientific decision can be boiled down to this: Am I sampling from
one or two distributions?
Instructions.
Open the Detect Difference Game if you have not done so. 1) Begin
by choosing your level of play. This is done by looking for a white
box (circled in red on the graphic) that says "Please Select."
Click on the arrow next to the box and choose Easy, Medium, or Hard
from the drop down menu. 2) Next you must click "Get Data."
You will get two samples of data. You must decide if these data
came from two different distributions (the treatment is effective)
or from one distribution (the treatment is ineffective). 3) Make
your decision. Press either the "One Distribution" or
"Two Distributions" button (circled in blue). That's it,
for game play.
Score.
In the green box on the graphic, you can see where your score will
be shown as a percentage. To submit your score to the database for
a grade, press the submit button (just to left of the green box
on the graphic). You cannot submit a score until you have made at
least 10 decisions. The "Submit" button will send your
percentage for the last 10 decisions you made. You can make as many
decisions as you like, but when you submit, you will be given the
as your score the percentage correct out of the last 10 decisions
you made. In other words, your score is percent correct of the last
10.
You can
logoff and log back on and return to Detect Difference to improve
your score as often as you like.
Statistical
Resources. On the graphic, above the green box and below the
blue box, you will see 4 buttons (t=, SD, M, and SEM). Click on
whichever button you want to see the corresponding statistic calculated
on your current two samples. You can use what you have learned about
the behavior of various statistics to help you make decisions. You
can open a second copy of Netscape and use it to open Double Sample
Tool simultaneously, so that you can can play with it as you make
your decisions. The learning goal is to discover how statistical
models work while you play the game.
Feedback.
After you make your decision (by pressing the one or two distribution
button), you'll get immediate feedback. The grey door with yellow
and black stripes will disappear and you'll see where the data came
from (either one or two distributions). You'll also get verbal confirmation
that your are right or wrong. And your score will be updated (but
not submitted, only you can submit your score).
Philosophical
Aside. This game creates the pretense that the population(s)
is (are) covered up. In fact, we must remember that the populations
are only a probability model. The model is something that humans
have made up. The normal distribution model has been built carefully
by mathematicians and scientists over decades. It's not really "out
there." There is only the mysterious universe. There is only
our attempts to reduce the universe to numbers through our measurement
operations. There is only the data we collect. In contrast, the
model we made up for this data is only a fantasy; a well-constructed
fantasy, but it is just something we have thought up to make sense
of the universe. So, it's not that the populations are covered up;
it's that there really are no populations to see when we are doing
"real world" science.
What we
have in science is the data. And, as we look at the data, we can
use our fantasies (models) to give meaning to the data and to make
decisions. That will be your job in this game. Look at the two data
samples. Use your own experience with the normal distribution model
to make decisions about what the data mean.
That's the
game.
Levels
of difficulty. The database will keep a separate score for you
for the Easy, Medium, and Hard levels of the game. So you have to
play all three levels.
In terms of the Normal
Distribution Model:
Easy = large effect size
Medium = moderat effect size
Hard = small effect size
The hard
level of difficulty is rather, well, hard. Even statistical aids
may not help you to get a good score. Think about your experience
in the Double Sample Tool Blood Pressure example, where the effect
size was very small. It is next to impossible to be able to detect
a small effect size in your sample.
But how
do I get a decent grade at the hard level? Good question. First,
I strongly suggest that you play the Hard Level just as scientists
have to play it. Get one data set and do your best to make a decision.
Do this 10 or so times to see what kind of score you get.
Then use
the "Get more data" button (below the Get Data button).
Watch what happens to the statistics, particularly M and t, as you
get many samples. Scientists rarely have this option, but you do.
Based on your knowledge of how the model works in Double Sample
Tool, figure out the puzzle through multiple replicaitons. You should
be able to get a good grade; but you'll have to think about the
patterns you are learning (which is the point).
Playing
with sigma in Double Sample Tool
The
following is an outline to follow if you want to learn about Population
sigmas and how they affect sample data
Go
back to double sample tool
You can play with the
tool to make discoveries on your own. I've outlined below some useful
places to find patterns. I've also pointed you at interesting patterns.
Compact (narrow) distributions
Sigma controls how compact
or spread out a distribution is. Set the green mu to 250 and the
red mu to 275. Now set green sigma
to 4 and red sigma to 4. Notice
that the two populations get very narrow and do not overlap much.
Collect a series of data samples while thinking about the following
questions. How much do the data values in the green sample overlap
with the data values in the red sample? (That is, if you were shown
set of data without any color, could you make a good guess as to
which population it was sampled from?)
Sample Standard Deviation.
Look at another statistic, SD (or S). Notice the values of the two
SD's (red and green) listed below the sample data. We've studied
this statistic in some depth in the Variability lecture (although
in the lecture we called it S). SD or S is the sample equivalent
of the population sigma. SD or S applies to sample data; sigma applies
to populations. What is the range of values of the red and green
SD's as you take multiple samples?
Sample Means and Population
Sigma's. The current simulation has very narrow distributions
(both sigma's = 4). Notice how far away the sample M's are from
the population mu's. Are the M's very close or far off from the
populations mu's? How do red M and green M relate to each other(for
example, how often is the green M lower than the red M?
What kinds of values
of t are you getting?
Spread out (wide)
distributions
Keeping the two mu's
the same, change both the red and green sigma to 30. (This is the
highest value of sigma allowable by this program.) Sample several
times, again thinking about the kinds of questions we have been
asking? Notice that now the data values overlap a great deal more
in the two samples than when the populations sigma's were set to
4. Can you make sense of this in terms of how much the populations
overlap?
Sample means and Populations
Sigma's. Watch the values of the red M and the green M. Notice
that now both of them vary a great deal more than when the population
sigma's were both 4. Notice also that now (with population sigma's
equal to 30) the sample M's tend to wander farther from the population
mu's than they did when the populations sigma's were only 4.
Notice that the sample
SD's are in the neighborhood of the populations sigma's. Notice
that sample M's wander farther from their respective populations
mu's when population sigma's are large than when they are small.
Population
Sigma's and Sample Means
Some interesting generalizations.
Sample M's vary. Sample M's stay in the neighborhood of population
mu's. And sample M's vary more when sigma is high than when sigma
is low.
Sample M's are less variable
than sample data points
SEM. Sample means vary.
SEM is like SD. Only it measures how much M's vary rather than data
points.
|