ANOVA - Correlated
for
1 Independent Variable
Copyright 2000, Tom Malloy. All rights reserved.
Let's
start with the Experimental Situation which is appropriate for this
ANOVA.
Experimental
Situation
The
research contexts where you would want to use a one-way analysis
of variance for correlated measures are the same as those contexts
where you might use a t-test for correlated means. We have already
discussed and worked with the t for correlated means so this should
be a straightforward generalization.
One of the advantages of ANOVA
for correlated means over a t-test for correlated means is that
a researcher can have more than two experimental conditions. The
t-test for correlated means can only analyze two conditions.
One-to-one. As
we discussed with the t for correlated means lecture, there are
several research procedures which would produce correlations or
one-to-one correspondences between the scores in different conditions.
Repeated Measures.
One procedure that gives us a one-to-one correspondence between
numbers in different columns of the data matrix is to have repeated
measures on each research participant so that there's just one group
of participants and you measure them at several points in time.
Within reason you can measure them however many times that you want,
so that every participant is measured multiple times. This sort
of design is called a repeated measures
design.
Matched Groups.
Another way to get a one-to-one correspondence or correlation between
columns of scores is to have matched groups.
For instance, in the experiment we talked about in the t correlated
lecture, researchers were interested in teaching reading to first
graders in Salt Lake City. Suppose there are three new reading programs
developed in three different regions of the the U.S. These three
reading programs are based on differing regional beliefs about how
best to teach reading. The researchers run a four-group study. Group
1 is a control group based on the current first grade curriculum
that's already established in Salt Lake City schools. Group 2 is
a special west-coast enriched reading program. Group 3 is an East
Coast enriched reading program. The final and fourth group is of
a Midwest training program. So the researchers have four different
treatments that they want to evaluate.
Next the researchers
do an experiment. The four programs are going to be taught to different
groups of children so that the research team can determine which
would be the best program to implement next year in their local
school district. One way of evaluating the results would be to use
a one way ANOVA for independent groups in which case they would
just randomly assign kids to the different groups. There is a problem
with this strategy however. Reading scores in the first grade tend
to be extremely variable, some kids come in being quite accomplished
readers and other kids haven't even started on the project. There's
a vast difference between children on reading achievement going
into the first grade. A way to deal with the problem of a lot of
initial variability among the participants is to match the participants
in every group on their initial reading ability.
The researchers could
give some kind of evaluation of reading comprehension prior to the
beginning of the study. Based on these reading comprehension scores,
they would list everyone who is going to be in the study from the
best reader down to the least accomplished reader. Next, since the
design calls for four groups, the researchers would take each of
the top four scores and randomly assign them to one of the four
groups. So the top four readers would be randomly divided among
the four treatment groups.
Then the researchers
would take the next four scores and randomly assign the students
with those scores to the four groups. And so on... This kind of
procedure helps to ensure that these four groups are carefully matched
with each other. Each group has one of the top four readers, each
group has one of the next four readers, and so on down to the least
accomplished readers. In short, these groups are carefully matched
with one another.
Matching is a common
researcher procedure. It creates a correlation between the scores
in the various groups.
We have now generalized
the example in the t-correlated lecture to four groups from two
groups.
Your IV is Type
of Program and it's levels are Old, West Coast, East Coast, and
Midwest. Your DV is reading performance on a standardized
reading test. The four programs are going to be taught to different
groups of children so that the research team can determine which
would be the best program to implement in their local school district.
CONTRAST: Independent
Groups? Let's contrast what we just did with an independent
groups design. One way of conducting the research would have been
to assign volunteer children to the 4 different groups randomly
without matching them. Then the researchers would evaluate the results
by using a one way ANOVA for independent groups. That would be a
fine way to do the study. But there is a problem with that strategy.
Reading scores in the first grade tend to be extremely variable,
some kids come in being quite accomplished readers and other kids
haven't even started on the project. There's a vast difference between
children on reading achievement going into the first grade. A common
research strategy for addressing this problem high variability in
the DV is to match the participants in the groups.
Go To Menu Map
Therapy
Study
Here
is another example of a study with correlated scores. The independent
variable is the Time Course of a new kind of psychotherapy
called "Therapia Nueva" or simply TN. The volunteers receive
this new kind of psychotherapy (TN) and are measured for mental
health at various different times.
IV
and DV. The IV is the Time Course
of the effects of Therapia Nueva.The
researchers are interested in how the effects of psychotherapy hold
up over time. Does the effect of psychotherapy dissapate quickly
or does it last a long time? The participants are given a particular
kind psychotherapy. Their Mental Health
(DV) is measured at various time intervals. We are going to look
at how mental health changes with the passage of time.
In this
example, the researchers are not interested in comparing the therapy
with any other kind of treatment. Rather, they want to know how
the effects of the therapy on mental health change over time. To
keep the example simple, we'll assume that there are only n = 5
participants. (Normally you would want a lot more participants).
Every
participant is measured four times. Prior to therapy each is given
a mental health pre-test. You can see the pre-test scores of the
5 participants, varying from 30 down to 5, on the first column in
the graphic.
The
big red bar (right after psychotherapy) on the graphic indicates
when in time the treatment (psychotherapy) was given.
Immediately
after psychotherapy the researchers post-test every participant.
Notice that their mental health scores go up from pre-test to post-test.
Now their scores vary between 90 and 50.
The
researchers also want to know how permanent the effect of psychotherapy
is. Is this improvement in mental health just something that happens
for a week or so, or does it last longer than that? So the research
team will do a follow-up measurement on all of the participants
at six months. Notice from the data that it appears like the participants
lost a little bit of the therapeutic effect, but it is still better
than it was at pretest.
The
research team does the final follow-up measurement at two years.
At two years, (just eyeballing the data) it appears that perhaps
participants lost a little bit more of therapeutic effect, but they
still have fairly substantial gains over the pretest.
The
research team does the final follow-up measurement at two years.
At two years, (just eyeballing the data) it appears that perhaps
participants lost a little bit more of therapeutic effect, but they
still have fairly substantial gains over the pretest.
Go
To Menu Map
Correlated
Scores
This
psychotherapy example is one in which the researchers made repeated
measurements on each participant. Every participant is measured
four different times: at pretest, at post-test, at six months, and
then finally at two years. So each participant has four scores on
the dependent variable.
Because
people are consistent across time, we would expect a correlation
between pre-test and post-test scores. Notice that the people who
have the highest scores at pre-test tend to have the highest scores
at post-test, and visa versa. To generalize, we would expect to
find a correlation between scores in any two of the four columns
(pre-test, post-test, 6 months, and 2 years). The way we have conducted
our research has created correlations among the columns of our DV
measures. Thus, we have to run a data analysis that takes this correlation
into account. We need an ANOVA for correlated DV measures.
Go To Menu Map
A
Rose by any other Name
Sometimes
repeated measurements designs are called within-subjects design
because the levels of the independent variable occur within the
participants (subjects). The time course of psychotherapy is something
that occurs within the life-experience of every participant. We
measure the effects of the time course of psychotherapy within each
participant.
The
terms within-subjects and repeated
measures are synonyms , changing from stat book to stat book.
Different people tend to use different names for this sort of research
design.
In fact,
there are a lot of different names for this kind of analysis of
variance: repeated measures ANOVA, within-subjects ANOVA. Some people
call it treatment by subjects or
t by s ANOVA, and other people call it dependent
groups ANOVA. You'll see all of those terms in the research
literature. I mention all these synonyms so that if you need to
use this research design or talk about it in some kind of conversation,
you will know that you can generalize from the terminology in this
lecture.
NOTE:
In StatCenter's Virtual Lab, there is a switch on the dependent
variable tool. On the bottom left corner of the tool's window it
says either independent means or correlated
means. So read the story problems carefully; so of them require
you to use a design with correlated measurements and other story
problems require you to use a design with indpeendent groups.
You may have to flip
the correlated versus independent mean switch depending on how the
Virtual Lab story problem askes you to design your study. The switch
defaults to independent. So if you if you want correlated groups
like the ones we are discussing here, then you must flip that switch
and indicate that your groups of scores are correlated.
Go To Menu Map
Statistical
Hypotheses
In any study there will
be more than one, usually several, groups. The index for groups
is j. In our reading program example there are 4 groups so j can
vary from 1 to 4.
The
statistical hypotheses are the same as the ones we used for ANOVA
independent. If we are skeptical that the IV will have an effect
on the DV, we write the null hypothesis. H0 assumes that there is
no treatment effect for any level of the independent variable. Alpha-j
= 0 for every value of j. 
In contrast,
if we're the scientist and writing the alternative hypothesis, we
expect that there are some treatment effects somewhere in this study.
Alpha-j is not equal to 0, for at least some values of j.
So,
returning to the teaching reading to first graders matched group
example which we started this lecture with, the skeptic says, "None
of these reading programs work. Whatever program is currently in
your curriculum in Salt Lake City is going to be just the same as
the West Coast program, and that'll be the same as the East Coast
program, and that'll be the same as the Midwest program. There won't
be any treatment effects." On the other hand, the people
who make up these programs will say "Oh no, mine is really
good. Mine will have an effect." They all would
believe that somewhere in the four groups, one of these programs,
and maybe all of them, will have some effect.
Go To Menu Map
Statistical
Conclusion Validity

Let's discuss statistical
conclusion validity. In other words, let's look at the plausibility
of chance as a way of explaining the data. If the data pattern
fit with the scientific hypothesis, is it plausible to argue that
the data pattern occurred by chance alone?
No
Calculations. You
are not going to calculate an ANOVA for correlated means in this
course. So we won't present formulas for you to learn nor to use.
You can use StatTool to calculate the Mean Squares and the
F's.
The focus in this lecture
is twofold. The first focus we've already completed: Learning what
kind of scientific context is appropriate for this ANOVA. The second
focus we will now take up: How to understand the results of an ANOVA
for correlated groups for purposes of statistical conclusion validity.
You need to be able to look at the output of a computer program
like Stat Tool on Stat Center or SPSS, or any other statistical
package, and know what the data analysis means.
Let's look at the degrees
of freedom and the summary table, because what you do need to do,
rather than calculate the F for correlated means, is to be able
to read the output of a computer program like Stat Tool on Stat
Center or SPSS. The outputs of such programs generally are organized
around the ANOVA summary table.
Go To Menu Map
ANOVA
Summary Table
I
am going to present two differnt ways of setting up a summary table
for this kind of ANOVA design.
Source
of Variance. The first column is call "Source of Variance"
and it divides up the Total Variance into subjects,
treatment, and treatments
by subjects. Often, the treatment by subjects variance
is also called error or residual. The synonyms for
these terms aren't my fault. I'm not making them up to torture you
. They just happen to be around, and so I want you to be able to
generalize from whatever we say here to a different conversation.
Degrees
of Freedom
Let's
look at the degrees of freedom. This is the second column in the
summary table. In the time course of psychotherapy example there
were five participants and there were four measurements on each
participant. So in that example j is four and n is 5.
The
total degrees of freedom is j times
n minus 1 written as j(n-1),
where j is the number of measurements
on each participant and n is the number
of participants.
Four
times 5 minus 1 is 19 (total degrees of freedom).
STUDENT QUESTION:
"Can you increase the degrees of freedom by having more
measures on your participants?" The answer is yes. You
are really starting to think like a researcher.
The
degrees of freedom for subjects is n minus 1 or (n-1).
And
the degrees of freedom for treatments is j minus 1
or (j-1). The error term, or
treatment by subjects term, is j minus 1 times n minus 1 or [(j-1)(n-1)].
Notice
on the graphic, I've highlighted in brigher yellow the degrees of
freedom that you actually need to look up the critical value of
F. In looking up the critical value in the F table, you use treatment
degrees of freedom (j minus 1) across the top of the table. You
will use the error degrees of freedom to move down the table. "Error"
is also called "residual" or "T by S."
In terms
of our experience with a One-way ANOVA for independent groups the
Treatment term here is equivalent to the between-groups
term there. In terms of our experience with a One-way ANOVA for
independent groups the Error term here is equivalent to the within-groups
term there.
Sum
of Squares
column. The next column across the top of the ANOVA summary table
is the sum of squares. In that column on a computer print out will
appear the values of the SS that need to be calculated for this
ANOVA.
Mean
Squares column. The next column of a computer print out will report
the means squares.
F.
The next (and sometimes final) column in a print out will be the
calculated value of the F ratio.
p
value. Many computer programs calculate a significance level
and place it in the final column. StatCenter's StatTool does not
calculate a significance level; you have to look it up in the tables
using the degrees of freedom.
Go To Menu Map
There
are different conventions for writing out the summary table for
ANOVA correlated. Depending of what computer program you're using,
the output may look a little different. Now I'm going to show you
another summary table that's a bit different than the previous one.
Go To Menu Map
Another
Version of the ANOVA Summary Table
Here
the total variance is broken into two major categories - between
subjects and within subjects.
The
between subjects category has just one subcategory, called
subjects.
The
within category has two subcategories - treatment and
treatment by subjects, or error. This version is really
exactly the same as the first one, except that this one analyzes
the variance into intermediate categories (Between and Within) and
then analyzes each of those intermediate categories so that we end
up with the same sources of variance and degrees of freedom as the
previous summary table had.
Go To Menu Map
Degrees
of Freedom
Notice
that the degrees of freedom in this version of the ANOVA summary
table are the same as in the previous summary table. Only the form
of the Summary Table is different.
When
you use the StatTool in StatCenter, you'll see the first of these
two versions of the Summary Table as your printout. Of course, StatTool
will also printout the actual sum of squares , the mean square,
and the F value. StatTool doesn't give you a p value (probability
value). You have to look that up yourself.
Go To Menu Map
Rejection
Regions
The
rejection region logic is the same as before when we went over ANOVA
for independent groups. There's nothing really new here.
Shown
on the graphic is a representation of the sampling distribution
of F. The expected value of F if H0 is true is somewhere
in the neighborhood of one.
As usual,
you need to look up the F critical based on two degrees of
freedom: the degrees of freedom for treatments and the degrees
of freedom for treatment by subjects. Next you examine the
calculated F to see whether the calculated F value
falls into the rejection region or not.
Go To Menu Map
Sampling
Distribution of F

By
now you should be familiar with the basic steps of this process
of statistical hypothesis testing. The first step is to assume the
poulation we are studying is a normal population. Next, you take
a random sample with some number of measurements on the dependent
variable. Next you use the statistical formulas to arrive at a calculated
F value. Then you determine where the calculated F
value falls on the sampling distribution of F in relation
to the critical value of F . Finally you decide whether or
not to reject H0.
Go To Menu Map
|