|
Web Lecture:
t for correlated means
©Copyright 1997, 2000 Tom
Malloy

Press the "Context for use" button

Use the "Next," "Back," and "Continue"
Buttons to navigate through the material presented on the screens

Definition of Context. What we want to do now is define just when
you would use the "t test for correlated means" ("dependent
means t") in contrast to the "t for independent means,"
which weve already learned.
As we have learned, the t for independent means is used when there
is no one to one correspondence between scores in one group and
scores in the other group. In contrast, the t for correlated means
is used when the two groups of scores have been constructed in a
way that naturally leads a particular score in on group to "go
with" a particular score in the other group.
Lets look at some common ways to establish this correspondence
between scores in two groups.


One way to get one-to-one correspondence between scores in groups
is to make "Repeated Measures" of the Dependent Variable
on the same subject. For example, say a scientist
wants to evaluate the effectiveness a Diet. The scientific hypothesis
is that following the Diet, which is the IV, a persons weight,
which is the DV, will be less than before the diet. So we weigh
study participants before the diet and then after the diet.
What we have done is measure each subjects
weight twice (before and after the diet). This is what we mean by
making repeated measures on the same subjects. Notice that the weight
of a subject before the diet "goes with" the weight of
that same subject after the diet. There is a natural correspondence
between scores (weights) in the before and after columns.

Whenever there are repeated measurements of the DV
on the same subjects (e.g., before and after the IV) the two sets
of measurements are correlated and it is necessary to use the t
test for correlated means.


Another way to create one-to-one correspondences between scores
in two groups to match carefully each subject in one group to a
subject in the second group.
For example, suppose a group of scientists
have developed an educational program for teaching reading. They
hypothesize that second graders taught by this New Program will
read better than second graders taught by the Old Program. The IV
is Type of Program (Old vs New). The DV is standardized reading
scores.


First measure all the students on reading ability. Then put them
into pairs. The two top readers are one pair. The next two are another
pair.

Because reading scores in second grade have so much variability
between children they want to reduce the variance. One way to do
this is to take the top two readers in the whole sample and then
randomly assign one of them to one group and the other to the other
group. Then take the next two best readers and randomly assign them
to the two groups. And so on. When you have done this the two groups
will be very similar.
The screen shows the subjects moving randomly from the ordered
list and their reading scores in the study assigned to one group
or the other.

One student in each group will be carefully matched to a corresponding
student in the other group.


Repeated measures and matched groups are only two of many research
procedures which will generate one-to-one correspondences between
groups. Any time there is a natural relationship between subjects
in groups (such as husband and wife, or in twin studies) you will
need to use the t for correlated means.

Since there is no formula or rule which tells you to use the dependent
t, you will have to remain alert for research procedures that create
correlations between groups.

Now on the Navigation Panel press (twice) the "up to
menu" arrow button and go back up to the "CORRELATED t MENU."
Press the "EXAMPLE: 1-TAILED" button
Use the "Next," "Back," and "Continue"
Buttons to navigate through the material presented on the screens

In the example, we have a salesperson from the Jazzy Ergonomic
Keyboard company who approaches a buyer at Consolidated Markets
Corporation (CMC). The Jer-Key salesperson says, "Ive
got a super, new kind of keyboard, designed to take advantage of
all the latest ergonomic discoveries. Id like you to try out
a few free sample keyboards with some of your staff. Youll
see that the new keyboard is going to increase productivity. People
will be able to type faster on this than they did on your old keyboards."


To evaluate the Jer-Key salespersons claim, were going
to do a little experiment. The independent variable will be the
kind of keyboard, either the old keyboard or the new keyboard. The
dependent variable will be peoples typing speed, measured
in words per minute.
Research Design: Because typing speed varies so drastically among
people we decide to use each participant in the study as his or
her own control: The participants will type on one of the keyboards
for some amount of time while we measure their typing speed; then
they type on the other keyboard for the same amount of time while
we measure their typing speed. Which keyboard a person types on
first and which second will be determined by the flip of a coin
(randomly). Because we measure each persons typing speed twice
(once on each keyboard), this will be repeated measures design;
so that is why were going to need a t for correlated means
(dependent t).
The actual dependent variable which we will use is going to be
the difference between a persons typing speed on the old keyboard
and the new keyboard. We are going to take peoples words per
minute on the old keyboard and subtract their words per minute on
the new keyboard.


Lets make some scientific hypotheses. The Jer-Key salesperson
promotes the scientific hypothesis that the new keyboard will cause
higher typing speeds. Thats a directional scientific hypothesis.
So the salesperson is expecting negative difference scores because
we are subtracting new typing speeds (which should be higher) from
old typing speeds (which should be lower).
We can imagine a different scientific hypothesis, one made by the
CMC accountant whos the one to decide whether or not to buy
these typewriters. Perhaps the accountant might say, "I dont
know, sometimes when you change technologies bad things happen,
other times good things happen. If we change technologies surely
theres going to be an effect, but I think it could either
improve productivity or it could make it worse." So the CMC
accountant is having a different point of view on buying these keyboards;
this point of view is that the new keyboard might cause an increase
or a decrease in typing. Thats a non-directional scientific
hypothesis. The differences between typing speeds (Old minus New)
might be negative or positive.
Finally, of course, the skeptic would say that the Independent
Variable (IV) has no effect on the Dependent Variable (DV), that
is, theres no difference in typing speed between the two keyboards.
Of course, if you do get a difference, then the skeptic is going
to propose the PCH of chance. The PCH of chance is that any difference
found between the typing speeds on the two keyboards would solely
be due to chance. The differences between typing speeds (Old minus
New) vary from person to person but generally be around zero.


Lets translate the three scientific hypotheses into statistical
hypotheses. Starting with the skeptical hypothesis (no effect of
the IV, that is, these new keyboards dont affect typing speed
one way or the other), we generate the Null Hypothesis, H0.
H0 says that we "expect" the average difference
in typing speeds (Old minus New) to be zero.
The salesperson, of course, expects that the mean difference between
typing speeds is going to be less than zero (negative) because we
are subtracting a larger number (new typing speed) from a smaller
number (old typing speed). So the salespersons H1
is that we expect the average difference in typing speeds to be
less than zero.
The accountant on the other hand has a non-directional scientific
hypothesis. So the accountants H1 is stated in
a way that expects the mean difference score will not be equal to
zero. The difference may be positive or negative but its is
not going to be equal to zero.
Since we are developing a one-tailed example, we will choose the
salespersons scientific and statistical hypotheses to evaluate
in this example.
Student Question: Why did you subtract the New typing scores
from the Old? Wouldnt it be less complicated to subtract the
other way so we dont get negative numbers?
Shes right. Theres no compelling reason why I subtracted
one way versus the other. Normally the convention (as she proposed)
is to choose to subtract in a direction that makes the your scientific
hypothesis generate positive numbers. Following that convention,
the salesperson would subtract Old typing scores (lower numbers
under the hypothesis) from New typing scores so as to get positive
differences. Notice that if that is what we did, then the salespersons
alternative statistical hypothesis (H1) would change
from what is on the screen. Then it would have to be H1:
E(MD > 0). The "less than" sign would change
to a "greater than" sign.
You are free to choose the subtraction in which ever direction
you want. But once you choose, then you have to make sure that your
statement of H1 logically follows your choice. H1
can vary depending on how you choose to subtract.
I could have subtracted the other way: its just the choice
I have. The reason that I chose to subtract New from Old (and so
generate negative differences if the salespersons hypothesis
is correct) is precisely to bring up this issue and to do an example
with negative numbers so you would know what to do in that case.
The main thing to remember is to you have a free choice of which
direction you subtract one measure of the DV from another measure
of the DV. And... once you make that choice you then have to keep
track of the logic of stating alternative hypotheses and setting
up rejection regions (which well do in a minute) in ways that
are consistent with how you subtract.

This next screen in a way gets ahead a bit because we havent
yet looked at the formula for t. Well see the full t formula
in a bit. Suffice it to say for now that the t formula will involve
the average difference score (MD) divided by a complicated
expression under a square root. To make the point that we are making
right now, we dont need to complicate things with what is
in the square root.
The point is that H0 expects the mean difference score
to be 0 because H0 expects there to be no difference between the
typing scores. If MD DOES happen to come out to be exactly
zero, then t would be zero. This is because the top of the t formula
would be zero (if MD = 0) and anything (whatever is in
the square root) divided into zero is zero. The important conceptual
point is that H0 expects t to be equal to 0. We can express
this as
E(t given H0) = 0.
The fact that H0 expects t to be zero will be important
to understanding the logic of statistical conclusion validity later.


Next we are going to complete this example using the salespersons
scientific hypothesis. We have a directional scientific hypothesis
leading to a one-tailed alternative. "Directional" and
"One-Tailed" are essentially synonymous, only one is in
the realm of science and ones in statistics.
This screen summarizes the research design and shows the layout
of the data. In the first column is the subject #. Next to each
subject # we see the typing speed on the Old keyboard and then on
the New Keyboard. As you press "Continue" the New scores
are subtracted from the Old to give you a difference score (in blue).
So for the first subject, typing speed on the Old keyboard is 55
words per minute, and typing speed for the New keyboard is 61 words
per minute. When we subtract those we get a minus 6 as a difference
score. (Note: We would get a plus 6 if I subtracted the other way,
and thats just that free choice.) Subject #2 has a minus 9
(47 minus 56) as a difference score. As you keep pressing "Continue"
the difference score column will fill out.
Keep pressing "Continue" and the final column (d2)
will fill itself out. We will need the difference scores squared
to do our calculations.
For the calculation of t we wont need the raw scores. The
t formula will only use difference scores. The raw scores are just
used in the beginning to create the difference scores. (That is
why they are dimmed in the final screens.)
Next we will start calculating t, starting with the Mean and Standard
Deviation of the difference scores.


We can see on the next screen that the sum of the difference scores
is minus 38. And that means the mean difference score is minus 38
over 10, and so equals -3.8. Now if youll remember the hypotheses,
the skeptic (corresponding to H0) said that we expect
the mean difference score should be zero. The salesperson said that
we expect the mean difference score to be below zero. MD
is below zero so the data pattern fits the scientific hypothesis.
Thats always an important first step. But the skeptic is going
to say, "Well -3.8, thats not very many words per minute.
I think thats just chance." Therefore, now were
going to have to use a t test to evaluate the PCH of chance.

To calculate t were going to have to calculate a standard
deviation, so lets get on with doing that. First, the sum
of the difference scores squared is 216. As you continue to press
the "Continue" button, the formula for the standard deviation
of difference scores will come, follow with the calculations of
the standard deviation. This formula is really identical to the
formula youve always used for standard deviation, except that
weve substituted a symbol (d) for all the Xs. Ive
changed the formula a little bit so that we dont have xs
in it, but its the same old standard deviation formula. In
words the formula is the square root of the sum of d squared over
the number of difference scores minus the average difference score
squared.
Go ahead and substitute into the formula before you press continue.
Doing that will give you feedback as to whether you understand the
formula. In the substitution on the screen, the standard deviation
of difference scores was the square root of 216 over 10, minus 3.8
squared.
The next screens go through the arithmetic. The arithmetic boils
down to the square root of 7.16, which equals 2.6758.

The degrees of freedom (df) for this t test are the number of difference
scores minus one. Dont count raw scores because thatll
get you twice as many. Count the number of subjects (or the number
of difference scores). Counting the raw scores (in this example
20) is a mistake students sometimes make on exams.
In this case the number of difference scores minus one is10 - 1
= 9, which is the degrees of freedom.

Now lets get the formula for the t. The reason that Im
presenting the formula for the t test after weve seen and
worked with the data instead of before is because to understand
the formula you have to understand thoroughly what we mean by the
mean difference scores and the standard deviation of difference
scores.
The formula: t equals the mean difference score (MD)
divided by the standard deviation of difference scores, itself divided
by the square root of the number of difference scores minus one.
Go ahead and substitute into the formula to make sure that you know
how to.
The correct substitution is t equals minus 3.8 divided by 2.6758,
itself divided by the square root of 10 minus 1. That should be
what you wrote down when you substituted into the formula. The next
screens do the arithmetic. You can write down in your notes as much
or as little of the arithmetic as you like.

The final result is that t equals -4.26. Thats the calculated
t. You know what we have to do next is to get the critical
t, set up rejection region(s) and determine which region the
calculated t falls into so we can make a decision about the null
hypothesis (H0).


Next we will find the critical value of t. First choose a significance
(alpha) level. This is a free choice on your part so long as alpha
is not greater than .05. Ill choose alpha equal to .05.
Second, you must determine if this is a one- or two-tailed test?
Since we choose the salespersons scientific hypothesis (which
is directional) to work with, this is a one-tailed test.
Student question: "Would the accountants scientific
hypothesis give us a two-tailed test?" Yes, the accountants
hypothesis would lead to a two-tailed test. Thats correct,
and well return to creating a test based on that later.
Third, and finally, we need to know the df if we are to use the
tables to get the critical t. In this case df = 09. Using our tables
(which arent shown on the screen) our critical value is equal
to 1.833.
But theres a question on the screen which asks, "Should
this be a plus or a minus critical value?" Before pressing
"Continue," answer that for yourself.

If you got the answer right and understand why, then you can skip
past this material. If you got the answer wrong or if you dont
fully understand why you choose the right answer then follow along
with this discussion. Lets examine the logic behind choosing
plus or minus. Recall that H0 expects t = 0. But what
does H1 expect t to be? To review a bit, if you take
into account both the salespersons scientific hypothesis and
the direction we subtracted, you expect the mean difference score
to be below zero (because the Old minus New would be negative).
Now think about the t formula. t is the mean difference score (MD)
divided by a standard deviation and a square root term. Remember,
both the standard deviation and a square root must be positive.
So if a negative MD is divided by positive numbers (standard
deviation and a square root), then the result (t) must be negative.
Therefore H1 expects t to be negative.
E(t given H1) < 0
Now lets get back to the question whether the critical value
of t should be plus or minus. Since the null hypothesis (H0)
expects t = 0 and the alternative hypothesis (H1) expects
t to be negative, it makes sense to place our rejection region on
the negative side of 0. So our t critical should be negative. In
short, the critical value should be minus because we are predicting
a negative average difference score, and therefore a negative t.
The critical value is, therefore, -1.833.
If that discussion is a little bit shaky for you, thats ok.
Next we will go over the whole process of setting up rejection regions,
so you can go over the whole process again with lots of visual support.


Lets set up our number line. The full range of t goes from
negative infinity to positive infinity, and in the center of that
gigantic range is zero. The null hypothesis expects t to be 0. So
we would not want to reject H0 if our calculated t was
close to zero. But what defines what is close to zero versus what
is far from zero? The critical value does. As we said before the
critical value of t divides the range of t into regions that are
close to 0 (the expectation of H0 ) versus far from zero.
The table tells us that our critical value is 1.833; and because
the alternative hypothesis (H1) tells us t should be
negative, we place our rejection region below zero. So the critical
value is a -1.833.

Lets put our critical t (-1.833) on the number line and then
draw the "Reject H0" and the "Do not Reject H0"
regions. Of course the "Reject H0" region is the one farthest
from 0, and the "Do not Reject H0" region is the one that
includes 0.
Next, lets put our calculated t (-4.26) on the number line.

We see that calculated t falls in the rejection region. Consequently
we reject the null hypothesis.

What were saying when we reject H0 is that if
H0 is true, then the chances of getting a calculated
value of t out here in this rejection region, is less then .05,
less than 1 in 20. This is pretty improbable. Improbable enough
to be considered implausible. If the data are solely determined
by chance, then the probability of getting a calculate t value in
the rejection region is less than .05. This probability is so small
(1 in 20) that, by general agreement, scientists are willing to
say the once plausible competing hypothesis that chance alone is
determining the data is no longer plausible.
In contrast notice that H1 (which comes from the scientific
hypothesis) predicts a value of t below 0. So the fact that we got
a calculate t = -4.26 is consistent with H1. Therefore
H1 and therefore the salespersons scientific hypothesis
remains plausible.
All weve accomplished with all these statistical machinations
is simply to eliminate "chance" as a plausible explanation
of the data pattern, while leaving the scientific hypothesis plausible.
This is what we mean by having a valid statistical conclusion (or
Statistical Conclusion Validity).
Weve not proven or even supported the scientific hypothesis
in any deep sense. Whether people come to believe the scientific
hypothesis or not depends on many important issues in the research
design. These issues, often grouped as Internal Validity, External
Validity, and Theoretical Construct Validity are addressed in research
methods.
Student question: "If it were a two-tailed test, would
there be two rejection regions?" Yes, if it were two-tailed,
there would be a critical value below zero with a negative sign
and the exact same critical value above zero with a positive sign.
BUT neither of them would be 1.833, because you would use a different
column in the tables for a two-tailed test. Thats a good question.
In fact you can go up to t correlated menu and redo this example
from the accountants point of view so that you have a non-directional
scientific hypothesis which generates a two-tailed rejection region.

Now on the Navigation Panel press (twice) the "up to
menu" arrow button and go back up to the "CORRELATED t MENU."
Press the "EXAMPLE 2-TAILED" button 
Use the "Next," "Back," and "Continue"
Buttons to navigate through the material presented on the screens.
We will use the same Jer-Key salesperson example. But this time
we will use the accountants scientific hypothesis. The first
set of screens indicate that the research design is the same, the
data is the same, and the results are the same: t = -4.26, with
df = 9. In other words, irrespective of what the scientific hypothesis
might be, the data are the data and the results are the results.

Lets look at the scientific and statistical hypothesis. The
accountant is saying, "Yeah, I admit that technology has an
impact. Im just not sure if its good or bad. The New
keyboard might produce faster or slower typing speeds." So
theres a non-directional scientific hypothesis.


The accountants non-directional scientific hypothesis generates
a two-tailed H1. Since the new keyboard might make typing
speeds faster or slower, the MD might be either above
zero or below. In other words the MD will not be equal
to zero.
The alternative hypothesis (H1) expects MD
to NOT equal zero. This is a two-tailed alternative.

We next use the tables (not shown) to find the t critical, which
depends on three things: alpha, df, and whether or not this is a
one- or two-tailed test.
Let leave alpha at .05, degrees of freedom remain at 9, and
we have a two-tailed test. Given those three things, the tables
tell us that the critical value of t is 2.262.
But there is a question on the screen. Should 2.262 have a plus
or minus in front of it? Before you press "Continue" answer
that question for yourself.

The answer is both. Because H1 expects MD
to be either above or below zero we want critical values that allow
us to reject H0 either above or below zero. So we have
two critical values: +2.262 and -2.262.

Lets set up our number line. The full range of t goes from
negative infinity to positive infinity, with zero in the center.
The null hypothesis expects t to be 0. So we would not want to reject
H0 if our calculated t was close to zero, "close"
is defined by the critical values. As we said before, the critical
value of t divides the range of t into regions that are close to
0 (the expectation of H0 ) versus far from zero.
We place the critical values of t on the number line to define
the rejection regions. In a two-tailed test we have two rejection
regions, one on each side of zero.
Next, lets put our calculated t (-4.26) on the number line.
We see that calculated t falls in the rejection region. Consequently
we reject the null hypothesis.
Statistical Conclusion Validity. OPTIONAL: You can continue
with the next series of screens which complete the ideas about statistical
conclusion validity. To get exact commentary to go with the screen
go back to the text for the 1-tailed examplethe screens and
the text are identical.
What Ill do here is rephrase all that in different words
without reference to screens in case you want another discussion
of the issues in different words.
Since H0 expects calculated t to be zero, if we get
any t close to zero, then theres no reason to reject H0.
The question is, what does "close to zero" mean? "Close"
is always defined by the critical values, in this case minus -2.262
and +2.262. We will reject above +2.262 and below -2.262. Thats
why this is called two-tailed, we reject in either tail: The tail
that goes to negative infinity or the tail that goes to positive
infinity. We do not reject H0 in the middle, near zero,
because H0 is predicting a t of zero.
"Close to zero" is defined as "between the critical
values" and "far from zero" is defined as "outside
the critical values."
This completes the material on the t test for correlated means.
You can go up to the "Correlated t Menu" and select "Formula"
any time that you want to quickly review the formula.
|