|
Copyright 2000, Tom Malloy

This
map allows you to--
-
Jump directly to a topic which interests you.
-
Coordinate the dynamic visual Authorware presentations with
the corresponding text available on this web page.
1.
To find a topic which interests you: Look at the map of menus above.
Choose a menu that interests you. Notice that the menu buttons have
topics printed on them. Click on any button (topic) on the menu;
you will jump directly to the text that corresponds to the topic
printed on the button.
2.
To coordinate this web page with Authorware presentations: The corresponding
Authorware program should already be open. Go to the menu of your
choice in the Authorware program and click any button which interests
you. Then on the topic locator map above click on the same button
on the same menu; you will jump to the text that corresponds to
the Authorware presentation.
End
of Topic Locator Map
Top
Test of Association
There are two different Chi
Square tools. We discussed the Goodness of Fit Chi Square in the
previous lecture. Now we will discuss the Chi Square Test of Association.
The Chi Square Test of
Association was derived mathematically by Karl Pearson early
in the century, and is often known as Pearson's Chi Square Test
of Association. Pearson showed that the formulas for the Goodness
of Fit statistic we learned last lecture and the Test of Association
statistic we will learn in this lecture both have Chi Square as
their sampling distribution. In this class we don't have sufficient
mathematical prerequisites to follow these proofs, but suffice it
to say they were great insights at their time, insights which have
allowed substantial scientific sophistication to be brought to bear
on frequency-categorical data.
Quite often we have two
categorical variables such as gender (male, female) and job status
(managers, clerks). We wonder whether there is an association (or
correlation) between the two categorical variables; that is, is
there a relationship between a person's gender and their job status?
As another example, we might interested in the potential association
of Political Party (Republican, Democratic, Independent, Other)
and Environmental Attitudes (Preservation, Development, Other).
The Chi Square Test of Association allows us to evaluate associations
(i.e., correlations) between categorical variables such as these.
We will begin our discussion
of the Chi Square Test of Association with the three criteria that
are necessary for its appropriate use.

Go To Top
THREE
CRITERIA
1) Recall that in the Chi Square Goodness
of Fit we needed a partition (a system of mutually exclusive and
exhaustive categories). Now, in the Chi Square Test of Association,
we need TWO partitions. 2) We also need some number of independent
observations. 3) Finally, we need frequency data.
CRITERION
#1: 2 Partitions. In probability
theory a partition is a mutually exclusive and an exhaustive set
of categories. As you know, categories are pigeon holes, or places
where we can put things conceptually. To be a partition, a set of
categories have to be both mutually exclusive and exhaustive. Mutually
exclusive means that observation can go into one and only one category.
No idea (nor thing, nor observation) may go in more than one category.
Exhaustive
means that the set of categories cover every possible case. It means
that every object we observe can be put into one of the categories.
More detail on these terms is available
in the Chi Square Goodness of Fit lecture.
Often when you fill out a survey you'll
find that there is an extra category called "Other." "Other"
is a great category because it ensures that whatever set of choices
is offered must be exhaustive, because if you don't fit in any of
the existing categories then you're in "Other".
Examples
of Partitions
As we discussed in the Goodness of Fit lecture,
gender is a partition. It is a set of mutually exclusive and exhaustive
categories. There are only two categories in gender--male and female.
You cannot be both at the same time so so male and female are mutually
exclusive. Moreover, for human beings,
gender is exhaustive because everybody goes in one of the two categories.
Male and female exhaust the possibilities.
Let's look at another example of a partition.
Let's say at a particular corporation we classify people according
to their job status. At this particular corporation the job status
is either managerial or clerical and there are no other categories.
(This unrealistic, but it keeps the calculations in this example
simple.) These categories are mutually exclusive. Any employee is
either going to be a manager or they're going to be a clerk; they
can't be both. And, since we've said that those are the only job
types, those two job types exhaust all the possibilities.
Example using
2 Partitions
Now let's create a running example for this
lecture. We will classify every employee of a business both by gender
and by job status. That is, we will apply two partitions in categorizing
people. As you can see from the graphic, that results in four possibilities:
Male clerk, Female clerk, Male manager, and Female manager.
Typically, in this
kind of chi square you will make some kind of table. For this small
example we have a two by two (2 x 2) table; but the table could
be seven by four table (7 x 4) or whatever, depending on how many
pigeon holes are in each partition.
CATEGORICAL VARIABLES. Another way to speak
about this example is that we have two categorical variables: Gender
and Job Status. When you classify by two categorical variables,
thus creating all the possible combinations of the two (clerical
male, clerical female, managerial male, managerial female) it is
called "crossing" the variables.
To do a chi squares test of association
you must classify each of your observations not with just one but
with two partitions.
In this simple example, male and female
is one dimension and job status is the other dimension. You'll notice
that our requirement is that every person we observe in our study
can be classified by each of the partitions. So everybody will go
into one of these "cells" created by crossing these two
partitions with each other.
CRITERION
#2. N Independent Observations
The
second criterion for the Chi Square Test of Association is that
we have some number of observations, each independent of the others.
In our running example, say that we classify 200 people who work
at a corporation by gender and job status. We will assume that these
people are independent of each other and that they are just picked
out at random.
In
the Goodness of Fit lecture we discussed the case of a rat in a
T-maze. In a T-maze a rat has to right or left so you have a partition
for each trial. However, if you observe the same rat over many trials,
these observations would not be independent because the particular
rat may have a turning bias.
The
important point is that you need to think about the observations
that you are making and decide whether they're independent or not.
For our example we are assuming that for each person who works at
this particular company, their gender is independent of the next
person. That fact that one person was born a man, doesn't have anything
to do with someone else who works in a different part of the corporation
being a woman. Their births are independent of each other and have
no relationship to each other. The
same must be true for job status.
CRITERION
#3: Frequency Data. The third
criterion is frequency data. Frequency data means that we don't
measure anything when we observe, all we do is put the observations
in categories and count the number of observation that fall into
each category.
Each time someone falls into a category
you can make a little hash mark in one of the cells of the table.
We are just counting, not measuring.
Think about how simply counting is different
from the measuring we did for the t tests. Our dependent variable
in t-tests has been things like someone's height or someone's weight,
or some number of puzzles solved correctly. In those examples when
we observe a person our dependent variable generated a measurement
number - how many inches tall they are; how many pounds they weigh;
how many puzzles they solve; and we actually assign that number
to that person. That kind of data is called measurement data; and
it requires statistics like t-tests, correlation coefficients, and
so on.
For Chi Square, we do not measure the participants;
we just count their frequency in various categories.

Go To Top


Scientific
Hypotheses. Let's say that the scientific hypothesis
is that there is gender bias in hiring and promotion in this corporation.
That means job status will depend upon gender. More specifically
it is more likely that the men will be managers and the women will
be clerks, which would be a classic kind of gender bias.
In contrast, the skeptic (or maybe corporation's
lawyer or public relations representative) will say that hiring
and promotion are completely fair. They will note that there will
be different proportions of men and women in different categories
but this is simply due to chance and there is nothing systematic
about how different genders fall into different job status categories.
So we have our scientific hypothesis of gender
bias, and our skeptical hypothesis which is saying hiring practices
are fair and if the data shows differently then it is only due to
chance.


What is the
Question? The essential question being asked by the
Chi Square Test of Association is: "Is one way of categorizing
things related to the other way of categorizing things, or are they
independent?" Another way of putting it is, "Are the two
partitions (categorical variables) correlated (associated) with
each other or are they unrelated (independent)?"
In our example we are trying to determine whether gender is related
to job status. Is there a correlations (association) between a person's
gender and her/his job status?
To answer this question we collect some data. We go to the business
and collect the relevant information on 200 employees. We put the
information into a table like the one we've been showing which crosses
job status with gender.
ASSOCIATION MEANS PREDICTABILITY. So let's repeat our essential
question in yet another way. Are the two classifications correlated
or associated or are they independent? That is, can you predict
one from the other? More specifically, can you predict someone's
job status from their gender. If there is an association between
gender and status, it will mean that you can make a good guess about
their job status by knowing their gender.
Let's examine how
the frequency data in the table would look in several cases from
complete dependence to complete independence. We examine 200 people
from the business. I've made the example so that of these 200, 110
are women and 90 are men.
Complete
Dependence
We'll start with the
extreme case--complete dependence
between gender and job status. If your study showed
the observed frequencies in the table on the graphic (not a single
male clerk and not a single female manager), that data would have
to come from watching old sitcoms from the 1950's. In any event,
if your data looked like this, where all 90 of the men are managers
and all 110 of the women are clerks, the data would indicate complete
dependence or complete predictability. In this example, the data
would demonstrate a case of extreme gender bias. The association
between gender and job status is as high as it can be.
If I told you that
someone is a clerk and then asked you to guess what their gender
is, you would be able to predict their gender perfectly. If a person
is a clerk, then she must be female. If I said someone is a man,
you could guess his job status perfectly; he must be a manager.
This is complete dependence of gender and status.
Strong
Dependence
Here is an example of observed frequency
data indicating partial but strong dependence. This example isn't
quite so clear cut, there is still a great deal of predictability
between the two categorical variables.
In the current graphic, out of 200 people,
there were 20 male clerks, 70 male managers; there were 100 clerical
females, and 10 females managers. In other words, 22% of the men
are clerks whereas 91% of the women are clerks.
Now you can't make a perfect prediction
from gender to job status, but you can make a pretty good guess.
If I say that somebody is a manager and asked you to guess whether
they're a male or a female, you could make a pretty good guess.
You'd guess a manager would be male. Now you wouldn't be right all
of the time, but you'd be right most of the time.
In this example there's strong predictability
from gender to job status and vice versa. In other words there is
a strong association between gender and job status.
Independence
What would the data look like if the partitions
were independent or had no relationship whatsoever? In such a case,
you would not be able to predict job status from gender any better
than chance.
I've changed the example such that half
the employees are managers and half are clerks. In line with this,
you'll notice that half of the women are clerical and half are managers.
The same goes for the men, half of them are clerical and half of
them are managers. In other words, 50% of all employees are managers,
AND 50% of the men are managers and 50% of the women are managers.
I've made the example such that men and
women are not equal in number. Out of every 100 people there is
45 men and 55 women. So 45% of the employees are man and 55% of
the employees are women. If we look at the 100 clerks, we find that
45% of them are men and 55% of them are women. That is the same
as the percentage of men and women in the company. There appears
to be no bias whatsoever, not even chance variations.
NO ASSOCIATION. In this case then, we have
independence of the two partitions or the two category systems.
The category system called gender is independent of the category
system called job status. Or in the language used in the Chi Square,
there is no association between gender and job status.
So that's the question is for this kind of test statistic. It's
a different kind of question than the one you would have for a t
test which uses measurement data.
The Research Data
For our running example, let's suppose that our research yields
the data shown in the current graphic.
N = 200 total people. Males = 90 out of 200 or 45%. Females = 110
out 200 or 55%. Clerks = 120 out of 200 or 60%. Managers = 80 out
of 200 or 40%.



DATA
PATTERN
In this particular case,
the data pattern fits the scientific hypothesis. You can argue it
any number of ways. For instance, 120 out of 200, or 60% of all
employees are clerical, but 90 out of 110, or 82% of females are
clerical, and 30 out of 90 or 30% of the males are clerical. These
are the kinds of arguments that someone who thought there was gender
bias would make. They'd say "Look, 60% of all employees are
clerical, but 82% of the females and only 33% of the males are clerical.
There appears to be gender bias."
CHANCE
The skeptic would say, "Well
I think that data pattern is just happening by chance." The
PCH of chance says that these data could have come about by chance
variations in hiring and promotion.
EVALUATE
THE PCH OF CHANCE
To deal with the plausible
competing hypothesis(PCH) of chance, we're going to do a Chi Square
Test of Association.

Go To Top
Let's calculate the Expected
Frequencies
Review
The
slide to the left and the two slides below summarize the example.
Review these three slides and then we'll start calculating the expected
frequencies.


FORMULA
FOR EXPECTED FREQUENCIES
This formula will make
most sense after you have worked through the example.
Calculating
the Expected Frequencies
We already have the observed frequencies
for each cell in our table. Now we will calculate an expected frequency
for each cell. The symbol for expected frequency is fe. We will
learn how to calculate the expected frequencies by going through
all the cells in the example.
NOTATION. We
want to be able to note and communicate which of the four cells
we are talking about at any given moment. The table is 2-dimensional,
so we need two dimensions to describe a location in it. By convention,
these two dimensions we will call "j" and "k."
As the blue arrow in the graphic shows, the
index j runs down rows; it tells us which row we are
in. And the index k runs across columns;
it tells us which column we are in. [It's arbitrary which dimension
we call j and which we call k. We just have to agree on which is
which.] The general symbol for the expected frequency in a particular,
unspecified cell is fe(jk). This can be read as the expected frequency
for the cell where row j intersects with column k. [Generally, "jk"
is a subscript but that is currently difficult to write subscripts
in web html text. So I'll use a parenthesis around jk when I need
to be clear. In obvious cases I'll just write the indices without
the parentheses.]
Our notation is such that we put row (j)
first and column (k) last when indexing a cell in a table. So fe(11)
is the expected frequency for the first row in the first column.
fe(12) indicates the expected frequency in row 1, column 2. Fe(21)
indicates the second row in the first column; and Fe(22) is the
cell that is in the second row of the second column.
CALCULATING THE EXPECTED FREQUENCY FOR CELL
1-1. To repeat, the symbol, fe11,
is the designation we give for the cell where j =1 and k =1. In
this example cell 1-1 is the upper left hand cell (which is where
we placed clerical male employees). The expected frequency for that
cell is determined by the total number in that row (which is 120)
times the total number in that column (which is 90) divided by the
total number of observations in the whole table (which is 200).
This is somewhat confusing to describe in
words. But it's really simple if you look at the examples in the
graphics. All you need to do is find the row total, multiply it
times the column total, and divide by the total total.
EXPECTED FREQUENCY FOR CELL 1,1. So you
calculate the expected frequency for the first cell as 120 times
90 over 200. fe11
is equal to 54.
Attempt to find the expected frequencies
for the other cells on your own before you look at the results below.
Expected Frequencies for the Other
Cells
 
Go ahead and find the expected frequency for cell 1-2 (row 1, column
2) or the female clerks cell.
One convenient way to summarize the information you will need when
we eventually get to the formula is to write the expected frequency
in the cell with the observed frequency.
Fe(1,2).
The expected frequency for row 1, column 2 is 120 times 110, over
200 which you can see equals 66.

Fe(2,1). For row 2, column
1 the expected frequency works out to be 90 times 80 over 200, or
36.
Fe(2,2). Then the expected
frequency for the cell in row two, column two is 110 times 80 over
200, or 44.
In the final graphic of the series, the table has observed frequencies,
which are the black colored numbers, and expected frequencies which
are shown in blue.
The observed frequencies are the data. The expected frequencies
is what the data should have come out to be if Gender were NOT associated
with Job Status.

Go To Top
The Formula

Let's look at the formula. It tells you to sum up the squared differences
between the expected and the observed frequencies and divide each
squared difference by the expected frequency. This is the same as
the Goodness of Fit, except that it is for a 2-dimensional case,
so the formula uses a double summation notation.
DEGREES OF FREEDOM. The degrees of freedom are the number of rows
minus one times the number of columns minus one. In terms of notation,
capital J represents the number of rows, and capital K represents
the number of columns.

Calculations

Here is what our 2 x 2 table would be like like given our data
and the calculated expected frequencies.
Next we determine the deviation between observed frequency and
the expected frequency for each cell. We square the deviation for
each cell. Finally we divide the squared deviation by the expected
frequency for that cell. The four graphics below show all the calculations.

IN WORDS. Get a value for each cell by determining (fo - Fe) squared
over Fe In the example the values for the four cells are 10.67,
8.73, 16, and 13.09.
Then sum up the values for all the cells to get the final chi square
value. In the example chi square = 10.67 + 8.73 + 16 + 13.09 = 48.58.
Degrees of Freedom

The formula for degrees of freedom is (the number of rows minus
one) times (the number of columns minus one). It can be symbolized
by J - 1 times K minus 1 or (J - 1)(K - 1). In this case, the degrees
of freedom equal 2 minus 1 times 2 minus 1, or just 1.
[Note: Capital J is used to indicate the number of rows. As we've
already said, little j is the index for some particular row. The
same is true of K and k.]
Now we can go on to the topic of statistical conclusion validity.
STATISTICAL
HYPOTHESES
The observed frequencies
(fo's) are the data we collect. The expected frequencies are what
the data should be if the two categorical variables are independent
(not associated).
NULL HYPOTHESIS. The
skeptic thinks that there is no association between categorical
variables (in this case, between gender and status) so the observed
frequencies should equal the expected frequencies other than chance
differences. So the corresponding null hypothesis is that we expect
the difference between observed and expected frequency in each cell
to equal zero.
ALTERNATIVE HYPOTHESIS.
The scientist things that there IS an association between the two
categorical variables. So the scientist thinks the data will differ
from the expected frequencies. So the corresponding null hypothesis
is that the difference between the observed and expected frequencies
will NOT be zero.
Statistical Conclusion
Validity

Here we show the sampling distribution of chi square again. The
number line along the bottom of the graph goes from zero to positive
infinity for chi square. Chi square is a squared entity - everything
in it is squared. Even if you get negative numbers from your cell
calculations, they are going to be squared and made into positive
numbers. You cannot get a chi square below zero. If you calculate
a chi square below zero, you made a mistake.
So the range of the Chi Square test statistic goes from zero to
positive infinity.
What H0 predicts

If you picture the nullhypothesis
in your mind, you'll remember that we expect the difference between
observed expected frequencies to be zero. If H0 is actually true,
then every term (for every cell) in the Chi Square formula would
be zero. That is, there would be no differences between the observed
and the expected frequencies in any cell. Therefore, each of those
differences would be zero, zero squared is zero, and the whole Chi
Square would be equal to zero. So H0 is predicting
values of Chi Square near zero. So high values of Chi Square are
not what H0 ispredicting.
The Rejection Region

To find the critical value you need to know
the degrees of freedom and your selected alpha level. Then you just
look the critical value up in your table. The critical value of
chi square, with one degree of freedom and alpha of .05, is 3.84.
Chi Square tables are available on the course
web site.
You draw your "Reject H0" and
"Do not reject H0" regions based on this critical value
of 3.84. We found that our calculated chi square of 48.58 falls
in the rejection region therefore, we would reject H0.
Why we Reject H0

H0 is predicting that there will be no difference
between expected and observed frequencies and so you should get
a chi square in the neighborhood of zero if Ho is true. By chance
alone, you may get a Chi Square value bigger than zero. However,
the chance of getting a value of Chi Square beyond 3.84 by chance
alone is very small.
Therefore using the logic we have used before
with other statistical tests, we'll reject H0 because it's very
improbable that you would get a Chi Square of this magnitude by
chance alone.

Go To Top
Sampling Distribution
of Chi Square

Let's review the sampling distribution of chi-square. This slide
shows the overall 4 step process and is the same slide you saw for
the Goodness of Fit Chi Square.
The sampling distribution of the test statistic you just calculated
is called the Chi Square probability distribution. This distribution
starts at zero and goes to positive infinity. Notice also that it
is not symmetrical. It's different than a bell curve, it has a big
lump down by zero where most of the probability is and it has only
one tail going off toward positive infinity.
Go To Top
©Copyright 1997, 2000 Tom
Malloy
|