Chi-square Goodness of Fit Test

[A, SfS] Chapter 7: Chi-square tests and ANOVA: 7.1: Chi-square Goodness of Fit Test

Chi-square Goodness of Fit Test

Goodness-of-Fit TestIn this section, we will discuss how, given a categorical variable, we can check whether the proportions of the population falling into each of the categories fit a hypothesized frequency distribution.

Recall that a categorical variable has non-numerical values, i.e., categories. If the variable is ordinal then there is some logical order to the categories; otherwise it is nominal.

We typically display a summary of data collected for such a variable using a frequency table, which shows the number of subjects in a sample which fall into each category. A researcher could be interested in whether or not these frequencies (also called counts) match some hypothesized distribution. This could be investigated using a goodness-of-fit test.

Suppose a categorical variable has $k$ distinct categories. Let $p_i$ denote the proportion of the population that belongs in category $i$ , for $i = 1,...,k$ .

In this context, a goodness-of-fit test can be used to determine whether the observed frequencies are consistent with some hypothesized distribution.

The null hypothesis of a goodness-of-fit test is defined as:

$H_0 : p_1 = p_{1,0},p_2 = p_{2,0},...,p_k=p_{k,0}$

where $p_{i,0}$ is some specified value for the proportion of the population belonging to category $i$ . Note that we must have $p_1 + p_2 + \cdot\cdot\cdot + p_k = 1$ .

The research hypothesis of a goodness-of-fit test is defined as:

$H_1 : p_i \neq p_{i,0}$

for at least one $i$ , meaning that at least one proportion does not equal its hypothesized value (i.e., the frequency distribution of the sample does not fit the null distribution).

Note that the test of $H_0 : p = p_0$ against $H_1 : p \neq p_0$ which we discussed in the previous chapter is a simplified version of this test, with $k=2$ .

In a random sample of $n$ subjects we would expect

$E_i = n \cdot p_{i,0}$ subjects to fall into category $i$ if $H_0$ is true, for $i = 1,...,k$ .

Let $X_i$ denote the actual number of subjects observed in category $i$ in the sample, for $i = 1,...,k$ .

The test statistic of a goodness-of-fit test is:

$X^2 = \sum_{i=1}^n \cfrac{\Big(X_i - E_i \Big)^2}{E_i}$

Provided that $E_i \geq 5$ for $i = 1,...,k$ , then it has been proven that $X^2$ has an approximate $\chi^2_{k-1}$ distribution (the chi-square distribution with $k-1$ degrees of freedom).

If $E_i < 5$ for any category, then that category can be combined with one or more other categories until the condition $E_i \geq 5$ is met for every category (this can sometimes be done by creating a category called 'other').

Given some observed value $x^2$ for the test statistic, the P-value is $P(X^2 \geq x^2)$ based on the $\chi^2_{k-1}$ distribution.

This probability is computed in $\mathtt{R}$ using:

> pchisq(x^2,k-1,low=F)

If a significance level $\alpha$ has been chosen, then $H_0$ is rejected if the computed P-value is $\leq \alpha$ , and $H_0$ is otherwise not rejected.

Suppose we want to know if students are distributed evenly among three majors, i.e., we test $H_0 : p_1 = p_2 = p_3 = \cfrac{1}{3}$ against $H_1 : p_i \neq \cfrac{1}{3}$ for at least one $i$ .

In a random sample of size $30$ we expect

$E_1 = E_2 = E_3 = (30)\Big(\cfrac{1}{3}\Big) = 10$

students per major. Note that $E_i > 5$ for each $i$ .

If in the sample we observe $X_1 = 16$ , $X_2 = 8$ and $X_3 = 6$ , then

$x^2 = \cfrac{(16 - 10)^2}{10} + \cfrac{(8 - 10)^2}{10} + \cfrac{(6 - 10)^2}{10} = 5.6$

with $3 - 1 = 2$ degrees of freedom.

Then the P-value is computed in $\mathrm{R}$ in using

> pchisq(5.6,2,low=F)

to be $0.0608$ .

If we use $\alpha = 0.05$ as our significance level, then we would not reject $H_0$ . But we can see that the evidence against $H_0$ is moderately strong.

Note that we typically do not compute any confidence intervals for the parameters $p_1,...,p_k$ in this setting, although it could be done if wished.

$\text{}$

Using RThe goodness-of-fit test can be carried out in $\mathrm{R}$ using the $\mathtt{chisq.test()}$ function.

You must input two vectors, one containing the observed frequencies $X_1,\ldots, X_k$ for the categories, and another containing the null frequency distribution $p_{1,0},\ldots,p_{k,0}$ .

The order of the categories must be the same for both vectors. You can make the vectors separately, or within the function.

For example, suppose we have $5$ categories, with null frequency distribution

> P = c(0.15,0.20,0.25,0.30,0.10)

and observed frequencies

> X = c(67,78,81,92,60)

Then we can carry out the goodness-of-fit test using

> chisq.test(X, p=P)

This will produce the values of the test statistic, its degrees of freedom, and the P-value.

Note: You must include the " $\mathtt{p=}$ " in front of the vector of the null frequency distribution. Otherwise, the function will misinterpret your input.

You could also have combined all steps into one:

> chisq.test(c(67,78,81,92,60), p=c(0.15,0.20,0.25,0.30,0.10))

You will get an error message if the sum of the elements of the null frequency vector is not equal to one.

If you need to know the expected frequencies under $H_0$ , add $\mathtt{$expected}$ to the end of the function, e.g.,

> chisq.test(X, p=P)$expected

> chisq.test(c(67,78,81,92,60), p=c(0.15,0.20,0.25,0.30,0.10))$expected