Chi-square Goodness of Fit test

Consider the following problem:
A research firm claims that the distribution of the days of the week that people are most likely to order food for delivery is different from the distribution seen in the past. You randomly select 494 people and record which day of the week each is most likely to order food for delivery. The table below also shows the results of your count. At alpha, α, = 0.05, test the research firm’s claim.

This sounds like a test of Goodness of Fit between the historical pattern and the observed pattern.

The claim is that the actual pattern and the historical pattern are different. That means we need the inequality math operator, which, in turn, means the claim is the alternative hypothesis.

Stating our two hypotheses:
H0: the distribution of people ordering food for delivery is 7% Sunday, 4% Monday, 5% Tuesday, …
Ha: the distribution of people ordering food for delivery differs from the expected distribution.

Putting this in math equation form using “P” to mean percentage:
H0: Distribution = PSunday 0 .07; PMonday 0.04; PTuesday 0.05; PWednesday 0.12; PThursday 0.11; PFriday 0.37; PSaturday 0.24
Ha: Distribution ≠ PSunday 0 .07; PMonday 0.04; PTuesday 0.05; PWednesday 0.12; PThursday 0.11; PFriday 0.37; PSaturday 0.24

Although the ≠ math operator normally indicates a two-tailed test, Chi-square Goodness of Fit tests are always right tail tests.

First, let’s use StatCrunch; then we will use Excel.

Remember, if you are in MyStatLab, look for the small blue rectangles near the upper right of a table. Click on them to automatically load the data into StatCrunch (and into Excel).

This is how StatCrunch looks with the data entered. I labeled the history % “Expected.”

Hint: you do not have to convert the expected % to counts; StatCrunch will do that automatically. And it is smart enough to know if the expected data is already counts (frequency) – not sure how it does this. [Note: I think I figured it out. If the total of the “expected” values = 100, StatCrunch assumes the values are percentages. If they do not equal 100, StatCrunch assumes they are counts.]

Use the command sequence Stat > Goodness of Fit > Chi-square test. In the Observed: box, select the “Frequency_f” column and in the Expected: box, select “Expected.” In the Display: box, select “Expected” so the actual counts will be shown. Click Compute!.

The results box appears. The X test statistic is 21.107 and the p-value is 0.0018, which is less than our alpha of 0.05.

Remember to check to make sure each of the expected frequencies is greater than 5, which they are in this problem. If any of the expected frequencies is less than 5, the Chi-square test is not valid.

If your problem requires you to find the critical value and rejection region, use the StatCrunch Chi-square calculator: Stat > Calculators > Chi-square. Enter the degrees of freedom, DF, which are the number of levels of the variable minus 1, i.e. 7 – 1 = 6 for this problem. Always select the ≥ option in the P(x) box. Enter alpha, 0.05 for this problem, and click Compute.

The critical value, X0 is 12.59 and the rejection area is any value of X greater than 12.59. The X test statistic of 21.107 is greater than 12.59 and thus falls within the rejection area.

Using either the p-value approach or the critical value approach, we reject the null hypothesis.

Because our claim was the alternative, we conclude there is sufficient evidence to support the claim that the observed distribution of people ordering food for delivery is different from the expected pattern.

Now, let’s do the Excel solution. This takes a bit more time, but if you save your worksheet, you can reuse it on similar problems by editing the data ranges in the formulas. Note: you can click on an image to see if full size.

We get the same results as we did when using StatCrunch.

Hope this helps!

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.