Tail of the Test: Interpreting Excel Data Analysis t-test output

Excel’s Data Analysis ToolPak has three tools for running tests of hypotheses using the t-distribution – t-tests. The output from the tools can be a bit confusing because, unlike other statistical software, these do not allow you to specify the “tail of the test” before you run the analysis. Here is how Microsoft explains how to interpret the output here:

“Under the assumption of equal underlying population means, if t < 0, “P(T <= t) one-tail” gives the probability that a value of the t-Statistic would be observed that is more negative than t. If t >=0, “P(T <= t) one-tail” gives the probability that a value of the t-Statistic would be observed that is more positive than t. “t Critical one-tail” gives the cutoff value, so that the probability of observing a value of the t-Statistic greater than or equal to “t Critical one-tail” is Alpha.
“P(T <= t) two-tail” gives the probability that a value of the t-Statistic would be observed that is larger in absolute value than t. “P Critical two-tail” gives the cutoff value, so that the probability of an observed t-Statistic larger in absolute value than “P Critical two-tail” is Alpha.

Understanding what all that means can be a bit daunting. Here is my attempt as a simpler explanation. For convenience, I am just using the output from the t-test: Two-Sample Assuming Unequal Variances, but the concepts apply to all three t-test tools.

We will start with the most common test, the two-tail test.

Two-tail Test

A company surveyed a random sample of its employees on how satisfied they were with their job. The manager does not care if one group has a higher or lower rating, and only wants to know if there is a difference in how men and women rate their job satisfaction.

State the Null and Alternative hypotheses:

Null Hypothesis  Ho: Mean Rating Men = Mean Rating Women

Alternative Hypothesis Ha: Mean Rating Men ≠ Mean Rating Women

Note: The tail of the test is indicated by the math operator in the Alternative. Here, not equal (≠)  does not “point” to ether side, so this is a two-tail test.

Important: Put the data ranges for the two groups in the tool dialog box in the same relationship as stated in the Null. The Men group (green highlight) is on the left side of the Null equation and should be placed in the Variable 1 Range field. The Women group (red highlight) is on the right side of the Null equation and must be in the Variable 2 Range field. [your data can be anywhere in your worksheet, but it may be better to arrange it in the right relationship there as well.]

Here is the output from the tool using a significance level of alpha, α, = 0.05. Note that the Men group is on the left and the Women group is on the right in the output.

We can see that the mean for the men is smaller than that of the women. But is the apparent difference “real”? Both groups have a lot of variance relative to the size of the means, 3.22 for the men and 2.83 for the women. So, the apparent difference might just be due to the “noise” of the variances.

To make our decision on rejecting (or not rejecting) the null, we can look at the three output values I have highlighted in yellow: the t statistic, the two-tail p-value, and the two-tail critical value of t.

Why two-tail? Consider this graphic:

The first rule for deciding whether to reject the null tells us to compare our test statistic to the critical value.

When we have a two-tail test, we must put half of our significance level of 5% in each tail to account for the possibility of our test statistic being either positive or negative, i.e. one sample mean being larger or smaller than the other. Putting 2.5% in each tail we can calculate a critical value of -2.042 on the left side and +2.042 on the right side.

If our test statistic, the t Stat, falls in either rejection area, less than – 2.042 or larger than + 2.042, we must reject the Null. But here our t Stat of – 1.886 does not fall in either rejection area, so we must decide to not reject the Null.

Another, and perhaps easier way to decide, is to compare the two-tail p-value against our significance level. Thankfully, for a two-tail test, we can just use the p-value the Excel tool gives us. It is 0.069 which is larger than our alpha of 0.05. Thus, this rule also tells us to not reject the Null that there is no difference in the ratings.

Note that the two rules always agree, unless your technology tool is faulty, which is very rare.

For this two-tail test, we do not reject the Null and we conclude that there is no statistically significant difference in the job satisfaction rating for men and women.

Left-tail Test

If the manager believes the men have a lower mean rating then the women, we should run a left-tail test. Why not just run the two-tail? As you will see, a one-tail test gives us more “power” to detect a real effect that is there in the direction we believe it to have. The downside of a one-tail test, if you guess wrong and the effect is in the other direction, the test has no power to detect it.

Null Hypothesis  Ho: Mean Rating Men >= Mean Rating Women

Alternative Hypothesis Ha: Mean Rating Men < Mean Rating Women

Here is our output again with the one-tail values we need, highlighted in yellow.

The tail of the test is always determined by the math operator in the Alternative hypothesis, which in this example is the less than symbol. Remember the less than symbol < points to the left, so this is a left-tail test.

Here is our left-tail graphic:

Excel reports the absolute value of the critical values. Or you can consider it to be the Right-tail critical value because it is +.

For a left-tail test, we need the negative t critical which is -1.697. You should note that the one-tail critical value is “smaller” than the two-tail value of -2.042 because we put all of the alpha in that one tail which “pushes” the critical value toward the mean.

Now, the t Stat does fall in the rejection area, so the rule says we must reject the Null hypothesis.

To use the second rule, we need to determine the p-value.

Here, Excel’s output can be confusing. If the t Stat is positive, the reported p-value is the right tail – the probability of getting a value for t-stat that is more positive. If the t Stat is negative, the p-value is for the left tail – the probability of getting a value for t-stat that is more negative.

Here, the t-stat is negative, so the p-value is for the left tail test.  It is 0.034 which is less than our alpha of 0.05.

You should notice that for the same sample data, the left tail test had to power to reject the Null while the two-tail did not.

So, this rule also tells us to reject the Null and we conclude that the mean rating of men is significantly lower than that of women.

Example 2: But what about a left-tail test with a positive t Stat?

We will use an example comparing the start time of a hospital procedure with the time the last required technician arrives in the surgical suite. In this example, the doctors are claiming that they are waiting for the technicians to arrive.

The null and alternative are:

Null Hypothesis Ho: Mean Procedure Start Time >= Mean Technician Ready Time

Alternative Hypothesis Ha: Mean Procedure Start Time < Mean Technician Ready Time

In a bit more clear language, the Null says the technicians are ready before the doctors need them. The Alternative says the technicians are not ready when the doctors need them. In the data, time is reported as hours past midnight. For example, the mean Procedure Start time is 7.196 hours or approximately 07:12 AM.

Because the t Stat is positive, the reported one-tail p-value is for the right tail test. We need to find the complement to use it for the left tail test here. So, 1 – 0.04310 = 0.9569. That is much larger than 0.05, so this method tells us to not reject the Null.

Remember the critical value for the left tail is -1.6956, We change the sign because Excel always reports the absolute value of the critical value, which is equivalent to the right tail critical value. Our t-stat of +1.7722 does not fall in the rejection area to the left of -1.6956, so this method also, as you should expect, tells us not to reject the Null.

We conclude that the mean procedure start time is not less (i.e. earlier) than the mean technician ready time. The doctor’s claim is not supported by the data. The doctors do not have to wait because of a late-arriving technician.

Right-tail test

Example 1: Now our manager believes the men have a higher rating than the women.

Null Hypothesis  Ho: Mean Rating Men <= Mean Rating Women

Alternative Hypothesis Ha: Mean Rating Men > Mean Rating Women

Remember the tail of the test is indicated by the math operator in the Alternative. Here, the Alternative math operator is greater than > which points to the right, so this is a right-tail test.

We can use the same highlighted one-tail values.

Here is our graphic:

For a right-tail test, we are interested in what is happening on the right side of the curve. We use the positive one-tail critical value of +1.697 and we find our t Stat of -1.866 is very far away from the right tail rejection area. So the first rule tells us to not reject the null.

To find the right-side p-value, we must recall that the area under the curve is equal to 1. Excel always gives us the left-tail p-value for one-tail tests, so we must subtract that value from 1 to get the right-tail p-value. 1 – 0.0344 = 0.966, which is much larger than 0.05, so this rule tells us to not reject the Null of no difference in the ratings of the two groups.

Our conclusion is that the mean job satisfaction of the men is not greater than that of the women.

Let’s look at Example 2 where we have a positive t-stat in a right -tail test.

Recall this example is comparing the start of a hospital procedure time with the arrival time of the last required technician. The doctors are claiming they have to wait for the technicians to arrive.

The null and alternative are:

Null Hypothesis Ho: Mean Procedure Start Time <= Mean Technician Ready Time 

Alternative Hypothesis Ha: Mean Procedure Start Time > Mean Technician Ready Time

In this example, the doctor’s claim is the Null, that their mean procedure start time is not greater than the mean technician ready time.

Using the p-value method, we see the t-stat is positive. That means the Excel p-value is for the right-tail test and we can use it directly to decide to reject the Null, p-value of 0.0431 < 0.05.

And our t-stat of 1.772 is greater than the right tail critical value of 1.696, so that too tells us to reject the Null and conclude that, on average, the technicians are in place and ready before the mean procedure start time.

Summary

It is important to note, in the first example, that while using the left-tail test gave us the power to detect the significant “less than” difference between the ratings, using the right-tail test does not. That is why you need to be careful if you decide to use a one-tail test and be pretty sure of the direction of the difference. Using a two-tail test is a bit more conservative in that it will pick up a larger difference either way but misses the smaller significant “less than” difference on the left side.

Using the proper tail of the test makes all the difference.

 

 

Support.Office. (n.d.). Use the Analysis ToolPak to perform complex data analysis. Retrieved from Support.Office: http://bit.ly/2XXgg6T

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top