Excel’s Data Analysis ToolPak has three tools for running tests of hypotheses using the t-distribution – t-tests. The output from the tools can be a bit confusing because, unlike other statistical software, these do not allow you to specify the “tail of the test” before you run the analysis. Here is how Microsoft explains how to interpret the output here:

Understanding what all that means can be a bit daunting. Here is my attempt as a simpler explanation. For convenience, I am just using the output from the t-test: Two-Sample Assuming Unequal Variances, but the concepts apply to all three t-test tools.

Two-tail Test

A company surveyed a random sample of their employees on how satisfied they were with their job. The manager does not care if one group has a higher or lower rating, and only wants to know if there is a difference in how men and women rate their job satisfaction.

State the Null and Alternative hypotheses:

Null Hypothesis  Ho: Mean Rating Men = Mean Rating Women

Alternative Hypothesis Ha: Mean Rating Men ≠ Mean Rating Women

Important: Put the data ranges for the two groups in the tool dialog box in the same relationship as stated in the Null. The Men group (green highlight) is on the left side of the Null equation and should be placed in the Variable 1 Range field. The Women group (red highlight) is on the right side of the Null equation and must be in the Variable 2 Range field.

Here is the output from the tool using a significance level of alpha, α, = 0.05. Note that the Men group is on the left and the Women group is on the right in the output.

We can see that the mean for the men is smaller than that of the women. But is the apparent difference “real”? Both groups have a lot of variance relative to the size of the means, 3.22 for the men and 2.83 for the women. So, the apparent difference might just be due to the “noise” of the variances.

To make our decision on rejecting (or not rejecting) the null, we can look at the three output values I have highlighted in yellow: the t statistic, the two-tail p-value, and the two-tail critical value of t.

Why two-tail? Consider this graphic:

The first rule for deciding whether to reject the null tells us to compare our test statistic to the critical value.

When we have a two-tail test, we must put half of our significance level of 5% in each tail to account for the possibility of our test statistic being either positive or negative, i.e. one sample mean being larger or smaller than the other. Putting 2.5% in each tail we can calculate a critical value of -2.042 on the left side and +2.042 on the right side.

If our test statistic, the t Stat, falls in either rejection area, less than – 2.042 or larger than + 2.042, we must reject the Null. But here our t Stat of – 1.886 does not fall in either rejection area, so we must decide to not reject the Null.

Another, and perhaps easier way to decide, is to compare the two-tail p-value against our significance level. Thankfully, for a two-tail test, we can just use the p-value the Excel tool gives us. It is 0.069 which is larger than our alpha of 0.05. Thus, this rule also tells us to not reject the Null that there is no difference in the ratings.

Note that the two rules always agree, unless your technology tool is faulty, which is very rare.

For this two-tail test, we do not reject the Null and we conclude that there is no statistically significant difference in the job satisfaction rating for men and women.

Left-tail Test

If the manager believes the men have a lower mean rating then the women, we should run a left-tail test. Why not just run the two-tail? As you will see, a one-tail test gives us more “power” to detect a real effect that is there in the direction we believe it to have. The downside of a one-tail test, if you guess wrong and the effect is in the other direction, the test has no power to detect it.

Null Hypothesis  Ho: Mean Rating Men >= Mean Rating Women

Alternative Hypothesis Ha: Mean Rating Men < Mean Rating Women

Here is our output again with the one-tail values we need highlighted in yellow.

The tail of the test is always determined by the math operator in the Alternative hypothesis, which in this example is the less than symbol. Remember the less than symbol < points to the left, so this is a left-tail test.

Here is our left-tail graphic:

Excel reports the absolute value of the critical values. For a left-tail test, we need the negative t critical which is -1.697. You should note that the one-tail critical value is “smaller” than the two-tail value of -2.042 because we put all of the alpha in that one tail which “pushes” the critical value toward the mean.

Now, the t Stat does fall in the rejection area, so the rule says we must reject the Null hypothesis.

To use the second rule, we can use the one-tail p-value directly from the output. The one-tail p-value Excel always gives us is the left-tail p-value, the area under the curve from the left end to our t Stat. It is 0.034 which is less than our alpha of 0.05.

You should notice that for the same sample data, the left tail test had to power to reject the Null while the two-tail did not.

So, this rule also tells us to reject the Null and we conclude that the mean rating of men is significantly lower than that of women.

Right-tail test

Now our manager believes the men have a higher rating than the women.

Null Hypothesis  Ho: Mean Rating Men <= Mean Rating Women

Alternative Hypothesis Ha: Mean Rating Men > Mean Rating Women

The alternative math operator is greater than > which points to the right, so this is a right-tail test.

We can use the same highlighted one-tail values.

Here is our graphic:

For a right-tail test, we are interested in what is happening on the right side of the curve. We use the positive one-tail critical value of +1.697 and we find our t Stat of -1.866 is very far away from the right tail rejection area. So the first rule tells us to not reject the null.

To find the right-side p-value, we must recall that the area under the curve is equal to 1. Excel always gives us the left-tail p-value for one-tail tests, so we must subtract that value from 1 to get the right-tail p-value. 1 – 0.0344 = 0.966, which is much larger than 0.05, so this rule tells us to not reject the Null of no difference in the ratings of the two groups.

Our conclusion is that the mean job satisfaction of the men is not greater than that of the women.


It is important to note that while using the left-tail test gave us the power to detect the significant “less than” difference between the ratings, using the right-tail test does not. That is why you need to be careful if you decide to use a one-tail test and be pretty sure of the direction of the difference. Using a two-tail test is a bit more conservative in that it will pick up a larger difference either way but misses the smaller significant “less than” difference on the left side.

Using the proper tail of the test makes all the difference.



Support.Office. (n.d.). Use the Analysis ToolPak to perform complex data analysis. Retrieved from Support.Office: http://bit.ly/2XXgg6T