Problem doing t-tests in Excel

Peter Dunn · May 4, 2010

Hi all

I am wanting to do a t-test in Excel (for a class I am teaching). So
as a practice, I did this (in columns A and B):

34 13
35 23
23 29
13 200000
14 200000
56 12
43

These are my two groups, with means:

31.14285714 66679.5

Certainly the means seem different!

However, the ttest procedure gave me this p-value:
0.174751284

The formula was this:
=TTEST(A1:A7,B1:B6,2,3)

What on earth is going on? Can anyone explain? Can anyone suggest the
correct approach?

P.

Carl Witthoft · May 4, 2010

Do you know anything about statistical analysis? What sort of class
are you teaching?

My point is: you should know how to calculate a t-test "manually' and
know what sort of p- values are significant. (and what the p-value
actually indicates).

After that, at least you'll know whether your data is pathological, or
whether Excel has a bug, or the answer is correct.

Mike Middleton · May 4, 2010

Peter Dunn -

What on earth is going on? Can anyone explain? <

I'm not sure, but I speculate that your values are so disparate that Excel's
algorithms are encountering numerical errors and returning spurious results
(instead of returning an error code).

Can anyone suggest the correct approach? <

I suggest using data sets that have sample means that are within several
standard deviations of each other (instead of obviously very different
sample means where it's clear that "Certainly the means seem different!").

And, pedagogically, I suggest charting the data before crunching the
numbers. For example, you could insert a column of 1's to the left of the
first data set and insert a column of 2's to the left of the second data
set. Create an XY (Scatter) chart type for the first data pairs and then add
the second set of data pairs to the chart. The quick result should be a kind
of dot plot that reinforces when and why a t-test for equal population means
might be appropriate. I can send you an excerpt from my Data Analysis Using
MS Excel book if you need more details.

- Mike
http://www.MikeMiddleton.com

Peter Dunn · May 5, 2010

Do you know anything about statistical analysis? What sort of class

are you teaching?

Yes I do know stats; quite a lot. (Which is why I never usually use
excel for my stats :->)

My point is: you should know how to calculate a t-test "manually' and
know what sort of p- values are significant. (and what the p-value
actually indicates).

Well, of course I know this (I've been teaching this stuff at
university for over 20 years). If you don't know the answer, just say
so, or don't post at all.

After that, at least you'll know whether your data is pathological, or
whether Excel has a bug, or the answer is correct.

That's right; I agree. I suspect there is a bug, but maybe it is my
ignorance in using Excel for this. I was wondering if I was doing
something wrong: I use R frequently, SPSS occasionally, and have never
used Excel for stats before. (Perhaps I now know why...)

Further: Even when I copy the data from the Excel Help example, I get
the incorrect P-value. The Help says P=0.196; pasting the same data
and code into Mac Excel I get 0.248. Using R on the same example data
and doing the equivalent paired t-test (R being my usual choice of
software), I get 0.196, agreeing with the help and disagreeing with my
Excel output.

Something is wrong with Mac Excel it would seem. But I can find no
other similar reports, which makes me think perhaps I am at fault.

P.

Mike Middleton · May 5, 2010

Peter Dunn -

Further: Even when I copy the data from the Excel Help example, I get the
incorrect P-value. The Help says P=0.196; pasting the same data and code
into Mac Excel I get 0.248. <

For me, using the Excel Help data set, the TTEST function returns 0.196 both
with Mac Excel 2004 (version 11.5.8) and Mac Excel 2008 (version 12.2.4).

- Mike
http://www.MikeMiddleton.com

Bob Greenblatt · May 5, 2010

Peter Dunn -

I'm not sure, but I speculate that your values are so disparate that
Excel's algorithms are encountering numerical errors and returning
spurious results (instead of returning an error code).

I suggest using data sets that have sample means that are within several
standard deviations of each other (instead of obviously very different
sample means where it's clear that "Certainly the means seem different!").

And, pedagogically, I suggest charting the data before crunching the
numbers. For example, you could insert a column of 1's to the left of
the first data set and insert a column of 2's to the left of the second
data set. Create an XY (Scatter) chart type for the first data pairs and
then add the second set of data pairs to the chart. The quick result
should be a kind of dot plot that reinforces when and why a t-test for
equal population means might be appropriate. I can send you an excerpt
from my Data Analysis Using MS Excel book if you need more details.

- Mike
http://www.MikeMiddleton.com

One of the things I would suspect is how the data got into Excel. Are
the numbers really numbers? Might one or more of the be text? Multiply
all the cells by 1 to force the text to numeric values and try again.
Are the results any different?

Carl Witthoft · May 6, 2010

Thanks -- I appreciate the detailed info. I guess I'm just getting
jaded in this group, as most people who post "it doesn't work" questions
don't know the difference between sqrt(2) and 1.414 :-( .

Anyway, I get the same p as you did in Excel v.X

I then ran t.test() in R on the same data:

Rgames> tx = c(34,35,23,13,14,56,43)
Rgames> ty=c(13,23,29,2e5,2e5,12)
Rgames> t.test(tx,ty)

Welch Two Sample t-test

data: tx and ty
t = -1.5809, df = 5, p-value = 0.1748
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-175023.17 41726.46
sample estimates:
mean of x mean of y
31.14286 66679.50000

So it sure looks to me like Excel is at least consistent with R. Do
you have any other tools (or, god forbid, pencil and paper

) which
give a different result for p? If not, then my suspicion is that this
is simply a pathological set. You know how p-values don't always mean
what people want them to mean.

The Help example in v.X is
=TTEST({3,4,5,8,9,1,2,4,5},{6,19,3,2,14,4,5,17,1},2,1)
And I get exactly the same answer. You may want to double-check the
Tails and Type values you're using.

Personally, I think two values out in left field with all other values
in a very reasonable range should lead to a p-value that is neither
vanishingly small nor optimistically large.

Carl

Help!! find two number recurring numbers in row??	1	Apr 3, 2013
Exporting to text file in multiple lines	3	May 10, 2010
get a list of sequence number that meet 3 criteria	0	Dec 7, 2009
need help finding most common number and so on	2	May 28, 2007
How to get the array like this?	6	Aug 15, 2009
time plots/charts in Excel	3	Oct 23, 2008
Find then highlight in yellow	6	Jan 10, 2012
Record Click on Error Indicator	2	Dec 21, 2010

Problem doing t-tests in Excel

Peter Dunn

Carl Witthoft

Mike Middleton

Peter Dunn

Mike Middleton

Bob Greenblatt

Carl Witthoft

Ask a Question

Similar Threads