Regresion

P

Pilar

Regressions in Excel 2000 only allow for 16 columns. I
have more data, what can I do?
 
J

Jerry W. Lewis

The algorithm that LINEST uses (in pre-2003 versions) to calculate
regression coefficients is
=MMULT(MINVERSE(MMULT(TRANSPOSE(x),x)),MMULT(TRANSPOSE(x),y))
where x is the x matrix. Note that the above formula would return
coefficients in the same order as the columns of the x matrix (reverse
of the order of coefficients returned by LINEST). Also note that the x
matrix needs a column of ones to fit an intercept.

The above formula is mathematically exact, but numerically unstable.
With 16+ columns, you may frequently get nonsense because of numerical
problems. Algorithms 75 and 274 from http://lib.stat.cmu.edu/apstat/
are numerically superior to to the above formula.

Another question to consider is whether you would get anything
meaningful from a regression on 16+ variables even if you had unlimited
numerical accuracy. In typical data situations that I have encountered,
this would grossly overfit the data. You might get more meaningful
information from a stepwise regression or a principal components
analysis to identify primary predictors.

Jerry
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top