On 6/15/12 4:22 PM, in article
[email protected], "joeu2004"
There has been some debate in COLA as to the correct way
to create a linear trendline in Excel. Someone in the forum
kindly made a video to show what he thought was the correct
method
http://tmp.gallopinginsanity.com/LinearTrendLineCreation.mov
[....]
Does anyone see any missing steps for the creation of a
linear trend line?
I cannot find the original debate in comp.os.linux.advocacy (COLA);just the
thread titled "Visualizing where to draw the standard deviation line", which
refers to yet-another "a debate in COLA".
No matter; I'm not interested. I think we can answer your Excel questions
without getting into anything so esoteric.
Really the video is all you need to see (I am the one who created it and the
other related images with trend lines)... it shows the steps and theymatch
what MS suggests exactly (with the exception that I use the right-click
method to get to some tools where the MS page suggests using the Ribbon).
I see no missing steps in creating the linear trendline per se using Excel.
Thank you. I think anyone who knows Excel well... or even not so well... or
even poorly but can read the steps provided by MS... can see this. But
someone in COLA kept saying I was missing steps:
cc:
------
I will gladly educate your ignorant ass, and give you step by
step instructions on how to correctly do it so you can see
all your missing steps, but first you have to stop lying.
------
So you are missing some key steps before doing the trendline.
I have said that many times now.
------
He later said he was in reference to doing more than just creating the trend
line, which is all I said I was doing - he claimed I was missing steps to do
part of a process I never claimed to do. Yes: his claims were idiotic.
This is the nature of the "advocacy" groups.
Bottom line: my steps for creating a linear trend line were flawless,which
is not to say creating such a line was the only form of analysis thatcould
be done on such data or that the creation of a linear trend line on data
that is clearly not linear is the best way to do an analysis.
Using the same steps, though, I was able to show a trend line with a very
good fit for the second half of 2011:
<
http://tmp.gallopinginsanity.com/LinuxTrend2011-2ndhalf.png>
And also able to show how the trend changed for the first half of 2012:
<
http://tmp.gallopinginsanity.com/LinuxTrendLine2012.jpg>
The fact is, a linear trend line is not going to show a very good fitwith
non-linear data... or data where the trend changes.
I would also select the trendline options to display the trendline equation
and the R-squared of the trendline.
You can see where I did that in the above links... and in others thatI
made... a whole bunch of them combined:
<
http://tmp.gallopinginsanity.com/LinuxMultLinearTrendLines.png>
Even then my labeling is not done well... but my only claim was that I made
the linear trend line correctly, not that the graph had good labeling, etc.
I openly acknowledge it did not.
And after displaying the trendline equation, I might select the option to
Format Trendline Label in order to alter the display precision of the
coefficients. I would select Scientific with 14 decimal places if I want to
see the trendline coefficients "exactly"; that is, the most precision that
Excel will display.
Seems a bit excessive for the needs here - but I did know about those
options.
But none of those "steps" affects how the linear trendline is created.
Correct. Nor do they remove the fact there are other forms of analysis that
can be done.
FYI, we can also use the Excel function LINEST to determine the regression
line coefficients.
If the dates are in A1:A24 and the percentages are in B1:B24, select two
horizontal cells and array-enter the following formula (press
ctrl+shift+Enter instead just Enter): =LINEST(B1:B24,A1:A24).
I have used this function a couple of times... but never for any realwork.
To compare with the "exact" trendline coefficients, I would also format the
LINEST results as Scientific with 14 decimal places.
You might notice infinitesimal differences between the trendline
coefficients and the LINEST results. But in this case, they are same up 10
or 11 decimal places in this case; "close enough for government work". As a
guess, the difference might be attributed to physically different internal
algorithms and/or to differences in internal floating-point arithmetic
anomalies.
For some insight into how the Excel LINEST and linear trendline coefficients
might be derived, seehttp://en.wikipedia.org/wiki/Simple_linear_regression.
Note that =COVAR(A1:A24,B1:B24)/VARP(A1:A24) in E3 and
=AVERAGE(B1:B24)-E3*AVERAGE(A1:A24) are about the same as the LINEST
results. The difference is in the 3 least-significant binary bits of the
floating-point representations in this case.
You might also take note of the wiki page section titled "Normality
assumption". That may or may not be relevant to the COLA debates..
Not really... though the idea of a normal distribution was also debated. A
bit of info you might or might not know: you can easily visualize where the
standard deviation (sigma) lines should be drawn on a normal distribution...
the lines are *always* at the distance from the mean to the inflection point
on the curve (where the concavity changes). I was repeatedly told by the
same person the distance was irrelevant:
cc:
-----
There'se nothing wrong with the image, other than some weird
axis labeling.
-----
Snit'sso fucking stupid he thinks the sigma lines are drawn
based on distance from the mean, not area under the curve.
-----
| The sigma lines are drawn based on the area of the curve -
| which is easy to see when the images screw it up, esp. when
| they do so really badly, like in some of the ones I showed
| you.
They are not wrong.
------
LOL!!!! All of those links are fine. The first sigma lines
cover 68% of the area UNDER THE CURVE.
-----
If you would like to prove, on any single one of the links
you call incorrect, that the first sigma lines do not bound
an area that is 68.2% of the area UNDER THE CURVE, then I
would like to see it.
-----
Hahahaha your "approximate inflection points" are hilarious.
Please, post more on this subject.
------
I explained to him why he was wrong here: <
>
There are many examples of people who should know better getting thiswrong.
Here are some
<
http://www.udel.edu/htr/Statistics/Images/Class12/normal2.gif> From:
<
http://www.udel.edu/htr/Statistics/Notes/class12.html>
Which is the example I used for showing how him can make a decent
approximation visually: <
http://tmp.gallopinginsanity.com/sd.png>.
I also pointed to some other examples which at least appear incorrectto me
(though they are not as far off as the above example):
<
http://www.footballguys.com/shickstandard_1_files/image009.gif> From:
<
http://www.footballguys.com/shickstandard_1.htm>
Sigma lines clearly not at a far enough distance from the mean, esp. on the
graph to the right.
<
http://www.gsseser.com/images/StandardDeviation2s.gif> From:
<
http://www.gsseser.com/Deviation.htm>
Sigma lines clearly not at a far enough distance from the mean.
You would think that such sites would be made by people who knew better. I
openly admit I am not a math wiz but it is rather silly when sites claiming
to be teaching such things get their depictions wrong (of course, oneof the
sites above is from "Footballguys"... and you might not expect them, by
stereotype, to be the most knowledgeable in such areas anyway).
Anyway, sorry to have the idiotic debate spread to other forums... though it
is sorta fun to see people who clearly know what they are talking about rip
apart those who were calling me names and insisting they "knew" I waswrong.
--
The indisputable facts about that absurd debate: <
http://goo.gl/2337P>
cc being proved wrong about his stats BS: <
http://goo.gl/1aYrP>
7 simple questions cc will *never* answer: <
http://goo.gl/cNBzu>
cc again pretends to be knowledgeable about things he is clueless about.