Interesting Solver problem (Solver encounters an error)

M

MrShorty

Here's an interesting problem, I wonder if anyone has any thoughts o
this. Recognize that my real problem is very complex (severa
intermediate calculation including some iterative steps), but th
problem I'm having seems similar (conceptually anyway) to this simpl
problem.

Given a data set:

x,y
10,3.9
8,3.2
7,2.8
6,2.2
5,1.4
4.5,0.8
4,0.01
3.8,-0.4
3.6,-1
3.5,-1.4
3.4,-1.8
3.3,-2.4
3.2,-3.2
3.1,-4.6
3.05,-6

One could look at the data and say, "that looks like the curve y=ln(x)
but with a different asymptote other than the y-axis and possibly
scaling factor." So we choose a function of the form y=b*ln(x-a) t
correlate the data. So we add a third column =r1c5*ln(rc1-r1c6) wher
r1c5 and r1c6 will hold our parameters b and a, then pu
=sumxmy2(r2c3:r16c3,r2c2:r16c2) at the bottom of column 3. Then set u
solver to minimize r18c3 by changing r1c5:r1c6.

Now we pull initial guess for b and a out of a hat, and Solver run
into an error. Because on the 2nd or 3rd iteration, solver is going t
try a value for a >3.05 and the LN function will return an error. W
try to improve the initial guesses, but, in this case, we would need t
be pretty close. I could get b=1.9, a=2.9 to converge, but b=1.8,a=2.
wouldn't.

We iterate on each parameter individually, back and forth between b an
a, but this becomes tedious, especially if it takes several tries t
manually locate an initial a that will not generate an error.

For this simple model, one can add a constraint that a<=3.049999 an
thus avoid the error. However, in my real problem, the value for
that generates an error isn't obvious. Also, it appears that th
optimum a value is essentially largest value that won't generate a
error. So I end up manually bisecting the interval between the lowes
value that generates an error and the highest value that doesn't unti
I obtain the desired accuracy in a. Not the most efficient way to d
it, especially when I want to optimize b at the same time.

I don't know how much you'll be able to help, but it seems like a
interesting problem. I don't readily see an option that will tel
Solver to use those error values as part of the optimization algorithm
even though the error values do contain useable information in thi
case. All this exercise might do is show the importance of choosin
appropriate initial guesses for Solver, or that Solver isn't suitabl
for solving all of the world's problems.

Any thoughts?
 
B

B. R.Ramachandran

Hi,

Yes, when I analyzed your data starting with initial guess values of 'a'=2.8
and 'b'=1.8, parameter 'a' becomes 3.08 in the second iteration step (which
is slightly greater than the smallest value in the x-range, i.e., 3.05), and
the Solver stops and returns an error (due to the attempted calculation of
the logarithm of a negative value).

However, I could get around this problem with a slight modification as
follows;

I placed your x- and y- data in A2:A16, and B2:B16 respectively. I created
a dummy parameter (let's call it 'a prime') in D2, and the parameters 'a' and
'b' in E2 and F2 respectively.
In E2 (corresponding to 'a') I entered the formula,
=MIN(A2:A16)*0.999999-D2. I placed initial guess values, 1 for aprime (i.e.,
D2), and 1 for b (in F2).
In C2, I entered the formula =$F$2*LN(A2-$E$2), and autofilled the formula
down to C16; and placed the SSR in G2 with the formula
=SUMXMY2(B2:B16,C2:C16). The SSR was about 94.2 at this point.

I invoked the Solver, to minimize the SSR (G2), by changing 'aprime' and 'b'
(i.e., $D$2, $F$2), and under "Options" checked "Assume Non-Negative".
Solver didn't have any problem and returned the optimized values,
'aprime'=0.050096784 (corresponding to 'a'=2.999900166) and b=1.99943891, and
the minimized SSR was 0.005916092.

Now comes the interesting part: I tried the guess values, 'aprime' = -100
and 'b' = 100. This corresponds to 'a' = slightly less than 103.05, and it
returns error (#NUM) to the entire range C2:C16, vis-a-vis to SSR.
Surprisingly, Solver handled even this situation and returned the same
optimized values as before (it needed a little more than the default 100
iterations; of course I had turned off "Show Iteration Results" for this)!

I think, the bottom-line is, it is safer to use a dummy parameter to
incorporate a constraint to the actual parameter, and optimize the dummy
parameter (with the added constraint to disallow negative values for
'aprime', i.e., 'a' will never become greater than any x-value).

Regards,
B. R. Ramachandran
 
M

MrShorty

Thanks,

That's an interesting idea, to build the conditions into the
spreadsheet model rather than into the solver model. One thing I
didn't like about your approach for this problem is the "assume
non-negative" option, because it applies to both parameters. A
different data set may require b to be less than 0, but your particular
solver model wouldn't find it. You would have to either add a
bprime=-b, or alter the formulas in column 3. Neither of which is a
bad solution, but I would probably rather have a single constraint
aprime>=0 rather than the dual constraint aprime>=0 and b>=0. Of
course, at that point, there's not a lot of difference between a single
constraint a<=3.049999 and aprime>=0. On the other hand, we have to
remember that we are building a Solver model to solve the problem at
hand, and solve future problems when we come to them.
 
B

B. R.Ramachandran

Hi,

Thanks for your feedback. You are absolutely right. I too didn't like the
"Assume non-negative" option since its constrains the other parameter as
well. But then for the particular problem in question that constraint was
necessary. As you correctly put it, I think, we have to build a Solver model
for a problem at hand, instead of trying to build a model that would be
global.

Another idea occurred to me for analyzing your data. Even though I
generally prefer to analyze data as-is and not use linear (or other)
transformations as for as possible, here I think a linear transformation
works well.

In column C, I calculated y values using the formula =$F$2*LN(A2-$E$2),
where E2 and F2 contain 'a' and 'b' respectively. Here, I didn't use the
dummy parameter 'aprime'. For calculating the SSR however, I used the
following formula [corresponding to the linear transform of y = b*ln(x-a) ,
i.e., (x-a) = exp(y/b)].

=SUMXMY2(A22:A36-$E$19,EXP(B22:B36/$F$19))

I started with the guess values a=1 and b=1 (SSR = 2108). When Solver is
invoked (with no constraints), the following result was obtained:

a=2.996056603, b=2.001589552, SSR=0.00621783

Of course, one has to expect slight differences in the values of the
parameters obtained from nonlinear and linear analyses of real-life data (due
to different error-distributions in the raw and transformed data, which is
not taken into account in these optimizations).

Regards,
B. R. Ramachandran
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Top