codegen,optimize (31.1.01)

8.25 codegen,optimize (31.1.01)

8.25.1 Frank S.
8.25.2 Carl DeVore (4.2.01)

8.25.1 Frank S.

I’m having a problem with optimization of numerical code using "codegen,optimize":

Maple’s ’codegen,optimize’ plays arround with diﬀerent ways of writing the same mathematical expression in order to optimize computation time.

It strives to minimize the amount of multiplications, additions, etc ...

The problem, now is that ’codegen,optimize’ does not know that one multiplication of two constants is in fact *no multiplication* because it does only appear during the initialization. So, ’codegen,optimize’ might prefer a writing with 196 multiplications where 5 multiplications are constant expressions (=> real 191 mults) over a writing with 250 multiplications where 150 multiplications are constant (=> real 100 mults).

The ﬁnal formula is therefore not optimal for computation.

Is there a way to get a result like

(1) CONSTANT EPXRESSIONS: 
           T1 = Mass*Gamma/Radius^2; 
           T2 = Height/Volume*Temprature; 
           ... 
 
(2) 'VARIABLE' EXPRESSIONS 
           t1 = T1*v^2/2; 
           t2 = T1/T2*Pressure/Roh;

... I am integrating a fairly complex state space equation.

Expressions (1) could be copmuted before integration. Only expressions (2) are relevant for the computational time.

Is there a way to tell ’optimize’ to ﬁnd an optimal separated solution ?

So, is there a package like Maple’s ’codegen,optimize’ that can handle this ?

8.25.2 Carl DeVore (4.2.01)

I ﬁnd that codegen[split] usually does a better job than codegen[optimize]. I suppose that it is problem dependent. That is not the main point of your article amyway.

The problem, now is that ’codegen,optimize’ does not know ...

Programming philosophy: You are using Maple essentially as a compiler. You express your formulas in high-level symbolic code, and then you get Maple to generate eﬃcient code to evaluate them.

It does not matter what the generated code looks like as long as it is eﬃcient.

You should strive to never modify the generated code. Make all modiﬁcations at the higher level.

So you can have your symbolic expressions at the higher level, evaulate them numerically, and then substitute then numbers into the generated code.

Here’s a simple example:

> P:= proc(x) 
>    local T1; 
>    T1:= _T1; 
>    # other code 
> end: 
> Mass:= 20: Gamma:= 10: Radius:= 5: 
> NumericP:= subs(_T1= Mass*Gamma/Radius^2, eval(P)); 
 
            NumericP := proc(x) local T1; T1 := 8 end proc

Here’s a deeper example where I optimize the code and make it so that it can be eﬃciently processed by evalhf:

This is a famous example from James H. Wilkinson that shows that using the Horner form to optimize numeric polynomial computations is often worthless. The large coeﬃcients drown out the smaller ones.

> Digits:= trunc(evalhf(Digits)); 
                             Digits := 15 
 
> tolerance:= 10.^(2-Digits): 
> n:= 20:

The polynomial is simply (x-1)*(x-2)*...*(x-20).

> W20:= unapply(evalf(product(x-i, i= 1..n)), x):

We will need its derivative to apply Newton’s Method.

> DW20:= D(W20):

This is the part of my example that is most relevant to your question. Notice that I express my iteration function symbolically in 5 lines.

The generated code will be much longer. Note that I am able to express the derivative and the tolerance symbollically. They won’t appear that way in the generated code.

This procedure generates a procedure that *robustly* and eﬃciently evaluates the Newton’s Method iteration function in the evalhf context.

By *robustly*, I do not mean that it always converges; rather, I mean that it makes sure that the derivative is not close to zero before it divides by it.

> OptimizeIterator:= proc(Df, tolerance) 
>  local P,x,A; 
>  P:= codegen[split] 
>    (parse 
>       (cat("proc(x,y)" 
>          ,"local d;" 
>          ,"d:= ", convert(Df(x),string), ";" 
>          ,"if abs(d) < ", convert(tolerance,string), " then undefined else x - y/d fi;" 
>       ,"end;" 
>           ) 
>       ) 
>    ); 
>    # "split" produces many local variables for the intermediate results. 
>    # Evalhf can't handle more than 50 locals.  Need to put the locals in 
>    # array.  op(2, eval(P)) extracts the locals. 
>    codegen[packlocals](P, [op(2, eval(P))], A) 
> end: 
 
> IterW20:= OptimizeIterator(DW20, tolerance): 
 
> Hornerize:= F -> unapply(convert(expand(F(x)), horner), x): 
> H20:= Hornerize(W20): 
> IterH20:= OptimizeIterator(Hornerize(DW20), tolerance): 
 
> RootFind:= proc(f, Iterator, maxsteps, tolerance, guess) 
>    local x,y,k; 
>    x:= guess; 
>    y:= f(x); 
>    for k to maxsteps while abs(y) >= tolerance do 
>       x:= Iterator(x,y); 
>       if x = undefined then break fi; 
>       y:= f(x) 
>    od; 
>    if k=maxsteps then print(`Maxsteps reached`) fi; 
>    x 
> end: 
 
> evalhf(RootFind(W20, IterW20, 30, tolerance, 16.5)); 
                                 16. 
 
> evalhf(RootFind(H20, IterH20, 30, tolerance, 16.5)); 
                         11.6564624486589921

It converged to a nonroot.

With judicious use of evalhf, I can use this method to ﬁnd the root associated with 4000 randomly chosen initial guesses in the interval [1,19], and plot the basins of attraction in full color, and plot the number of iterations that it took to reach the root for each guess, also in full color, all in under 10 seconds. ...And some people say that Maple is too slow.

If I add (x-16.) to W20, then 16 is still a root.

> WW20:= unapply(W20(x)+(x-16.), x): 
> fsolve(WW20(x), x); 
 
        .999999999999890, 2.00000000001765, 3.00000000024772, 
        3.99999991836044, 5.00000338109935, 5.99993707274161, 
        7.00066877649414, 7.99548717272868, 9.02200423473733, 
        9.93245033577469, 11.2365729267462, 11.6523666711907, 
        17.1148875209843, 17.9589394650730, 19.0071483259016, 
        19.9993717380041

Fsolve did not ﬁnd the root. If you look at the code for ‘fsolve/polynom‘, you’ll see that it converts the polynomial to horner form. The author has ignored Wilkinson’s advice (perhaps with good reason).

The same problem will occur if you expand the polynomial – the large coeﬃcients drown out the eﬀect of the small coeﬃcients. But applying the above Newton’s method to the polynomial in its "raw" form, we get the root exactly:

> Iter:= OptimizeIterator(D(WW20), tolerance): 
> RootFind(WW20, Iter, 30, tolerance, 16.5); 
                           16.0000000000000

[next] [prev] [prev-tail] [front] [up]