I’m having a problem with optimization of numerical code using "codegen,optimize":
Maple’s ’codegen,optimize’ plays arround with different ways of writing the same mathematical expression in order to optimize computation time.
It strives to minimize the amount of multiplications, additions, etc ...
The problem, now is that ’codegen,optimize’ does not know that one multiplication of two constants is in fact *no multiplication* because it does only appear during the initialization. So, ’codegen,optimize’ might prefer a writing with 196 multiplications where 5 multiplications are constant expressions (=> real 191 mults) over a writing with 250 multiplications where 150 multiplications are constant (=> real 100 mults).
The final formula is therefore not optimal for computation.
Is there a way to get a result like
(1) CONSTANT EPXRESSIONS: T1 = Mass*Gamma/Radius^2; T2 = Height/Volume*Temprature; ... (2) 'VARIABLE' EXPRESSIONS t1 = T1*v^2/2; t2 = T1/T2*Pressure/Roh;
Expressions (1) could be copmuted before integration. Only expressions (2) are relevant for the computational time.
Is there a way to tell ’optimize’ to find an optimal separated solution ?
So, is there a package like Maple’s ’codegen,optimize’ that can handle this ?
I find that codegen[split] usually does a better job than codegen[optimize]. I suppose that it is problem dependent. That is not the main point of your article amyway.
The problem, now is that ’codegen,optimize’ does not know ...
Programming philosophy: You are using Maple essentially as a compiler. You express your formulas in high-level symbolic code, and then you get Maple to generate efficient code to evaluate them.
It does not matter what the generated code looks like as long as it is efficient.
You should strive to never modify the generated code. Make all modifications at the higher level.
So you can have your symbolic expressions at the higher level, evaulate them numerically, and then substitute then numbers into the generated code.
Here’s a simple example:
> P:= proc(x) > local T1; > T1:= _T1; > # other code > end: > Mass:= 20: Gamma:= 10: Radius:= 5: > NumericP:= subs(_T1= Mass*Gamma/Radius^2, eval(P)); NumericP := proc(x) local T1; T1 := 8 end proc
Here’s a deeper example where I optimize the code and make it so that it can be efficiently processed by evalhf:
This is a famous example from James H. Wilkinson that shows that using the Horner form to optimize numeric polynomial computations is often worthless. The large coefficients drown out the smaller ones.
The polynomial is simply (x-1)*(x-2)*...*(x-20).
We will need its derivative to apply Newton’s Method.
This is the part of my example that is most relevant to your question. Notice that I express my iteration function symbolically in 5 lines.
The generated code will be much longer. Note that I am able to express the derivative and the tolerance symbollically. They won’t appear that way in the generated code.
This procedure generates a procedure that *robustly* and efficiently evaluates the Newton’s Method iteration function in the evalhf context.
By *robustly*, I do not mean that it always converges; rather, I mean that it makes sure that the derivative is not close to zero before it divides by it.
> OptimizeIterator:= proc(Df, tolerance) > local P,x,A; > P:= codegen[split] > (parse > (cat("proc(x,y)" > ,"local d;" > ,"d:= ", convert(Df(x),string), ";" > ,"if abs(d) < ", convert(tolerance,string), " then undefined else x - y/d fi;" > ,"end;" > ) > ) > ); > # "split" produces many local variables for the intermediate results. > # Evalhf can't handle more than 50 locals. Need to put the locals in > # array. op(2, eval(P)) extracts the locals. > codegen[packlocals](P, [op(2, eval(P))], A) > end: > IterW20:= OptimizeIterator(DW20, tolerance): > Hornerize:= F -> unapply(convert(expand(F(x)), horner), x): > H20:= Hornerize(W20): > IterH20:= OptimizeIterator(Hornerize(DW20), tolerance): > RootFind:= proc(f, Iterator, maxsteps, tolerance, guess) > local x,y,k; > x:= guess; > y:= f(x); > for k to maxsteps while abs(y) >= tolerance do > x:= Iterator(x,y); > if x = undefined then break fi; > y:= f(x) > od; > if k=maxsteps then print(`Maxsteps reached`) fi; > x > end: > evalhf(RootFind(W20, IterW20, 30, tolerance, 16.5)); 16. > evalhf(RootFind(H20, IterH20, 30, tolerance, 16.5)); 11.6564624486589921
It converged to a nonroot.
With judicious use of evalhf, I can use this method to find the root associated with 4000 randomly chosen initial guesses in the interval [1,19], and plot the basins of attraction in full color, and plot the number of iterations that it took to reach the root for each guess, also in full color, all in under 10 seconds. ...And some people say that Maple is too slow.
If I add (x-16.) to W20, then 16 is still a root.
> WW20:= unapply(W20(x)+(x-16.), x): > fsolve(WW20(x), x); .999999999999890, 2.00000000001765, 3.00000000024772, 3.99999991836044, 5.00000338109935, 5.99993707274161, 7.00066877649414, 7.99548717272868, 9.02200423473733, 9.93245033577469, 11.2365729267462, 11.6523666711907, 17.1148875209843, 17.9589394650730, 19.0071483259016, 19.9993717380041
Fsolve did not find the root. If you look at the code for ‘fsolve/polynom‘, you’ll see that it converts the polynomial to horner form. The author has ignored Wilkinson’s advice (perhaps with good reason).
The same problem will occur if you expand the polynomial – the large coefficients drown out the effect of the small coefficients. But applying the above Newton’s method to the polynomial in its "raw" form, we get the root exactly: