DMelt:Numeric/6 Minimization
Minimization
Minimization is a subtopic is more broader "mathematical optimization" topic: Mathematical_optimization.
Minuit minimization
Any function that extends jhplot.FNon can be minimized. How to define arbitrary complicated function in multiple dimensions was described before.
1D case
Let us give an example of how to find a minimum of a function shown in this figure:
which is defined as [math]\displaystyle{ a(x-2)^2 * \sqrt{x}+b*x^2 }[/math]. To minimize this function we will use JMinuitOptimizer. Here is a small script that finds and print the value of the variable $x$ which minimizes the function using the Migrad method:
from jhplot.shapes import Line from java.lang import * from java.awt import * from jhplot import * class Func(FNon): def value(self, x): a=self.p[0]; b=self.p[1]; return a*Math.sqrt(x[0])*(x[0]-2)**2+b*x[0]*x[0] c1=HPlot() c1.visible(); c1.setAutoRange() pl=Func("a(x-2)^{2}#sqrt{x}+bx^2",1,2) # 1 var, 2 pars pl.setParameter("par0", 5) # define two parameters pl.setParameter("par1",-6) print pl.parameters() from hep.aida.ref.optimizer.jminuit import * op=JMinuitOptimizerFactory().create() op.setFunction(pl) op.variableSettings("x0").setValue(10) # set initial value op.optimize() # optimization res=op.result() ff=F1D(pl,0,10) # show the function and min value xmin=res.parameters()[0] ymin=ff.eval(xmin) p1=Line(xmin,ymin-10,xmin,ymin+10,BasicStroke(4),Color.red) c1.add(p1) c1.draw(ff) print res.parameters()," status=",res.optimizationStatus()
Configuring the minimization
One can define the minimization methods, strategies, precision, tolerance and a maximum number of iterations. This is especially needed if the minimization fails. The above example uses the so-called Migrad method that requires the knowledge of first derivatives. Thus, it can fail if such knowledge is insufficient.
You can learn about the input configuration to JMinuit using the following lines that should be appended to the above example:
co=op.configuration() print co.method() print co.precision() print co.strategy() print co.tolerance()
where ``co`` is an instantiation of the hep.aida.ref.optimizer.JMinuitConfiguration class.
The output will tell about the used method and initial precision. We will not go to details about other optimization methods that can be defined and used during initialization; an advanced user is welcome to look at the Java API of the JMinuitOptimizerFactory() class.
If the minimization fails, consider using other minimisation methods. In the above example, you can set a new minimization technique right after the "create()", but before the "optimize()" call:
co=op.configuration() co.setMethod(method) op.setConfiguration(co)
where "method" is a string that can take one of the following values:
- MIG" - Migrad method. It is the default method which is the most commonly used. However, it depends on the knowledge of the first derivative and may fail if such knowledge is inaccurate;
- "SIMP" - ``simplex`` method. This is a multidimensional minimization that does not use first derivatives, so it should not be insensitive to the precision with which first derivative is calculated. But it is rather slow method.
- "MINI" - minimize method is equivalent to Migrad. If Migrad fails, it reverts to "SIMP" and then calls Migrad again.
- "IMP" - improved Migrad method that has improvements for searches of a global minimum;
- "SEE" - the seek method uses a Monte Carlo method with random jumps to find a minimum.
Similarly, one can define the minimization strategies, precision, tolerance and a maximum number of iterations. See the hep.aida.ref.optimizer.JMinuitConfiguration class for detail.
2D case
Minimization can be done for a function in any dimension, with any number of parameters. Let us consider 2D case, [math]\displaystyle{ a*(x-2)^2 + (y+1)^2 }[/math]. The example script that performs minimization and plotting in 2D is shown below:
from java.lang import Math from jhplot import * class MyFunc(FNon): # define a function a*(x-2)^2 + (y+1)^2 def value(self, x): return (x[0]-2)*(x[0]-2)*self.p[0]+(x[1]+1)*(x[1]+1) c1 = HPlot3D() # plot data and the fit function c1.visible() c1.setAutoRange() pl = MyFunc("2D function",2,1) # 2 variables, 1 parameter print pl.numberOfParameters() print pl.dimension() print pl.parameterNames().tolist() pl.setParameter("par0",2) print pl.parameters() c1.draw( F2D(pl,-10,10,-10,10) ) # create plottable function and show it from hep.aida.ref.optimizer.jminuit import * op=JMinuitOptimizerFactory().create() op.setFunction(pl) # set function op.variableSettings("x0").setValue(10) # set initial value op.variableSettings("x1").setValue(-10) # set initial value op.optimize() # perform optimization res =op.result() print "Minimization results=",res.parameters(), " status=",res.optimizationStatus()
The output is (2,-1) and the generated image file is:
Migrad function minimization
This is an alternative approach. First we will consider a minimization of a function in 1D using the MnMigrad class.
1D case
Let us consider a trivial example of minimization for a function [math]\displaystyle{ 10+x^2 }[/math].
from org.freehep.math.minuit import * class func(FCNBase): # define user functions def valueOf(self, par): return 10+par[0]*par[0] # 10+x^2 function Par = MnUserParameters() Par.add("x", 1., 0.1) migrad = MnMigrad(func(), Par) vmin = migrad.minimize() state=vmin.userState() print "Min value=:",vmin.fval(), "function calls=",vmin.nfcn() print "Parameters=",state.params() print "Print more information=:",vmin.toString()
The output of this code is: <hidden click here to view the output>
Min value=: 10.0 function calls= 13 Parameters= array('d', [-2.1424395590941003e-10]) Print more information=: Minuit did successfully converge. # of function calls: 13 minimum function value: 10.0000 minimum edm: 4.57506e-20 minimum internal state vector: LAVector parameters: -2.14244e-10 minimum internal covariance matrix: LASymMatrix parameters: 1.00000 # ext. || name || type || value || error +/- 0 || x || free || -2.14244e-10 || 1.00000 MnUserCovariance: 1.00000 MnUserCovariance parameter correlations: 1.00000 MnGlobalCorrelationCoeff: 0.00000
Note that for a complicated function minimisation may fail. It is important to check that the minimisation is successful; if not, one can try alternative strategy of the Minuit. Modify the above code by including isValid() method:
...... migrad = MnMigrad(func(), Par) vmin = migrad.minimize() if vmin.isValid()==False: # try with higher strategy migrad = MnMigrad(func(), upar, 2) min = migrad.minimize() .....
The above example can be run using Java codding as shown below:
import org.freehep.math.minuit.*; public class Main { public static void main(String[] args) { FCNBase myFunction = new FCNBase() { public double valueOf(double[] par) { return 10 + par[0]*par[0]; } }; MnUserParameters myParameters = new MnUserParameters(); myParameters.add("x", 1., 0.1); MnMigrad migrad = new MnMigrad(myFunction, myParameters); FunctionMinimum min = migrad.minimize(); System.out.printf("Minimum value is %g found using %d function calls", min.fval(),min.nfcn()); } }
The above code needs to be modified in order to produce a graphic output. Let us consider a more complex example in which we will minimize the function [math]\displaystyle{ 10+x^2+\exp(x) }[/math]. In order to plot such function and minimize at the same time, we will add additional method "addPlot()" to the function definition. This method returns the (X,Y) array for function plotting. This example shows how to minimize and plot this function. We also indicate the results of minimisation as a red dot:
from org.freehep.math.minuit import FCNBase,MnMigrad,MnUserParameters from java.lang.Math import * class func(FCNBase): # define user functions def valueOf(self, par): return 10+par[0]*par[0]+exp(par[0]) def getPlot(self,min,max): f=P1D("function") step=(max-min)/100.0 # define step for i in range(100): x=min+step*i y=self.valueOf([x]) f.add(x,y) return f Par = MnUserParameters() Par.add("x", 1., 0.1) migrad = MnMigrad(func(), Par) vmin = migrad.minimize() state=vmin.userState() print "Min value=:",vmin.fval(), "function calls=",vmin.nfcn() par=state.params() from jhplot import * # plotting part from jhplot.shapes import * from java.awt import Color c1 = HPlot() c1.visible() min=-10; max=10 # min and max values for plotting c1.setRangeX(min,max) c1.setRangeY(0,200) f=func() p=f.getPlot(min,max) p.setStyle('l'); p.setPenWidth(3); p.setColor(Color.blue) c1.draw(p) # now show min and max values center=P1D("min value") center.add(par[0],vmin.fval()) center.setColor(Color.red) c1.draw(center)
2D case
Now let us consider a more complicated case of minimizing a function in 2D, which will also give you an idea how to do this for any arbitrary function in any dimension. We will consider a function [math]\displaystyle{ 100*(y-x^2)^2+(1-x)^2 }[/math]. This function has the known minima at (1,1) with the Z value close to 0.
from org.freehep.math.minuit import FCNBase,MnUserParameters,MnMigrad from java.lang.Math import * class func(FCNBase): # define user functions def valueOf(self, par): return 100*(par[1]-par[0]*par[0])*(par[1]-par[0]*par[0]) + (1- par[0])*(1-par[0]) Par = MnUserParameters() Par.add("x", 1., 0.1) # initial parameters Par.add("y", 3., 0.1) migrad = MnMigrad(func(), Par) vmin = migrad.minimize() state=vmin.userState() print "Min value=:",vmin.fval(), "function calls=",vmin.nfcn() print state.params()
The output is
Min value=: 1.86879138626e-22 function calls= 35 array('d', [1.0, 1.0])
Constrained optimization
In this section we will discuss constrained optimization by linear approximation for any arbitrary function with any number of variables. The function can be constrained by certain condition. This algorithm is publicly available in the Jcobyla project but it was optimized in order to use it with scripting languages.
Let us consider a minimisation of a function with two variables. The function can be defined in Java as:
10.0 * Math.pow(x[0] + 1.0, 2.0) + Math.pow(x[1], 2.0)
Let us minimize this function and determine a point (x,y) where the value of this function is smallest. We minimizes the objective function above with respect to a set of inequality constraints "CON". The function and CON may be non-linear, and should preferably be smooth.
from com.cureos.numerics import Cobyla,Calcfc from java.lang import Math Calcfc nvar=2 # number of variables nconst=0 # number of constraints rhobeg=0.5 # Initial size of the simplex rhoend=1.0e-10 # Final value of the simplex iprint = 0 # no output to Std.Out maxfun = 5000 # Maximum number of function evaluations before terminating class calFun(Calcfc): def Compute(self, n, m, x, con): return 10.0 * Math.pow(x[0] + 1.0, 2.0) + Math.pow(x[1], 2.0); x = [1.0, 1.0] # initial values of the variables cc=Cobyla() result=cc.FindMinimum(calFun(), nvar, nconst, x, rhobeg, rhoend, iprint, maxfun); print cc.getOutput().tolist()
The output of the above script is:
[-1.0, 0.0]
Note that the corresponding code in Java look as:
public void test01FindMinimum() { // Java example double rhobeg = 0.5; double rhoend = 1.0e-6; int iprint = 1; int maxfun = 3500; System.out.format("%nOutput from test problem 1 (Simple quadratic)%n"); Calcfc calcfc = new Calcfc() { @Override public double Compute(int n, int m, double[] x, double[] con) { return 10.0 * Math.pow(x[0] + 1.0, 2.0) + Math.pow(x[1], 2.0); } }; double[] x = {1.0, 1.0 }; CobylaExitStatus result = Cobyla.FindMinimum(calcfc, 2, 0, x, rhobeg, rhoend, iprint, maxfun); }
The code illustrated that we overwrite the function Calcfc() method by using a custom code.
Let us make a more complicated example: a minimization of function with two variables in unit circle. This adds a constrain on the minimization. The example is shown below:
from com.cureos.numerics import Cobyla,Calcfc from java.lang import Math nvar=2 # number of variables nconst=1 # number of constraints rhobeg=0.5 # Initial size of the simplex rhoend=1.0e-10 # Final value of the simplex iprint = 0 # no output to Std.Out maxfun = 5000 # Maximum number of function evaluations before terminating class calFun(Calcfc): def Compute(self, n, m, x, con): con[0] = 1.0 - x[0] * x[0] - x[1] * x[1]; return x[0] * x[1]; x = [1.0, 1.0] # initial values of the variables cc=Cobyla() result=cc.FindMinimum(calFun(), nvar, nconst, x, rhobeg, rhoend, iprint, maxfun); print cc.getOutput().tolist()
The next example shows minimization in three variables. We will minimize an ellipsoid (with one contain)
from com.cureos.numerics import Cobyla,Calcfc from java.lang import Math nvar=3 # number of variables nconst=1 # number of constraints rhobeg=0.5 # Initial size of the simplex rhoend=1.0e-10 # Final value of the simplex iprint = 0 # no output to Std.Out maxfun = 5000 # Maximum number of function evaluations before terminating class calFun(Calcfc): def Compute(self, n, m, x, con): con[0] = 1.0 - x[0] * x[0] - 2.0 * x[1] * x[1] - 3.0 * x[2] * x[2]; return x[0] * x[1] * x[2]; x = [1.0, 1.0, 1.0] # initial values of the variables cc=Cobyla() result=cc.FindMinimum(calFun(), nvar, nconst, x, rhobeg, rhoend, iprint, maxfun); print cc.getOutput().tolist()
Finally, the next example show a minimization of function with 9 variables and 14 constants:
from com.cureos.numerics import Cobyla,Calcfc from java.lang import Math nvar=9 # number of variables nconst=14 # number of constraints rhobeg=0.5 # Initial size of the simplex rhoend=1.0e-10 # Final value of the simplex iprint = 0 # no output to Std.Out maxfun = 5000 # Maximum number of function evaluations before terminating class calFun(Calcfc): def Compute(self, n, m, x, con): con[0] = 1.0 - x[2] * x[2] - x[3] * x[3]; con[1] = 1.0 - x[8] * x[8]; con[2] = 1.0 - x[4] * x[4] - x[5] * x[5]; con[3] = 1.0 - x[0] * x[0] - Math.pow(x[1] - x[8], 2.0); con[4] = 1.0 - Math.pow(x[0] - x[4], 2.0) - Math.pow(x[1] - x[5], 2.0); con[5] = 1.0 - Math.pow(x[0] - x[6], 2.0) - Math.pow(x[1] - x[7], 2.0); con[6] = 1.0 - Math.pow(x[2] - x[4], 2.0) - Math.pow(x[3] - x[5], 2.0); con[7] = 1.0 - Math.pow(x[2] - x[6], 2.0) - Math.pow(x[3] - x[7], 2.0); con[8] = 1.0 - x[6] * x[6] - Math.pow(x[7] - x[8], 2.0); con[9] = x[0] * x[3] - x[1] * x[2]; con[10] = x[2] * x[8]; con[11] = -x[4] * x[8]; con[12] = x[4] * x[7] - x[5] * x[6]; con[13] = x[8]; return -0.5 * (x[0] * x[3] - x[1] * x[2] + x[2] * x[8] - x[4] * x[8] + x[4] * x[7] - x[5] * x[6]); x = [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0] # initial values of the variables cc=Cobyla() result=cc.FindMinimum(calFun(), nvar, nconst, x, rhobeg, rhoend, iprint, maxfun); print cc.getOutput().tolist()
The approach minimizes an objective function F(X) subject to M inequality constraints on X, where X is a vector of variables that has N components. The algorithm employs linear approximations to the objective and constraint functions, the approximations being formed by linear interpolation at N+1 points in the space of the variables. We regard these interpolation points as vertices of a simplex. The parameter RHO controls the size of the simplex and it is reduced automatically from RHOBEG to RHOEND. For each RHO the subroutine tries to achieve a good vector of variables for the current size, and then RHO is reduced until the value RHOEND is reached. Therefore RHOBEG and RHOEND should be set to reasonable initial changes to and the required accuracy in the variables respectively, but this accuracy should be viewed as a subject for experimentation because it is not guaranteed.
The subroutine has an advantage over many of its competitors, however, which is that it treats each constraint individually when calculating a change to the variables, instead of lumping the constraints together into a single penalty function. The name of the subroutine is derived from the phrase Constrained Optimization BY Linear Approximations.
The user must set the values of N, M, RHOBEG and RHOEND, and must provide an initial vector of variables in X. Further, the value of IPRINT should be set to 0, 1, 2 or 3, which controls the amount of printing during the calculation. Specifically, there is no output if IPRINT=0 and there is output only at the end of the calculation if IPRINT=1. Otherwise each new value of RHO and SIGMA is printed. Further, the vector of variables and some function information are given either when RHO is reduced or when each new value of F(X) computed in the cases IPRINT=2 or IPRINT=3 respectively. Here SIGMA is a penalty parameter, it being assumed that a change to X is an improvement if it reduces the merit function
F(X)+SIGMA*MAX(0.0, - C1(X), - C2(X),..., - CM(X)),
where C1,C2,...,CM denote the constraint functions that should become non-negative eventually, at least to the precision of RHOEND. In the printed output the displayed term that is multiplied by SIGMA is called MAXCV, which stands for 'MAXimum Constraint Violation'. The argument ITERS is an integer variable that must be set by the user to a limit on the number of calls of CALCFC, the purpose of this routine being given below. The value of ITERS will be altered to the number of calls of CALCFC that are made.