DMelt:Numeric/6 Minimization

From HandWiki
Member

Minimization

Minimization is a subtopic is more broader "mathematical optimization" topic: Mathematical_optimization.

Minuit minimization

Any function that extends jhplot.FNon jhplot.FNon can be minimized. How to define arbitrary complicated function in multiple dimensions was described before.

1D case

Let us give an example of how to find a minimum of a function shown in this figure:

DMelt example: Finding minimum in a function using JMinuit

which is defined as [math]\displaystyle{ a(x-2)^2 * \sqrt{x}+b*x^2 }[/math]. To minimize this function we will use JMinuitOptimizer JMinuitOptimizer. Here is a small script that finds and print the value of the variable $x$ which minimizes the function using the Migrad method:

from jhplot.shapes import Line
from java.lang import *
from java.awt  import *
from jhplot import *

class Func(FNon):
  def value(self, x):
   a=self.p[0]; b=self.p[1];
   return a*Math.sqrt(x[0])*(x[0]-2)**2+b*x[0]*x[0]

c1=HPlot()
c1.visible();  c1.setAutoRange()
pl=Func("a(x-2)^{2}#sqrt{x}+bx^2",1,2) # 1 var, 2 pars 
pl.setParameter("par0", 5) # define two parameters 
pl.setParameter("par1",-6)
print pl.parameters()

from hep.aida.ref.optimizer.jminuit import *
op=JMinuitOptimizerFactory().create()
op.setFunction(pl)
op.variableSettings("x0").setValue(10) # set initial value
op.optimize()                          # optimization 
res=op.result()

ff=F1D(pl,0,10) # show the function and min value 
xmin=res.parameters()[0]
ymin=ff.eval(xmin)
p1=Line(xmin,ymin-10,xmin,ymin+10,BasicStroke(4),Color.red)
c1.add(p1)
c1.draw(ff)
print res.parameters()," status=",res.optimizationStatus()

Configuring the minimization

One can define the minimization methods, strategies, precision, tolerance and a maximum number of iterations. This is especially needed if the minimization fails. The above example uses the so-called Migrad method that requires the knowledge of first derivatives. Thus, it can fail if such knowledge is insufficient.

You can learn about the input configuration to JMinuit using the following lines that should be appended to the above example:

co=op.configuration()
print co.method()
print co.precision()
print co.strategy()
print co.tolerance()

where ``co`` is an instantiation of the hep.aida.ref.optimizer.JMinuitConfiguration hep.aida.ref.optimizer.JMinuitConfiguration class.

The output will tell about the used method and initial precision. We will not go to details about other optimization methods that can be defined and used during initialization; an advanced user is welcome to look at the Java API of the JMinuitOptimizerFactory() class.

If the minimization fails, consider using other minimisation methods. In the above example, you can set a new minimization technique right after the "create()", but before the "optimize()" call:

co=op.configuration()
co.setMethod(method)
op.setConfiguration(co)

where "method" is a string that can take one of the following values:

  • MIG" - Migrad method. It is the default method which is the most commonly used. However, it depends on the knowledge of the first derivative and may fail if such knowledge is inaccurate;
  • "SIMP" - ``simplex`` method. This is a multidimensional minimization that does not use first derivatives, so it should not be insensitive to the precision with which first derivative is calculated. But it is rather slow method.
  • "MINI" - minimize method is equivalent to Migrad. If Migrad fails, it reverts to "SIMP" and then calls Migrad again.
  • "IMP" - improved Migrad method that has improvements for searches of a global minimum;
  • "SEE" - the seek method uses a Monte Carlo method with random jumps to find a minimum.

Similarly, one can define the minimization strategies, precision, tolerance and a maximum number of iterations. See the hep.aida.ref.optimizer.JMinuitConfiguration hep.aida.ref.optimizer.JMinuitConfiguration class for detail.

2D case

Minimization can be done for a function in any dimension, with any number of parameters. Let us consider 2D case, [math]\displaystyle{ a*(x-2)^2 + (y+1)^2 }[/math]. The example script that performs minimization and plotting in 2D is shown below:

from java.lang import Math
from jhplot import *

class MyFunc(FNon):  # define a function a*(x-2)^2 + (y+1)^2 
     def value(self, x):
              return (x[0]-2)*(x[0]-2)*self.p[0]+(x[1]+1)*(x[1]+1)

c1 = HPlot3D()                   # plot data and the fit function
c1.visible()
c1.setAutoRange()
pl = MyFunc("2D function",2,1) # 2 variables, 1 parameter     
print pl.numberOfParameters()
print pl.dimension()
print pl.parameterNames().tolist()
pl.setParameter("par0",2)
print pl.parameters()
c1.draw( F2D(pl,-10,10,-10,10) ) # create plottable function and show it 

from hep.aida.ref.optimizer.jminuit import *
op=JMinuitOptimizerFactory().create()
op.setFunction(pl)                                 # set function 
op.variableSettings("x0").setValue(10)  # set initial value
op.variableSettings("x1").setValue(-10) # set initial value
op.optimize()                                         # perform optimization 
res =op.result()
print "Minimization results=",res.parameters(), " status=",res.optimizationStatus()

The output is (2,-1) and the generated image file is:

DMelt example: Minimization of a function in 3D


Migrad function minimization

This is an alternative approach. First we will consider a minimization of a function in 1D using the MnMigrad MnMigrad class.

1D case

Let us consider a trivial example of minimization for a function [math]\displaystyle{ 10+x^2 }[/math].

from org.freehep.math.minuit import *

class func(FCNBase): # define user functions  
  def valueOf(self, par):
     return 10+par[0]*par[0]  # 10+x^2 function

Par = MnUserParameters()
Par.add("x", 1., 0.1)
migrad = MnMigrad(func(), Par)
vmin = migrad.minimize()
state=vmin.userState()
print "Min value=:",vmin.fval(), "function calls=",vmin.nfcn()
print "Parameters=",state.params()
print "Print more information=:",vmin.toString()

The output of this code is: <hidden click here to view the output>

Min value=: 10.0 function calls= 13
Parameters= array('d', [-2.1424395590941003e-10])
Print more information=: 
Minuit did successfully converge.
# of function calls: 13
minimum function value: 10.0000
minimum edm: 4.57506e-20
minimum internal state vector: LAVector parameters: -2.14244e-10 
minimum internal covariance matrix: LASymMatrix parameters: 1.00000 
# ext. ||   name    ||   type  ||   value   ||  error +/- 
     0 ||         x ||   free   || -2.14244e-10 ||    1.00000
MnUserCovariance:   1.00000 
MnUserCovariance parameter correlations:  1.00000 
MnGlobalCorrelationCoeff:  0.00000


Note that for a complicated function minimisation may fail. It is important to check that the minimisation is successful; if not, one can try alternative strategy of the Minuit. Modify the above code by including isValid() method:

......
migrad = MnMigrad(func(), Par)
vmin = migrad.minimize()
if vmin.isValid()==False:  # try with higher strategy
            migrad = MnMigrad(func(), upar, 2)
            min = migrad.minimize()
.....


The above example can be run using Java codding as shown below:

import org.freehep.math.minuit.*;
public class Main {
   public static void main(String[] args) {
      FCNBase myFunction = new FCNBase() {
         public double valueOf(double[] par) {
            return 10 + par[0]*par[0]; } };
            
      MnUserParameters myParameters = new MnUserParameters();
      myParameters.add("x", 1., 0.1);
      MnMigrad migrad = new MnMigrad(myFunction, myParameters);
      FunctionMinimum min = migrad.minimize();
      System.out.printf("Minimum value is %g found using %d function calls",
         min.fval(),min.nfcn());
   }  }

The above code needs to be modified in order to produce a graphic output. Let us consider a more complex example in which we will minimize the function [math]\displaystyle{ 10+x^2+\exp(x) }[/math]. In order to plot such function and minimize at the same time, we will add additional method "addPlot()" to the function definition. This method returns the (X,Y) array for function plotting. This example shows how to minimize and plot this function. We also indicate the results of minimisation as a red dot:


from org.freehep.math.minuit import FCNBase,MnMigrad,MnUserParameters
from java.lang.Math import *

class func(FCNBase): # define user functions  
  def valueOf(self, par):
     return 10+par[0]*par[0]+exp(par[0])
  def getPlot(self,min,max):  
        f=P1D("function")
        step=(max-min)/100.0 # define step
        for i in range(100):
            x=min+step*i
            y=self.valueOf([x])
            f.add(x,y)
        return f 

Par = MnUserParameters()
Par.add("x", 1., 0.1)
migrad = MnMigrad(func(), Par)
vmin = migrad.minimize()
state=vmin.userState()
print "Min value=:",vmin.fval(), "function calls=",vmin.nfcn()
par=state.params()


from jhplot import *   # plotting part
from jhplot.shapes  import *
from java.awt import Color
c1 = HPlot()   
c1.visible()
min=-10;  max=10      # min and max values for plotting
c1.setRangeX(min,max)
c1.setRangeY(0,200)
f=func()
p=f.getPlot(min,max)
p.setStyle('l'); p.setPenWidth(3); p.setColor(Color.blue)
c1.draw(p)

# now show min and max values
center=P1D("min value")
center.add(par[0],vmin.fval())
center.setColor(Color.red)
c1.draw(center)

2D case

Now let us consider a more complicated case of minimizing a function in 2D, which will also give you an idea how to do this for any arbitrary function in any dimension. We will consider a function [math]\displaystyle{ 100*(y-x^2)^2+(1-x)^2 }[/math]. This function has the known minima at (1,1) with the Z value close to 0.

from org.freehep.math.minuit import FCNBase,MnUserParameters,MnMigrad 
from java.lang.Math import *

class func(FCNBase): # define user functions  
  def valueOf(self, par):
     return 100*(par[1]-par[0]*par[0])*(par[1]-par[0]*par[0]) + (1- par[0])*(1-par[0])
     
Par = MnUserParameters()
Par.add("x", 1., 0.1)  # initial parameters
Par.add("y", 3., 0.1)
migrad = MnMigrad(func(), Par)
vmin = migrad.minimize()
state=vmin.userState()
print "Min value=:",vmin.fval(), "function calls=",vmin.nfcn()
print state.params()

The output is

Min value=: 1.86879138626e-22 function calls= 35
array('d', [1.0, 1.0])

Constrained optimization

In this section we will discuss constrained optimization by linear approximation for any arbitrary function with any number of variables. The function can be constrained by certain condition. This algorithm is publicly available in the Jcobyla project but it was optimized in order to use it with scripting languages.


Let us consider a minimisation of a function with two variables. The function can be defined in Java as:

10.0 * Math.pow(x[0] + 1.0, 2.0) + Math.pow(x[1], 2.0)

Let us minimize this function and determine a point (x,y) where the value of this function is smallest. We minimizes the objective function above with respect to a set of inequality constraints "CON". The function and CON may be non-linear, and should preferably be smooth.


from com.cureos.numerics import Cobyla,Calcfc
from java.lang import Math
Calcfc
nvar=2        # number of variables
nconst=0      # number of constraints
rhobeg=0.5    # Initial size of the simplex 
rhoend=1.0e-10 # Final value of the simplex 
iprint = 0     # no output to Std.Out
maxfun = 5000  # Maximum number of function evaluations before terminating 
     
class calFun(Calcfc):           
  def Compute(self, n, m, x, con):
      return 10.0 * Math.pow(x[0] + 1.0, 2.0) + Math.pow(x[1], 2.0); 

x = [1.0, 1.0]  # initial values of the variables
cc=Cobyla()
result=cc.FindMinimum(calFun(), nvar, nconst, x, rhobeg, rhoend, iprint, maxfun); 
print  cc.getOutput().tolist()

The output of the above script is:

[-1.0, 0.0]

Note that the corresponding code in Java look as:

public void test01FindMinimum() {  // Java example
        double rhobeg = 0.5;
        double rhoend = 1.0e-6;
        int iprint = 1;
        int maxfun = 3500;
          
        System.out.format("%nOutput from test problem 1 (Simple quadratic)%n");
        Calcfc calcfc = new Calcfc() {
            @Override
            public double Compute(int n, int m, double[] x, double[] con) {
                return 10.0 * Math.pow(x[0] + 1.0, 2.0) + Math.pow(x[1], 2.0);
            }
        };
        double[] x = {1.0, 1.0 };
        CobylaExitStatus result = Cobyla.FindMinimum(calcfc, 2, 0, x, rhobeg, rhoend, iprint, maxfun);
}

The code illustrated that we overwrite the function Calcfc() method by using a custom code.

Let us make a more complicated example: a minimization of function with two variables in unit circle. This adds a constrain on the minimization. The example is shown below:

from com.cureos.numerics import Cobyla,Calcfc
from java.lang import Math

nvar=2         # number of variables
nconst=1       # number of constraints
rhobeg=0.5     # Initial size of the simplex 
rhoend=1.0e-10 # Final value of the simplex 
iprint = 0     # no output to Std.Out
maxfun = 5000  # Maximum number of function evaluations before terminating 
     
class calFun(Calcfc):           
  def Compute(self, n, m, x, con):
      con[0] = 1.0 - x[0] * x[0] - x[1] * x[1];
      return x[0] * x[1];

x = [1.0, 1.0]  # initial values of the variables
cc=Cobyla()
result=cc.FindMinimum(calFun(), nvar, nconst, x, rhobeg, rhoend, iprint, maxfun); 
print  cc.getOutput().tolist()

The next example shows minimization in three variables. We will minimize an ellipsoid (with one contain)

from com.cureos.numerics import Cobyla,Calcfc
from java.lang import Math

nvar=3         # number of variables
nconst=1       # number of constraints
rhobeg=0.5     # Initial size of the simplex 
rhoend=1.0e-10 # Final value of the simplex 
iprint = 0     # no output to Std.Out
maxfun = 5000  # Maximum number of function evaluations before terminating 
     
class calFun(Calcfc):           
  def Compute(self, n, m, x, con):
      con[0] = 1.0 - x[0] * x[0] - 2.0 * x[1] * x[1] - 3.0 * x[2] * x[2];
      return x[0] * x[1] * x[2];

x = [1.0, 1.0, 1.0]  # initial values of the variables
cc=Cobyla()
result=cc.FindMinimum(calFun(), nvar, nconst, x, rhobeg, rhoend, iprint, maxfun); 
print  cc.getOutput().tolist()

Finally, the next example show a minimization of function with 9 variables and 14 constants:

from com.cureos.numerics import Cobyla,Calcfc
from java.lang import Math

nvar=9         # number of variables
nconst=14      # number of constraints
rhobeg=0.5     # Initial size of the simplex 
rhoend=1.0e-10 # Final value of the simplex 
iprint = 0     # no output to Std.Out
maxfun = 5000  # Maximum number of function evaluations before terminating 
     
class calFun(Calcfc):           
  def Compute(self, n, m, x, con):
                con[0] = 1.0 - x[2] * x[2] - x[3] * x[3];
                con[1] = 1.0 - x[8] * x[8];
                con[2] = 1.0 - x[4] * x[4] - x[5] * x[5];
                con[3] = 1.0 - x[0] * x[0] - Math.pow(x[1] - x[8], 2.0);
                con[4] = 1.0 - Math.pow(x[0] - x[4], 2.0) - Math.pow(x[1] - x[5], 2.0);
                con[5] = 1.0 - Math.pow(x[0] - x[6], 2.0) - Math.pow(x[1] - x[7], 2.0);
                con[6] = 1.0 - Math.pow(x[2] - x[4], 2.0) - Math.pow(x[3] - x[5], 2.0);
                con[7] = 1.0 - Math.pow(x[2] - x[6], 2.0) - Math.pow(x[3] - x[7], 2.0);
                con[8] = 1.0 - x[6] * x[6] - Math.pow(x[7] - x[8], 2.0);
                con[9] = x[0] * x[3] - x[1] * x[2];
                con[10] = x[2] * x[8];
                con[11] = -x[4] * x[8];
                con[12] = x[4] * x[7] - x[5] * x[6];
                con[13] = x[8];
                return -0.5 * (x[0] * x[3] - x[1] * x[2] + x[2] * x[8] - x[4] * x[8] + x[4] * x[7] - x[5] * x[6]);

x = [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]  # initial values of the variables
cc=Cobyla()
result=cc.FindMinimum(calFun(), nvar, nconst, x, rhobeg, rhoend, iprint, maxfun); 
print  cc.getOutput().tolist()

The approach minimizes an objective function F(X) subject to M inequality constraints on X, where X is a vector of variables that has N components. The algorithm employs linear approximations to the objective and constraint functions, the approximations being formed by linear interpolation at N+1 points in the space of the variables. We regard these interpolation points as vertices of a simplex. The parameter RHO controls the size of the simplex and it is reduced automatically from RHOBEG to RHOEND. For each RHO the subroutine tries to achieve a good vector of variables for the current size, and then RHO is reduced until the value RHOEND is reached. Therefore RHOBEG and RHOEND should be set to reasonable initial changes to and the required accuracy in the variables respectively, but this accuracy should be viewed as a subject for experimentation because it is not guaranteed.

The subroutine has an advantage over many of its competitors, however, which is that it treats each constraint individually when calculating a change to the variables, instead of lumping the constraints together into a single penalty function. The name of the subroutine is derived from the phrase Constrained Optimization BY Linear Approximations.

The user must set the values of N, M, RHOBEG and RHOEND, and must provide an initial vector of variables in X. Further, the value of IPRINT should be set to 0, 1, 2 or 3, which controls the amount of printing during the calculation. Specifically, there is no output if IPRINT=0 and there is output only at the end of the calculation if IPRINT=1. Otherwise each new value of RHO and SIGMA is printed. Further, the vector of variables and some function information are given either when RHO is reduced or when each new value of F(X) computed in the cases IPRINT=2 or IPRINT=3 respectively. Here SIGMA is a penalty parameter, it being assumed that a change to X is an improvement if it reduces the merit function

F(X)+SIGMA*MAX(0.0, - C1(X), - C2(X),..., - CM(X)),

where C1,C2,...,CM denote the constraint functions that should become non-negative eventually, at least to the precision of RHOEND. In the printed output the displayed term that is multiplied by SIGMA is called MAXCV, which stands for 'MAXimum Constraint Violation'. The argument ITERS is an integer variable that must be set by the user to a limit on the number of calls of CALCFC, the purpose of this routine being given below. The value of ITERS will be altered to the number of calls of CALCFC that are made.