UP | HOME

multimin: an interface for GSL optimization routines

Table of Contents

Introduction

multimin is an interface to the GNU Scientific Library minimization routines. It is a simple C function. When invoked, all the information necessary to perform the minimization, that is the function, possibly its derivatives, the initial conditions and the minimization options, are passed as formal parameters.

The main advantage is that multimin automatically handles different kinds of boundary conditions (open, closed, semi-closed). The way in which this is performed is by transforming the problem using a change of coordinates. For instance, assume one is interested in the minimum \(x_0\) of the function \(f(x)\) on the closed interval \([a,b]\). Considering the change of variable

\[ x=h(y)=\frac{a+b}{2} + \frac{a-b}{2} \, \sin(y) \]

one can restate the problem above as finding the minimum \(y_0\) of the function \(f(h(y))\) on the open real line. Once the minimum (or better, one minimum) has been found, the solution of the original problem is obtained as \(x_0=h(y_0)\). The list of the implemented transformations can be found below.

Using multimin one can safely forget about the details of the interior functioning of the GSL minimization routines. The pointer to the objective function to be minimized and, when required, its derivatives, are passed as parameters. So the only programming effort for the user is to properly code these functions. The C function returning the value of the objective function at the point \(x\) should be coded as

void (*f) (const size_t n, const double *x,void *fparams,double *fval)

where n is the dimension of the problem (number of variables), fval is the return value, x is an array of type double of dimension n containing the coordinates where the objective function has to be computed and fparams is a pointer to a list of parameters, different from the value of the variables, on which the function depends. The function returning derivatives of the objective is coded similarly (see below).

Download and install

Download the file multimin.c and multimin.h and copy it in the directory of your choice. In order to work, the GNU scientific library must be properly installed on your system. To compile a program using multimin, remember to include the appropriate header file multimin.h. To test the function, download this file and compile it using

gcc -Wall multimin_test.c multimin.c -lgsl -lgslcblas -o multimin_test

The example is discussed in the section below. For a list of recent additions to multimin you can check the ChangeLog file distributed with the software.

Calling convention

Let's now analyze the calling convention in details

multimin(size_t n,double *x,double *fmin, unsigned *type,double
	 *xmin,double *xmax, void (*f) (const size_t,const double
	 *,void *,double *), void (* df) (const size_t,const double *,
	 void *,double *), void (* fdf) (const size_t,const double *,
	 void *,double *,double *), void *fparams, const struct
	 multimin_params oparams)

the list of the parameters is as follows (where INPUT stands for the input values of the parameter, while OUTPUT indicates their values when the function return):

n
INPUT: dimension of the problem, number of independent variables of the function.
x
INPUT: pointer to an array of n values x[0],...x[n-1] containing the initial estimate of the minimum point. OUTPUT: contains the final estimation of the minimum position
fmin
OUPUT: return the value of the function at the minimum
type

INPUT a pointer to an array of n integers type[1],...,type[n-1] describing the boundary conditions for the different variables. The problem is solved as an unconstrained one on a suitably transformed variable y. Possible values are:

Table 1: Available transformations
value interval function
0 unconstrained \(x=y\)
1 \([x_m,+\infty )\) \(x=x_m+y^2\)
2 \(( -\infty,x_M ]\) \(x=x_M-y^2\)
3 \([ x_m,x_M ]\) \(x=SS+SD \sin(y)\)
4 \(( x_m,+\infty )\) \(x=x_m+\exp(y)\)
5 \(( -\infty,x_M )\) \(x=x_M-\exp(y)\)
6 \(( x_m,x_M )\) \(x=SS+SD \tanh(y)\)
7* \(( x_m,x_M )\) \(x=SS+SD (1+y/\sqrt{1+y^2})\)
8* \(( x_m,+\infty )\) \(x=x_m+.5 (y+\sqrt{1+y^2})\)
9* \(( -\infty,x_M )\) \(x=x_M+.5 (y-\sqrt{1+y^2})\)

where \(SS=(x_m+x_M)/2\) and \(SD=(x_M-x_m)/2\).

(*) These are UNSUPPORTED transformations: they have been used for testing purposes but are not recommended.

xmin xmax
INPUT: pointers to arrays of double containing respectively the lower and upper boundaries of the different variables. For a given variable, only the values that are implied by the type of constraints, defined as in *type, are actually inspected.
f

function that calculates the objective function at a specified point x. Its specification is

void (*f) (const size_t n, const double *x,void *fparams,double *fval)

where

n
INPUT: the number of variables
x
INPUT:the point at which the function is required
fparams
INPUT: pointer to a structure containing parameters required by the function. If no external parameter are required it can be set to NULL.
fval
OUTPUT: the value of the objective function at the current point x.
df

function that calculates the gradient of the objective function at a specified point x. Its specification is

void (*df) (const size_t n, const double *x,void *fparams,double *grad)

where

n
INPUT: the number of variables
x
INPUT:the point at which the function is required
fparams
INPUT: pointer to a structure containing parameters required by the function. If no external parameter are required it can be set to NULL.
grad
OUTPUT: the values of the gradient of the objective function at the current point x are stored in grad[0],...,grad[n-1].
fdf

fdf calculates the value and the gradient of the objective function at a specified point x. Its specification is

void (*fdf) (const size_t n, const double *x,void *fparams,double *fval,double *grad)

where

n
INPUT: the number of variables
x
INPUT:the point at which the function is required
fparams
INPUT: pointer to a structure containing parameters required by the function. If no external parameter are required it can be set to NULL.
fval
OUTPUT: the value of the objective function at the current point x.
grad
OUTPUT: the values of the gradient of the objective function at the current point x are stored in grad[0],...,grad[n-1].
fparams
pointer to a structure containing parameters required by the function. If no external parameter are required it can be set to NULL.
oparams
structure of the type "multiminparams" containing the optimization parameters. The members are
double step_size
size of the first trial step
double tol
accuracy of the line minimization
unsigned maxiter
maximum number of iterations
double epsabs
accuracy of the minimization
double maxsize
final size of the simplex
unsigned method
method to use. Possible values are:
  1. Fletcher-Reeves conjugate gradient
  2. Polak-Ribiere conjugate gradient
  3. Vector Broyden-Fletcher-Goldfarb-Shanno method
  4. Steepest descent algorithm
  5. Nelder-Mead simplex
  6. Vector Broyden-Fletcher-Goldfarb-Shanno ver. 2
  7. Simplex algorithm of Nelder and Mead ver. 2
  8. Simplex algorithm of Nelder and Mead: random initialization
unsigned verbosity
if greater then 0 print info on intermediate steps

Example

Consider the same example described in the GSL maual: a two-dimensional paraboloid function centerd in \((a,b)\) with scale factor \((A,B)\) and minimum \(C\):

\[ f(x,y) = A (x-a)^2 + B (y-b)^2 + C \;. \]

Assume the parameters \((a,b,A,B,C)\) are stored in a five dimensional array named p. The function returning the value of the objective can be coded as follows

void f(const size_t n, const double *x,void *fparams,double *fval){

  double *p = (double *)params;

  *fval=p[2]*(x[0]-p[0])*(x[0]-p[0]) + p[3]*(x[1]-p[1])*(x[1]-p[1]) + p[4];

}

while the function returning the derivatives reads

void df(const size_t n, const double *x,void *fparams,double *grad)
{

    double *p = (double *)params;

    grad[0]=2*p[2]*(x[0]-p[0]);

    grad[1]=2*p[3]*(x[1]-p[1]);
}

and, in addition, one needs a function returning both the value of the objective and of its derivative at the point

void fdf(const size_t n, const double *x,void *fparams,double *fval,double *grad)
{

    double *p = (double *)params;

    *fval=p[2]*(x[0]-p[0])*(x[0]-p[0]) + p[3]*(x[1]-p[1])*(x[1]-p[1]) + p[4];

    grad[0]=2*p[2]*(x[0]-p[0]);

    grad[1]=2*p[3]*(x[1]-p[1]);
}

Once these functions are defined, the program to minimize the objective can simply be

int main(){

  double par[5] = { 1.0, 2.0, 10.0, 20.0, 30.0 };

  double x[2]={0,0};
  double minimum;

  double xmin[2], xmax[2];
  unsigned type[2];

  struct multimin_params optim_par = {.1,1e-2,100,1e-3,1e-5,2,0};

  /* unconstrained minimization */
  multimin(2,x,&minimum,NULL,NULL,NULL,&f,&df,&fdf,(void *) par,optim_par);

  printf("unconstrained minimum %e at x=%e y=%e\n",minimum,x[0],x[1]);

  /* minimum constrained in [-1,2]x(-5,5] */
  type[0]=3;
  xmin[0]=-1;
  xmax[0]=2;
  x[0]=0;

  type[1]=2;
  xmin[1]=-5;
  xmax[1]=5;
  x[1]=0;

  /* increase verbosity */
  optim_par.verbosity=1;

  multimin(2,x,&minimum,type,xmin,xmax,&f,&df,&fdf,par,optim_par);

  printf("constrained minimum %e at x=%e y=%e\n",minimum,x[0],x[1]);


  return 0;
}

Notice that in the unconstrained minimization NULL arrays have been passed to the function. To search for a constrained minimum, the arrays type, xmin and xmax have been set accordingly. Of course, in this case the two minima coincide.

Created: 2024-03-06 Wed 09:49