Skip to content
Snippets Groups Projects
Commit 1ee784ab authored by Lorenzo Moneta's avatar Lorenzo Moneta
Browse files

Update math libraries and fitting user guide

parent 87fed364
No related branches found
No related tags found
No related merge requests found
......@@ -9,33 +9,25 @@ needs to be drawn in a pad before the Fit Panel is invoked. The method
## The Fit Method
The Fit method is implemented in ROOT for the histogram classes `TH1`,
the sparce histogram classes, `THnSparse`, the graph classes, `TGraph`,
`TGraph2D` and `TMultiGraph` for fitting a collection of Graphs with the same function.
### The TH1::Fit Method
To fit a histogram programmatically, you can use the `TH1::Fit`
method. Here is the signature of `TH1::Fit` and an explanation of the
method. Here is the signatures of `TH1::Fit` and an explanation of the
parameters:
``` {.cpp}
void Fit(const char *fname, Option_t *option, Option_t *goption,
TFitResultPtr Fit(TF1 *function, Option_t *option, Option_t *goption,
Axis_t xxmin, Axis_t xxmax)
```
- `function` a pointer to the fitted function (the fit model) object.
One can also use the function name. This name may be one of ROOT pre-defined
function names or a user-defined function. See the next paragraph for the list of pre-defined functions.
- `*fname: `The name of the fitted function (the model) is passed as
the first parameter. This name may be one of ROOT pre-defined
function names or a user-defined function. The functions below are
predefined, and can be used with the `TH1::Fit` method:
- "`gaus`" Gaussian function with 3 parameters:
`f(x) = p0*exp(-0.5*((x-p1)/p2)^2)`
- "`expo`"An Exponential with 2 parameters: `f(x) = exp(p0+p1*x)`
- "`pol`*`N`*" A polynomial of degree *N*:
`f(x) = p0 + p1*x + p2*x2 +...`
- "`landau`" Landau function with mean and sigma. This function has
been adaptedfrom the `CERNLIB` routine `G110 denlan`.
- `*option:`The second parameter is the fitting option. Here is the
- `*option:` The second parameter is the fitting option. Here is the
list of fitting options:
- "`W`" Set all weights to 1 for non empty bins; ignore error bars
......@@ -46,17 +38,21 @@ parameters:
- "`I`" Use integral of function in bin instead of value at bin
center
- "`L`" Use log likelihood method (default is chi-square method)
- "`L`" Use log likelihood method (default is chi-square method). To be used when
the histogram represents counts
- "`U`" Use a user specified fitting algorithm
- "`WL`" Weighted log likelihood method. To be used when the histogram has been filled with
weights different than 1.
- "`Q`" Quiet mode (minimum printing)
- "`V`" Verbose mode (default is between Q and V)
- "`S`" The result of the fit is returned in the `TFitResultPtr`.
- "`E`" Perform better errors estimation using the Minos technique
- "`M`" Improve fit results
- "`M`" Improve fit results, by using the *IMPROVE* algorithm of TMinuit.
- "`R`" Use the range specified in the function range
......@@ -70,17 +66,15 @@ parameters:
one is kept)
- "`B`"Use this option when you want to fix one or more parameters
and the fitting function is like `polN`, `expo`, `landau`, `gaus`.
and the fitting function is a predefined one, like `polN`, `expo`, `landau`, `gaus`.
Note that in case of pre-defined functions some default initial values and limits are set.
- "`LL`"An improved Log Likelihood fit in case of very low
statistics and when bincontentsare not integers. Do not use this
option if bin contents are large (greater than 100).
- "`C`"In case of linear fitting, don't calculate the chisquare
(saves time).
- "`F`"If fitting a `polN`, switch to `Minuit` fitter (by default,
`polN` functions are fitted by the linear fitter).
- "`F`"If fitting a linear function (e.g. `polN`), switch to use the default minimizer (e.g. `Minuit`). By default,
`polN` functions are fitted by the linear fitter.
- `*goption: `The third parameter is the graphics option that is the
same as in the **`TH1`**`::Draw` (see the chapter Draw Options).
......@@ -88,10 +82,29 @@ parameters:
- `xxmin`, `xxmax:`Thee fourth and fifth parameters specify the
range over which to apply the fit.
By default, the fitting function object is added to the histogram and
By default, the fitted function object is added to the histogram and
is drawn in the current pad.
## Fit with a Predefined Function
### The TGraph::Fit Method
The signature for fitting a TGraph is exactly the same as for the `TH1`. Only some options apply only for fitting histograms,
these are the options `L`, `WL` and `I`.
These options apply instead only for `TGraph::Fit`:
* `TGraph` specific *options*
- "`EX0`" When fitting a `TGraphErrors` or a `TgraphAsymErrors` the errors on the coordinates are not used in the fit
- "`ROB`" in case of linear fitting use the Robust fitting. Compute the LTS regression coefficients (robust (resistant) regression),
using the default fraction of good points.
- "`ROB=0.x`" as above, but compute the LTS regression coefficients, using 0.x as a fraction of good points.
## The `TF1` function class
Here we will show how to create the `TF1` class that is used for fitting histograms and graphs.
### Fit with a Predefined Function
To fit a histogram with a predefined function, simply pass the name of
......@@ -102,10 +115,30 @@ this line fits histogram object `hist` with a Gaussian.
root[] hist.Fit("gaus");
```
The initial parameter values for pre-defined functions are set
automatically.
The initial parameter values (and eventual limits) for pre-defined functions are set
automatically. For overriding the default limits values use the fit option `B`.
The list of pre-defined functions that can be used with the `Fit` method is the following:
## Fit with a User-Defined Function
- "`gaus`" Gaussian function with 3 parameters:
`f(x) = p0*exp(-0.5*((x-p1)/p2)^2)`
- "`expo`"An Exponential with 2 parameters: `f(x) = exp(p0+p1*x)`
- "`pol`*`N`*" A polynomial of degree *N*, where N is a number between 0 and 9:
`f(x) = p0 + p1*x + p2*x2 +...`
- "`chebyshev`*`N`*" A Chebyshev polynomial of degree *N*, where N is a number between 0 and 9:
`f(x) = p0 + p1*x + p2*(2*x2-1) +...`
- "`landau`" Landau function with mean and sigma. This function has
been adapted from the `CERNLIB` routine `G110 denlan` (see `TMath::Landau`).
- "`gausn` Normalized form of the gaussian function with 3 parameters
`f(x) = p0*exp(-0.5*((x-p1)/p2)^2)/(p2 *sqrt(2PI))`
### Creating User-Defined Functions (TF1)
You can create a **`TF1`** object and use it in the call the
......@@ -117,9 +150,10 @@ the **`TF1`** object. There are three ways to create a **`TF1`**.
- Same as first one, with parameters
- Using a function that you have defined
- Using a function that you have defined. This can be a free function or
a functor object or a particular member function of a class.
### Creating a TF1 with a Formula
#### Creating a TF1 with a Formula
Let's look at the first case. Here we call the **`TF1`** constructor
......@@ -136,7 +170,7 @@ You can also use a **`TF1`** object in the constructor of another
root[] TF1 *f2 = new TF1("f2","f1*2",0,10)
```
### Creating a TF1 with Parameters
#### Creating a TF1 with Parameters
The second way to construct a **`TF1`** is to add parameters to the
......@@ -169,11 +203,11 @@ This sets parameter 0 to 10 and parameter 1 to 5. We can now draw the
root[] f1->Draw()
```
### Creating a TF1 with a User Function
#### Creating a TF1 with a User Function
The third way to build a **`TF1`** is to define a function yourself
and then give its name to the constructor. A function for a **`TF1`**
and then pass the function pointer to the constructor. A function for a **`TF1`**
constructor needs to have this exact signature:
``` {.cpp}
......@@ -232,8 +266,48 @@ Now we use the function:
}
```
## Fixing and Setting Parameters' Bounds
You can create a TF1 also from a C++ function object (functor) with parameters
A TF1 can be created from any C++ class implementing this member function:
```{.cpp}
double operator()(double *x, double *p)
```
The advantage of the function object is that it can have a state and reference therefore what-ever other object
the user needs, without using globals. This is an example to define first the function object
``` {.cpp}
class MyFunctionObject {
public:
// use constructor to customize your function object
MyFunctionObject(......) { ......}
double operator() (double *x, double *p) {
// function implementation using class data members
}
};
```
and then use it to create the `TF1`:
```{.cpp}
MyFunctionObject fobj(....); // create the function object
TF1 * f = new TF1("f",fobj,xmin,xmax,npar); // create TF1 class with n-parameters and range [xmin,xmax]
```
If using C++11, one can create a `TF1` also from a C++ `lambda` function:
```{.cpp}
// create TF1 class with 2 parameters and range [xmin,xmax] using a lambda
TF1 * f = new TF1("f",[](double*x,double*p){return p[0] + p[1]*x[0];},xmin,xmax,2);
```
## Configuring the Fit
We will show here some configuration actions that can or must be done
when fitting histogram or graph using the `Fit` method.
### Fixing and Setting Parameters' Bounds
Parameters must be initialized before invoking the `Fit` method. The
setting of the parameter initial values is automatic for the
......@@ -278,8 +352,7 @@ With this setup, parameters 0`->`2 can vary freely, parameter 3 has
boundaries [-10, 4] with initial value -1.5, and parameter 4 is fixed
to 0.
## Fitting Sub Ranges
### Fitting Sub Ranges
By default, `TH1::Fit` will fit the function on the defined histogram
range. You can specify the option "`R`" in the second parameter of
......@@ -301,140 +374,8 @@ root[] hist->Fit("f1","","",-2,2)
See macros `$ROOTSYS/tutorials/fit/myfit.C` and `multifit.C` as more
completed examples.
## The Fit Panel
![The Fit Panel](pictures/03000061.png)
To display the Fit Panel right click on a histogram to pop up the
context menu, and then select the menu entry Fit Panel.
The new Fit Panel GUI is available in ROOT v5.14. Its goal is to
replace the old Fit Panel and to provide more user friendly way for
performing, exploring and comparing fits.
By design, this user interface is planned to contain two tabs:
"General" and "Minimization". Currently, the "General" tab provides
user interface elements for setting the fit function, fit method and
different fit, draw, print options.
The new fit panel is a modeless dialog, i.e. when opened, it does not
prevent users from interacting with other windows. Its first prototype
is a singleton application. When the Fit Panel is activated, users can
select an object for fitting in the usual way, i.e. by left-mouse
click on it. If the selected object is suitable for fitting, the fit
panel is connected with this object and users can perform fits by
setting different parameters and options.
### Function Choice and Settings
*‘Predefined' combo box* - contains a list of predefined functions in
ROOT. You have a choice of several polynomials, a Gaussian, a Landau,
and an Exponential function. The default one is Gaussian.
*‘Operation' radio button group* defines the selected operational mode
between functions:
*Nop* - no operation (default);
*Add* - addition;
*Conv* - convolution (will be implemented in the future).
Users can enter the function expression into the text entry field
below the ‘Predefined' combo box. The entered string is checked after
the Enter key was pressed and an error message shows up, if the
function string is not accepted.
*Set Parameters*' button opens a dialog for parameters settings,
which will be explaned later.
### Fitter Settings
*‘Method' combo box* currently provides only two fit model choices:
Chi-square and Binned Likelihood. The default one is Chi-square. The
Binned Likelihood is recomended for bins with low statistics.
*‘Linear Fit' check button* sets the use of Linear fitter when is
selected. Otherwise the minimization is done by Minuit, i.e. fit
option "`F`" is applied. The Linear fitter can be selected only for
functions linears in parameters (for example - `polN)`.
*‘Robust' number entry* sets the robust value when fitting graphs.
*‘No Chi-square' check button* switch On/Off the fit option "`C`" -
do not calculate Chi-square (for Linear fitter).
*‘Integral' check button* switch On/Off the option "`I`" - use
integral of function instead of value in bin center.
*‘Best Errors'* sets On/Off the option "`E`" - better errors
estimation by using Minos technique.
*‘All weights = 1'* sets On/Off the option "`W`"- all weights set to 1
excluding empty bins; error bars ignored.
*‘Empty bins, weights=1'* sets On/Off the option "`WW`" - all weights
equal to 1 including empty bins; error bars ignored.
*‘Use range'* sets On/Off the option "`R`" - fit only data within the
specified function range. Sliders settings are used if this option is
set to On. Users can change the function range values by pressing the
left mouse button near to the left/right slider edges. It is possible
to change both values simultaneously by pressing the left mouse button
near to the slider center and moving it to a new position.
*‘Improve fit results'* sets On/Off the option "`M`"- after minimum is
found, search for a new one.
*‘Add to list'* sets On/Off the option "`+`"- add function to the list
without deleting the previous one. When fitting a histogram, the
function is attached to the histogram's list of functions. By default,
the previously fitted function is deleted and replaced with the most
recent one, so the list only contains one function. Setting this
option to On will add the newly fitted function to the existing list
of functions for the histogram. Note that the fitted functions are
saved with the histogram when it is written to a ROOT file. By
default, the function is drawn on the pad displaying the histogram.
### Draw Options
*‘SAME'* sets On/Off function drawing on the same pad. When a fit is
executed, the image of the function is drawn on the current pad.
*‘No drawing'* sets On/Off the option "`0`"- do not draw the fit
results.
*‘Do not store/draw'* sets On/Off option "`N`"- do not store the
function and do not draw it.
### Print Options
This set of options specifies the amount of feedback printed on the
root command line after performed fits.
*‘Verbose'* - prints fit results after each iteration.
*‘Quiet'* - no fit information is printed.
*‘Default'* - between Verbose and Quiet.
### Command Buttons
*Fit button* - performs a fit taking different option settings via the
Fit Panel interface.
*Reset* - sets the GUI elements and related fit settings to the
default ones.
*Close* - closes the Fit panel window.
## Fitting Multiple Sub Ranges
### Fitting Multiple Sub Ranges
The script for this example is `$ROOTSYS/tutorials/fit/multifit.C`. It
......@@ -493,7 +434,7 @@ the "+" sign is explained below:
h->Fit(total,"R+");
```
## Adding Functions to the List
### Adding Functions to the List
The example `$ROOTSYS/tutorials/fit/multifit.C` also illustrates how
......@@ -510,8 +451,8 @@ root[] hist->Fit("f1","+","",-2,2)
Note that the fitted function(s) are saved with the histogram when it
is written to a ROOT file.
## Combining Functions
## Example of fit: Combining Functions
You can combine functions to fit a histogram with their sum as it is
illustrated in the macro `FitDemo.C`
......@@ -592,7 +533,12 @@ function:
![The output of the FittingDemo() example](pictures/03000070.png)
## Associated Function
## Result of the fit
Here we will show how to obtain the result of the fit (fitted function, parameter values, errors
and eventually the covariance and correlation matrix).
### Associated Function
One or more objects (typically a **`TF1`**\*) can be added to the list
......@@ -604,7 +550,7 @@ of functions (`fFunctions`) associated to each histogram. A call to
TF1 *myfunc = h->GetFunction("myfunc");
```
## Access to the Fit Parameters and Results
### Access to the Fit Parameters and Results
If the histogram (or graph) is made persistent, the list of associated
......@@ -621,7 +567,10 @@ root[] Double_t p1 = fit->GetParameter(0);
root[] Double_t e1 = fit->GetParError(0);
```
## Associated Errors
Using the fit option `S` one can access the full result of the fit including the covariance and correlation matrix.
See later the paragraph `TFitResult`.
### Associated Errors
By default, for each bin, the sum of weights is computed at fill time.
......@@ -636,10 +585,13 @@ weights)`; otherwise, the error is set equal to the
```
Empty bins are excluded in the fit when using the Chi-square fit method.
When fitting the histogram with the low statistics, it is recommended to
use the Log-Likelihood method (option ‘`L`' or "`LL`").
When fitting an histogram representing counts (i.e with Poisson statistics) it is recommended to
use the Log-Likelihood method (option ‘`L`' or "`WL`"), particularly in case of low statistics.
When the histogram has been filled with weights different than one, a weighted likelihood method can be used
and the errors retrieved from the fit are corrected following a procedure described in paragraph 8.5.2 of the book,
*F. James, Statistical Methods in Experimental Physics, 2nd Edition*.
## Fit Statistics
### Fit Statistics
You can change the statistics box to display the fit parameters with the
......@@ -658,7 +610,595 @@ errors, use:
gStyle->SetOptFit(1011);
```
## The Minimization Package
## The Fit Panel
![The Fit Panel](pictures/03000061.png)
To display the Fit Panel right click on a histogram to pop up the
context menu, and then select the menu entry Fit Panel.
The new Fit Panel GUI is available in ROOT v5.14. Its goal is to
replace the old Fit Panel and to provide more user friendly way for
performing, exploring and comparing fits.
By design, this user interface is planned to contain two tabs:
"General" and "Minimization". Currently, the "General" tab provides
user interface elements for setting the fit function, fit method and
different fit, draw, print options.
The "Minimization tab" provides the option to set the Minimizer to use in the fit and
its specific options.
The new fit panel is a modeless dialog, i.e. when opened, it does not
prevent users from interacting with other windows. Its first prototype
is a singleton application. When the Fit Panel is activated, users can
select an object for fitting in the usual way, i.e. by left-mouse
click on it. If the selected object is suitable for fitting, the fit
panel is connected with this object and users can perform fits by
setting different parameters and options.
### Function Choice and Settings
*‘Predefined' combo box* - contains a list of predefined functions in
ROOT. You have a choice of several polynomials, a Gaussian, a Landau,
and an Exponential function. The default one is Gaussian.
*‘Operation' radio button group* defines the selected operational mode
between functions:
*Nop* - no operation (default);
*Add* - addition;
*Conv* - convolution (will be implemented in the future).
Users can enter the function expression into the text entry field
below the ‘Predefined' combo box. The entered string is checked after
the Enter key was pressed and an error message shows up, if the
function string is not accepted.
‘*Set Parameters*' button opens a dialog for parameters settings,
which will be explaned later.
### Fitter Settings
*‘Method' combo box* currently provides only two fit model choices:
Chi-square and Binned Likelihood. The default one is Chi-square. The
Binned Likelihood is recomended for bins with low statistics.
*‘Linear Fit' check button* sets the use of Linear fitter when is
selected. Otherwise the minimization is done by Minuit, i.e. fit
option "`F`" is applied. The Linear fitter can be selected only for
functions linears in parameters (for example - `polN)`.
*‘Robust' number entry* sets the robust value when fitting graphs.
*‘No Chi-square' check button* switch On/Off the fit option "`C`" -
do not calculate Chi-square (for Linear fitter).
*‘Integral' check button* switch On/Off the option "`I`" - use
integral of function instead of value in bin center.
*‘Best Errors'* sets On/Off the option "`E`" - better errors
estimation by using Minos technique.
*‘All weights = 1'* sets On/Off the option "`W`"- all weights set to 1
excluding empty bins; error bars ignored.
*‘Empty bins, weights=1'* sets On/Off the option "`WW`" - all weights
equal to 1 including empty bins; error bars ignored.
*‘Use range'* sets On/Off the option "`R`" - fit only data within the
specified function range. Sliders settings are used if this option is
set to On. Users can change the function range values by pressing the
left mouse button near to the left/right slider edges. It is possible
to change both values simultaneously by pressing the left mouse button
near to the slider center and moving it to a new position.
*‘Improve fit results'* sets On/Off the option "`M`"- after minimum is
found, search for a new one.
*‘Add to list'* sets On/Off the option "`+`"- add function to the list
without deleting the previous one. When fitting a histogram, the
function is attached to the histogram's list of functions. By default,
the previously fitted function is deleted and replaced with the most
recent one, so the list only contains one function. Setting this
option to On will add the newly fitted function to the existing list
of functions for the histogram. Note that the fitted functions are
saved with the histogram when it is written to a ROOT file. By
default, the function is drawn on the pad displaying the histogram.
### Draw Options
*‘SAME'* sets On/Off function drawing on the same pad. When a fit is
executed, the image of the function is drawn on the current pad.
*‘No drawing'* sets On/Off the option "`0`"- do not draw the fit
results.
*‘Do not store/draw'* sets On/Off option "`N`"- do not store the
function and do not draw it.
### Advances Options
The advance option button is enabled only after having performed the fit and provides
additional drawing options that can be used after having done the fit. These new drawing tools,
which can be selected by the "Advanced Drawing Tool" panel that pops up when clicking the "Advanced" button, are:
* *Contour*: to plot the confidence contour of two chosen parameters. One can select the number of points to draw the contour
(more points might require more time to compute it), the parameters and the desired confidence level .
* *Scan* : to plot a scan of the minimization function (likelihood or chi-squared) around the minimum as function of the chosen parameter.
* *Conf Interval* : to plot the confidence interval of the fitted function as a filled coloured band around its central value.
One can select the desired confidence level for the band to be plotted.
### Print Options
This set of options specifies the amount of feedback printed on the
root command line after performed fits.
*‘Verbose'* - prints fit results after each iteration.
*‘Quiet'* - no fit information is printed.
*‘Default'* - between Verbose and Quiet.
### Command Buttons
*Fit button* - performs a fit taking different option settings via the
Fit Panel interface.
*Reset* - sets the GUI elements and related fit settings to the
default ones.
*Close* - closes the Fit panel window.
### Minimization Options
With this tab one can select specific options for minimization. These include
* The minimizer library ( *Minuit*, *Minuit2*, *Fumili*, *GSL*, *Genetics* )
* The method (algorithm) for minimization. For example for Minuit one can choose between (*Migrad*, *Simplex* or *Scan*)
* Error definition
* Minimization tolerance
* Number of iterations/function calls
* Print Level: (*Default*, *Verbose* or *Quiet*).
## New ROOT::Fit classes
The fitting of the data objects in ROOT, histograms, graphs and tree is performed via some common classes,
which are defined in the `ROOT::Fit` namespace.
These classes can be classified in the following groups:
* User classes driving the fit: `ROOT::Fit::Fitter` for executing the fit, `ROOT::Fit::FitConfig` for configuring the fit,
`ROOT::Fit::ParameterSettings` to define the properties of the fit parameters (initial
values, bounds, etc..), `ROOT::Fit::FitResult` for storing the result of the fit.
* Data classes containing the data sets used in the fitting. These classes are the`ROOT::Fit::BinData`for describing bin data sets,
thus data points containing both coordinates and a corresponding value/weight
with optionally an error on the value or the coordinate and the `ROOT::Fit::UnBinData` for un-binned data sets,
which consisst only of a vector of coordinate values. The coordinate values can be
one-dimensional (i.e. one entry per event) or multi-dimensional (N entries per event).
* Function classes defining the type of fit (the objective function used for fitting):
- `ROOT::Fit::Chi2FCN` for chi2 (least-square fits),
- `ROOT::Fit::PoissonLikelihoodFCN` for binned likelihood fits of histograms,
- `ROOT::Fit::LogLikelihoodFCN` for generic un-binned likelihood fits.
These classes are templated on the type of function interface they implement (see later). User convenient typedefs are also provided.
They derive from the common generic interface multi-dimensional for function evaluation, `ROOT::Math::IBaseFunctionMultiDim`.
In addition the fitter classes make uses of the generic interfaces for parametric function evaluations, `ROOT::Math::IParametricFunctionMultiDim`
to define the fit model function and use the `ROOT::Math::Minimizer` interface to perform the minimization of the objective function.
More information about the function interface and the multi-dimensional minimization in ROOT is given in the Mathematical Library chapter.
Here we present a detailed description of the `ROOT::Fit` classes and how to use them.
Using these classes instead of the interface provided directly in the ROOT data objects, like `TH1::Fit` allow are more fine control
to configure and customise the fits. For example, using these classes a combined fit of several histograms can be performed.
To understand how these class work, let's go through a simple example, such as fitting an histogram.
When fitting an histogram, instead of using `TH1::Fit` we will show in the following hot wo use the `ROOT::Fit` classes.
We will show how to perform the following different type of fits with the histogram data:
* a least square fit using the observed errors (Neyman chi-squared);
* a least square fit using the expected errors from the function (Pearson chi-squared);
* a binned likelihood fit;
* an extended unbinned likelihood fits, if the histogram has been set to store in the buffer the original data used to fill it.
Let's go through all the steps required for performing these fits using the `ROOT::Fit::Fitter` class.
These steps are:
1. Create the input fit data object.
2. Create the input model function.
3. Configure the fit.
4. Perform the data fitting.
5. Examine the result.
### Creating the input fit data
We have two types of input data, binned data (class `ROOT::Fit::BinData`) used for least square (chi-square) fits of histograms or `TGraph` objects
or un-binned data (class `ROOT::Fit::UnBinData`) used for
fitting vectors of data points (e.g. from a `TTree`).
#### Using Binned data
Let's suppose we have an histogram, represented as a `TH1` type object (it can be one or multi-dimensional). The following shows how to create and
fill a `ROOT:Fit::BinData` object.
``` {.cpp}
ROOT::Fit::DataOptions opt;
opt.fIntegral = true;
ROOT::Fit::BinData data(opt);
// fill the bin data using the histogram
// we can do this using the following helper function from the Hist library
TH1 * h1 = (TH1*) gDirectory->Get("myHistogram");
ROOT::Fit::FillData(data, h1);
```
In this code example, we have used the utility function of the *Hist* library, `ROOT::Fit::FillData` to fill the `BinData` object.
The `ROOT::Fit::FillData` is defined in the headerfile `HFitInterface.h` and it has a signature for all different ROOT objects,
like `TH1`, `THnBase`, `TGraph`, `TGraph2D` and `TMultiGraph`
It is possible to specify, when creating the `BinData` object, the data range we want to use and some fitting options we want to apply
to fill in the object and later when fitting.
The fit data options are controlled by the ``ROOT::Fit::DataOptions`` class, the range by the ``ROOT::Fit::DataRange`` class.
Here is an example how to specify the input option to use the integral of the function value in the bin instead of using the function value
evaluated at the bin center, when doing the fit and to use a
range beween the 'xmin' and 'xmax' values.
``` {.cpp}
ROOT::Fit::DataOptions opt;
opt.fIntegral = true;
ROOT::Fit::DataRange range(xmin,xmax);
ROOT::Fit::BinData data(opt,range);
// fill the bin data using the histogram
// we can do this using the following helper function from the Hist library
TH1 * h1 = (TH1*) gDirectory->Get("myHistogram");
ROOT::Fit::FillData(data, h1);
```
The list of possible fit options available is the following:
``` {.cpp}
ROOT::Fit::DataOptions opt;
opt.fIntegral = true; // use integral of bin content instead of bin center (default is false).
opt.fBinVolume = true; // normalize data by the bin volume (default is false).
// This is for fitting density functions in histograms with variable bin sizes.
opt.fUseRange =true; // use the function range when creating the fit data (default is false).
opt.fExpErrors = true; // use the expected errors estimated from the function values
// assuming Poisson statistics and not the observed errors (default is false).
opt.fUseEmpty = true; // use empty bins when fitting (default is false). If fExpErrors
// is not set an arbitrary error = 1 is assigned to those bins.
opt.fErrors1 = true; // Set all measured errors to 1 (default is false).
opt.fCoordErrors = false; // When available coordinate errors are not used in the fit
// (default is true: the errors are used when they are available,
// e.g. fitting a TGraphErrors).
opt.fAsymErrors = false; // When available asymmetric errors are considered in the fit
// (default is true, the asymmetric errors are used when they are available,
// e.g. fitting a TGraphAsymmErrors).
```
The `ROOT::Fit::DataRange` class supports defining multiple rectangular ranges in each dimension, and supports n-dimension.
The function `DataRange::AddRange(icoord,xmin,xmax)` adds a range in
the coordinate `icoord` with lower value `xmin` and upper value `xmax`:
``` {.cpp}
ROOT::Fit::DataRange range;
range.AddRange(icoordinate, xmin, xmax);
```
#### Using Un-Binned data
The unbinned data sets are represented with the `ROOT::Fit::UnBinData` class.
For creating un-binned data sets, a `ROOT::Fit::UnBinData` object, one has two possibilities:
1. Copy the data inside `ROOT::Fit::UnBinData`. One can create an empty `UnBinData` object, iterate on the data and add the data point one by one, or directly create the `UnBinData`
object from a data iterator. In this case an input `ROOT::Fit::DataRange` object is passed in order to copy the data according to the given range.
2. Use `ROOT::Fit::UnBinData` as a wrapper to an external data storage. In this case the `UnBinData` object is created from an iterator or pointers to the data and the data are not copied
inside. In this case the data cannot be selected according to a specified range. All the data points will be included in the fit.
The `ROOT::Fit::UnBinData` class supports also weighted data. In addition to the data points (coordinates), which can be of arbitrary `k` dimensions, the class can be constructed from a vector of
weights. This is an example of taking data from an histogram buffer of a `TH1` object:
``` {.cpp}
double * buffer = histogram->GetBuffer();
// number of entry is first entry in the buffer
int n = buffer[0];
// when creating the data object it is important to create with the size of the data
ROOT::Fit::UnBinData data(n);
for (int i = 0; i < n; ++i)
data.add(buffer[2*i+1]); // the buffer of 1D histogram contains nevt,x1,w1,x2,w2,......
```
Instead in this example we will create a 2-dim `UnBinData` object with the contents from a ROOT `TTree`
``` {.cpp}
TFile * file = TFile::Open("hsimple.root");
TTree *ntuple = 0; file->GetObject("ntuple",ntuple);
// select from the tree the data we want to use for fitting
// we use TTree::Draw for this
int nevt = ntuple->Draw("px:py","","goff");
double * x = ntuple->GetV1();
double * y = ntuple->GetV2();
ROOT::Fit::UnBinData data(nevt, x, y );
```
### Creating the Fit model
In order to fit a data sets we need a model to describe our data, e.g. a probability density function describing our observed data or
an hypothetical function describing the relation between the independent variables **`X`** and the single dependent variable `Y`.
We can have an arbitrary number `k` of independent variables. For example, when fitting a `k`-dimensional histogram,
the independent variables **`X`** are the bin center coordinates and `Y` is the bin weight.
The model function needs to be expressed as function of some unknown parameters. The fitting will find the best parameter value to describe
the observed data.
We can use the ROOT `TF1` class, the parametric function class, to describe the model function. However the `ROOT::Fit::Fitter` class, to be independent of the ROOT *`Hist`* library,
takes as input a more general parametric function object, the interface (abstract) class `ROOT::Math::IParametricFunctionMultiDim`, which describe a generic one or multi-dimensional function
with parameters. This interface extends the abstract class `ROOT::Math::IBaseFunctionMultiDim`, with methods to set/retrieve parameter values and to evaluate the function given the
independent vector of values **`X`** and vector of parameters `P`.
More information about the different `ROOT::Math` function interfaces is available in the Mathematical Library chapter.
An end-user can convert a `TF1` object in a `ROOT::Math::IParametricFunctionMultiDim`, using the wrapper class `ROOT::Math::WrapperMultiTF1`:
``` {.cpp}
TF1 * f1 = new TF1("f1","gaus");
ROOT::Math::WrappedMultiTF1 fitFunction(f1, f1->GetNdim() );
ROOT::Fit::Fitter fitter;
fitter.SetFunction( fitFunction, false);
```
When creating the wrapper, the parameter values stored in `TF1` will be copied in the `ROOT::Math::WrappedMultiTF1` object.
The function object representing the model function is given to the `ROOT::Fitter` class using the `Fitter::SetFunction` method.
The user has also the possibility to provide a function object, which implements the derivatives of the function with respect
to the parameters.
This information might be useful for some types of fits. In this case he needs to provide the function object as a class deriving from the
`ROOT::Math::IParametricGradFunctionMultiDim` interface.
Note that the wrapper class `ROOT::Math::WrappedMultiTF1` implements also the gradient interface, using internally `TF1::GradientPar`,
which is based on numerical differentiation, apart for the case of linear functions (i.e. when `TF1::IsLinear()` is `true`).
The parameter derivatives of the model function can be useful to some minimization algorithms, such as Fumili.
However, in general is better to leave the minimization algorithm (e.g. Minuit) to compute the needed derivatives using its own customised
numerical differentiation algorithm.
In order to not provide to the fitter the parameter derivatives, we explicitly passed in `Fitter::SetFunction` a `false` value.
### Fit Configuration
The configuration of the fit is done via the `ROOT::Fit::FitConfig` class and its contained `ROOT::Fit::ParameterSettings` class.
These are the possible allowed fit configurations:
- setting the initial values of the parameters;
- setting the parameter step sizes;
- setting eventual parameter bounds;
- setting the minimizer library and the particular algorithm to use;
- setting different minimization options (print level, tolerance, max iterations, etc...)
- setting the type of parameter errors to compute (parabolic error, Minos errors, re-normalize errors using fitted chi2 values)
The initial parameter values can be set directly in the input model function object.
However, for setting parameter bounds and step sizes to values different than the automatically computed ones, one needs to use the `ROOT::Fit::ParameterSetting` class.
This example code will set the lower/upper bounds for the first parameter and a lower bound for the second parameter
``` {.cpp}
fitter.SetFunction( fitFunction, false);
fitter.Config().ParSettings(0).SetLimits(0,1.E6);
fitter.Config().ParSettings(2).SetLowerLimit(0);
```
Note that a `ROOT::Fit::ParameterSettings` objects exists for each fit parameter and it created by the `ROOT::Fit::FitConfig` class, after the model function has been set in the Fitter.
Only when the function is set, the number of parameter is known and
automatically the `FitConfig` creates the corresponding `ParameterSetting` objects.
When fitting, different minimizer can be used. The can be implemented in different libraries and loaded ar run time by the plug-in manager system of ROOT.
Each different minimizer (e.g. *Minuit, Minuit2, Fumili,* etc.) consists of a different implementation of the `ROOT::Math::Minimizer` interface.
Within the same minimizer, thus within the same class implementing the `Minimizer` interface, different algorithms can exist.
For example in the case of Minuit, we have *Migrad, Simplex* or *Minimize*. The minimizer and its corresponding algorithm, when available,
can be set by using the function `FitConfig::SetMinimizer("minimizerName")` or by using directly the `ROOT:Math::MinimizerOptions` class.
If the requested minimizer is not available in ROOT, the default one is used. The default minimizer type and algorithm can be specified by using the
static function `ROOT::Math::MinimizerOptions::SetDefaultMinimizer("minimizerName")`
### Minimizer Libraries and Algorithms
The list of available minimizer libraries currently available in ROOT, with their corresponding available algorithms is the following one.
Some minimizers (e.g. *Minuit*) contain several algorithms that the user can
choose. Others are based on a single algorithm (e.g. *Fumili*)
- **`Minuit`** (library *libMinuit*). Old version of Minuit, based on the `TMinuit` class. The list of possible algorithms are:
- *`Migrad`* (default one)
- *`Simplex`*
- *`Minimize`* (it is a combination of Migrad and Simplex)
- *`MigradImproved`*
- *`Scan`*
- *`Seek`*
- **`Minuit2`** (library *libMinuit2*). New C++ version of Minuit. The list of possible algorithm is :
- *`Migrad`* (default)
- *`Simplex`*
- *`Minimize`*
- *`Scan`*
- *`Fumili`* . This is the same algorithm of `TFumili`, but implemented in the Minuit2 library.
- **`Fumili`**. Implement a dedicated minimization algorithm for least-square and likelihood fits. It has requirements on the type of method function to be used.
No specific algorithm exists
- **`GSLMultiMin`** (library *libMathMore*). Minimizer based on the Multidimensional Minimization routines of the Gnu Scientific Library (GSL). The list of available algorithms is
- *`BFGS2`* (default) : second version of the vector Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm;
- *`BFGS`* : old version of the vector Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm;
- *`ConjugateFR`* : Fletcher-Reeves conjugate gradient algorithm;
- *`ConjugatePR`* : Polak-Ribiere conjugate gradient algorithm;
- *`SteepestDescent`*: steepest descent algorithm;
- **`GSLMultiFit`** (library *libMathMore*). Minimizer based on the Non-Linear Least-Square routines of GSL. This minimizer can be used only for least-square fits.
- **`GSLSimAn`** (library *libMathMore*). Minimizer based on simulated annealing.
- **`Genetic`** (library *libGenetic*). Genetic minimizer based on an algorithm implemented in the *TMVA* package.
Each minimizer can be configured using the `ROOT::Math::MinimizerOptions` class. The list of possible option that can be set are:
* Minimizer type (`MinimizerOptions::SetMinimizerType(const char *)`) .
* Minimizer algorithm (`MinimizerOptions::SetMinimizerAlgorithm(const char *)`).
* Print Level (`MinimizerOptions::SetPrintLevel(int )`) to set the verbose printing level (default is 0).
* Tolerance (`MinimizerOptions::SetTolerance(double )`) tolerance used to control the iterations.
* Maximum number of function calls (`MinimizerOptions::SetMaxFunctionCalls(int )`).
* Maximum number of iterations (`MinimizerOptions::SetMaxIterations(int )`). Note that this is not used by *Minuit*
* FCN Upper value for Error Definition (`MinimizerOptions::SetMaxIterations(int )`). Value in the minimization function used to compute the parameter errors.
The default is to get the uncertainties at the 68% CL is a value of 1 for a chi-squared function minimization and 0.5 for a log-likelihood function.
* Strategy (`MinimizerOptions::SetStrategy(int )`), minimization strategy used. For each minimization strategy *Minuit* uses different configuration parameters
(e.g. different requirements in computing derivatives, computing full Hessian (strategy = 2) or an approximate version. The default is a value of 1. In this case the full Hessian matrix
is computed only after the minimization.
* Precision (`MinimizerOptions::SetTolerance(double )`). Precision value in the evaluation of the minimization function. Default is numerical double precision.
Note that not all the options are implemented by all the minimizers.
For example in *Minuit* is possible to set the maximum number of function calls, but not the maximum number of iterations. The Strategy and the Precision options apply instead only for *Minuit* (and
*Minuit2*).
The class supports alo setting different default values for the options, by using the static functions `MinimizerOptions::SetDefault...` (e.g. `MinimizerOptions::SetDefaultPrintLevel(int )`).
The list of the current option values can be inspected by using `MinimizerOptions::Print`.
```{.cpp}
ROOT::Math::MinimizerOptions() opt;
// print the default minimizer option values
opt.Print();
```
In addition it is possible to provide extra options which might apply for a particular minimizer `MinimizerOptions::SetExtraOptions(const IOptions & )`.
See the documentation of the particular minimizer to use for the list of possible additional options available.
### Performing the Fit
Here we have now all the required input ingredients for the fit, the data and the function to fit.
Depending on these we have now several different way to perform the fit, using the corresponding methods of the
`ROOT::Fit::Fitter` class and depending on the type of input data.
#### Available fit methods
* **Least-square fit**: `Fitter::LeastSquare(const BinData & )` or `Fitter::Fit(const Bindata &)`. It requires the user to pass a `BinData` object. It should be used when the data values follow a
Gaussian distribution. This fit method is implemented using the class `ROOT::Fit::Chi2FCN`.
* **Binned Likelihood fit** : `Fitter::LikelihoodFit(const Bindata & )`. The user needs to pass a `BinData` object. It should be used when the data values follow a Poisson or a multinomial
distribution. The Poisson case (extended fit) is the default and in this case the function normalization is also fit to the data. The Multi-nominal case can be selected by passing the optional
*extended* boolean flag as *false*. This method is implemented by the class `ROOT::Fit:::PoissonLikelihoodFCN`.
* **Un-Binned likelihood fit**: `Fitter::LikelihoodFit(const UnBindata &)`. The user needs to pass an `UnBinData` object. By default the fit is not extended (i.e. the normalization is not fitted to the
data). As above the user can select an extended likelihood fit by passing the optional
*extended* boolean flag as *true*. This methos is implemented using the class `LogLikelihoodFCN`
* **Linear Fit**: A linear fit can be selected (no iterative minimization is needed in this case, but using linear algebra algorithms from the *Matrix* library), if the model function is linear in the
parameters.
#### Customised Fit methods
Above we described the pre-defined methods used for fitting. A user can also implement its own fitting methods, thus its version of the chi-square or likelihood function he wants to minimize.
In this cas, the user does not really need to build as input a `ROOT::Fit` data set and model function as we described before. He can implements its own version of the method function using on its own
data set objects and functions.
In this case `ROOT::Fit::Fitter::SetFCN` is used to set the method function and `ROOT::Fit::FitFCN` is used for fitting. The method function can be passed also in `ROOT::Fit::FitFCN`, but in this
case a previously defined fitting configuration is used.
The possible type of method functions that can be bassed in `ROOT::Fit::Fitter::SetFCN` are:
* A generic functor object implementing `operator()(const double * p)` where **`p`** is the parameter vectors. In this case one needs to pass the number of parameters,
the function object and optionally a vector of initial parameter values. Other optional parameter include the size of the data sets and a flag specifying if it is a chi2 (least-square fit).
In the last two parameters are given, the `chi2/ndf` can be computed after fitting the data.
``` {.cpp}
template <class Function>
bool Fitter::SetFCN(unsigned int npar, Function & f,
const double * initialParameters = 0,
unsigned int dataSize=0, bool isChi2Fit = false)
```
* A function object implementing the `ROOT::Math::IBaseFunctionMultiDim` interface:
``` {.cpp}
bool Fitter::SetFCN(const ROOT::Math::IBaseFunctionMultiDim & f,
const double * initialParameters = 0,
unsigned int dataSize=0, bool isChi2Fit = false)
```
* A function object implementing the `ROOT::Math::FitMethodFunction` interface. This is an interface class extending
the `ROOT::Math::IBaseFunctionMultiDim` with some extra functionality which can be used when fitting.
This extra functionality is required by dedicated fitting algorithms like *Fumili* or *GSLMultiFit*.
``` {.cpp}
bool Fitter::SetFCN(const ROOT::Math::FitMethodFunction & f,
const double * initialParameters = 0, unsigned int dataSize=0)
```
* A old-Minuit like FCN interface (i.e. a free function with the signature `fcn(int &npar, double *gin, double &f, double *u, int flag)`.
``` {.cpp}
typedef void(* MinuitFCN)(int &npar, double *gin, double &f, double *u, int flag)
bool Fitter::SetFCN(MinuitFCN fcn, int npar,
const double * initialParameters = 0,
unsigned int dataSize=0, bool isChi2Fit = false)
```
### Fit Result
The result of the fit is contained in the `ROOT::Fit::Result` object. A reference to the result object is obtained with the function
`Fitter::Result()`.
The `ROOT::Fit::FitResult` class provides an API for retrieving parameter values, errors, covariance and correlation matrix from the fit,
minimum chi2/likelihood values, etc...
A `FitResult::Print` method is also available to print the result of the fit.
The class has a self-explanatory API so, see its reference documentation for the possible information available after the fit.
One extra functionality offered by `ROOT::Fit::FitResult` is the possibility to compute the confidence intervals of the function after the fit.
The function `ROOT::Fit::FitResult::GetConfidenceInterval` given an input data sets (e.g. a `BinData` object) and a confidence level value (e.g. 68%)
computes the lower/upper band values of the model function at the given data points.
### TFitResult
`TFitResult` is a class deriving from `ROOT::Fit::Result` and providing in addition some convenient methods to return a
covariance or correlation matrix as a `TMatrixDSym` object. In addition `TFitResult` derives from `TNamed` and can be conveniently
stored in a file.
When fitting an histogram ( a `TH1` object) or a graph (a `TGraph` object) it is possible to return a `TFitResult` via the `TFitResultPtr` object,
which behaves as a smart pointer to a `TFitResult`.
`TFitResultPtr` is the return object by `TH1::Fit` or `TGraph::Fit`.
By default the TFitResultPtr contains only the status of the fit and can be obtained by an automatic conversion of the TFitResultPtr to an integer.
If the fit option *`S`* is instead used, `TFitResultPtr` contains the `TFitResult` and behaves as a smart
pointer to it. This is an example:
``` {.cpp}
int fitStatus = hist->Fit(myFunction); // TFitResultPtr contains only the fit status
TFitResultPtr r = hist->Fit(myFunction,"S"); // TFitResultPtr contains the TFitResult
TMatrixDSym cov = r->GetCovarianceMatrix(); // to access the covariance matrix
Double_t chi2 = r->Chi2(); // to retrieve the fit chi2
Double_t par0 = r->Parameter(0); // retrieve the value for the parameter 0
Double_t err0 = r->ParError(0); // retrieve the error for the parameter 0
r->Print("V"); // print full information of fit including covariance matrix
r->Write(); // store the result in a file
```
## The Minimization packages
As explained before various minimization packages can be used when fitting in ROOT.
We have seen before how to configure the `Fitter` class to use different minimization packages
and different minimization options.
When using the `Fit` method the minimization package (and its options) can be selected using the
static methods of the `ROOT::Math::MinimizerOptions` class.
For example to select `Minuit2` instead of `Minuit` for fitting an histogram do:
``` {.cpp}
ROOT::Math::MinimizerOptions::SetDefaultMinimizer("Minuit2");
// fit the histogram histo with the gaussian pre-defined function
histo->Fit("gaus");
```
In the following we will give some brief description of the minimization packages.
The packages all implement the `ROOT::Math::Minimizer` interface which can be use for
finding the minimum of a multi-dimensional function.
The interface is documented in the Mathematical Library Chapter.
In addition packages like Minuit or Minuit2 provide their own interfaces.
## MINUIT (Old TMInuit Version)
This package was originally written in FORTRAN by Fred James and part
......@@ -691,8 +1231,6 @@ conversion of the original FORTRAN version. The main changes are:
- The ROOT static function `Printf` is provided to replace all
format statements and to print on currently defined output file
- The derived class **`TMinuitOld`** contains obsolete routines from
the FORTRAN based version
- The functions `SetObjectFit/GetObjectFit` can be used inside the
`FCN` function to set/get a referenced object instead of using
......@@ -939,6 +1477,52 @@ following:
- Starting too far from the solution - the function may have
unphysical local minima, especially at infinity in some variables.
## Minuit2 Package
`Minuit2` is a new object-oriented implementation, written in C++, of
the popular `MINUIT` minimization package. Compared with the
**`TMinuit`** class, which is a direct conversion from FORTRAN to C++,
`Minuit2` is a complete redesign and re-implementation of the package.
This new version provides all the functionality present in the old
FORTRAN version, with almost equivalent numerical accuracy and
computational performances.
Furthermore, it contains some fixes and small improvements and this new functionality:
* The possibility to set single side parameter limits
* the FUMILI algorithm (see the next paragraph "FUMILI Minimization Package"),
which is an optimized method for least square and log
likelihood minimizations.
Minuit2 has been originally developed by M.
Winkler and F. James in the SEAL project. More information can be found
on the [MINUIT Web Site](MINUIT Web Site) and in particular at the
following documentation page at
<http://www.cern.ch/minuit/doc/doc.html>.
A detailed User Guide for Minuit2 exists, describing the API of the internal classes.
ROOT uses `Minuit2` for fitting via the `Minuit2Minimizer` class which implements
the `ROOT::Math::Minimizer` interface.
`Minuit2` is also distributed as an independent package of ROOT and can be built
without any other dependency on the ROOT libraries.
Examples on how to use the `Minuit2` and `Fumili2` plug-ins are provided
in the tutorials' directory `$ROOTSYS/tutorials/fit`:
`minuit2FitBench.C`, `minuit2FitBench2D.C` and `minuit2GausFit.C`.
More information on the classes and functions present in `Minuit2` is
available at
[online reference documentation](online reference documentation).
Useful information on MINUIT and minimization in general is provided in the
following documents:
F. James, *Minuit Tutorial on Function Minimization* (
<http://seal.cern.ch/documents/minuit/mntutorial.pdf>); F. James, *The
Interpretation of Errors in Minuit* (
<http://seal.cern.ch/documents/minuit/mnerror.pdf>);
## FUMILI Minimization Package
......@@ -1031,6 +1615,8 @@ similar step formulae are used in FUMILI for negative logarithm of
the likelihood function with the same idea - linearization of function
argument.
## Neural Networks
......
This diff is collapsed.
File suppressed by a .gitattributes entry or the file's encoding is unsupported.
File suppressed by a .gitattributes entry or the file's encoding is unsupported.
File suppressed by a .gitattributes entry or the file's encoding is unsupported.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment