======================================================================================== T M V A --- Toolkit for Multivariate Data Analysis with ROOT ======================================================================================== TMVA Users Guide : http://tmva.sourceforge.net/docu/TMVAUsersGuide.pdf TMVA home page : http://tmva.sourceforge.net/ TMVA developer page : http://sourceforge.net/projects/tmva TMVA mailing list : http://sourceforge.net/mailarchive/forum.php?forum_name=tmva-users TMVA license (BSD) : http://tmva.sourceforge.net/LICENSE ======================================================================================== System requirements: -------------------- TMVA has been tested to run on Linux, MAC/OSX and Windows platforms. Running TMVA requires the availability of ROOT shared libraries with ROOT_VERSION >= 5.08 (type "root-config --version" to see the version installed) ======================================================================================== Getting Started: ---------------- How to compile the code: ------------------------ /home> cd TMVA /home/TMVA> source setup.[c]sh // includes TMVA/lib in your lib path /home/TMVA> cd src /home/TMVA/src> make // compile and build the library ../libTMVA.1.so How to run the code as ROOT macro: // training/testing of an academic example ---------------------------------- /home/TMVA> cd macros --- For classification: /home/TMVA/macros> root -l TMVAClassification.C // run all standard classifiers /home/TMVA/macros> root -l TMVAClassification.C\(\"LD,Likelihood\"\) // run LD and Likelihood classifiers --- For regression: /home/TMVA/macros> root -l TMVARegression.C // run all regression algorithms /home/TMVA/macros> root -l TMVARegression.C\(\"LD,KNN\"\) // run LD and k-NN regression algorithms --> at the end of the jobs, a GUI will pop up: try to click through all the buttons; some of the lower buttons are method-specific, and will only work when the corresponding classifiers/regression algorithms have been trained/tested before (unless they are greyed out) How to run the code as an executable: // training/testing of an academic example ------------------------------------- /home/TMVA> cd execs /home/TMVA/execs> make /home/TMVA/execs> ./TMVAClassification // run all standard classifiers /home/TMVA/execs> ./TMVAClassification LD Likelihood // run LD and Likelihood classifiers ... and similar for regression /home/TMVA/examples> root -l ../macros/TMVAGui.C // start the GUI How to apply the TMVA methods: ------------------------------------- /home/TMVA> cd macros --- For classification: /home/TMVA/macros> root -l TMVAClassificationApplication.C /home/TMVA/macros> root -l TMVAClassificationApplication.C\(\"LD,Likelihood\"\) ... and similar for regression The directory structure: ------------------------ src/ : the TMVA source code lib/ : here you'll find the TMVA library (libTMVA.1.so) after compilation (copy it to you preferred library directory or include this directory in your LD_LIBRARY_PATH as it is done by: source setup.[c]sh macros/ : example code of how to use the TMVA library with a ROOT macro uses input data from a Toy Monte Carlo; also: handy root macros which read and display the results produced by TMVAClassification and TMVARegression execs/ : same example code as in 'macros', but for using the TMVA library in an executable execs/data : the Toy Monte Carlo data python/ : example code of how to use the TMVA library with a python script; requires availability of PyROOT development/ : for use by developers only ======================================================================================== Executive summary: ------------------ The Toolkit for Multivariate Analysis (TMVA) provides a machine learning environment for the processing and parallel evaluation of multivariate classification and regression algorithms. TMVA is integrated into the data analysis framework ROOT. It is specifically designed to the needs of high-energy physics (HEP) applications, but should not be restricted to these. The package includes: * Rectangular cut optimisation * Projective likelihood estimation (PDE approach) * Multi-dimensional likelihood estimation (PDE - range-search, PDE-foam, and k-NN) * Linear discriminant analysers (H-Matrix, Fisher Discriminant, and Linear Discr. (same as Fisher)) * Function discriminant analysis (FDA) * Artificial neural networks (three different Multilayer Perceptron implementations) * Support Vector Machine (SVM) * Boosted/Bagged decision trees * Predictive learning via rule ensembles (RuleFit) TMVA consists of object-oriented implementations in C++ for each of these discrimination techniques and provides training, testing and performance evaluation algorithms and visualization scripts. The classifier/regression training and testing is performed with the use of user-supplied data sets in form of ROOT trees or text files, where each event can have an individual weight. The true event classification/target value in these data sets must be known. Preselection requirements and transformations can be applied on this data. TMVA supports the use of variable combinations and formulas. TMVA works in transparent factory mode to guarantee an unbiased performance comparison between the algorithms: all algorithms see the same training and test data, and are evaluated following the same prescriptions within the same execution job. A Factory class organises the interaction between the user and the TMVA analysis steps. It performs preanalysis and preprocessing of the training data to assess basic properties of the discriminating variables used as input to the algorithms. The linear correlation coefficients of the input variables are calculated and displayed, and a preliminary ranking is derived (which is later superseded by method-specific variable rankings). The variables can be linearly transformed (individually for each algorithm) into a non-correlated variable space or projected upon their principle components. To compare the signal-efficiency and background-rejection performance of the algorithms, the analysis job prints tabulated results for some benchmark values, besides other criteria such as a measure of the separation and the maximum signal significance. Smooth efficiency versus background rejection curves are stored in a ROOT output file, together with other graphical evaluation information. These results can be displayed using ROOT macros, which are conveniently executed via a graphical user interface that comes with the TMVA distribution. The TMVA training job runs alternatively as a ROOT script, as a standalone executable, where libTMVA.so is linked as a shared library, or as a python script via the PyROOT interface. Each MVA method trained in one of these applications writes its configuration and training results in result (``weight'') files, which consists of one text and one ROOT file. A light-weight Reader class is provided, which reads and interprets the weight files (interfaced by the corresponding MVA method), and which can be included in any C++ executable, ROOT macro or python analysis job. For standalone use of the trained MVA methods, TMVA also generates lightweight C++ response classes (not available for all MVA methods), which contain the encoded information from the weight files so that these are not required anymore. These classes do not depend on TMVA or ROOT, neither on any other external library. We have put emphasis on the clarity and functionality of the Factory and Reader interfaces to the user applications. All MVA methods run with reasonable default configurations, so that for standard applications that do not require particular tuning, the user script for a full TMVA analysis will hardly exceed a few lines of code. For individual optimisation the user can (and should) customize the classifiers via configuration strings. Please report any problems and/or suggestions for improvements to the authors. ======================================================================================== Copyright © (2005-2009): ------------------------ Andreas Hoecker, Peter Speckmayer, Jörg Stelzer (all: CERN, Switzerland), Jan Therhaag, Eckhard von Toerne (both: U. Bonn, Germany), Helge Voss (MPI-KP Heidelberg, Germany), Contributed to TMVA have, please see: http://tmva.sourceforge.net/#authors Redistribution and use of TMVA in source and binary forms, with or without modification, are permitted according to the terms listed in the BSD license: http://tmva.sourceforge.net/LICENSE ----------------------------------------------------------------------------- @(#)root/tmva $Id: README,v 1.14 2008-03-12 18:10:35 andreas.hoecker Exp $
Rene Brun
authored
git-svn-id: http://root.cern.ch/svn/root/trunk@31885 27541ba8-7e3a-0410-8455-c3a389f83636