Skip to content

Analysis checkout scripts

Olena Hlushchenko edited this page Jan 31, 2018 · 16 revisions

Checkout Recipe

Checkout the recommended version of CMSSW and all packages that are needed for compiling and executing the analysis. The script also compiles everything properly.

It is recommended to first create a suitable directory and change to this directory before you checkout CMSSW in this directory. This should be a place where you can collect more versions for furture versions of the analysis.

You want to work on top of CMSSW8

wget https://raw.githubusercontent.com/cms-analysis/HiggsAnalysis-KITHiggsToTauTau/master/scripts/checkout_packages.sh
source checkout_packages.sh

This recipe exits and logs out from the current shell if an error occurs. If this is not wanted, remove this line.

this will give you cmssw8 + all the packages set to 8.

You want to work on top of CMSSW7

If you WANT cmssw7 - don't just checkout the other branches! Use the different checkout script from the different branch from where you get checkout script as in example below:

wget https://raw.githubusercontent.com/cms-analysis/HiggsAnalysis-KITHiggsToTauTau/CMSSW_747/scripts/checkout_packages.sh
source checkout_packages.sh -b CMSSW_747

You can try to omit option -b but it's strongly not recommended: you'll have to check that Artus and Htt are on the same (74X) branch. And that the correct tau spinner version is used in iggsAnalysis/KITHiggsToTauTau/data/tauspinner.xml. The example of the correct such a file you can see just below.

<tool name="tauspinner" version="1.1.5">
  <lib name="TauolaCxxInterface"/>
  <lib name="TauolaFortran"/>
  <lib name="TauolaTauSpinner"/>
  <client>
    <environment name="TAUSPINNER_BASE" default="$CMSSW_RELEASE_BASE/../../../external/tauolapp/1.1.5-cms2"/>
    <environment name="LIBDIR" default="$TAUSPINNER_BASE/lib"/>
    <environment name="INCLUDE" default="$TAUSPINNER_BASE/include"/>
  </client>
  <use name="hepmc"/>
  <use name="f77compiler"/>
  <use name="pythia8"/>
  <use name="lhapdf"/>
</tool>

If yours is different copy this one instead, do source $CMSSW_BASE/src/HiggsAnalysis/KITHiggsToTauTau/scripts/ini_KITHiggsToTauTauAnalysis.sh; scram clean; scramb

Don't forget the scripts do Not handle the same names - before trying everything from scratch - rename or remove the checkout_packages.sh! (TODO: automatize it with giving the time stamp to the name for example)

Regular Analysis Setups

In every new shell, CMSSW needs to be setup up and also some settings for the analysis need to be declared.

export VO_CMS_SW_DIR=/cvmfs/cms.cern.ch
source $VO_CMS_SW_DIR/cmsset_default.sh

cd <path/to/CMSSW_X_Y_Z>/src/
cmsenv

source ${CMSSW_BASE}/src/HiggsAnalysis/KITHiggsToTauTau/scripts/ini_KITHiggsToTauTauAnalysis.sh

It is recommended to put this into a dedicated function/alias in the ~/.bashrc or ~/.bash_profile file.

Running Artus

Have a look at the "Running Artus" tutorial here.

SVFit

Have a look at the "SVFit tools" tutorial here.

TMVA training

  • Classifications: tmvaClassification.py -h
  • tmvaWrapper (recommended): tmvaWrapper.py -h
    • -i and -o are mandatory and you will be reminded if one is missing
    • -S and -n are not mandatory but one of them must be specified for a proper training
    • -S is the SplitValue: if this is specified you will do a regular training with a splitting of your sample defined by this variable
    • -n is the number of N-Fold: with this you will do a N-Fold training and splitting is done accordingly
    • --modify and --modificatioin: those parameters give you access to predefined training routines, those are defined below the argparse part
    • use those parameters to use already predefined sequences or define your own in the code.
    • You should write all trainings that belong together in one folder!

TMVA post training duties

  • Overtraining: plot_overtraining.py -h
  • ConfigWriter: mvaConfigWriter.py -h
    • store all trainings that belong together in the same folder
    • use the Producer MVATestMethodsProducer to run all of your trainings with Artus, the config for this is contained in the output of ConfigWriter
  • Correlations
    • Producer: correlation_SampleProducer.py -h
    • Collector: correlation_SampleCollector.py -h
    • use the Producer to calculate correlations for an Artus run, there must be a merged folder within the directory of the run
    • use the Collector to add Correlations of different channels and plot them
    • Producer: if run for the first time you have to use -P to produce a set of rootfiles, that are the base for correlation calculation, as of now the options for BDT bin correlations and multiprocess are there but not implemented
    • Collector: Per default the collector never adds MC Sample correlations and data correlations but you can add all MC Samples and compare it to data

Plotlevel Filtering

  • ArtusRunFilter: MinimalPlotlevelFilter.h -general instructions for the setup are in the source code
  • PostRunFilter: reduce_mergedFiles.py -h -this filter reduces the merged output from an Artus run to a reduced set, which is enough for plotting etc.
Clone this wiki locally