Combined Template-based Top Quark Measurement in the Lepton+Jets and Dilepton channels using 1.9fb-1

J. Adelman, E. Brubaker, W. Fedorko, H.S. Lee, Y.K. Kim, M. Shochet (U. Chicago)

S. Carron, T. Farooque, P. Sinervo (U. Toronto)

G. Velev (Fermilab)

Contact the authors

Combined fit result:
Mtop = 171.9 +/- 1.7 (stat.+JES) +/- 1.0 (syst) GeV/c2 = 171.9 +/- 2.0 GeV/c2

Lepton+Jets only result:
Mtop = 171.8 +/- 1.9 (stat.+JES) +/- 1.0 (syst) GeV/c2 = 171.8 +/- 2.1 GeV/c2

Dilepton only result:
Mtop = 171.6 +3.4-3.2 (stat.) +/- 3.8 (syst) GeV/c2 = 171.6 5.1-5.0 GeV/c2

Images are saved as .gif or .jpg files. Click on any image to link to an .eps file
Conference Note


We present a measurement of the top quark mass simulatneously in the Lepton+Jets and Dilepton decay channels. We use a datasample with integrated luminosity 1.9fb-1 collected by the CDF II detector. The data sample consists of 344 Lepton+Jets event candidates and 144 Dilepton event candidates.
In the Lepton+Jets channel a chi-squared function is minimized to obtain reconstructed top mass mtreco for every event. The invariant mass of the jets coming from the hadronically decaying W boson mjjis used to reduce the dominant systematic effect arising from the jet energy scale.
Neutrino Weighting Algorithm is applied to all Dilepton channel events where we integrate over the unknown quantites in the kinematically underconstrain system to obtiain the reconstructed top mass MtNWA. We also use HT- the scalar sum of the momenta of jets and leptons and missing transverse energy to improve the sensitivity.
Kernel density estimation (KDE) is used produce probability density functions that are two-dimensional in the observables. The two-dimensional distributions (mtreco; mjj) and (MtNWA; HT) from data are compared to Monte Carlo to measure the top quark mass and the jet energy scale. The jet energy calibration from the Lepton+Jets channel is naturally applied to the Dilepton channel. We measure Mtop=171.9 +/- 1.7 (stat.+JES) +/- 1.0 (syst.) GeV/c2. We also perform separate fits in Lepton+Jets channel yielding Mtop=171.8 +/- 1.9 (stat.+JES) +/- 1.0 (syst.) GeV/c2 and Dilepton channel yielding Mtop=171.6 +3.4 -3.2 (stat.) +/- 3.8 (syst.) GeV/c2. Note that the Dilepton channel only fit has no in-situ JES calibration.

Event selection in the Lepton+Jets channel:

To select events in the Lepton+Jets channel where one W from the tops decays to a pair of hadrons, and the other W decays to a charged lepton (electron or muon) plus a neutrino, we require a well-identified electron or muon, large missing transverse energy and 4 jets, at least one of which is identified as arising from a b quark. We take advantage of different signal-to-background (S:B) and event shapes by splitting our sample into two non-overlapping subsamples, based on the number of jets with a b-tag (using CDF's secondary vertex tagger, SECVTX). Events with exactly one tag are required to have exactly 4 jets. In events with two or more tags, which have a higher S:B and more statistiacl power, we loosen the cut on the 4th jet and allow more than 4 jets. The event selection is summarized in the following table:

Number of b-tags
>= 2
Jets 1-3 Et threshold (GeV)
4th jet Et threshold (GeV)
Extra jets (GeV)
< 20
> 20
> 20
In addition we require that the chi-squred returned by the kinematic fit is smaller than 9 for both Lepton+Jets subsamples to further reduce the background fraction and to ensure that only well reconstructed ttbar events enter the analysis. Furthermore to avoid possible bias in the probability density functions we apply a boundary cut requiring that all events have 110 < mtreco < 350 GeV/c2 and 50 < mjj < 115 GeV/c2 for the single-tag subsample and 50 < mjj < 125 GeV/c2 for the double-tag subsample.

Event selection in Dilepton channel:

We design the selection to accept ttbar events where both W bosons decay into an electron or muon and neutrino pair. We use W+jets dataset which is triggered on a central electron or central muon. The selection criteria are summerised as follows
  • Two leptons (e or mu) with pT > 20GeV. One lepton has to be isolated
  • Two jets with transverse energy > 15 GeV, jets are corrected for differences in responce in different calorimeter regions and calorimeter nonlinearities
  • Missing transverse energy > 25GeV
  • Z-veto incorporating missing ET significance cut
  • ET > 50 GeV if a lepton is closer than 20o in azimuth from the missing ET vector
  • HT > 200 GeV

We divide the dilepton sample into two sub samples based on the presence of a b-tag to enhance the statistical power of the method. As in the Lepton+Jets channel we apply a boundary cut requiring that 100 < MtNWA < 350 GeV/c2 and 200 < HT < 800.

Top mass reconstruction and dijet mass reconstruction in the Lepton+Jets channel:

A chi-2 minimization is performed to reconstruct a top quark mass for each event. The fitter is based on the hypothesis that the event is ttbar: it contains W mass constraints on the hadronic and leptonic side and requires the two top masses in the event to be equal. Only the leading 4 jets are assigned to the four quark daughters from the top quark decay. The jet-parton assignment that yields the lowest chi2 after minimization is kept for further analysis, and the corresponding top mass (m_t) is used in our templates. The distribution of mtreco for the two Lepton+Jets subsamples are shown below.


To measure the JES, mass templates of the W boson decaying hadronically mjj are also constructed in addition to the top mass templates. The chi2 fitter is not used to obtain mjj though events failing the chi2 cut are also not used to measuring the JES. In 2-tag events, there is only one dijet mass from among the leading 4 jets consistent with b-tagging (ie not tagged as a b). In 1-tag events, there are 3 dijet masses consistent with b-tagging. We take the single dijet mass closest to the well known W mass as the single value of mjj per event. Mjj for the 2 subsamples for 3 different values of the JES in the detector are shown below.


Top mass reconstruction in the Dilepton channel:

We use Neutrino Weighting Algorithm to reconstruct events in the Dilepton channel. Here there are not enough measured quantities to fully constrain the event. This is due to the presence of two neutrinos in the final.state. We integrate over neutrino pseudorapidities taking the distribution from the Monte-Carlo simulation. The algorithm procedes as follows:
  • Assume the value of the top mass.
  • Choose a particular jet to b-quark assignment (there are two possibilities)
  • Assume neutrino pseudorapidities.
  • Using the world average masses of the W boson b quark and leptons we now can solve for the Px and Py of each of the neutrinos. Solutions might not exist for the assumed value of the top quark mass and values. When a solution exists we will have two solutions for each neutrino.
  • We form four weights comparing each combination of solutions to the measured missing transverse energy with a Gaussian weight. Since the correct combination is not known we sum the four weights.
  • We integrate over 1 and 2 obtaining the weight for the assumed top mass. The integration distribution for neutrino pseudorapidities is taken from the ttbar Monte Carlo and is a Gaussian with width approximately 1. The integration is performed by summing a grid of values with 0.2 spacing
  • We obtain the weight corresponding to the other jet to b-quark assignment
  • We sum the two weights. Now we have a handle on probability that the true top mass is the top quark mass we assumed.
  • We scan the top mass in units of 3GeV.
  • The maximum weight is found, as well as maximum weights of the two jet to b-quark assignments separately.
  • The scan is repeated succesively around the maxima until the step size of 0.03GeV is reached.
  • The assumed top mass which yields the highest weight is taken as the reconstructed top mass MtNWA

  • Below are shown distributions of MtNWA for three true top masses for the two Dilepton subsamples.


    To improve the sensitivity of the measurement we also use HT: scalar sum of transverse momenta of jets and leptons and missing transverse energy in the Dilepton channel. The distributns of HT are shown below.


    Backgrounds for the Lepton+Jets channel

    The background sources and their expected fraction of the total background are given in the table below. The backgrounds are dominated by real W boson production in association with high-pt jets. The absolute normalization of W+jets is determined from the data, but the relative normalization between the different flavor samples is taken from MC. The expected number of events for single-top and diboson background are taken from theoretical cross-sections and MC predictions. The table below shows the expected numbers of events for signal and each background source.

    Backgrounds for the Dilepton channel

    The major background for the dilepton channel are Drell-Yan process, diboson production and Fakes -where a jet mimics a lepton.
    The Drell-Yan background is notoriously hard to model given the fact that the signal selection uses a Z-veto. We use more than 50 'matched' Alpgen+Pythia samples which cover on-peak and off-peak regions as well as associated light flavour and heavy flavour jet production. We remove events with heavy flavour jets generated by Pythia showering from light flavor samples and some heavy flavour samples.
    We model the fakes background using data. We select events from the W+jets dataset requiring one isolated lepton. We apply a dilepton veto to eliminate ttbar events. We require that a lepton object likely to be a fake be prestnt. All other selection criteria are applied. Remaining events are reconstructed with NWA and form the fake background shape.
    Expected numbers of events for signal and background are shown in the table below

    Kernel Density Estimation:

    We use a non-parametric Kernel Density Estimate-based approach to forming probability density functions from fully simulated Pythia MC. The probability for an event with an observale x is given by the linear sum of contributions from all entries in the MC:
    KDE 2 image

    Here, f(x) is the probability to observe x given some MC sample with known mass and JES (or the background). The kernel function K is a normalized function that adds varying probability to a measurement at x depending on its distance from xi. The smoothing parameter h is a number that determines the width of the kernel. Larger values of h smooth out the density estimate, and smaller values of h keep most of the probability weight near xi. We use an adaptive method in which the value of h = h(f(x_i)). The peak of the distribution, we use smaller smoothing. In the tails of the distribution, where statistics are poor and we are sensitive to statistiacl fluctuation, we use a larger amount of smoothing. KDE can be expanded to two dimensions by multiplying together two kernels:
    KDE 4 image

    The two-dimensional density estimates for an input signal mass of 170 GeV/c2 and JES=0.0 for the Lepton+Jets subsamples are shown here:

    In the Dilepton channel the MtNWA and HT variables are correlated. This is captured by the KDE technique as shown below:

    We apply this technique to obtain pdf for each Mtop, JES Monte Carlo sample that was generated as well as the background samples generated at a ranfe of JES values. Note that the KDE techinque is applied separately on background subsamples, taking into account MC event weights and then added using relative subsample normalizations to form the background pdf's. The Lepton+Jets background pdf for JES=0.0 is shown below:

    The plots below show the Dilepton subsample backgrounds:

    Likelihood and Local Polynomial Smoothing

    We minimize the extended likelihood with respect to the top mass, JES and signal and background expectation to obtain the measurement as well as statistical uncertainty. The form of the likelihood for subsample k is shown below.
    where ns and nb are signal and background expectations and N is the number of events in the subsample, Psig is the signal probability density function and Pbg is the background probability density function. mi and yi denote mtreco and mjj or MtNWA and HT depending on the sample. nb0 is the a-priori background estimate and sigma_n_b^0 is the uncertainty on that estimate.
    Kernel density estimation allows only for calculation of probability density function at the values of the top mass and jet energy scale where Monte-Carlo samples are available. To evaluate pdf at arbitrary Mtop for each event we use local polynomial smoothing. A fit to a quadratic polynomial will be performed using the values of PDF calculated using the KDE method. The points near the required value have a higher weight than points away from the required point. Deweightig is performed using a 'tricubic' function with width of 10GeV for the Lepton+Jets samples and 15GeV for the Dilepton samples in the Mtop direction and 0.8 sigma_c in the JES direction. Value of the quadratic fit at the required (Mtop,JES) point is used as the value of PDF.

    Method validation

    To ensure that the method is unbiased and the estimate of statistical uncertainty is valid we perform ensamble tests. We repeatedly draw events from the signal and background model mimicking possible variations of signal and background numbers that may occur in data. A mass measurement is performed on each of these pseudo-datasets. Knowing the Mtop and JES of the dataset from which the signal events were drawn we can form residuals (M_top_fitted-M_top_MC) and pulls ((M_top_fitted-M_top_MC)/returned uncertainty) as well as similar quantities for the JES calibration. Ideal performance would yield 0 residual and pull distributions centered at 0 and with width 1. We can estimate the expected stsistical+JES uncertainty (or just statistical uncertainty in the case of DIL-only measurement) from the RMS of the fitted mass distribution or by considering median returned error. The results of the ensamble tests are shown below:

    The bias check was performed for 14 mass points at JES=0.0sigmac and for three mass points at non-zero JES. The color code for the JES values is provided in the legend below:
    The pull width for the Combined and Lepton+Jets fits departs from 1, thus we scale up the uncertainty returned from the fit by 3%. We only use the bias check performed at JES=0.0 to determine bias and pull width scaling.

    Systematic uncertainties

    The contributions to the systematic uncertainty for the Combined, Lepton+Jets and Dilepton fits are shown below. The dominant effect on the Combined and Lepton+Jets fit is the b-jet energy scale which captures the differences in modelling of the b-jets as opposed to modelling of the light-flavor jets. We model the jet energy scale as a single parameter, which is an over-simplification resulting in the Residual JES uncertainty - the second largest source of uncertainty. Since the Dilepton-only measurement has no in-situ calibration the JES uncertainty becomes dominant for this channel.

    Fit and results

    We perform fit to the data using both Lepton+Jets and Dilepton channels and measure Mtop = 171.9 +/- 1.7 (stat.+JES) +/- 1.0 (syst) GeV/c2 = 171.9 +/- 2.0 GeV/c2 The ikelihood contours for the combined fit are shown below:
    In the Lepton+Jets channel only fit we measure Mtop = 171.8 +/- 1.9 (stat.+JES) +/- 1.0 (syst) GeV/c2 = 171.8 +/- 2.1 GeV/c2 The likelihood contour plot for the Lepton+Jets only fit is shown below
    In the Dilepton only fit we obtain Mtop = 171.6 +3.4-3.2 (stat.) +/- 3.8 (syst) GeV/c2 = 171.6 5.1-5.0 GeV/c2 The likelihood profile is shown below:
    We perform pseudoexperiments using observed number of events to evaluate the probability of obtaining the uncertainty found in data. Results are shown below.

    The reconstructed top mass distribution from the data with overlayed background and signal template fitted is depicted below.

    Wojciech Fedorko for the TMT group

    Last modified Feb 18, 2008