MTM3: A Top Mass Measurement in the Lepton + Jets Channel with 4.8 fb-1

Lina Galtieri, Paul Lujan, Jeremy Lys (LBNL), John Freeman (FNAL), Jason Nielsen (UC Santa Cruz), Igor Volobouev (Texas Tech)
Send mail to the authors

Measured value:
mt = 172.8 ± 0.7 (stat.) ± 0.6 (JES) ± 0.8 (syst.) GeV/c2 = 172.8 ± 1.3 (total) GeV/c2


Our previous measurements:

Note: to download high-resolution Encapsulated PostScript (.eps) versions of the plots and figures, click on the plot. There are also .eps versions as well as LaTeX source available for the tables.


We report an updated measurement of the top quark mass obtained from proton-antiproton collisions at a center-of-mass energy of 1.96 TeV at the Fermilab Tevatron using the CDF II detector. We calculate a signal likelihood using a matrix element integration method with a Quasi-Monte Carlo integration to take into account finite detector resolution and quark mass effects. We use a neural network discriminant to distinguish signal from backgrounds. Our overall signal probability is a 2-D function of mt and ΔJES, where ΔJES is a shift applied to all jet energies in units of the jet-dependent systematic error. We apply a cut to the peak value of individual event likelihoods in order to reduce the effect of badly reconstructed events. This measurement updates our previous measurements to use a dataset corresponding to 4.8 fb-1 of integrated luminosity, requiring events with a lepton, large missing ET, and exactly four high-energy jets in the pseudorapidity range |η| ≤ 2. In addition, for this analysis, we add a new class of events containing loose muons to increase the total data sample. We require that at least one of the jets is tagged as coming from a b quark, and observe 738 total events before and 630 events after applying our likelihood cut. We find mt = 172.6 ± 0.9 (stat.) ± 0.7 (JES) ± 1.1 (syst.) GeV/c2, or mt = 172.6 ± 1.6 (tot.) GeV/c2.

Event selection

In our analysis, we look for events in which ttbar pairs are produced, each decays into a W boson and a b quark, and then one W decays into a neutrino and a lepton (meaning, in this paper, an electron or muon) and the other W decays into a quark-antiquark pair; this is called the "lepton + jets" channel.

We identify top mass candidates in this channel by requiring four high energy jets from the four quarks and a W decay into a lepton and a neutrino. Specifically, for the lepton we require either an identified electron with ET > 20 GeV, an identified muon with pT > 20 GeV/c in the central region of the detector, or a loose muon with pT > 20 GeV/c, where a loose muon is a muon obtained not by the standard central muon trigger, but rather using a missing ET trigger; this allows us to accept muons in regions of the detector not covered by the main muon systems. The neutrino is identified by requiring a missing ET > 20 GeV in the event. For the jets, we require exactly 4 jets with ET > 20 GeV and |η| ≤ 2, where the jet energies have been corrected for non-uniform detector response, calorimeter stability, and nonlinear response to particle momenta. The missing ET is also corrected for muons and jet response. In addition, at least one of the jets must be tagged as a b-jet using a secondary vertex tagging algorithm. With these selection criteria, we observe a total of 1070 events in the data.

The background to this signal consists of three main sources: events where a W is produced in conjunction with heavy flavor quarks, events where a W is produced with light flavor quarks which are mistagged, and QCD events where a jet is misidentified as an electron. There are also smaller contributions from single top production, diboson (WW, WZ, or ZZ) production, and events with a Z decaying into a charged lepton pair in association with jets. To save time, we do not use separate samples for the Z+jets background, but rather increase the contribution from W+light to include them.

We use a variety of Monte Carlo samples to test and calibrate our method and evaluate the backgrounds to ttbar production. For signal events, we use ttbar events generated at a variety of top masses from 160 GeV/c2 to 184 GeV/c2 by the PYTHIA generator. The non-W QCD background is derived from data with non-isolated leptons, while the other backgrounds are generated using the ALPGEN generator with parton showering by PYTHIA, except for the single top samples which are generated using MadEvent with parton showering by PYTHIA at a top mass of mt = 172.5 GeV/c2, and the diboson samples, which are generated entirely with PYTHIA. Overlaps in the W+jets samples for different parton multiplicities are removed using the ALPGEN jet-parton matching along with a jet-based heavy flavor overlap removal algorithm. We also use events generated with the HERWIG generator as a cross-check.

The background estimate is shown in the table below:

CDF Run II Preliminary, 4.8 fb-1
Event type1 tag≥ 2 tags
non-W QCD44.5 ± 38.63.8 ± 4.0
W+light mistag40.7 ± 10.10.8 ± 0.3
diboson (WW, WZ, ZZ)10.6 ± 1.11.0 ± 0.1
Z → ℓℓ + jets8.5 ± 1.20.7 ± 0.1
W+bb54.6 ± 20.710.5 ± 3.5
W+cc33.5 ± 11.51.5 ± 0.5
W+c16.5 ± 5.70.7 ± 0.3
Single top8.7 ± 0.72.6 ± 0.2
Total background217.6 ± 56.921.6 ± 7.8
Predicted top signal (σ = 7.4 pb)644.2 ± 107.5238.7 ± 36.8
Events observed859211
eps version | LaTeX source

Signal likelihood calculation

Our signal likelihood calculation is performed by integrating over the matrix element using the following formula:

likelihood integration formula

eps version | LaTeX source

This likelihood gives us the probability that we observe in our detector an event with kinematic variables y as a function of the true top mass mt and the jet energy scale shift parameter ΔJES by integrating over the unknown parton-level quantities x. Specifically:

We assume that the lepton momentum is well-measured, leaving us 19 dimensions in our phase space Φ. In the past, we made further assumptions to reduce the dimensionality of this phase space. However, in this analysis, we use Quasi-Monte Carlo integration, which allows us to integrate over all 19 variables. Quasi-Monte Carlo integration differs from standard Monte Carlo integration in that it uses quasi-random sequences to generate points. Formally, a quasi-random sequence is one with a low discrepancy, where the discrepancy is a measure of the nonuniformity of the sequence. This allows an improvement in the convergence rate of the integral over the 1/√N convergence of standard Monte Carlo integration.

The transfer functions relate the observed jets to the parton-level quantities. We construct our transfer functions from Monte Carlo events and matching the simulated jets to their parent partons (or "proto-jets"). In our analysis, we factorize the transfer functions into separate momentum and angular parts. The momentum transfer functions are built as probability distributions of the ratio of the pT of the jet to the pT of the parton, while the angular transfer functions are built as probability distributions of Δη and Δφ, the differences between the η and φ of the jet and the parton. Both the momentum and angular transfer functions are built with dependence on the proto-jet pT and mass, and there are separate transfer functions built for each of 4 separate bins of jet η as well as for b and light quarks. A sample momentum (left) and angular (right) transfer function are shown below.

Sample momentum transfer function Sample angular transfer function

Background handling

In order to distinguish between signal and background events, we employ a neural network discriminant. Our discriminant uses ten variables: the PT of the 4 leading jets, the missing ET, the lepton ET, HT (the scalar sum of the jet transverse energies, missing energy, and lepton energy), aplanarity, DR, and HTZ. The neural network is trained on Monte Carlo events with a signal mass of 170 against a W+bbbar background and then checked to see that the output does not change significantly with different signal masses and background types. The result is shown below.

Neural network
Neural network discriminant. The solid lines show the output for different signal masses, while the dashed lines show the output for different background types.

Overall our discriminant shows good stability with respect to signal mass and background types. For a given event, we calculate the background fraction for that event fbg(q) = B(q)/(S(q)+B(q)), where q is the neural network output for that event. Note that the distributions for B(q) and S(q) are normalized to the overall expected background and signal fractions.

Our background handling is relatively simple. Our method does not include an explicit background likelihood, as we do not integrate over the background matrix elements. Instead, we treat all events under the assumption that they are signal. Thus, we expect that when we add all of the likelihoods for the observed events, the events will contain signal and background in their expected fractions. Thus, to recover the likelihood for the signal events, we subtract off the expected contribution from the background events:

log Lsig(mt, JES) = Σi[log Li(mt, JES)] - nbg log Lavg(mt, JES | background)

where the Li are the individual likelihoods for each event and Lavg is the average likelihood for background events, as obtained from Monte Carlo. We can rewrite this slightly using the individual background fraction for each event:

log Lsig(mt, JES) = Σi[log Li(mt, JES) - fbg(qi) log Lavg(mt, JES | background)]

These two expressions are equivalent if the events follow their expected distributions. However, the advantage to using fbg per event rather than the total of nbg is that if there are more or fewer background events in our sample than expected, fbg should be able to capture some of this change.

In addition to background events, there is another class of events not handled well by our signal integration. These are events which contain a true ttbar pair, but where the four observed tight jets and/or lepton are not produced directly from the ttbar decay; we call these events "bad signal" events. These can occur due to a variety of possibilities (extra jets from radiated gluons, misidentified dilepton or all-hadronic events, W → τ decay, etc.) and overall comprise roughly 35% of our total signal. In order to deal with these events, we implement a cut on the log of the peak value of the likelihood curve of 10. We find that such a cut eliminates a good percentage of bad signal and background events while retaining nearly all signal events. The below table shows the efficiency for "good signal", "bad signal", and background events. For a signal mass of 172.5 GeV/c2, 63.2% of 1-tag and 67.6% of >1-tag events are "good signal".

Type of event Total 1-tag >1-tag
Good signal96.3% ± 0.2%96.1% ± 0.2%96.8% ± 0.3%
Bad signal79.2% ± 0.4%78.7% ± 0.5%80.7% ± 0.9%
Background72.7% ± 0.3%72.9% ± 0.4%70.9% ± 1.0%
eps version | LaTeX source

Method validation and calibration

To test and calibrate our method, we perform our integration on Monte Carlo samples at a variety of signal top masses with background events included in the expected fraction. For a given top mass, 2000 pseudo-experiments (PEs) are performed, where each pseudo-experiment includes 924.5 events (the expected number of observed events after applying the likelihood cut) randomly drawn from the signal and background pools according to their expected fraction; the number of events for each pool is fluctuated around its average by a Poisson fluctuation.

For a given pseudo-experiment, we combine the individual event likelihoods, subtract off the expected background contribution as described above, and then extract the overall top mass using the "profile likelihood" method; that is, for each value along the mt axis, we select the value along the JES axis where the likelihood is maximized:

Lprof(mt) = maxj ∈ JES L(mt, j)

We then extract our result and statistical uncertainty from the resulting 1-D likelihood curve. For an ensemble of 2000 PEs, we then compute the measured mass (determined by the mean of the ensemble), bias, expected statistical uncertainty, and pull.

The plots below show the results of this test. The upper-left plot shows the measured mass as a function of input mass, while the upper-right plot shows the measured bias as a function of input mass. The lower-left plot shows the pull widths as a function of input mass, while the lower-right plot shows the expected uncertainty as a function of input mass.

Output mass Bias

Pulls Expected error

We also examine samples in which the ΔJES has been shifted from its nominal value of 0. The below plots show the results of these tests. The top two show the output JES and JES pull width for mt = 172 GeV/c2, which we use to calibrate our ΔJES measurement. The bottom left plot shows the output mass vs. ΔJES for different input top masses. There is a small dependence of output mass on input JES, as the lower right plot shows; we account for this dependence in our final calibration.

Output JES JES pulls

Output mass vs. input JES mt shift vs. input JES

From the results of the above tests, we calibrate our final measurement for the top mass and ΔJES. First we apply the bias and slope for the mass and ΔJES measurement, and then we use the measured slope of the output mt as a function of ΔJES as a final correction. This gives us the following formulas, where Δm = mt - 172, since our fits are centered around 172:

Δmcalib = (Δmmeas + 0.504)/0.969 - 0.34 ⋅ (ΔJES)calib
JES)calib = ((ΔJES)meas + 0.288)/0.884

We also correct the measured uncertainties using the slope and pull widths obtained:

m)calib = (σm)meas × 1.160/0.969
ΔJES)calib = (σΔJES)meas × 1.057/0.884

Data results

In the 4.8 fb-1 data sample, we find a total of 1070 events passing all of our selection cuts before the likelihood cut, 859 single-tag and 211 multiple-tag events. After applying the likelihood cut, we have 720 single-tag events and 198 multiple-tag events for a total of 918 events in our final likelihood. Applying the background subtraction, taking the profile likelihood, and applying the above calibration factors, we obtain a measurement of:

mt = 172.8 ± 0.9 (stat. + JES) GeV/c2

We can separate this uncertainty into a statistical uncertainty and uncertainty due to JES by comparing this with the 1-dimensional result, which yields a result of:

mt = 172.8 ± 0.7 (stat.) ± 0.6 (JES) GeV/c2

We can also extract a measured value for ΔJES, which is:

ΔJES = 0.14 ± 0.20 σ

We can also perform separate measurements on the 1-tag and >1-tag samples, which yield mt = 171.9 ± 1.1 GeV/c2 and mt = 174.2 ± 1.7 GeV/c2, respectively.

The plots below show the overall likelihood in data events. The plot on the left shows the likelihood over most of the range used in our integration. The right plot shows the contours corresponding to a 1-sigma, 2-sigma, and 3-sigma uncertainty around the peak. The full 2-D calibration has been applied to both axes.

Data likelihood Data contours

We can also compare the observed uncertainty with the expected uncertainty from pseudo-experiments. The below plot shows this comparison for PEs at a signal mass of 172.5 GeV/c2. 64% of pseudo-experiments had a smaller uncertainty than our uncertainty measured in data.

Expected uncertainty from PEs

Another comparison of interest is to compare the likelihoods observed in data with the likelihoods observed in Monte Carlo, to check the validity of our likelihood cut as applied to data. The plot below shows the value of the log-likelihood at the peak of the curve for all events; the cut at 10 is shown as the dashed line on the plot. The Monte Carlo is normalized to the number of data events. Performing a Kolmogorov-Smirnov (K-S) test to check the consistency of the data and Monte Carlo indicates a confidence level of 0.88, showing good agreement.

Log-likelihood value
at peak


Our systematics are summarized in the table below.

CDF Run II Preliminary, 4.8 fb-1
Systematic sourceSystematic uncertainty (GeV/c2)
MC generator0.25
ISR and FSR0.15
Residual JES0.49
Lepton PT0.14
Multiple hadron interactions0.10
Background modeling0.33
Gluon fraction0.03
Color reconnection0.37
eps version | LaTeX source

Here is a brief summary of the systematic uncertainties:


We have measured the mass of the top quark on a total of 4.8 fb-1 of integrated luminosity, and found a total of 918 events passing all of our cuts, from which we extract a measurement of:

mt = 172.8 ± 0.7 (stat.) ± 0.6 (JES) ± 0.8 (syst.) GeV/c2 = 172.8 ± 1.3 (total) GeV/c2