Measurement of the Top Quark Mass with In Situ Determination of the Jet Energy Scale (April 2005)


Jean-Francois Arguin, Pekka Sinervo, Si Xie (Toronto)
Erik Brubaker, Adam Gibson (UC Berkeley)
Jahred Adelman, Young-Kee Kim, Mel Shochet, Un-Ki Yang (Chicago)
Guram Chlachidze, Julian Budagov (JINR)
Giorgio Belletini (Pisa) George Velev (Fermilab)
Shinhong Kim, Taka Maruyama, Koji Sato, Tomonobu Tomura (Tsukuba)
Luc Demortier (Rockefeller)
E-mail to authors

1) Summary

We present a measurement of the top quark mass using 318 pb^-1 of data from ppbar collisions at a center-of-mass of 1.96 TeV collected by the CDF detector. We select ttbar events where one W boson decay leptonically and the other hadronically. The observed invariant mass of the hadronic W boson decay is used to reduce the largest systematic uncertainty arising from the jet energy scale. The top quark mass and hadronic W boson mass distributions reconstructed in data are compared to Monte Carlo expectations to determine simultaneously the top quark mass and the jet energy scale. We measure M_top = 173.5 +3.7/-3.6 (stat.+JES) +/- 1.3 (syst.) GeV/c^2, where the first uncertainty includes both the statistical and the jet energy scale (JES) uncertainty. This constitutes the best measurement of the top quark mass to date.

M_top = 173.5 +3.7/-3.6 (stat.+JES) +/- 1.3 (syst.) GeV/c^2 = 173.5 +3.9/-3.8 GeV/c^2.

It is possible to disentangle the statistical and JES part of the uncertainty:

M_top = 173.5 +2.7/-2.6 (stat.) +/- 2.5 (JES) +/- 1.3 (syst.) GeV/c^2

The m_t distributions for each subsamples are shown below with the MC templates overlaid corresponding to the best fit. Note to the speakers: the area of each subsample on this plot is proportional to their respective weight in the combined fit to the data. The total area is normalized to one. The signal and background template fit functions are scaled accordingly. For an alternative plot where the histograms area correspond to the actual number of events, see the eps and jpg version here.

Below is the plot for the Higgs mass constraint from M_W and M_top (thanks to Martin Grunewald and the LEP electroweak working group for providing these plots). The constraint on M_top comes from the EPS 2005 combination (172.7+/-2.9 GeV/c^2). The corresponding Higgs mass fit yields:

M_Higgs = 91 +45/-32 GeV/c^2

Electroweak observables (like Mtop) can also be used to constrain physics beyond the standard model, like supersymmetry as shown in the plot below. The new CDF measurement (blue curve) favors lower SUSY mass scale compared with the Run I D0 measurement (black curve) that yields a higher top mass. Note that greater precision is needed for the top and W mass measurements (as well as other sensitive electroweak observables) to decisively discriminate between the Standard Model (red area) and Minimal Supersymmetric Model (green area).

Note: on the figures below: left-click for eps file, right-click for jpg

2) Strategy of the Analysis

This analysis constitute an extension of the traditional template analysis that was used for example for the official CDF Run I measurement. In addition to consider templates of the reconstructed top mass, the mass of the W boson decaying hadronically (W->jj) is also considered. The goal is to reduce the uncertainty arising from the jet energy scale (JES) that is the dominant systematic uncertainty in the top quark mass measurement. The information on the jet energy scale from the W->jj decays is combined with the traditional CDF jet energy calibration to guarantee the reduction of this systematic uncertainty. A 2-D likelihood fit of the true top mass and JES is employed to determine these two parameters simultaneously.

Before performing the top mass measurement, we verify that the determination of the jet energy scale from the W->jj decays is in agreement with the CDF jet energy calibration. This is done by performing a JES cross-check where the likelihood mentioned above is used to determine solely the JES (the top mass is fixed to the spring 2005 world average of 178 GeV/c^2 when performing the JES cross-check).

3) Selection of ttbar events

The ttbar events are selected in the lepton+jets channel, i.e. with a well-identified electron or muon, at least 4 jets and large missing transverse energy. We use a sample of well-validated CDF data with an integrated luminosity totalizing 318 pb^-1. The lepton+jets subsample is separated in four non-overlapping subsamples with different sensitivity to the top quark mass that are fitted separately for better statistical performance. The separation is primarily based on the number of jets with a b-tag (based on the secondary vertex tagger SECVTX): 0-tag, 1-tag and 2-tag. The 1-tag sample is further separated based on the 4th jet Et threshold: 8 < Et < 15 GeV for 1-tag(L) and Et > 15 GeV for 1-tag(T) (L and T stands for loose and tight, respectively).

The event selection for measuring the W mass and the top mass is slightly different: no chi2 cut is applied on the W mass sample. This is to get better sensitivity of the W mass to the jet energy scale (the chi2 fitter for the top mass reconstruction is explained later). The efficiency for the chi2 cut is approximately 85% for signal and 70% for background. The number of events in the top mass and W mass samples for each subsamples is shown below:

4) Top Mass Reconstruction

A chi2 fitter is used to reconstruct the top quark mass in each event. The fitter is based on the hypothesis that the event is ttbar: it contains W mass constraints on the hadronic and leptonic side and requires the two top mass in the event to be equal. The jet-parton assignment that yields the lowest value of the chi2 after minimization is kept for further analysis and plotted in the histograms below (for M_top = 178 GeV/c^2):

Template of the reconstructed top mass (m_t)

The reconstructed top mass templates are built as a function of the true top quark mass (M_top) and the jet energy scale as shown below (for the 0-tag subsample). Analytical probability density functions of m_t as a function of M_top and JES are extracted by fitting the multidimensional templates. The result of the fit is overlaid in green in the histograms below, and is illustrated in the two-dimensional histograms below (for the 1-tag(T) subsample).

The unit of JES chosen is 1 sigma as defined by the CDF Jet Energy and Resolution (JER) Group (see public page here). This unit is chosen to facilitate the combination of the JES from W->jj and the JER group. This uncertainty varies as a function of the jet eta and pt. Integrated over the ttbar jet pt and eta spectrum, it corresponds to an uncertainty of approximately 3%.

5) W mass reconstruction

To measure the jet energy scale in situ ttbar events, mass templates of the W boson decaying hadronically (m_jj) are also constructed in addition to the top mass templates. The mass reconstruction method is different, the chi2 fitter for top mass reconstruction is not employed. Instead, the invariant mass of any pair of jets in the first 4 highest Et jets that is not b-tagged is part of the templates. Therefore, the 2-, 1- and 0-tag templates contain 1, 3 and 6 m_jj per event, respectively. The W->jj mass distribution are shown below for M_top=178 GeV and nominal JES.

Template of the reconstructed W mass (m_jj)

Similarly to the top mass templates, the W mass templates are constructed as a function of M_top and JES. As can be seen in the two-dimensional histograms below, the m_jj are relatively insensitive to M_top but are sensitive to JES, which allows a determination of the jet energy scale with little uncertainty on M_top.

7) Background Contamination

The background sources and their expected number of events are given in the table below (for the m_jj sample). It is dominated by real W boson production in association with high-pt jets. This a priori information on the expected number of background events is used as a constraint in the likelihood fit. No background estimate currently exists for the 0-tag subsample, so the number of background events is left unconstrained for that subsample.

The background top and W templates, made of the sources shown in the table above and in the expected proportions, are shown below. They are independent of M_top and JES. Their dependence of JES is small and it is treated as a systematic uncertainty.

8) Test of the JES Cross-Check Fit

As mentioned before, the JES information from the traditional CDF JES calibration and the W->jj decays is combined in the M_top measurement for optimal performance. Before doing so, we cross-check that the JES in data is in agreement with MC expectations based solely on the m_jj templates. This is done using an unbinned likelihood fit that fits the data m_jj distributions using the MC templates. The top quark mass is fixed to the world average 178 GeV/c^2 in this fit.

Before performing the fit in the data, we use a pseudo-experiments procedure to check that the mean and uncertainties of the fitted JES obtained from the JES cross-check are unbiased. This is illustrated in the pull mean and width distributions as a function of the input JES shown below. The uncertainties obtained in the data are scaled by a factor 1.027 that corresponds to the average size of the pull width.

9) Test of the M_top Measurement Fit

After having performed the JES cross-check, the full 2-dimensional top mass measurement is applied. The 2-D fit determines simultaneously the top mass and the JES and take into account the correlations between these two observables. For optimal performances, the JES information from the traditional CDF estimate is combined with the one obtained from the W->jj decays. This is done by including a Gaussian constraint in the likelihood fit that constrains the JES parameter to 0 with a width of 1 (as described above, the unit of JES is one sigma, the uncertainty defined by the CDF jet energy and resolution group).

As for the JES cross-check fit, we use pseudo-experiments to check that the fitted M_top and JES returned by the 2-D fit have unbiased central values and uncertainties. This is illustrated below by the figures of the pull mean and width of M_top and JES as a function of the input M_top and JES. No obvious bias is observed. The uncertainties obtained in the data are scaled by a factor 1.027 and 1.013 for the fitted M_top and JES, respectively. These factors corresponds to the average size of the pull widths shown below.

10) Systematics Uncertainties

The summary of the systematic uncertainties are shown below (apart from the JES that is a result of the final fit) for the M_top and JES from the 2-D fit and the JES from the JES cross-check.

The JES estimate from the JER group or W->jj do not give direct information on the b-jets energy scale. B-jets are different than other jets since they contain more semileptonic decays, have a harder fragmentation and suffer a different colour flow than jets from the W bosons. The uncertainty arising from each of these differences has been estimated:

The +/-1 sigma variations from the 20 eigenvectors of the CTEQ6M set of PDF's have been considered. Small variations of the fitted top mass have been observed has shown in the plot below (40 points at the right of the straight line). The PDF uncertainties sum up to 0.3 GeV/c^2.

The non-W (QCD) background is assumed to be modelled by W+light jets MC in the likelihood fit. This assumption is checked by considering the background templates made of fake electrons in data (shown below). By performing pseudo-experiments from these background templates, we estimate the uncertainty from the shape of the non-W background to be 0.1 GeV/c^2.

11) Results on data: JES cross-check

The application of the JES cross-check fit to the data (with M_top fixed to 178 GeV) yield :

JES = -0.76 +/- 1.00 (stat.) sigma.

Considering the systematic uncertainties:

JES = -0.76 +/- 1.27 sigma.

Thus we conclude that the JES in the data is in good agreement with our MC expectations within uncertainties. The m_jj distributions and the shape of the -log likelihood is shown below.

The expected uncertainties are shown below. The expectated median uncertainty is approximately 1.0 sigma. The uncertainty obtained in the data (shwon by the arrows) are in good agreement with expectations: 69% and 40% of pseudo-experiments have lower negative and positive uncertainties, respectively.

The number of events fitted in the data are shown below. They are in good agreement with the background calculation shown above as expected.

The results in each subsamples (with the shape of the likelihood shown in the inset) are shown below. The table below summarizes the results. No significant differences are observed between each subsample (the apparent discrepancy for 0-tag is in reality less than 2 sigma, thus not very unlikely).

12) Results on data: M_top Measurement

The reconstructed top and W mass with the overlaid best fit for the signal and background Monte Carlo expectations are shown below:

The application of the full 2-D fit to the data yield:

M_top = 173.5 +3.7/-3.6 (stat.+JES) GeV/c^2

JES = -0.10 +/- +0.78/-0.80 sigma

by considering the systematic uncertainties:

M_top = 173.5 +3.7/-3.6 (stat.+JES) +/- 1.3 (syst.) GeV/c^2

JES = -0.10 +/- +0.88/-0.89 sigma

For the top mass, one can also untangle the stat. and JES part of the fitted uncertainty but re-performing the fit where the JES is fixed to the fitted value, re-expressing the result:

M_top = 173.5 +2.7/-2.6 (stat.) +/- 2.5 (JES) +/- 1.3 (syst.) GeV/c^2

The 2-D shape of the - log likelihood is shown as a function of M_top and JES below. The anti-correlation between these two variables is well illustrated in this plot. The plot on the right is different version of the same plot.

Below is shown the 2-D likelihood for each subsamples. Note that the JES variable is well constrained for subsamples with limited statistical power like 0-tag and 1-tag(L) because of the a priori constraint on JES.

Below is shown the expected M_top stat.+JES uncertainty (left) and the combined JES uncertainty (right). The uncertainty obtained in the data, shown by the arrow, are in good agreement with expectations. For instance, 9% pseudo-experiment M_top uncertainties are smaller than the data. The median expected top mass (stat.+JES) uncertainty is 4.2 GeV/c^2 at 178 GeV using the expected number of events based on the theoritical cross-section.

The number of fitted events is shown below. It is in good agreement with the one obtained from the JES cross-check.

The 2-D shape of the - log likelihood is shown as a function of M_top and JES below, but this time where the JES is only constrained from the W->jj decays (no a priori information from the JES group is assumed). The same color code as above is used. A very good agreement is obtained with respect to the full fit:

M_top = 174.0 +4.5/-4.5 (stat.+JES) GeV/c^2

JES = -0.25 +/- +1.22/-1.22 sigma

Shown below is the result of the fit in each subsamples. On the left (right) are shown the m_t (m_jj) mass distributions. The result of the fitted M_top and JES are summarized in the tables below. Good agreement is obtained between each subsamples.

Future of the Analysis

The big advantage of this analysis is that both the statistical and jet energy scale systematic uncertainties will improve with integrated luminosity. Shown below is the expected JES uncertainty only from W->jj as a function of integrated luminosity. We expect to reach JES uncertainties of less than 1 GeV in the future of CDF.

Shown below is the expected total uncertainty obtained with this analysis as a function of integrated luminosity. As can be observed, this technique permits to reach an uncertainty of 2 GeV or better by the end of Run II (4-8 fb^-1). This estimate is conservative in the sense that we have assumed that the systematic uncertainties are constant at 1.3 GeV/c^2. We note also that the projection obtained at 2 fb^-1 is better than the one made in TDR (dating of 1996) as illustrated in the plot.

Jean-Francois Arguin

Last modified: Tue Oct 4 19:43:37 CDT 2005