A Measurement of Top Quark Width using Template Method in Lepton+Jets Channel with 4.3 fb-1

Jian Tang1, HyunSu Lee1, and Young-Kee Kim1,2
1 The University of Chicago, and 2 Fermilab
Contact the authors

95% Confidence Level:    Γtop < 7.6 GeV
68% Confidence Level:    0.3 GeV < Γ top < 4.4 GeV

This analysis: Public Note
Anslysis with 1fb-1: Public Note, Public Webpage

Images below are saved as .gif or .png files. Click on any image to link to an .eps file


We present a measurement of the top quark width in the Lepton+Jets channel. We use a data sample with integrated luminosity 4.3 fb-1 collected by the CDF II detector. In the Lepton+Jets channel a chi-squared function is minimized to obtain reconstructed top mass mtreco for every event. The invariant mass of the jets coming from the hadronically decaying W boson mjj is used to reduce the dominant systematic effect arising from the jet energy scale.
Kernel density estimation (KDE) is used produce probability density functions that are two-dimensional in the observables. The two-dimentional distributions (mtreco; mjj) from data are compared to Monte Carlo to extract the top quark width using maximum likelihood fit. We then perform Pseudo-Experiments (PE) for each MC sample, which enables us to apply Feldman-Cousins (FC) construction to build 95% confidence interval for top quark width. In the Feldman-Cousins construction, we take information from the maximum likelihood function and build an ordering principle---Δχ2 for each PE. Each MC sample, which has a set of PEs(3000), will finally have a critical value of Δχ2 called Δχ2c that is calculated from the Δχ2 distribution of these PEs. In the end, we will use the Δχ2c of each sample together with the data fit information to find the limit(s) of top quark width. To incorporate systematic effects in to top quark width limits, we first convolute then shift the maximum likelihood function with and by a Gaussian function, which has a &sigma related to systematic effects.
We set an upper limit on the top quark width of 95% CL: Γtop < 7.6 GeV, which corresponds to a lower limit on the top quark life time of &tautop > 8.7 * 10-26 s at 95% CL. We also set central limits of top width at 68% CL: 0.3 GeV < Γ top < 4.4 GeV.

Event selection

To select events in the Lepton+Jets channel where one W from the tops decays to a pair of hadrons, and the other W decays to a charged lepton (electron or muon) plus a neutrino, we require a well-identified electron or muon, large missing transverse energy and 4 jets, at least one of which is identified as arising from a b quark. We take advantage of different signal-to-background (S:B) and event shapes by splitting our sample into two non-overlapping subsamples, based on the number of jets with a b-tag (using CDF's secondary vertex tagger, SECVTX). Events with exactly one tag are required to have exactly 4 jets. In events with two or more tags, which have a higher S:B and more statistiacl power, we loosen the cut on the 4th jet and allow more than 4 jets. The event selection is summarized in the following table:

Number of b-tags
>= 2
Jets 1-3 Et threshold (GeV)
4th jet Et threshold (GeV)
> 20
> 20
In addition we require that the chi-squared returned by the kinematic fit is smaller than 9 for both Lepton+Jets subsamples to further reduce the background fraction and to ensure that only well reconstructed ttbar events enter the analysis. Furthermore to avoid possible bias in the probability density functions we apply a boundary cut requiring that all events have 100 < mtreco < 350 GeV/c2, and 50 < mjj < 115 GeV/c2 for the single-tag subsample and 50 < mjj < 125 GeV/c2 for the double-tag subsample.

Top mass reconstruction and dijet mass reconstruction

A chi-2 minimization is performed to reconstruct a top quark mass for each event. The fitter is based on the hypothesis that the event is ttbar: it contains W mass constraints on the hadronic and leptonic side and requires the two top masses in the event to be equal. Only the leading 4 jets are assigned to the four quark daughters from the top quark decay. The jet-parton assignment that yields the lowest chi2 after minimization is kept for further analysis, and the corresponding top mass (mt) is used in our templates. The distribution of mtreco for the two Lepton+Jets subsamples(1-btag on the left and 1-btag on the right) are shown below.

1tagmtreco 2tagmtreco

To measure the JES, mass templates of the W boson decaying hadronically mjj are also constructed in addition to the top mass templates. The chi2 fitter is not used to obtain mjj though events failing the chi2 cut are also not used to measure the JES. In 2-tag events, there is only one dijet mass from among the leading 4 jets consistent with b-tagging (ie not tagged as a b). In 1-tag events, there are 3 dijet masses consistent with b-tagging. We take the single dijet mass closest to the well known W mass as the single value of mjj per event. Mjj for the 2 subsamples(1-btag on the left and 2-btag on the right) for 3 different values of the JES in the detector are shown below.
1tagwjj 2tagwjj


The background sources and their expected fraction of the total background are given in the table below with expected signal and observed data. The backgrounds are dominated by real W boson production in association with high-pt jets. The absolute normalization of W+jets is determined from the data, but the relative normalization between the different flavor samples is taken from MC. The expected number of events for single-top and diboson background are taken from theoretical cross-sections and MC predictions. We assume 7.4 pb of ttbar cross section with top mass=172.5 GeV/C2. The table below shows the expected backgrounds and signal (the numbers in the table have been rounded off according to rules in PDG).

Kernel Density Estimation

We use a non-parametric Kernel Density Estimate-based approach to form probability density functions from fully simulated PYTHIA MC. The probability for an event with an observable x is given by the linear sum of contributions from all entries in the MC:
KDE 2 image

Here, f(x) is the probability to observe x given some MC sample with known mass and JES (or the background). The kernel function K is a normalized function that adds varying probability to a measurement at x depending on its distance from xi. The smoothing parameter h is a number that determines the width of the kernel. Larger values of h smooth out the density estimate, and smaller values of h keep most of the probability weight near xi. We use an adaptive method in which the value of h = h(f(x_i)). The peak of the distribution, we use smaller smoothing. In the tails of the distribution, where statistics are poor and we are sensitive to statistiacl fluctuation, we use a larger amount of smoothing. KDE can be expanded to two dimensions by multiplying together two kernels:
KDE 4 image

We apply this technique to obtain pdf for each Mtop, JES Monte Carlo sample that was generated as well as the background samples generated of different JES values.

Likelihood Fit

We minimize the extended likelihood with respect to the top mass, JES and signal and background expectation to obtain the measurement as well as statistical uncertainty. The form of the likelihood for subsample k is shown below.
where ns and nb are signal and background expectations and N is the number of events in the subsample, Ps is the signal probability density function and Pb is the background probability density function. Mtreco and Wjj are the reconstructed top quark mass and dijet mass of W boson. Γtop and ΔJES are the top width and jet enengy scale to be determined by this likelihood fit. nb0 is the a-priori background estimate and &sigmanb0 is the uncertainty on that estimate.
Kernel density estimation allows only calculation of probability density function at the values of the top mass and jet energy scale where Monte-Carlo samples are available. To evaluate pdf at arbitrary Mtop for each event we use local polynomial smoothing. A fit to a quadratic polynomial will be performed using the values of PDF calculated using the KDE method. The points near the required value have a higher weight than points away from the required point. Deweighting is performed using a 'tricubic' function. Value of the quadratic fit at the required (Mtop,JES) point is used as the value of PDF.

Feldman-Cousins Construction

The key feature in constructing confidence intervals using Feldman-Cousins scheme is to define the ordering principle.
KDE 4 image
where Γinput is the input top width of the MC sample, and ΓBestFit is the measured top width of a Pseudo Experiment; χ2Input) is the χ2 value at input top width, while χ2BestFit) is the χ2 value at the best fit top width of this single Pseudo-Experiment. We use the minimization of negative log-likilihood fit to get the best fit value of measured top width, but suppose the likihood function to be in Gaussian regime, -2log(likelihood) should follow the χ2 distribution. Therefore in the above quation we simply choose
KDE 4 image
where L is the likelihood function discribed in Section "Likelihood Fit". There is a Δχ2 value for each Pseudo-Experiment. For a MC sample we run thousands of Pseudo-Experiments therefore there is a distribution of Δχ2 for this sample, from which we can find a critical Δχ2 value Δχ2c so that the interval [0,Δχ2c]$ contains 95% of the events. If the distribution is really a "χ2" distribution one would naively expect Δχ2c to be 3.84, as calculated from statistics. However, in reality Δχ2c could deviate from 3.84 for several reasons: a) Physical boundary effects; b) the deviation of likelihood function from Gaussian regime.
After we find the Δχ2c for each MC sample, we test the coverage by running another set of PEs of this sample. Note that we have two parameters when generating MC samples--Mtop and ΔJES, thus routinely a two-dimentional Feldman-Cousins construction should be performed. In our analysis, however, we fixed ΔJES=0 and only Mtop is used. Nevertheless, we can use the Δχ2c of samples of ΔJES=0 to test the coverage of samples of ΔJES !=0. If the coverage is fine (fluctuate around 95%) then we do not need to go to two-dimentional Feldman-Cousins construction. The following figure shows the coverage for both zero and none-zero ΔJES samples, which does not show any obvious underestimation, therefore we think it is fine that we only use Mtop to extract top width limits.
KDE 4 image

Systematic Uncertainties

The top width shift due to the systematic effect is shown below. The dominant effec comes from the jet resolution. The total top width shift due to all systematic effects is 1.61 GeV

In order to incorporate these systematic effects into top width limit(s), we first use a convolution method for folding systematic uncertainties into likelihood function. That is, we convolute the original likelihood function in Section "Likelihood Fit" with a Gaussian function related to systematic effects to obtain a new likelihood function:

where x represents data and &sigma is equal to the total top width shift(1.61 GeV) due to systematic effects. Second, we shift this new likelihood function horizontally by a random number according to a Gaussian distribution with &sigma 1.61 GeV. In fact, with no boundary effect considered, the first step changes the shape of likelihood function while the second step changes the best fit top width of the likelihood function. Then we repeat what was done in Section "Feldman-Cousins Constructioni": 1),get the Δχ2 distribution for each MC sample and find the critical value Δχ2c; 2), plot Δχ2c vs input top width; 3), overlap this plot with data fit and find the limit(s) of top quark width. Note that in the data fit the likelihood function should also be convoluted with the same Gaussian function.

Fit Results

After performing the log-likelihood fit of data, the best fit gives Γtopdata meas =1.9+1.9-1.5 GeV and ΔJES=0.07+0.2-0.2, as shown in the following Figure.
KDE 4 image
We project the 2D likelihood fit to 1D likelihood function with variable Γtop, convert this function according to the second equation in Section "Feldman Cousins COnstruction". Then overlap the data fit plot and the plot of Δχ2c vs input top width, discribed in Section "Feldman Cousins Construction". From the interception of the overlapped plots, as seen from the following figure, we find an upper limit of top quark width Γtop < 7.6 GeV at 95% Confidence Level.
KDE 4 image

We also measured top width at 68% confidence level and get central limits of top quark width 0.3 GeV < Γtop < 4.4 GeV, with systematic effecs incorporated.

[1]Satomi Shiraishi, Jahred Adelman, Erik Brubaker,Young-Kee Kim, Phys.Rev.Lett. 102, 042001 (2009)
[2]G.Feldman and R.Cousins, Phys.Rev.D 57, 3873 (1998)
[3] Scott, David W., Multivariate Density Estimation: Theory, Practice and Visualization (Wiley-Interscience, 1992)
[4]C. Loader, Local Regression and Likelihood, (Springer 1999)
[6] A.Bhatti,et al.,Nucl.Instrum.Methods Phys.Rev.A 566,375 (2006)
[7]T.Affolder, et al., Phys.Rev.D 64, 032002 (2001)
[8]F.Abe, et al., Phys.Rev.D 45, 1448 (1992)
[9]I.I.Y. Bigi et al., Phys.Lett.B181, 157 (1986)
[10]M.Jezabek and J.H.Kuhn, Nucl.Phys.B 314,1 (1989)

TMT Group

Last modified Jan 15, 2010