Measurement of Top Quark and Anti-top Quark Mass Difference
in the Lepton+Jets Using 5.6 fb-1 of CDF Data

Hyunsu Lee1 and Young-Kee Kim1,2
1 The University of Chicago, and 2 Fermilab
Contact the authors

ΔMtop = -3.3 +/- 1.4(stat.) +/- 1.0 (syst.) GeV/c2 = -3.3 +/- 1.7 GeV/c2


Phys. Rev. Lett. 106, 152001 (2011)

Public Note


Images below are saved as .gif or .png files. Click on any image to link to an .eps file




Introduction:

We present a measurement of the mass difference between top quark and anti-top quark (ΔMtop) in the Lepton+Jets decay channel. We use a data sample with integrated luminosity 5.6 fb-1 collected by the CDF II detector. The data sample consists of 2294 Lepton+Jets event candidates including both zero tag and b-tagged events.
In the Lepton+Jets channel a chi-squared function is minimized to obtain reconstructed mass different Δmtreco for every event which is modification of kinematic fitter of top quark mass measurment allowing mass difference of top quark and anti-top quakr by assuming averaged top quark mass as 172.5 GeV/c2.
In addtion, we use reconstructed mass difference from different combination of jet-parton assignment by choosing 2nd smallest chi2 combinatoric (Δmtreco(2)). Kernel density estimation (KDE) is used produce probability density functions that are two-dimensional in the observables. The two-dimensional distributions (Δmtreco; Δmtreco(2)) from data are compared to Monte Carlo to measure the ΔMtop. We measure ΔMtop=-3.3 +/- 1.4 (stat.) +/- 1.0 (syst.) GeV/c2.

Event selection:

To select events in the Lepton+Jets channel where one W from the tops decays to a pair of hadrons, and the other W decays to a charged lepton (electron or muon) plus a neutrino, we require a well-identified electron or muon, large missing transverse energy and 4 jets. We also require large HT and QCD veto cut to reduce background especially QCD multijet events. Here QCD veto cut is correspondig to reject events which we have very close phi (less than 0.5) angle between leading jet and MET if MET <30. We take advantage of different signal-to-background (S:B) and event shapes by splitting our sample into non-overlapping subsamples, based on the number of jets with a b-tag (using CDF's secondary vertex tagger, SECVTX). Events with exactly zero tag and one tag are required to have exactly 4 jets. In events with two or more tags, which have a higher S:B and more statistiacl power, we loosen the cut on the 4th jet and allow more than 4 jets. The event selection is summarized in the following table:


2-tag
1-tag
0-tag
Number of b-tags
>= 2
1
0
Jets 1-3 Et threshold (GeV)
>20
>20
>20
4th jet Et threshold (GeV)
>12
>20
>20
Extra jets (GeV)
Any
< 20
< 20
MET (GeV)
> 20
> 20
> 20
HT (GeV)
> 250
> 250
> 250
chi2
< 9
< 9
< 3
In addition we require that the chi-squared returned by the kinematic fit is smaller than 3 for zero b-tag sample and 9 for b-tagged (1 btag and 2 btag) samples to further reduce the background fraction and to ensure that only well reconstructed ttbar events enter the analysis. Furthermore to avoid possible bias in the probability density functions we apply a boundary cut requiring that all events have -200 < Δmtreco < 200 GeV/c2, -200 < Δmtreco(2) < 200 GeV/c2.


Mass difference Reconstruction between top and anti top quark:

A chi-2 minimization is performed to reconstruct a mass difference between top quark and anti-top quark for each event. The fitter is based on the hypothesis that the event is ttbar: it contains W mass constraints on the hadronic and leptonic side. Then, we allow mass difference of top quark and anti-top quark deviated by dM/2 from averaged mass 172.5 GeV/c2.
chi2

Only the leading 4 jets are assigned to the four quark daughters from the top quark decay. The jet-parton assignment that yields the lowest chi2 after minimization is kept for further analysis, and the corresponding to calculate reconstructed mass difference ( Δmtreco) by multiplying minus lepton charge
dmt

is used in our templates. Because the response of leptonic decay and hadronic decay of top in the fitter are very different, we divide our sample using lepton charge to improve statistical power of measurement. The distribution of Δmtreco for the six subsamples are shown below.
0tagmtreco_neg 0tagmtreco_pos
1tagmtreco_neg 1tagmtreco_pos
2tagmtreco_neg 2tagmtreco_pos

To add more information of mass difference, we use 2nd best reconstructed mass difference with different combinatoric from 2nd minimum of chi-2 combination. The distribution of Δmtreco(2) for subsamples are shown below.
0tagmtreco_neg 0tagmtreco_pos
1tagmtreco_neg 1tagmtreco_pos
2tagmtreco_neg 2tagmtreco_pos

Backgrounds

The background sources and their expected fraction of the total background are given in the table below with expected signal and observed data. The backgrounds are dominated by real W boson production in association with high-pt jets. The absolute normalization of W+jets is determined from the data, but the relative normalization between the different flavor samples is taken from MC. The expected number of events for single-top and diboson background are taken from theoretical cross-sections and MC predictions. We assume 7.4 pb of ttbar cross section with top mass=172.5 GeV/C2. The table below shows the expected backgrounds and signal.


Kernel Density Estimation:

We use a non-parametric Kernel Density Estimate-based approach to forming probability density functions from fully simulated Pythia MC. The probability for an event with an observable x is given by the linear sum of contributions from all entries in the MC:
KDE 2 image

Here, f(x) is the probability to observe x given some MC sample with known mass and JES (or the background). The kernel function K is a normalized function that adds varying probability to a measurement at x depending on its distance from xi. The smoothing parameter h is a number that determines the width of the kernel. Larger values of h smooth out the density estimate, and smaller values of h keep most of the probability weight near xi. We use an adaptive method in which the value of h = h(f(x_i)). The peak of the distribution, we use smaller smoothing. In the tails of the distribution, where statistics are poor and we are sensitive to statistiacl fluctuation, we use a larger amount of smoothing. KDE can be expanded to two dimensions by multiplying together two kernels:
KDE 4 image

We apply this technique to obtain pdf for each ΔMtop Monte Carlo sample that was generated as well as the background samples.

Likelihood and Local Polynomial Smoothing


We minimize the extended likelihood with respect to the mass difference and signal and background expectation to obtain the measurement as well as statistical uncertainty. The form of the likelihood for subsample k is shown below.
where ns and nb are signal and background expectations and N is the number of events in the subsample, Psig is the signal probability density function and Pbg is the background probability density function. nb0 is the a-priori background estimate and sigma_nb0 is the uncertainty on that estimate.
Kernel density estimation allows only for calculation of probability density function at the values of ΔMtop where Monte-Carlo samples are available. To evaluate pdf at arbitrary Mtop for each event we use local polynomial smoothing. A fit to a quadratic polynomial will be performed using the values of PDF calculated using the KDE method. The points near the required value have a higher weight than points away from the required point. Deweighting is performed using a 'tricubic' function. Value of the quadratic fit at the required (ΔMtop) point is used as the value of PDF.

Method validation

To get a calibration of the method on the estimate of bias and the estimate of statistical uncertainty we perform ensamble tests. We repeatedly draw events from the signal and background model mimicking possible variations of signal and background numbers that may occur in data. A mass measurement is performed on each of these pseudo-datasets. Knowing the ΔMtop of the dataset from which the signal events were drawn we can form residuals (ΔMtop(fitted) - ΔMtop(MC)) and pulls ((ΔMtop(fitted) - ΔMtop(MC))/returned uncertainty). Ideal performance would yield 0 residual and pull distributions centered at 0 and with width 1. The residual distribution of the ensamble tests are shown here:
residuals residuals
The residual is very well agree with 0 while pullwidth is slightly off from 1. Based on pullwidth measurement, we should inflate out measured statistical uncertainty by 4.0%.

Systematic uncertainties

The contributions to the systematic uncertainty are shown below. The dominant effect is the signal modeling. We compared pseudoexperiments generated with Madgraph and Pythia (Herwig also had compared but, we took bigger deviation from Pythia). We also compared different parton showering between Pythia and Herwig using generated events by Alpgen. Those of two parts are composed to signal modeling. Because we are measuring mass difference between top and anti top quark, our measurement would be sensetive on response of b quark and anti-b quark. We measured pT balance of b and anti-b quark using dijet sample by SECVTX b-tagging for both jets. We indentify b flaver using soft muon of leptonic decay of b quark. The measured deviation between data and MC propage to b/bbar asymmetry systematics. We also investigate the effect of faking lepton charge by 1% which is added as a part of lepton pT systematics.

Fit and results

We perform fit to the data using 5.6 fb-1 ppbar collisions and measure ΔMtop = -3.3 +/- 1.4 (stat.) +/- 1.0 (syst.) GeV/c2 = -3.3 +/- 1.7 GeV/c2. The likelihood profile is shown below:
We perform pseudoexperiments using observed number of events to evaluate the probability of obtaining the uncertainty found in data. Results are shown below.
Pvalue


The Δmtreco distribution from the data with overlayed background and signal (-4GeV/c2) template fitted is depicted below.
residuals residuals
residuals residuals
residuals residuals
The 2nd observable Δmtreco(2) distribution from the data with overlayed background and signal (-4GeV/c2) template fitted is depicted below.
residuals residuals
residuals residuals
residuals residuals

We then combined all of tagged events and compared with 0GeV and -4GeV signal sample below. We also make same plot for 0tagged events. residuals residuals
residuals residuals

The comparison between data and estimation for a couple of kinematic variables are shown in the link separating by b-tagging.

0 tag - ΔMtop = 0.0 GeV/c2
tagged - ΔMtop = 0.0 GeV/c2

0 tag - ΔMtop = -4.0 GeV/c2
tagged - ΔMtop = -4.0 GeV/c2



Here is showing the first observable (Δmreco) distribution compared with prediction from 0.0 GeV/c2 signal (left) and -4.0 GeV/c2 signal (right).
residuals residuals
Hyunsu Lee for the Authors group

Last modified Jun 7, 2010