Search for Electroweak Single Top-Quark Production using Neural Networks with 2.2 fb-1 of CDF II data

Dominic Hirschbühl, Jan Lück, Thomas Müller, Adonis Papaikonomou, Thomas Peiffer, Manuel Renz, Svenja Richter, Irja Schall, Jeannine Wagner-Kuhr, Wolfgang Wagner

Universität Karlsruhe

 


Abstract
Results
Event Selection
Neural Network Input Variables
Neural Network Output Crosscheck with 0tag sample
Templates for Combined Search
Systematic Uncertainties
Expected Significance for Combined Search
Binned Likelihood Fit to Data for Combined Search
Observed Significance for Combined Search
Variables in the high-output region
Public Conference Note (pdf)
 

To download a plot in .eps format, left-click on the plot.

To view a plot with full resolution in .gif format , right-click and select "View Image."

 

Abstract

We report on a search for electroweak single top-quark production with CDF II data corresponding to 2.2 fb-1 of integrated luminosity. We apply neural networks to construct discriminants that distinguish between single top-quark and background events. In our analysis we assume a top-quark mass of 175 GeV/c2. We combine t- and s-channel events to one single top-quark signal under the assumption that the ratio of the two processes is given by the standard model (SM). The expected significance under the assumption of a SM cross-section is determined to be 4.4 σ (p-value of 0.00000529). A binned likelihood fit to the data measures a single top-quark production cross-section of 2.0-0.8+0.9 pb. The observed p-value is 0.00060790 which corresponds to a significance of 3.2 σ.

 

Results
The sum of the NN Outputs of the four different neural networks. Background and signal templates are normalized to the fitted values.







Summary of the results for the four different neural networks and the final result of the simultaneous fit in all channels.



For the combined search the observed single top-quark cross section is:

 

 

 

Event Selection

The CDF event selection exploits the kinematic features of the signal final state, which contains a top quark, a bottom quark, and possibly additional light quark jets. To reduce multijet backgrounds, the W boson originating from the top quark is required to decay leptonically. One therefore demands a single high-energetic electron or muon (ET(e) > 20 GeV, or PT(μ) > 20 GeV/c) and large missing transverse energy (MET) from the undetected neutrino MET > 25 GeV.

The backgrounds belong to the following categories: Wbb, Wcc, Wc, mistags (light quarks misidentified as heavy flavor jets), top pair production tt events (one lepton or two jets are lost due to detector acceptance), non-W (QCD multijet events where a jet is erroneously identified as a lepton), Z→ll and diboson WW, WZ, and ZZ. We remove a large fraction of the backgrounds by demanding exactly two jets with ET > 20 GeV and |η| < 2.8 be present in the event. At least one of these two jets has to be tagged as a b-quark jet by using displaced vertex information from the silicon vertex detector (SVX). The non-W content of the selected electron dataset is further reduced by several requirements to MET, MET significance, transverse W boson mass, and several angles between the MET vector, lepton vectors and jet vectors. The numbers of expected and observed events are listed in the tables below.

 

Neural Network Input Variables
Using neural networks kinematic or event shape variables are combined to a powerful discriminant. In total we use four different networks in our analysis, one for the 2jet1tag category, one for 2jet2tag events, one for 3jet1tag events, and one for 3jets2tags. One of the variables is the output of the KIT flavor separator. The KIT flavor separator gives an additional handle to reduce the large background components where no real b quarks are contained, mistags and charm-backgrounds. Both of them amount to about 50% in the W+2 jets data sample even after imposing the requirement that one jet is identified by the secondary vertex tagger of CDF. The following plots show the 14 variables for the 2jet1tag neural net. The plots in the third column show the variables in the "zero-tag" sample (for cross-check).

Please find the plots for the variables of the other neural networks here: 2jet2tag, 3jet1tag, 3jet2tag.



MC distributions: the mass of the reconstructed top-quark data - MC comparison: the mass of the reconstructed top-quark data - MC comparison: the mass of the reconstructed top-quark
MC distributions: the neural network output of the KIT flavor separator for the b-tagged jet data - MC comparison: the neural network output of the KIT flavor separator for the b-tagged jet
MC distributions: the invariant mass of the two jets data - MC comparison: the invariant mass of the two jets data - MC comparison: the invariant mass of the two jets
MC distributions: the product of the lepton-charge and the pseudorapidity of the light quark jet data - MC comparison: the product of the lepton-charge and the pseudorapidity of the light quark jet data - MC comparison: the product of the lepton-charge and the pseudorapidity of the light quark jet
MC distributions: the transverse mass of the reconstructed top-quark data - MC comparison: the transverse mass of the reconstructed top-quark data - MC comparison: the transverse mass of the reconstructed top-quark
MC distributions: the cosine of the polar angle between the tight lepton and the light-quark jet in the top-quark rest-frame data - MC comparison: the cosine of the polar angle between the tight lepton and the light-quark jet in the top-quark rest-frame data - MC comparison: the cosine of the polar angle between the tight lepton and the light-quark jet in the top-quark rest-frame
MC distributions: the transverse energy of the light-quark jet data - MC comparison: the transverse energy of the light-quark jet data - MC comparison: the transverse energy of the light-quark jet
MC distributions: the cosine of the polar-angle between the charged lepton in the W-Boson rest-frame and the direction of the W-boson data - MC comparison: the cosine of the polar-angle between the charged lepton in the W-Boson rest-frame and the direction of the W-boson data - MC comparison: the cosine of the polar-angle between the charged lepton in the W-Boson rest-frame and the direction of the W-boson
MC distributions: the pseudorapidity of the reconstructed W boson data - MC comparison: the pseudorapidity of the reconstructed W boson data - MC comparison: the pseudorapidity of the reconstructed W boson
MC distributions: the transverse mass of the reconstructed W-boson data - MC comparison: the transverse mass of the reconstructed W-boson data - MC comparison: the transverse mass of the reconstructed W-boson
MC distributions: the sum of the pseudorapidities of the two jets data - MC comparison: the sum of the pseudorapidities of the two jets data - MC comparison: the sum of the pseudorapidities of the two jets
MC distributions: the transverse momentum of the charged lepton data - MC comparison: the transverse momentum of the charged lepton data - MC comparison: the transverse momentum of the charged lepton
MC distributions: the scalar sum of transverse energies data - MC comparison: the scalar sum of transverse energies data - MC comparison: the scalar sum of transverse energies
MC distributions: the cosine of the angle between the charged lepton in the W-boson rest-frame and the W-boson momentum in the top-quark rest-frame data - MC comparison: the cosine of the angle between the charged lepton in the W-boson rest-frame and the W-boson momentum in the top-quark rest-frame data - MC comparison: the cosine of the angle between the charged lepton in the W-boson rest-frame and the W-boson momentum in the top-quark rest-frame

 

Neural Network Output Crosscheck with 0tag sample
NN Output of the 2jet1tag neural net applied to the zero-tag sample NN Output of the 2jet1tag neural net applied to the zero-tag sample with logarithmic scale Ratio of the difference between data and prediction to the prediction for the 2jet1tag neural net applied to the zero-tag sample
NN Output of the 2jet2tag neural net applied to the zero-tag sample NN Output of the 2jet2tag neural net applied to the zero-tag sample with logarithmic scale Ratio of the difference between data and prediction to the prediction for the 2jet2tag neural net applied to the zero-tag sample
NN Output of the 3jet1tag neural net applied to the zero-tag sample NN Output of the 3jet1tag neural net applied to the zero-tag sample with logarithmic scale Ratio of the difference between data and prediction to the prediction for the 3jet1tag neural net applied to the zero-tag sample
NN Output of the 3jet2tag neural net applied to the zero-tag sample NN Output of the 3jet2tag neural net applied to the zero-tag sample with logarithmic scale Ratio of the difference between data and prediction to the prediction for the 3jet2tag neural net applied to the zero-tag sample

 

Templates for Combined Search
We use four different neural networks, one for the 2jet1tag sample, one for the 2jet2tag sample, one for the 3jet1tag sample, and one for the 3jet2tag sample. Since this is a combined search, we have one fit template for single top-quark events, which is the combination of the template for s-channel and the template for t-channel single top-quark production according to the ratio of the cross-sections predicted by the SM.
Fit templates of the 2jet1tag neural net. Fit templates of the 2jet2tag neural net.
Fit templates of the 3jet1tag neural net. Fit templates of the 3jet2tag neutral net.

 

Systematic Uncertainties
Systematic uncertainties can cause a shift in the event detection efficiency for events of different physics processes, but can also cause a change in the shape of the template distributions. The rate uncertainties for the four different neural networks are summarized in the tables. Below you find three examples of systematic shape uncertainties in the 2jet 1tag sample: jet energy scale (JES) for the single top-quark template, factorization and renormalization scale (Q2) for Wbb events, and modeling uncertainty on the KIT flavor separator output (KIT opt.).

Systematic rate uncertainties for the 2jet1tag neural net. Systematic rate uncertainties for the 2jet2tag neural net.
Systematic rate uncertainties for the 3jet1tag neural net. Systematic rate uncertainties for the 3jet2tag neural net.



The JES systematic uncertainty for the four different neural networks.


Systematic shape uncertainties in the 2jet 1tag sample: jet energy scale (JES) for the single top-quark template. Systematic shape uncertainties in the 2jet 1tag sample: factorization and renormalization scale (Q2) for Wbb events. Systematic shape uncertainties in the 2jet 1tag sample: modeling uncertainty on the KIT flavor separator output (KIT opt.).

 

Expected Significance for Combined Search
To compute the significance of a potentially observed signal, we perform a hypothesis test. Two hypotheses are considered. The first one, H0, assumes that the single-top cross section is zero (β1 = 0) and is called the null hypothesis. The second hypothesis, H1, assumes that the single-top production cross section is the one predicted by the standard model (β1 = 1). The objective of our analysis is to observe single-top, that means to reject the null hypothesis. The hypothesis test is based on the Q-value, Q= -2(ln Lred1=1) - ln Lred1=0)) , where Lred1=1) is the value of the reduced likelihood function at the standard model prediction and Lred1=0) is the value of the reduced likelihood function for a single-top cross section of zero. Using the two ensemble tests the distribution of Q-values is determined for the case with single-top included at the standard model rate, q1, and for the case of zero single-top cross section, q0. The two Q-value distributions are shown below. In order to quantify the probability for the null hypothesis to be correct we define the p-value, often also named 1-CLb. To quantify the sensitivity of our analysis we define the expected p-value pexp = p(Q1med) where Q1med is the median of the Q-value distribution q1 for the hypothesis H1. The meaning of pexp is the following: Under the assumption that H1 is correct one expects to observe pexp with a probability of 50%. We find pexp = 0.00000529, including all systematic uncertainties. In other words, assuming the predicted single-top cross section, we expect, with a probability of 50%, to see at least that many single-top events that the observed excess over the background corresponds to a 4.4σ background fluctuation.


Distributions of Q-values for two ensemble tests, one with single-top events present at the expected standard model rate, one without any single-top events. The expected significance under the assumption of a SM cross-section is determined to be 4.4 σ.

 

Binned Likelihood Fit to Data for Combined Search
Finally, the templates for all four networks are fitted simultaneously to the observed distributions using a binned likelihood function. The fit yields a single top-quark cross section of 2.0-0.8+0.9 pb. Below you find the distributions of observed data and MC normalized to the SM prediction (left-hand side) and MC normalized to the simultaneously fitted values (right-hand side) for all four networks and for the sum.
NN Output for the 2jet1tag neural net. The background and signal templates are normalized to the SM prediction. NN Output for the 2jet1tag neural net. The background and signal templates are normalized to the simultaneously fitted values.
NN Output for the 2jet2tag neural net. The background and signal templates are normalized to the SM prediction. NN Output for the 2jet2tag neural net. The background and signal templates are normalized to the simultaneously fitted values.
NN Output for the 3jet1tag neural net. The background and signal templates are normalized to the SM prediction. NN Output for the 3jet1tag neural net. The background and signal templates are normalized to the simultaneously fitted values.
NN Output for the 3jet2tag neural net. The background and signal templates are normalized to the SM prediction. NN Output for the 3jet2tag neural net. The background and signal templates are normalized to the simultaneously fitted values.
The sum of the NN Outputs of the four different neural networks. The background and signal templates are normalized to the SM prediction. The sum of the NN Outputs of the four different neural networks. The background and signal templates are normalized to the simultaneously fitted values.

 

Observed Significance for Combined Search


The observed Q-value (indicated by the arrow) yields a p-value of 0.00060790% which corresponds to a observed significance of 3.2 σ.

 

Variables in the high-output region
The invariant mass of the reconstructed top-quark in the high-output region (NN Output > 0.4). The invariant mass of the reconstructed top-quark in the high-output region (NN Output > 0.8).
The output of the KIT flavor separator in the high-output region (NN Output > 0.4). The output of the KIT flavor separator in the high-output region (NN Output > 0.8).
The product of the lepton-charge and the pseudo-rapidity of the light-quark jet in the high-output region (NN Output > 0.4). The product of the lepton-charge and the pseudo-rapidity of the light-quark jet in the high-output region (NN Output > 0.8).

 

Our single top-quark results were approved (blessed) by CDF on Tuesday 2/26/2008 and on Thursday 3/6/2008.