|
[EPS][GIF] |
Expected p-value = 2.0 × 10-7 (5.1 σ) Observed p-value = 9.4 × 10-5 (3.7 σ) σSingle Top = 2.2 ± 0.7 pb |Vtb| = 0.88 ± 0.14 (exp.) ± 0.07 (theory) |Vtb| > 0.66 at 95% confidence level |
There are currently three separate CDF searches for single top
production: an analysis using neural
networks, an analysis using a
multivariate likelihood function technique, and an analysis using
matrix
element discriminants. The combination uses neural networks
taking the discriminant outputs from these three techniques as inputs
to form a single, more powerful discriminant. The weights for the
combination neural networks are chosen using genetic algorithms to
provide optimal sensitivity. This analysis uses
2.2 fb-1 of CDF Run II data collected between February
2002 and August 2007 at the Tevatron in proton-antiproton collisions
at a center-of-mass energy of 1.96 TeV. We measure a combined
single top s- and t-channel cross section of 2.2 ± 0.7 pb.
The observed signal has a significance of 3.7σ, while the median
significance in psuedo-experiments is 5.1σ. These sensitivities
represent an improvement of approximately 9% for the observed
significance and 13% for the expected significance over the best
single analysis. From the cross section measurement we extract a
value for |Vtb| of 0.88 ± 0.14 (exp.) ± 0.07
(theory). As a cross check, we combine the results of the individual
analyses using a modified version of the Best Linear Unbiased
Estimator (BLUE) technique to obtain a result of 2.1
+0.7-0.6 pb. BLUE also found that the observed
signal has a significance of 3.7&sigma (its expected significance is
4.7σ).
Evolved Neural Network Combination
Below we describe the analysis level combination technique using
neural networks optimized using genetic algorithms. We call such
networks evolved neural networks. This technique provides the primary
combination result.
Technique
We use the same data sample, event selection, background estimate and systematic uncertainties used by each of the three individual CDF single top analyses (see the neural network, multivariate likelihood function, or matrix element analysis pages for more details). The outputs of the three individual analyses are combined into a single discriminant using a neural network. The neural network is weights and topolgy are optimized using a technique known as neuro-evolution of augmenting topologies (NEAT). In addition, through a careful choice of initial neural network configuration, we enable NEAT also to optimize the binning used. A separate neural network discriminant is optimized for each of the following channels in the lepton triggered samples:
We fit the evolved neural network output in data to signal and
background templates from Monte Carlo using a binned likelihood
technique to extract a cross section and limit on |Vtb|.
In addition, we compare the data to two hypotheses: H0
assumes that there is no single top production while H1
supposes the Standard Model rate of single top. The likelihood ratio
Q = -2ln(p(H1)/p(H0)) is used to perform this
comparison. We calculate a p-value for the observed data assuming the
null hypothesis (H0) and compare it to our expected
p-value, evaluated from ensembles of pseudo-experiments constructed
assuming H1 (SM amount of single top).
Results
The plot below collects the neural network output for events from all channels (2 and 3 jets, single and double tag, trigger lepton and extended muons) into a single plot. For this plot, we do not use the optimized binning chosen in during neuro-evolution, but simply choose a convenient binning for display purposes.
|
All Channels Shown on a linear scale [EPS][GIF] |
All Channels Shown on a log scale [EPS][GIF] |
Below, we show the output of the evolved neural network applied to data, compared to the prediction for the SM amount of single top plus backgrounds, separately for each channel. Recall that the neuro-evolution technique employed here optimizes not only the shape, but also the binning. The plots shown below use the optimized binning chosen during the evolution. Note: The extended muon channels are not shown here because the discriminant used in those channels is simply the matrix element EPD.
|
2-Jet, 1-Tag Channel Shown on a linear scale [EPS][GIF] |
2-Jet, 1-Tag Channel Shown on a log scale [EPS][GIF] |
|
2-Jet, 2-Tag Channel
Shown on a linear scale [EPS][GIF] |
2-Jet, 2-Tag Channel Shown on a log scale [EPS][GIF] |
|
3-Jet, 1-Tag Channel
Shown on a linear scale [EPS][GIF] |
3-Jet, 1-Tag Channel Shown on a log scale [EPS][GIF] |
|
3-Jet, 2-Tag Channel
Shown on a linear scale [EPS][GIF] |
3-Jet, 2-Tag Channel Shown on a log scale [EPS][GIF] |
| Cross Section |
Limit of |Vtb| |
|
[EPS][GIF] |
[EPS][GIF] |
|
σSingle Top = 2.2 ± 0.7 pb |Vtb| = 0.88 ± 0.14 (exp.) ± 0.07 (theory) | Limit on |Vtb| assuming a flat prior on |Vtb|2. |
The expected and observed p-values are given below. The expected significance for the combination represents a 13% improvement over the best single analysis. The observed significance is 9% better than the best single analysis.
|
Expected and Observed P-Value [EPS][GIF] Expected p-value = 2.0 × 10-7 (5.1 σ) Observed p-value = 9.4 × 10-5 (3.7 σ) |
|
Linearity Scan [EPS][GIF] | Each point corresponds to 3000 pseudo-experiments generated for a given true value of β = σ/σSM, as shown on the x-axis. The line is not a fit, but rather just the line defined by β(Measured) = β(True). The median pseudo-experiment result tracks this line very closely. |
|
[EPS][GIF] | This plot shows the results of fitting for the σSingle Top in each of the eight channels separately, as well as the result for the global fit of all channels. |
Most of us are used to using a χ² to combine N uncorrelated numbers.

We can express this using matricies:

If we have multiple types of uncertainties (and different measurements have correlations in these uncertainties):

The beauty of BLUE is that you do not need a Minuit-like minimization routine.

We then calculate a weight for each measurement wi

Note that weights can be negative, but:

And BLUE now tells us that:

As shown, BLUE is basically a weighted sum technique that takes correlations among the measurements into account. Specificially, analyses that have smaller uncertainties will have greater weight in the combination.
The uncertainties in most analyses, however, depend on the value of quantity of interest (lower central values usually have smaller errors than higher values). These three single top analyses are no exception. Their uncertainties do depend on the single top cross section that is measured.
This dependence of the uncertainties on the measured value can cause a bias if not taken into account. For example, take the case where we are combining two numbers whose uncertainties are always 10% of the measured values. If we measure 4 and 8, we'll be combining 4 ± 0.4 and 8 ± 0.8. It now looks as if the first measurement (4) has twice the precision as second (8), and the combination will be biased low.
To get around this problem, we study all uncertainties and parameterize them as a function of the single top cross section. We then pick a starting value of the single top cross section, estimate all uncertainties at this value, and then calculate a BLUE combination. We iterate this until the input cross section and output cross section are very close.
Below are shown two examples of uncertainties as a function of the single top cross section. In both cases, they are in terms of β which is the measured cross section normalized to the SM expectation (2.864 pb).
Statistical Uncertainty for the Three Analyses As A Function Of Single Top Cross Section
β = 1 ⇒ SM expectation σsingle top = 2.864 pb

Jet Energy Scale Uncertainty for the Three Analysese As A Function of Single Top Cross Section
β = 1 ⇒ SM expectation σsingle top = 2.864 pb

As with many analyses, the single top analyses have many asymmetric uncertainties. BLUE is based on Gaussian χ² distributions and is not directly able to treat asymmetric distributions. To get around this limitation, we created Asymmetric Iterative BLUE (AIB).
AIB is the set of three BLUE combinations. The center BLUE combination is given the average values of all uncertainties. The upper and lower BLUE combinations are given the upper and lower uncertainties, respectively.
The center BLUE combination is used to give the central value as well as the total length of the uncertainties. The ratio of uncertainties from the upper and lower BLUE combinations are used to determine the ratio of upper and lower unceratinties of the final answer:

By definition, the central value of AIB is always the same as the central value of BLUE. The average of AIB's upper and lower uncertainties are the equal to those of BLUE.

Input Cross Sections and their Statistical Correlations
As described above, we need the measured cross sections, the statistical correlations between the analyses, and all uncertainties parameterized as a function of the single top cross section.
AIB calculates a single top cross section of 2.1 +0.7-0.6 pb. Looking at pseudo-experiments with no signal present, AIB calculates an obvserved sensitivity of 3.7σ (median expected sensitivity of 4.7σ).
In addition, AIB also can be used to calculate consistency of the three analyses. Not surprisingly, the three combination shows that the three analyses are very compatible - the data combination had a χ² equal to or less than 87% of pseudo-experiments. Finally, 14.8% (1.1σ) of pseudo-experiments with the SM expected single top cross section had a measured cross section of 2.1 pb or smaller.
Below we show extrapolations from our existing results to larger datasets. These extrapolations assume single top production at the Standard Model rate.
|
Extrapolation of Expected Signal Significance [EPS][GIF] |
Extrapolation of Expected Error on |Vtb| [EPS][GIF] |
|
Extrapolation from Observed Signal Significance [EPS][GIF] |
This plot shows an extrapolation of possible results as we accumulate data from our current 2.2 fb-1 result. |
Last modified: Wed Mar 26 09:38:23 CDT 2008