Search for Electroweak Single Top Quark Production
using the Matrix Element Method with L=955 pb-1

 
Florencia Canelli (FNAL), Peter Dong (UCLA), Bernd Stelzer (UCLA), Rainer Wallny (UCLA)


&sigmasingle top = 2.7+1.5-1.3pb, (Mtop = 175 GeV/c2)
 

  • Abstract
  • Event selection
  • Method
  • Validation of the method
  • Systematic uncertainties
  • Results
  • Examination of other variables
  • References

  • Summer 2006 Conference Note


  •  

    Download plots in EPS format by clicking on the plot.
    To view GIF format with full resolution, right-click and select "View Image."


     
     
     Abstract

    We present a search for electroweak single top quark production using 955 pb-1 of CDF II data collected between February 2002 and March 2006 at the Tevatron in proton-antiproton collisions with center-of-mass energy 1.96 TeV. The analysis employs a matrix-element technique which calculates event probability densities for signal and background hypotheses. We combine the probabilities to form a discriminant variable which is evaluated for signal and background Monte Carlo events. The resulting template distributions are fit to the data using a binned likelihood approach. We search for a combined single top s- and t-channel signal and measure a cross section of 2.7+1.5-1.3pb, assuming a top quark mass of 175 GeV/c2. We use the CLs/CLb method to calculate the signal significance. The observed p-value of this analysis is 1.0% and the expected (median) p-value in pseudo-experiments is 0.6%.

     
     
     Event Selection

    This analysis uses events from leptonic decay of the W boson. We require a single, well isolated high-transverse-energy lepton, large missing transverse energy (from the neutrino), and exactly two high-transverse-energy jets. Of these jets, we require at least one to be identified as originating from a b-quark by secondary vertex tagging. The secondary vertex tag identifies tracks associated with the jet originating from a vertex displaced from the primary vertex. We further require the missing transverse energy and the jets not to be collinear for low values of missing transverse energy. This requirement removes a large fraction of the non-W background while retaining most of the signal.
    Our major backgrounds come from W + heavy flavor jets, Wbb, Wcc, Wcj; mistags which are W + light quark events that are mistakenly tagged as b jets; Non-W, which are multijet events in which a jet is mistakenly identified as a lepton and another jet mismeasured, providing a false missing transverse energy signature; and top pair production events in which one lepton or two jets are lost due to detector acceptance.

    Predicted event yield with 955 pb-1
    s-channel 15.44 ± 2.23
    t-channel 22.36 ± 3.64
    Single top 37.80 ± 5.87
    tt-bar 58.35 ± 13.46
    Diboson 13.72 ± 1.85
    Z + jets 11.92 ± 4.42
    Wbb-bar 170.9 ± 50.7
    Wcc-bar 63.5 ± 19.9
    Wc + jets 68.6 ± 19.0
    Non-W 26.2 ± 15.9
    Mistags 136.1 ± 19.7
    Total background 549.3 ± 95.2
    Total prediction 587.1 ± 96.6
    Observed 644



    Jet multiplicity distribution for signal and background processes. We compare the predicted number of events in each W+jet bin to the number of events observed in data. Uncertainty on the data are statistical; the hatch marks represent systematic errors in the background estimate.
     
     
     Method

    This analysis is based on a Matrix-Element method in order to maximize the use of information in the events (see references). We calculate event probability densities under the signal and background hypotheses as follows. Given a set of measured variables of each event (the 4-vectors of the lepton and the two jets), we calculate the probability densities that these variables could result from a given underlying interaction (signal and background). The probability is constructed by integrating over the parton-level differential cross-section, which includes the matrix element for the process, the parton distribution functions, and the detector resolutions. This analysis calculates probabilities for four different underlying processes: s-channel, t-channel, Wbb-bar, and Wc + jet.

    Transfer functions are used to include detector effects. Lepton quantities and jet angles are considered to be well measured. However, jet energies are not, and their resolution is parameterized from Monte Carlo simulation to create a jet resolution transfer function. We integrate over the quark energies and over the z-momentum of the neutrino to create a final probability density.

    We use the probabilities to construct a discriminant variable for each event. The two single-top channels are combined to form a single signal probability. We also introduce extra non-kinematic information by using the output (b) of a neural network b-tagger which assignes a probability (0 < b < 1) for each b-tagged jet of originating from a b quark. The discriminant variable is then constructed as:





    To quantify the single top content in the data, we perform a binned maximum likelihood fit. We fit a linar combination of signal and background shapes of the event probability discriminant to the data. The background normalization are Gaussian constraint in the fit. The fit determines the most probable value of the single-top cross section. All sources of systematic uncertainty are included as nuisance parameters in the likelihood function. Sources of systematic uncertainties can affect the normalization and shape for a given process. Correlations between both are taken into account through a common nuisance parameter (delta_i).


    Here &betaj; is the template fit parameter for each process, indexed by j; &deltai; are the nuisance parameters for each systematic effect, with (relative) normalization uncertainty &epsilonji; and (relative) shape uncertainty &kappajik;, indexed by ji;k indexes the bins of the event probability discriminant. H(&deltai) denotes the Heavyside function to treat asymmetric errors properly.

     
     
     Validation of the Method
     
    Several tests have been performed for this analysis. We compare the distribution of many kinematic variables predicted by the Monte Carlo samples for signal and background to the data. In particular, we compare the distributions of the input variables to ensure the data matches the Monte Carlo prediction. We evaluate the event probability discriminant in the untagged W+2jets sample, a high-statistics control sample with very little single-top content (<0.5%). We also evaluate the event probability discriminant in the tagged dipleton + 2 jets sample (using only the most energetic lepton) and in tagged lepton + 4 jets sample (using only the two most energetic jets as input to the discriminant), which should agree well with tt-bar Monte Carlo. In all data control samples, the data agrees well with the Monte Carlo prediction.

    The input variables and resulting discriminant evaluated in the untagged W+2jets sample.


    The discriminants evaluated in the tagged dilepton + 2jets sample and the tagged lepton + 4jets sample, both of which are mostly composed of tt-bar events.


    The input variables to the signal and background probabilities calculation. The Monte Carlo has been scaled to match the data (factor 1.096).


     
     
     Systematic Uncertainties

    Each systematic can include a normalization uncertainty and a shape uncertainty. The normalization uncertainty includes changes to the acceptance from the systematic effect, and the shape uncertainty includes changes to the template histograms. Both these effects are included in the likelihood function as shown above.

    Listed below are systematic uncertainties estimated from various Monte Carlo samples.
    • The jet energy scale systematic is found by changing the jet energy scale by 1 standard deviation and recalculating acceptance and the discriminant. This affects both normalization and shape.
    • We increase or decrease the amount of initial state radiation in the Monte Carlo to assign a systematic from this effect.
    • We increase or decrease the amount of final state radiation in the Monte Carlo to assign a systematic from this effect.
    • We vary the eigenvectors in the CTEQ parton distribution function tables to determine the uncertainty from this effect. We also include the effect of using different versions of CTEQ and of using MRST with different values of &LambdaQCD.
    • We include a systematic error to account for the modeling of the single top sample (MadEvent).
    • We include an uncertainty on event detection efficiency due to the scale factors that we apply to our Monte Carlo samples (mainly b-tagging and lepton ID scale factors)
    • We include a 6% uncertainty on our measured luminosity.
    • We include a systematic which accounts for systematic variation of the neural network b tagger output.
    • We use an alternative model for our mistag model and use the difference to the default model as a systematic uncertainty.
    • We use an alternate model to model our non-W background. We also assign a systematic effect to the flavor composition of the background, which is necessary to include for the neural-net b tagger to run.
    • We vary the factorization and renormalization scele (Q2) in the Monte Carlo samples that have been created with the ALPGEN Monte Carlo program.

    Systematic uncertainty Rate Shape
    Jet energy scale +1.6% / -2.0% X
    Initial state radiation +2.0% / + 0.3% X
    Final state radiation +2.6% / +1.9% X
    Parton distribution functions +1.4% / -0.4% X
    Monte Carlo generator ±1.6%
    Event detection efficiency ±7.4%
    Luminosity ±6%
    Neural-net b tagger N/A X
    Mistag model N/A X
    Non-W model N/A X
    Q2 scale in Alpgen MC N/A X
    Total rate uncertainty ±10.5%
    Systematic uncertainties. The numbers here are given for the combined single-top channel. Jet energy scale and neural network b tagger systematics are applied to all processes (not shown here).


     
     
     Results
     
    The results of the binned maximum likelihood fit are shown below. All sources of systematic uncertainties (normalization and shape) are included in this fit.



    Event probability discriminant distribution for signal and background processes. All templates are normalized to the best fit value of the maximum likelihood fit result. The inset shows the most sensitive bins of the analysis (EPD>0.7).
    Results from full dataset (644 candidate events):

    &sigmasingle top =2.7+1.5-1.3pb




     

    Hypothesis Test

    We have also calculate the signal significance of this result by using the CLs/CLb method developed at LEP. In this approach, pseudo-experiments are generated assuming the null hypothesis (H0) which assumes background only (without single top) and the test hypothesis (H1) which assumes background and single top. We then can calculate the p-value which is the probability of the background only (the null hypothesis) to fluctuate to the observed result in data. We estimate the expected p-value, by taking the median of the test hypothesis distribution as the 'observed' value.

    Expected p-value: 0.6% (2.5σ)
    Observed p-value: 1.0% (2.3σ)



     
     
     More cross-checks
     
    We look at variables which are sensitive to single top production. As we make increasing cuts on our event probability discriminant (EPD), we can observe the increasing sensitivity of these variables and the behavior of the data. We enrich the sample with signal events by making increasing cuts on our event probability discriminant (EPD) and look for characteristic changes in these sensitive variables. Although the uncertainties are large, there is a good agreement between data and the Monte Carlo simulation including single top.


    Increasing cuts on the EPD for the product of the lepton charge and the pseudorapidity of the untagged jet, a variable known to be sensitive to t-channel.



    Increasing cuts on the EPD for the invariant mass of the lepton, neutrino, and b-jet.



    Increasing cuts on the EPD for the invariant mass of the lepton, neutrino, and leading jet.



    Increasing cuts on the EPD for the invariant mass of the lepton, neutrino, and second jet.



    Increasing cuts on the EPD for the opening angle between the jet and lepton in the reconstructed top rest frame.


     
     
     Conclusions
     
    We performed the first search for single top using a Matrix-Element based analysis. We apply our method to 955 pb-1 of data taken by the CDF experiment. We include rate and shape systematic uncertainties in our method. We measure a single top cross-section &sigmasingle top =2.7+1.5-1.3pb. We use the CLs method to calculate the signal significance. The observed p-value in 955/pb of CDF data is 1.0%. The expected (median) p-value in pseudo-experiments is 0.6%.

     
     
     References