Q & A for Blessing

August 5, 2004


1)Binned likelihood - is bin size too small for statistics that you have?  Check stability with 0.5,1.5,2,2.5,3 x bins. (Evelyn/Jaco)
       
   
A:  We ran 10k pseudo-experiments where background and signal templates had been re-binned by multiples of the original bin size (1.0).  The sensitivity of our measurement with the current bin size is as good as or better than making the measurement with a smaller/larger bin size.  The results of these pseudo-experiments are tabulated below (click on the links to see the different templates used, or the measured F0s during the various pseudo-trials):

Multiple of original bin size
F_0 Measured from p'expts
Sigma Measured from p'expts
Template Shapes
F_0 Distbn and Pulls
0.5
0.709 +/- 0.003
0.310 +/- 0.003
Templates 0.5x
F0 and pull for 0.5
1.0
0.704 +/- 0.003
0.314 +/- 0.002
Templates 1.0x
F0 and pull for 1.0
1.5
0.709 +/- 0.003
0.317 +/- 0.003
Templates 1.5x
F0 and pull for 1.5
2.0
0.703 +/- 0.003
0.316 +/- 0.003
Templates 2.0x
F0 and pull for 2.0
2.5
0.705 +/- 0.003
0.307 +/- 0.003
Templates 2.5x
F0 and pull for 2.5
3.0
0.702 +/- 0.003
0.316 +/- 0.002
Templates 3.0x
F0 and pull for 3.0



2) Use S. Miller et al. [CDF Note 6907] background numbers, they are even calculated for you in the 3.5 jet bin.  (K. Bloom)

     As you mention in the to-do list of the preblessing talk, I think you need to rederive the background normalization using your constrained fit chi2<20 cut efficiency. Make sure correlations are treated correctly: all uncertainties are uncorrelated and added in quadrature _except_ that first the uncertainties for the same process in different jet bins are added linearly, and uncertainties for W+heavy flavor are also I think added linearly.  (E. Brubaker)

   
A:  Thank you for the nice suggestions and letting us know that these numbers are out there.  This has now been done.  The new background table is shown below.  3.5-jet and 4-jet uncertainties for the same processes, and HF uncertainties, have been added linearly.  All other uncertainties (uncorrelated) have been summed in quadrature.  The Wbbbar efficiency with chi2 < 20 is 0.649

Table 5 image

3) As I've said before, I'd like to see the QCD background treated more carefully, as it will probably have a different cos(theta*) shape than processes that make higher-pT stuff.  Admittedly, to use a QCD-only shape as a systematic variation would be an overestimate.  Better would be to try to combine the different processes in their appropriate proportions and vary them properly, which might require you to master Method 2 a little. (K. Bloom)

   
The QCD shape [shown during the pre-blessing] looks suspiciously unlike Wbb; the more statistics you can get the better! (What Jahred did is to not _require_ a btag for event selection, but to use the _most likely_ heavy flavor jet, using jet probability, as a btag in the fitter. He could probably give you code for this, or explain how to do it.) Also, what's the efficiency of the chi2 cut on the non-isolated lepton data? (E. Brubaker)

    A:  Below is a plot which compares the QCD background shape with the Wbbbar+2p shape which we use in our analysis to model the entire background.  The QCD shape comes from running over detector data events (ntuplized using leptonIdCuts set 4).  The lack of agreement between the two shapes is obvious.  Because of this, we have now added a QCD shape contribution to our "background shape" systematic uncertainty.  This contribution to the background shape systematic is determined by running a large cast of pseudo-experiments where the QCD contribution to the background is modeled by the shape (data points) below, and the remaining background contributions are modeled by the usual Wbbbar+2p shape (solid green line in the plot below).  Our likelihood fitter for this cast of pseudo-experiments fits the "data" using the default background and signal templates.


wbbbar vs qcd image


4)  Why do you assume that Wbb+2p adequately represents all the background processes? Is there some intuitive argument for this? Apparently there exist templates for other processes--can we see them, and the corresponding chi2 cut efficiencies? If you have the several templates, why not use a composite template in the fit? (E. Brubaker)


    A:  We get reasonable agreement between the shapes of the cos(theta*) distributions for Wbbbar+2p and those of W+jets and Single Top.  A systematic uncertainty has been assigned for making this assumption.  Plots of the Wbbbar+2p shape versus each of these backgrounds can be found below:

wbbbar_vs_wjets image
wbbbar_vs_singletop image


5)
Jet energy systematic underestimated since not taking all shifts into account. Plan to use total systematic function from Jean-Francois Arguin. (Evelyn)

    Do you really apply the jet energy systematics to data jets, not to Monte Carlo datasets?  (E. Brubaker)


    Jet energy systematics: need independent systematic for data and MC jets at levels 1 and 3. This would be taken into account by Jean-Francois' totalJetSystematic (?) function.   (E. Brubaker)

    A:  Now we are using Jean-Francois' total systematic function.  What we do is run three sets of pseudo-experiments (+1 sigma, -1 sigma, and with no shift) where the "data" events come from ttopli (for signal) and the Wbbbar+2p MC sample (for background).  The MC "data" for the pseudo-experiments come from events where the JES has been shifted  (by either: +1 sigma, -1 sigma, or with no shift), and these are fit using the standard templates.  The signal and background templates used to fit the Monte Carlo "data" are NEVER shifted.   The systematic uncertainty that we measure due to the JES is: 0.06


6)  
Also going to redo systematic uncertainties for ISR/FSR (Evelyn)

   
 As you mention in the to-do list, use the modern ISR/FSR samples. (E. Brubaker)

    A:  This systematic uncertainty has been re-done using Un-Ki's more/less samples.  Our total ISR/FSR systematic is now: 0.02



7)  
Also going to redo systematic uncertainties for PDFs (Evelyn)

     
Also, need uncertainty for PDF eigenvectors, right?  (E. Brubaker)

    A:  We will punt on this for now, as it is currently work in progress.  What we are currently using for the PDF systematic (using ttopei, ttop3e and ttop4e samples) is certainly a conservative estimate.  We will update the PDF systematic uncertainty in time for publication.


8) Some of your systematics are based on comparisons to ttopei.  Do you in fact get the F0 you expect from ttopei? (K. Bloom)

   
A: No, in fact we do not measure the F_0 (should be 0.703) that we would expect in either the ttopei (PYTHIA) or ttopli (HERWIG) datasets.   The values of F_0 that we measure in these datasets are shown in the plot below.  This difference is the reason that we have now added a systematic uncertainty due to Monte Carlo modeling.  The value of this systematic uncertainty is 0.03 and this was taken as the largest deviation from a value of F_0 = 0.703.

mc modeling syst image



9)  What accounts for the differences between your table 1 (event counts in tag/njet bins) and CDF 6845 table 2 (before chi2 cut)? (E. Brubaker)

   
A:  Thank you for noticing this.  There are two things that we were not doing for the overall event selection, but which have now been included:
    1. Veto dileptons where one lepton is a PHOENIX electron
    2. XFT trigger fiduciality requirement (COT exit radius > 140 cm)
Our table of event counts by tag/njet bins (before chi2 cut) now matches that of CDF Note 6845.  This table is shown below:

table 1


10)  Is m_{lb} formed using the generically corrected jet, or the JF-corrected jet, or the jet (and lepton) as output from the fitter? (E. Brubaker)

   
A:  We use the jet and lepton as output from the fitter.  Sorry, we didn't mention this in the first version of the note, but this has now been clarified in the latest version.


11) Why do you use a binned likelihood fit rather than unbinned? (E. Brubaker)

   
A:  Due to some features of our templates, we found the signal and background shapes extremely difficult to fit using simple/reasonable functions (see the images included below).  If anyone has some nice suggestions on how to do this, we would welcome them (Evelyn has recently suggested looking at CDF Note 6214 by Carlos Sanchez et al., which apparently uses a neural network package to fit shapes).  

Difficult to fit? Image



12) Can you show plots of the + and - errors from the pseudoexperiments? (E. Brubaker)

        A:  Okay, that plot is shown below.  This is from 10k pseudo-experiments where the number of events from experiment-to-experiment is allowed to fluctuate (according to a Poisson distribution with a mean of 31).  Sometimes Minos fails, hence the spike at zero.

   asymmetric error image
       

13) What about checking the results of pseudoexperiments at different values of F_0? That would help show the measurement is unbiased, and that the pulls are reasonable, throughout the range of possible measurements. (E. Brubaker)

        A:  This plot is shown below.  The x-axis is the true F_0 (between 0.0 and 1.0, varied in steps of 0.01), and the y-axis shows the mean measured value of F_0 from 10k pseudo-experiments.  The black line, which has a slope of 1.02, is the best fit to the data points.  The red line is drawn for reference and shows a line of slope 1.0 with a y-intercept of 0.0.
  f0 versus mu(f0) image

14) Top mass and jet energy scale correlated?  (Paul Tipton)

         A:  (K. McFarland answers) At this point CDF measurement of top mass not dominating world average, so assumption that they are not correlated is reasonable.


15)
Use mtop=175GeV to constrain in mass fitter, used to choose correct lepton-b assignment.  Asked to use mtop=178GeV? (Jaco)

   
    A:  This would require generating new signal MC where the top mass is 178.0 GeV/c^2, and running the Top Mass Fitter where we now constrain the top mass to be 178.0 GeV/c^2.  This on the to do list and we are considering doing this in time for publication.


16)  For display purposes, chooose a wider binning (data points)  (K. Bloom)

       
A:  Okay, we'll try.  I'm not certain that I know how to do this in root (overlay two histograms with different binning).


17)  
Why use 3.5 jets? (Someone)

        A:  We used exactly same selection as top mass group.  There are 10 (out of 37 events total) 3.5-jet events in our dataset which pass our selection criteria, but would have been excluded if we were strict about using only 4 jet events.


18)  
Prediction of negative events in one bin? (Pekka)

          A:  Has to be positive by construction for F0 between 0 and 1.  There are no bins from the fit (below) which contain a prediction of negative events.
result image


19)  Different chi2 cut than mass analysis?  (Andy Beretvas)
       
        A:  Yes, since running mass fitter in constrained mode (versus free-fit mode) and using a chi2  < 20 cut that is optimal for this analysis.


20) Likelihood takes off at 1.5 - is the steep rise coming from a bin with data events having a prediction of zero events?  Check this. If yes, then need to think about meaning of uncertainties with +-0.5 likelihood change. (K. McFarland)
   
        A:  In the latest fit (post pre-blessing, after we dropped two events which should have failed the general selection criteria) we still see a steep rise, this time around F_0 = 1.7 (see plot of F_0 versus NLL below).  There is no bin in our cos(theta*) distribution which has a non-zero number of data events and a prediction of zero events from our likelihood fit.

f0 vs nll image


21) Quoting uncertainty on top mass as a systematic?  Take it out, not quote as a systematic and instead show dependence of result on top mass.  Set a precedent for future results and papers. (K. McFarland)


      Should we be using the new world average for top mass? 178.0 +/- 4.3. I'm
sorry I missed the discussion at the last meeting (referenced in the
minutes):

"Discussion onthe treatment of the top mass value. Suggestion is to masure
helicity at given top mass and re-do for other top mass values. Will also
solve correlation of Jet E scale and top mass systematics."

I don't think I understand the suggestion. First of all, I don't
understand how jet energy and top mass systematics are correlated, unless
it's since the range of masses you use in the top mass systematic is
loosely derived from a world average that includes a CDF run I measurement
which has jet energy systematics that are partly correlated with the
current jet energy systematics.

And what does redo for other top mass values mean? Generate new helicity
samples for templates, change mt in the fitter constraint, change mt in
the cos theta* calculation, all of the above? Even redo all the
systematics? That would give you a measurement of F_0 as a function of mt,
which sounds great to me, but it's a lot of work.

What you do now seems perfectly reasonable to me: Report a central value
that assumes a given top mass throughout the measurement (175), then
include a systematic that accounts for possible differences in that
parameter. This is much simpler, since you just have to throw
pseudoexperiments from a different, already existing, MC sample.

Here, my main issue is that now mt = 178.0 +/- 4.3, not 175 +/- 5. One
possibility is to investigate the F_0 from pseudoexperiments for several
mass points between, say, 165 and 190, discover a linear relationship, and
extract your central value and partial systematic from that relationship
using mt = 178.0 +/- 4.3.  (E. Brubaker)

        A:  We're considering doing this in time for the publication to set a paradigm for future W helicity measurements/publications.  This will probably involve a large amount of additional MC production, and we just aren't able to do this on short order.


22) The constraints in the fitter are claimed to be m_t < 175 GeV and Gamma_t =2.5 GeV.  First, is that really supposed to be "<" on m_t?  Why? (K. Bloom)
       
        A:  No, this was a typo in the very first draft of our note, for the constrained fit m_t = 175 GeV/c^2.  Sorry about that, this typo has since been fixed!  


23) As for Gamma_t, is the right thing to do to set the constraint to the expected intrinsic width of the top (that must be the 2.5 GeV), or should it be to something more like the precision with which we measure the mass, which is larger? (K. Bloom)

        A:  We wanted to be consistent with what was being used for the top mass (template) analysis.  Footnote 12, of CDF Note 6845 states, "We don't know where this number came from.  The commonly referenced theoretical top width is [appx] 1.5 GeV.  We should be consistent [with] what the generators use, if anything."


24)  Your treatment of the acceptance bias seems unnecessarily complicated. You have F_0^obs = F_0^obs(F_0,a_{L0}). So why not let the likelihood simply measure F_0^obs, which is really the measured quantity, then just solve analytically for F_0, including propagating the error on a_{L0}. This simplifies the likelihood, and eliminates annoying pseudoexperiments and "uncertainty on the uncertainty" for the acceptance bias systematic. (E. Brubaker)
 
      
        A:
 Well, it might seem complicated but that's how we did it.  We wanted to be consistent with the lepton pT helicity measurement since we plan to combine our two results (cos(theta*) and lepton pT) in the near future.

 

25)  Future possibility: Choose the leptonic b by considering more than the lowest chi2 combination. For example, take the jet that is most often identified as the leptonic b by the five combinations with lowest chi2, or by all the combinations such that chi2min < chi2 < chi2min + 10. (E. Brubaker)

        A:  Yes, that is a nice suggestion.  


26)  Do you use the JF corrections? Which version? (E. Brubaker)

        A:  Yes, we use the JF corrections with the top mass fitter (we do "topSpecificCorrs set JF" in our talk-to to the TopMassFitModule for both data and MC).  I have recently been told that the mass template analysis used "JFL5", which has been shown to improve the top mass sensitivity over the the "JF" topSpecificCorrs.  There is nothing wrong with using "JF", but we are considering using "JFL5" for the publication results.


27)  What chi2 cut is used in the free fit when you're scanning the delta(m_{t}) cut? Did you think about scanning both the chi2 cut and delta(m_t) cut for the free fit, for completeness? (E. Brubaker)

        A:  We used a chi2 cut of 10.0 with the free-fit and the delta(m_t) cut to produce the numbers in the acceptance plot of our note.  We did scan both delta(m_t) and chi2 on the free-fit by themselves, and in the end looked for the "best of the best" (chi2 < 10.0 and scan delta(m_t)).


28) See the comment about acceptance bias -> don't need true F_0 as a parameter in the fit. Just fit for F_0^obs, then solve for F_0. (E. Brubaker)

       
A:  Again, we wanted to be consistent with the lepton pT helicity measurement since we plan to combine them in the near future.
   

29)  I think this is a minor problem: your constraint for the background is on the expected number of background events, not on the background fraction.  Also, I think you should _not_ fix the N_e parameter to the number of events observed so that the data and "expected" histograms have the same integral. The likelihood will take into account possible Poisson fluctuations in the expectations. I would use the following expression for the prediction:
        mu_i = N_B*T_B,i + N_0*T_o,i + N_L*T_L,i
                  = N_B*T_B,i + N_S*F_0^obs*T_0,i + N_S*(1-F_0^obs)*T_L,i
then I would change the background constraint back to use N_BG, not f_BG. Then fit for N_B, N_S, F_0^obs. This shouldn't change the result too much, but I think it's more correct.  (E. Brubaker)

        A:  We don't think that this is a problem, and it seems to be the usual way that other W helicity analyses have handled constraining their background.
 

30) Out of curiosity, have you ever tried running pseudoexperiments where 100% of the events are taken from the background template? (E. Brubaker)

        A:  No, we have not tried this.


31) [Regarding systematics estimates from pseudo-experiments] We (template mass) define shifts using the median of the collection of p.e. results, on the grounds that it is less sensitive to fluctuations and tails than the mean of a Gaussian fit. What do you think?  (E. Brubaker)

 
       A:  Our shapes (F_0 measured from a large cast of pseudo-experiments) tend to fit well to a gaussian, so for us using the mean instead of the median is okay.


32) [Regarding systematics estimates from pseudo-experiments]  In general, it would be good to see all the results in the note, meaning all the shifts when you say you choose the largest.   (E. Brubaker)

        A:  Okay, we've added tables (and some additional plots) for all of the systematics estimates to the latest version of the note.


33) [Regarding systematics estimates from pseudo-experiments]  What are the uncertainties on these uncertainties?  (E. Brubaker)

        A:  From 10k pseudo-experiments the uncertainty on the systematic uncertainties is about +/- 0.003.


34)  MC Statistics: this is nice work. However, I would argue that you've shown the uncertainty due to MC statistics is negligible and should be zero.
 That's because the width of 0.01 is just the uncertainty on the mean of a 1k normal distribution: RMS/sqrt(N) = 0.3/sqrt(1000) = 0.0095.   (E. Brubaker)

        A:  Thanks for pointing this out... but the systematic is so small compared to the others, we'll keep this for now.


35)  See the acceptance bias comment. Can eliminate the use of pseudoexperiments for this uncertainty by just solving for F_0 directly and propagating the systematic.   (E. Brubaker)


        A:  We wanted to be consistent with what was done in the lepton pT analysis.

36)  Will you use Feldman-Cousins?  (C. Plager)

     
A: Yes.  The plot is shown below.  F-C tells us our measured value is F_0 = 0.89 +0.11 -0.38.  The confidence belts allow us to set the limit, F_0 > 0.25 @ 95% CL

        cbs