1)Binned
likelihood - is bin size too small for statistics that you have?
Check stability with 0.5,1.5,2,2.5,3 x bins. (Evelyn/Jaco)
A: We ran 10k
pseudo-experiments where background and signal templates had been
re-binned by multiples of the original bin size (1.0). The
sensitivity of our measurement with the current bin size is as good as
or better than making the measurement with a smaller/larger bin size.
The results of these pseudo-experiments are tabulated below (click
on the links to see the different templates used, or the measured F0s
during the various pseudo-trials):
2) Use S. Miller et
al.
[CDF Note 6907] background numbers, they are even calculated for you in
the 3.5 jet bin. (K. Bloom)
As you mention in the to-do list of
the preblessing talk, I think you need to rederive the background
normalization using your constrained fit chi2<20 cut efficiency. Make
sure correlations are treated correctly: all uncertainties are
uncorrelated and added in quadrature _except_ that first the
uncertainties for the same process in different jet bins are added
linearly, and uncertainties for W+heavy flavor are also I think added
linearly. (E. Brubaker)
A:
Thank you for the nice suggestions and letting us know that these
numbers are out there. This has now been done. The new
background table is shown below. 3.5-jet and 4-jet uncertainties
for the same processes, and HF uncertainties, have been added linearly.
All other uncertainties (uncorrelated) have been summed in
quadrature. The Wbbbar efficiency with chi2 < 20 is 0.649
3) As I've said before, I'd like to see
the QCD background treated more carefully, as it will probably have a
different cos(theta*) shape than processes that make higher-pT
stuff. Admittedly, to use a QCD-only shape as a systematic
variation would be an overestimate. Better would be to try to
combine the different processes in their appropriate proportions and
vary them properly, which might require you to master Method 2 a little.
(K. Bloom)
The QCD shape [shown during the
pre-blessing] looks suspiciously unlike Wbb; the more statistics you can
get the better! (What Jahred did is to not _require_ a btag for event
selection, but to use the _most likely_ heavy flavor jet, using jet
probability, as a btag in the fitter. He could probably give you code
for this, or explain how to do it.) Also, what's the efficiency of the
chi2 cut on the non-isolated lepton data? (E. Brubaker)
A: Below is a plot which compares the
QCD background shape with the Wbbbar+2p shape which we use in our
analysis to model the entire background. The QCD shape comes from
running over detector data events (ntuplized using leptonIdCuts set 4).
The lack of agreement between the two shapes is obvious.
Because of this, we have now added a QCD shape contribution to our
"background shape" systematic uncertainty. This contribution to
the background shape systematic is determined by running a large cast of
pseudo-experiments where the QCD contribution to the background is
modeled by the shape (data points) below, and the remaining background
contributions are modeled by the usual Wbbbar+2p shape (solid green line
in the plot below). Our likelihood fitter for this cast of
pseudo-experiments fits the "data" using the default background and
signal templates.
4) Why do you assume that
Wbb+2p adequately represents all the background processes? Is there some
intuitive argument for this? Apparently there exist templates for other
processes--can we see them, and the corresponding chi2 cut efficiencies?
If you have the several templates, why not use a composite template in
the fit? (E. Brubaker)
A: We
get reasonable agreement between the shapes of the cos(theta*)
distributions for Wbbbar+2p and those of W+jets and Single Top. A
systematic uncertainty has been assigned for making this assumption.
Plots of the Wbbbar+2p shape versus each of these backgrounds can
be found below:
5) Jet energy systematic underestimated
since not taking all shifts into account. Plan to use total systematic
function from Jean-Francois Arguin. (Evelyn)
Do you really apply the jet energy systematics to
data jets, not to Monte Carlo datasets? (E. Brubaker)
Jet energy
systematics: need independent systematic for data and MC jets at levels
1 and 3. This would be taken into account by Jean-Francois' totalJetSystematic
(?) function. (E. Brubaker)
A: Now we are using Jean-Francois' total
systematic function. What we do is run three sets of
pseudo-experiments (+1 sigma, -1 sigma, and with no shift) where the
"data" events come from ttopli (for signal) and the Wbbbar+2p MC sample
(for background). The MC "data" for the pseudo-experiments come
from events where the JES has been shifted (by either: +1 sigma,
-1 sigma, or with no shift), and these are fit using the standard
templates. The signal and background templates used to fit the
Monte Carlo "data" are NEVER shifted. The systematic uncertainty
that we measure due to the JES is: 0.06
6) Also going to redo systematic
uncertainties for ISR/FSR (Evelyn)
As you mention in the to-do
list, use the modern ISR/FSR samples. (E. Brubaker)
A: This systematic uncertainty has been
re-done using Un-Ki's more/less samples. Our total ISR/FSR
systematic is now: 0.02
7) Also
going to redo systematic uncertainties for PDFs (Evelyn)
Also, need uncertainty for PDF
eigenvectors, right? (E. Brubaker)
A: We will punt on this for now,
as it is currently work in progress. What we are currently using
for the PDF systematic (using ttopei, ttop3e and ttop4e samples) is
certainly a conservative estimate. We will update the PDF
systematic uncertainty in time for publication.
8) Some of your systematics are based on
comparisons to ttopei. Do you in fact get the F0 you expect from
ttopei? (K. Bloom)
A:
No, in fact we do not measure the F_0 (should be 0.703) that we would
expect in either the ttopei (PYTHIA) or ttopli (HERWIG) datasets.
The values of F_0 that we measure in these datasets are shown in the
plot below. This difference is the reason that we have now added a
systematic uncertainty due to Monte Carlo modeling. The value of
this systematic uncertainty is 0.03 and this was taken as the largest
deviation from a value of F_0 = 0.703.
9) What accounts for the differences
between your table 1 (event counts in tag/njet bins) and CDF 6845 table
2 (before chi2 cut)? (E. Brubaker)
A: Thank you for noticing this.
There are two things that we were not doing for the overall event
selection, but which have now been included:
- Veto dileptons where one lepton is a PHOENIX electron
- XFT trigger fiduciality requirement (COT exit radius > 140
cm)
Our table of event counts by tag/njet
bins (before chi2 cut) now matches that of CDF Note 6845. This
table is shown below:
10) Is m_{lb} formed using the
generically corrected jet, or the JF-corrected jet, or the jet (and
lepton) as output from the fitter? (E. Brubaker)
A:
We use the jet and lepton as output from the fitter. Sorry,
we didn't mention this in the first version of the note, but this has
now been clarified in the latest version.
11) Why do you use a binned likelihood
fit rather than unbinned? (E. Brubaker)
A: Due to some features of our
templates, we found the signal and background shapes extremely difficult
to fit using simple/reasonable functions (see the images included
below). If anyone has some nice suggestions on how to do this, we
would welcome them (Evelyn has recently suggested looking at CDF Note
6214 by Carlos Sanchez et al., which apparently uses a neural network
package to fit shapes).
12) Can you show plots of the + and -
errors from the pseudoexperiments? (E. Brubaker)
A: Okay, that plot is shown below. This
is from 10k pseudo-experiments where the number of events from
experiment-to-experiment is allowed to fluctuate (according to a Poisson
distribution with a mean of 31). Sometimes Minos fails, hence the
spike at zero.
13) What about checking the results of
pseudoexperiments at different values of F_0? That would help show the
measurement is unbiased, and that the pulls are reasonable, throughout
the range of possible measurements. (E. Brubaker)
A: This plot is shown below. The x-axis
is the true F_0 (between 0.0 and 1.0, varied in steps of 0.01), and the
y-axis shows the mean measured value of F_0 from 10k pseudo-experiments.
The black line, which has a slope of 1.02, is the best fit to the
data points. The red line is drawn for reference and shows a line
of slope 1.0 with a y-intercept of 0.0.
14) Top mass and jet energy scale
correlated? (Paul Tipton)
A: (K. McFarland
answers) At this point CDF measurement of top mass not dominating world
average, so assumption that they are not correlated is reasonable.
15) Use
mtop=175GeV to constrain in mass fitter, used to choose correct
lepton-b assignment. Asked to use mtop=178GeV? (Jaco)
A: This would
require generating new signal MC where the top mass is 178.0 GeV/c^2,
and running the Top Mass Fitter where we now constrain the top mass to
be 178.0 GeV/c^2. This on the to do list and we are considering
doing this in time for publication.
16) For display
purposes, chooose a wider binning (data points) (K. Bloom)
A:
Okay, we'll try. I'm not certain that I know how to do this
in root (overlay two histograms with different binning).
17) Why use 3.5 jets?
(Someone)
A:
We used exactly same selection as top mass group. There are
10 (out of 37 events total) 3.5-jet events in our dataset which pass our
selection criteria, but would have been excluded if we were strict about
using only 4 jet events.
18) Prediction of
negative events in one bin? (Pekka)
A:
Has to be positive by
construction for F0 between 0 and 1. There are no bins from the
fit (below) which contain a prediction of negative events.
19) Different chi2 cut than mass
analysis? (Andy Beretvas)
A: Yes, since running mass
fitter in constrained mode (versus free-fit mode) and using a chi2
< 20 cut that is optimal for this analysis.
20) Likelihood takes off at 1.5 - is the
steep rise coming from a bin with data events having a prediction of
zero events? Check this. If yes, then need to think about meaning
of uncertainties with +-0.5 likelihood change. (K. McFarland)
A: In the latest fit (post
pre-blessing, after we dropped two events which should have failed the
general selection criteria) we still see a steep rise, this time around
F_0 = 1.7 (see plot of F_0 versus NLL below). There is no bin in
our cos(theta*) distribution which has a non-zero number of data events
and a prediction of zero events from our likelihood fit.
21) Quoting uncertainty on top
mass as a systematic? Take it out, not quote as a systematic and
instead show dependence of result on top mass. Set a precedent for
future results and papers. (K. McFarland)
Should we be using the new world
average for top mass? 178.0 +/- 4.3. I'm
sorry I
missed the discussion at the last meeting (referenced in the
minutes):
"Discussion
onthe treatment of the top mass value. Suggestion is to masure
helicity at
given top mass and re-do for other top mass values. Will also
solve
correlation of Jet E scale and top mass systematics."
I don't think
I understand the suggestion. First of all, I don't
understand
how jet energy and top mass systematics are correlated, unless
it's since
the range of masses you use in the top mass systematic is
loosely
derived from a world average that includes a CDF run I measurement
which has jet
energy systematics that are partly correlated with the
current jet
energy systematics.
And what does
redo for other top mass values mean? Generate new helicity
samples for
templates, change mt in the fitter constraint, change mt in
the cos
theta* calculation, all of the above? Even redo all the
systematics?
That would give you a measurement of F_0 as a function of mt,
which sounds
great to me, but it's a lot of work.
What you do
now seems perfectly reasonable to me: Report a central value
that assumes
a given top mass throughout the measurement (175), then
include a
systematic that accounts for possible differences in that
parameter.
This is much simpler, since you just have to throw
pseudoexperiments
from a different, already existing, MC sample.
Here, my main
issue is that now mt = 178.0 +/- 4.3, not 175 +/- 5. One
possibility
is to investigate the F_0 from pseudoexperiments for several
mass points
between, say, 165 and 190, discover a linear relationship, and
extract your
central value and partial systematic from that relationship
using mt =
178.0 +/- 4.3. (E. Brubaker)
A: We're considering doing
this in time for the publication to set a paradigm for future W helicity
measurements/publications. This will probably involve a large
amount of additional MC production, and we just aren't able to do this
on short order.
22) The constraints in the fitter are
claimed to be m_t < 175 GeV and Gamma_t =2.5 GeV. First, is
that really supposed to be "<" on m_t? Why? (K. Bloom)
A: No, this was a typo in
the very first draft of our note, for the constrained fit m_t = 175
GeV/c^2. Sorry about that, this typo has since been fixed!
23) As
for Gamma_t, is the right thing to do to set the constraint to the
expected intrinsic width of the top (that must be the 2.5 GeV), or
should it be to something more like the precision with which we measure
the mass, which is larger? (K. Bloom)
A: We wanted to be consistent with what was
being used for the top mass (template) analysis. Footnote 12, of
CDF Note 6845 states, "We don't know where this number came from.
The commonly referenced theoretical top width is [appx] 1.5 GeV.
We should be consistent [with] what the generators use, if
anything."
24) Your treatment of the acceptance bias
seems unnecessarily complicated. You have F_0^obs = F_0^obs(F_0,a_{L0}).
So why not let the likelihood simply measure F_0^obs, which is really
the measured quantity, then just solve analytically for F_0, including
propagating the error on a_{L0}. This simplifies the likelihood, and
eliminates annoying pseudoexperiments and "uncertainty on the
uncertainty" for the acceptance bias systematic. (E. Brubaker)
A:
Well, it might
seem complicated but that's how we did it. We wanted to be
consistent with the lepton pT helicity measurement since we plan to
combine our two results (cos(theta*) and lepton pT) in the near future.
25) Future possibility: Choose the
leptonic b by considering more than the lowest chi2 combination. For
example, take the jet that is most often identified as the leptonic b by
the five combinations with lowest chi2, or by all the combinations such
that chi2min < chi2 < chi2min + 10. (E. Brubaker)
A: Yes, that is a nice
suggestion.
26) Do you use the JF corrections? Which
version? (E. Brubaker)
A: Yes, we use the JF
corrections with the top mass fitter (we do "topSpecificCorrs set JF" in
our talk-to to the TopMassFitModule for both data and MC). I have
recently been told that the mass template analysis used "JFL5", which
has been shown to improve the top mass sensitivity over the the "JF"
topSpecificCorrs. There is nothing wrong with using "JF", but we
are considering using "JFL5" for the publication results.
27) What chi2 cut is used in the free fit
when you're scanning the delta(m_{t}) cut? Did you think about scanning
both the chi2 cut and delta(m_t) cut for the free fit, for completeness?
(E. Brubaker)
A: We used a chi2 cut of
10.0 with the free-fit and the delta(m_t) cut to produce the numbers in
the acceptance plot of our note. We did scan both delta(m_t) and
chi2 on the free-fit by themselves, and in the end looked for the "best
of the best" (chi2 < 10.0 and scan delta(m_t)).
28) See the comment about acceptance bias
-> don't need true F_0 as a parameter in the fit. Just fit for
F_0^obs, then solve for F_0. (E. Brubaker)
A: Again, we wanted to be consistent with the
lepton pT helicity measurement since we plan to combine them in the near
future.
29) I think this is a minor problem: your
constraint for the background is on the expected number of background
events, not on the background fraction. Also, I think you should
_not_ fix the N_e parameter to the number of events observed so that the
data and "expected" histograms have the same integral. The likelihood
will take into account possible Poisson fluctuations in the
expectations. I would use the following expression for the prediction:
mu_i = N_B*T_B,i + N_0*T_o,i + N_L*T_L,i
=
N_B*T_B,i + N_S*F_0^obs*T_0,i + N_S*(1-F_0^obs)*T_L,i
then I would
change the background constraint back to use N_BG, not f_BG. Then fit
for N_B, N_S, F_0^obs. This shouldn't change the result too much, but I
think it's more correct. (E. Brubaker)
A: We don't think that this
is a problem, and it seems to be the usual way that other W helicity
analyses have handled constraining their background.
30) Out of curiosity, have you ever tried
running pseudoexperiments where 100% of the events are taken from the
background template? (E. Brubaker)
A: No, we have not tried this.
31) [Regarding systematics estimates from
pseudo-experiments] We (template mass) define shifts using the median of
the collection of p.e. results, on the grounds that it is less sensitive
to fluctuations and tails than the mean of a Gaussian fit. What do you
think? (E. Brubaker)
A: Our shapes (F_0 measured from a large cast
of pseudo-experiments) tend to fit well to a gaussian, so for us using
the mean instead of the median is okay.
32) [Regarding systematics estimates from
pseudo-experiments] In general, it would be good to see all the
results in the note, meaning all the shifts when you say you choose the
largest. (E. Brubaker)
A: Okay, we've added tables
(and some additional plots) for all of the systematics estimates to the
latest version of the note.
33) [Regarding systematics estimates from
pseudo-experiments] What are the uncertainties on these
uncertainties? (E. Brubaker)
A: From 10k
pseudo-experiments the uncertainty on the systematic uncertainties is
about +/- 0.003.
34) MC Statistics: this
is nice work. However, I would argue that you've shown the uncertainty
due to MC statistics is negligible and should be zero. That's
because the width of 0.01 is just the uncertainty on the mean of a 1k
normal distribution: RMS/sqrt(N) = 0.3/sqrt(1000) = 0.0095.
(E. Brubaker)
A: Thanks for pointing this
out... but the systematic is so small compared to the others, we'll keep
this for now.
35) See the acceptance bias
comment. Can eliminate the use of pseudoexperiments for this uncertainty
by just solving for F_0 directly and propagating the systematic.
(E. Brubaker)
A: We wanted to be consistent with what was
done in the lepton pT analysis.
36) Will you use
Feldman-Cousins? (C. Plager)
A: Yes. The plot is shown below.
F-C tells us our measured value is F_0 = 0.89 +0.11 -0.38.
The confidence belts allow us to set the limit, F_0 > 0.25 @
95% CL