A: Have compared the distribution of the correlation coefficients in data with a mix of 10% ttbar MC events and 90% W+3p, looking in the the Nj=3 exclusive mode. The agreement between data and MC is good (CDF Note 6897).
A: Don't think this is any different from fitting 1 variable. When certain effects contribute both to the acceptance and the shape systematic the two contributions are 100% correlated and we add them linearly. On the other hand, systematics originating from different sources are considered uncorrelated. Please note that we calculated the systematic errors piece by piece, including all the components for the jet correction systematics. With two exceptions we find a general trend downward in going from single-variable Ht fit to a multiple-input NN fit. The two exceptions are the "out of cone" and "splash out" components of the jet energy corrections. The above two contribute with ~2% each, which is a small fraction of the overall systematic (~21%).
A: A multivariate likelihood fit: a straightforward approach but need more statistics for good performance. Also by using a single hidden node we tried a linearized version of the NN. Average fractional error: 16.8% (7 hidden nodes) versus 18.7% (1 hidden node). Systematic error: 19.1% (7 hidden nodes) versus 28.7% (1 hidden node). We interpret this as an indication that a NN approach is more powerful than a linear discriminant analysis.
A: We have already assigned a systematic for the QCD-fakes background. We were using the
theoretical cross sections in adding the contribution of Wtau3p, dibosons, Z's, single-top to the
overall W-like shape. We will add a systematic for doing this. The expected contribution to the systematic
fraction is 2% and was calculated by by doing pseudo-experiments and fitting with/without the smaller
backgrounds and taking half of the average difference. For a plot comparing the W3p NN-output shape to
the NN shape of all other ewk backgrounds mixed appropriately look
here.
A: The cut was placed at 0.5 since for calculating the numbers in this table we used a balanced set of events - equal number of signal/background events.
A: We did try initially to construct the NN looking at the correlations between variables. After we did an iterative study on a large number of input variable combinations as detailed in the note, we found many different combinations providing comparatively good performance. As a result we abandoned this track and looked at the expected systematic error expected for each net. This procedure is time consuming so we studied 42 different NN with the number of input variables ranging from 1 to 20. The selection of each NN was based on these previous studies we made but also on guesswork. We see the NN-fit improving with respect the statistical/systematic error the more information we add. Certain combination of variables perform better especially with respect to the systematics. We do not see large variations in the fit fractional error for a given number of input variables. Based on Figure 5 in the note alone, we choose a 7 input NN as being reasonably close to the overall best performance we were able to obtain and still not very "complicated". There is certainly room for optimization here, but probably not much, considering the performance numbers we found for the larger NNs.