Search for Bs→μμ- and Bd→μμ- Decays at CDF II

Primary Authors: Douglas Glenzinski, Matthew Herndon, Walter Hopkins, Teruki Kamon, DaeJung Kong, Vyacheslav Krutelyov, Cheng-Ju Lin, David Sperka, Julia Thom, Satoru Uozumi

Table of Contents

Introduction and Motivation

Processes involving Flavor Changing Neutral Currents (FCNC) provide excellent opportunities to search for evidence of new physics since in the standard model they are forbidden at tree level and can only occur through higher order loop diagrams. Two such processes are the decays Bs (Bd) → μμ-. The SM predictions for these branching fractions are BR(Bs → μμ-) = (3.2 ± 0.2)× 10-9 and BR(Bd → μ+μ-) = (1.00 ±0.1)×10-10 (1).

These predictions are one order of magnitude smaller than the current experimental sensitivity. Previous bounds from the CDF collaboration, based on 7 fb-1 of integrated luminosity, are 2.8×10-9 < BR(Bs → μμ-) < 4.4×10-8 and BR(Bd → μμ-) < 6.0×10-9 at 95% C.L. A description of the previous CDF analysis is given here.

Enhancements to Bs → μμ- occur in a variety of different new-physics models. For example, in supersymmetry (SUSY) models, supersymmetric particles can increase BR(Bs → μμ-) by several orders of magnitude at large tanβ, the ratio of vacuum expectation values of the Higgs doublets (2). In the minimal supersymmetric standard model (MSSM), the enhancement is proportional to tan6β. For large tanβ, this search is one of the most sensitive probes of new physics available at the Tevatron experiments.

Analysis Overview

This measurement uses 10 fb-1 of integrated luminosity collected by the CDF detector and supersedes our previous published result (preprint and public page), which used 7fb-1 of data.

The analysis methods are exactly the same as in the previous iteration but with the full CDF II data set.

The events are collected using a set of dimuon triggers and must satisfy either of two sets of requirements corresponding to different topologies: CC events have both muon candidates detected in the central region (often labeled "CMU" by CDF), while CF events have one central muon and another muon detected in the forward region (aka CMX).

We use the same NN as used for the previous iteration of the analysis without any retraining. The NN uses 14 input variables and is described in more detail here. The signal discrimination power of the 14 variables and some additional kinematic variables (pT(B) and pT(higher pT muon)) are shown here, here, and here. We check the MC modeling of our 14 input variables using a sample of B → J/Ψ K → μμ- K events collected on the same triggers and satisfying the same set of baseline requirements. The kaon is required to have pT>1 GeV/c. The comparisons are shown here, here, and here. For these plots, in order to better mimic the resolutions for the B→μμ- decays, the vertex variables use only the two muons from the J/Ψ to mimic the resolutions of the Bs→μμ- decay while the pT(B) and isolation variables use the 3-track information.

The baseline selection requires high quality muon candidates with transverse momentum relative to the beam direction of pT > 2.0 (2.2) GeV/c in the central (forward) region. The muon pairs are required to have an invariant mass in the range 4.669 < Mμμ < 5.969 GeV/c2 and are constrained to originate from a common well measured three-dimensional (3D) vertex. A likelihood method together with a dE/dx based selection are used to further suppress contributions from hadrons misidentified as muons. Only a fraction of the total number of background and simulated signal events are used to train the NN. The remainder are used to test for NN overtraining and to determine the signal and background efficiencies.

Several tests were done in the previous iteration of the analisys to ensure νNN (the neural network output) is independent of Mμμ. We train the NN with the inner and outer part of our sideband and then compare the NN output of the two trained NN. In the resulting plots for CC and CF show no signs of mass bias. We also check the NN output as a function of dimuon mass and find no correlation between mass and NN output. All selection criteria were finalized before revealing the content of the signal regions. The optimization used the expected upper limit on the branching fraction as a figure of merit. To exploit the difference in the Mμμ distributions between signal and background and the improved suppression of combinatorial background at large νNN , the data is divided into sub-samples in the (νNN , Mμμ) plane. The CC and CF samples are each divided into 40 sub-samples. There are eight bins in νNN with bin boundaries 0.70, 0.76, 0.85, 0.90, 0.94, 0.97, 0.987, 0.995 and 1. Within each νNN bin we employ five Mμμ bins, each 24 MeV/c2 wide, centered on the world average Bs (Bd) mass.

We use candidate B→ J/Ψ K events collected on the same triggers as a relative normalization to estimate the BR(Bs → μ+μ-) as:

BR(Bs → μμ-) = NBs/(αBs⋅εrecoBs) ⋅ (αBεrecoB)/ (NB) ⋅ εtrigBtrigBs ⋅ εNNBs fu/fs ⋅ BR(B→ J/Ψ K)⋅ BR(J/Ψ→μμ-),

where NBs is the number of candidate Bs → μ+μ- events, αBs is the geometric and kinematic acceptance of the di-muon trigger for Bs → μμ- decays, εrecoBs is the reconstruction efficiency for Bs → μμ- events in the acceptance, εtrigBs is the trigger efficiency for Bs → μμ-, with NB, αB, εtrigB, and εrecoB similarly defined for B→ J/Ψ K+ decays; the ratio fu/fs accounts for the different b-quark fragmentation probabilities and is (0.402 ± 0.013)/(0.112 ± 0.013) = 3.589 ± 0.374 (3), where the (anti-)correlation between the uncertainties has been accounted for. The final two terms are the relevant branching ratios BR(B→ J/Ψ K) ⋅ BR(J/Ψ → μ+μ-) = (1.01 ±0.03)×10-3 ⋅ (5.93 ±0.06)×10-2 =(6.01 ± 0.21)×10-5 (3). A summary of the values for all the parts of the equation is given in this table. The analysis described is also sensitive to Bd→μμ- decays. The BR(Bd → μμ-) is estimated from the same equation, substituting Bs for Bd, and changing fu/fs to fu/fd = 1. All other aspects are the same as the Bs → μ+μ- search.

Backgrounds and Cross Checks

The backgrounds in this analysis are categorized as combinatoric and B→ h+h'- peaking background.

The combinatoric background is estimated by fitting a fixed slope first order polynomial to the mass sidebands with dimuon mass greater than 5 GeV/c2. The lower sideband goes from 4.669 GeV/c2 to 5.169 GeV/c2 while the upper mass sideband goes from 5.469 GeV/c2 to 5.969 GeV/c2. The slope is attained from the dimuon mass shape for all NN bins combined (νNN>0.7). We then allow the normalization of the pol1 to float and fit for each NN bin. The individual fits to the separate NN bins can be seen in these figures: Lower NN bins, CC, Higher NN bins, CC, Lower NN bins, CF, Higher NN bins, CF For the three highest NN bins we also assign a shape systematic based on our ignorance of the background shape.

We fit a completely floating pol1 to sidebands with dimuon mass >5GeV/c2, and an exponential to the entire sideband region and compare the results with our standard fixed slope pol1. We take the largest difference as our systematic. The resulting relative errors are shown in this table.

The final expected number of combinatoric background events in all the NN bins for both Bs and Bd signal regions are given in these tables: Bs, Bd.

Our other main source of backgrounds is a the peaking B→ hh background where both hadrons pass our muon ID. This background is more signicant in the Bd mass window (an order of magnitude larger than in the Bs window). The peaking B→ h background is estimated by Monte Carlo and D*-tagged D0→ Kπ events. We measure the probability that a pion or kaon will satisfy our muon identification criteria using a data sample of D*-tagged D0→ Kπ events. We use a MC sample of B→ hh events to estimate the acceptance, the pT(hadron) distribution, and the shape of the invariant mass distribution (assuming the muon mass for both legs). All the reconstruction efficiencies are taken from the data in the manner described above. The estimated B→ hh background in each NN bin is shown here: Bs, Bd.

We use 4 control regions to test our background estimates:

  • OS-: Opposite sign muons with negative lifetime
  • SS+: Same sign muons with positive lifetime and looser preselection cuts
  • SS-: Same as SS+ except with negative lifetime
  • FM+: We require one 'muon' to fail our muon ID requirements, positive lifetime

The first 3 samples are dominated by combinatoric backgrounds with negligible B→ hh contributions. Due to the looser muon-id requirements, the FM+ sample has a significant B→ hh contribution. For each control sample we compare the number of predicted background events to the number observed in each NN bin for the CC and CF channels separately. For these cross checks we use a "signal" region defined as 5.169 < Mμμ < 5.469 GeV/c2. These comparisons give us confidence in our background estimates.


The expected limit for the branching ratio considering the background-only hypothesis is BR(Bs→μμ-) < 1.3 (1.0)× 10-8 at 95% (90%) C.L. for the Bs search. The expected limits for the Bd search is BR(Bd→μμ-) < 4.2 (3.4)× 10-9 at 95% (90%) C.L. for the background-only hypothesis.

The number of observed events is compared to the number expected in all 80 sub-samples for the Bd search region in this table and is summarized in this plot.The data are consistent with the background expectations and yield an observed limit of BR(Bd→μ+μ-) < 4.6 × 10-9 (3.8 × 10-9) at 95% (90%) C.L. An ensemble of background-only pseudo-experiments are employed to estimate the significance as a p-value. The effects of systematic uncertainties are included in the pseudo-experiments by allowing them to float within Gaussian constraints. The p-value is obtained by comparing the log-likelihood ratio, -2ln(Q), observed in the data with the distribution from an ensemble of MC pseudo-experiments. Here Q=L(s+b|data)/L(b|data), where L(h|x) is the product of Poisson probabilities over all NN and mass bins. The systematic uncertainties are included as nuisance parameters. The likelihood is minimized with respect to the nuisance parameters. The resulting p-value for background-only pseudoexperiments is 41%.

The result for the Bs region are shown in this table and is summarized in this plot. There is a excess of events concentrated in the νNN>0.97 region. The p-value for background-only pseudoexperiments is 0.94%. The excess is concentrated in bins with νNN > 0.97. The excess in the 0.97 < νNN < 0.987 bin appears to be a statistical fluctuation of the background as there is no significant expectation of Bs → μ+μ- signal consistent with the observation in the two highest NN bins. If we consider only the two highest NN bins the p-value becomes 2.0%. If we include Bs→μ+μ- events in the pseudo-experiments at the SM level (BR=3.2× 10-9) we obtain a p-value of 6.8% (21.6%) using all (only the highest 2) NN bins.

We use the log-likelihood fit described above to determine the BR(Bs→μμ-) most consistent with the data in the Bs search region. From the resulting Δχ2 distribution the BR(Bs → μμ-) is taken as the value at the minimum and the uncertainty as the BR corresponding to 1 unit change in Δχ2, BR(Bs → μμ-)=(1.3+0.9-0.7)× 10-8. Additionally we set bounds at 95% (90%) C.L. on the braching fraction of Bs → μμ- of 0.8 × 10-9 < BR(Bs → μμ-) < 3.4 × 10-8 (2.2 × 10-9 < BR(Bs → μμ-) < 3.0 × 10-8).

We also derive an upper limit at 95% (90%) C.L. of BR(Bs → μ+μ-) < 3.1 × 10-8 (2.7× 10-8) with the CLs methodology.

Comparison of New and Old Data

We checked the compatibility between the last 3 fb-1 of data and the previous 7 fb-1 by performing the same analyis independently on the two datasets. As an additional stringent check, we used the combinatorial background observed in the first subset of data to predict the average combinatorial background expected in the last subset of data. This is then compared with the observed background and compatibility is evaluated.

Plots showing the final results of the analysis using only the last 3fb-1 are shown here. The data in the bin 0.97 < νNN < 0.987 show no evidence of peaking structures and is consistent with background expectations. No new events were observed in the bin νNN>0.995 where 0.25 events are expected assuming the SM. The expected CLs-limits are BR(Bs → μμ- ) < 3.4 x 10-8 and BR(Bd→ μμ-) < 10 x 10-9 at 95% CL. The observed limits are BR(Bs → μμ-) < 2.9 x 10-8 and BR(Bd→ μμ- ) < 9.0 x 10-9 at 95% CL P-values for background-only and SM+background for the Bs are 82% and 48%, respectively.

In addition, the number of sideband events in the new data is predicted for each NN bin using the old data. Because the sample sizes and running conditions of the two subsets are different, the old data sideband yield is scaled down. The scaling factor is extracted by comparing the sideband yields observed in old and new data selected through a loose NN cut of νNN>0.7. Because backgound may depend on instantaneous luminosity, whose profile differs between old and new data, the scaling has been done independently in four luminosity bins. The sideband yield is estimated for each NN bin and reweighted according to the new data luminosity distribution for each luminosity bin. All luminosity bins are then combined to form a total expected sideband yield for a specific NN bin. The resulting predicted and observed sideband yields are shown here and agree well with each other. The efficacy of the NN to reject combinatorial background is the same in the old and new data. Because the NN was trained using (a fraction of) the old sideband data only, this test on new data confirms that the background predictions are free from any possible bias due to using same data in the NN optimiyation and in the background estimation.


(1) A. Buras, B. Duling, T. Feldmann, T. Heidsieck, C. Promberger, S. Recksiegel, 1002.2126 or JHEP 09, 106 (2010).

(2) S. R. Choudhury and N. Gaur, Phys. Lett. B 451, 86 (1999); K.S. Babu and C. Kolda, Phys. Rev. Lett. 84, 228 (2000).

(3) W. M. Yao et al. [Particle Data Group], J. Phys. G 33, 1 (2010).

Figures and Tables

The list of Figures and Tables is here.

Date: 2012-10-04 11:44:35 CDT