Machine learning: Trends

Applications of machine learning tools to problems of physical interest are often criticized for producing sensitivity at the expense of transparency. To address this concern, we explore a data planning procedure for identifying combinations of variables—aided by physical intuition—that can discriminate signal from background. Weights are introduced to smooth away the features in a given variable(s). New networks are then trained on this modified data. Observed decreases in sensitivity diagnose the variable’s discriminating power. Planning also allows the investigation of the linear versus nonlinear nature of the boundaries between signal and background. We demonstrate the efficacy of this approach using a toy example, followed by an application to an idealized heavy resonance scenario at the Large Hadron Collider. By unpacking the information being utilized by these algorithms, this method puts in context what it means for a machine to learn. A common argument against using machine learning for physical applications is that they function as a black box: send in some data and out comes a number. While this kind of nonparametric estimation can be extremely useful, a physicist often wants to understand what aspect of the input data yields the discriminating power, in order to learn/ confirm the underlying physics or to account for their systematics A physical example studied below is the Lorentz invariant combination of final state four-vectors, which exhibit a Bret-Wigner peak in the presence of a new heavy resonance. It is exposes the subtlety inherent in extracting what the machine has “learned.”
The left panel shows red and blue data, designed to be separated by a circular border. The right panel shows the boundary between signal and back-
ground regions that the machine (a neural network with one hidden layer composed of 10 nodes) has inferred. Under certain assumptions, a deep neural network can approxi-
mate any function of the inputs, and thus produces a fit to the training data. While any good classifier would find a “circular” boundary, simply due to the distribution of the training data, one (without additional architecture) has no mechanism of discovering it is a circle. In light of this, our goal is to unpack the numerical discriminator into a set of human-friendly variables that best characterize the data.
Submit manuscript through online: American journal of computer science and information technology
Regards,
Annie Foster,
Managing Editor