Skip to content

Signal process identification using data from high energy collisions and demonstration of the interpolatable property of parameterised Neural Networks.

Notifications You must be signed in to change notification settings

pratyushranjan2/Signal-Process-Identification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 

Repository files navigation

Signal Process Identification

The paper from UOC Irvine on Parameterised Machine Learning for High Energy Physics describes how typical neural network methods are widely used in high energy physics to solve isolated set of closely related problems. One example of such problem is signal-background classification for a particle with a range of possible masses. Earlier a set of isolated neural networks were used for training which lacked the ability to smoothly interpolate. But if we make the neural network parameterised, in the sense that now apart from event-level features, we also include certain parameters in an attempt to make our network more generalised, then our model can smoothly interpolate on parameters where it had not been trained. This parameter can be the mass of the new particle produced.

In this project we demonstrate how a parameterised neural network which is trained on a data with a particular set of parameter as features, makes accurate predictions even for a parameter where it had not been trained. The parameter being considered is the mass of the particle.

In the following demonstration we use data from high energy collisions which is publicly available at UCL ML Repository.
The implementation has been done using pytorch.

We start by loading collision data which had the mass values from the set {500, 750, 1250, 1500}. We also load the collision test data with masses from the set {500, 750, 1250, 1500} and another collision test data for mass=1000.

We then build a Neural Network Classifier model with the first layer having the number of input neurons same as the number of features which is 28. The number of output neurons in the last layer is 1, since we are concerned with binary classification. Some hidden layers have also been added along with activation functions. Since this is a classificaiton task, the loss function used is the Binary Cross Entropy Loss. The optimiser being used is the Stochastic Gradient Descent. The data is provided in batches for both forward as well as backward propagation.

After training the Neural Network model on the data with masses in {500, 750, 1250, 1500}, we make predictions on the collision test data with masses in {500, 750, 1250, 1500}. The area under the Receiver Operating Characteristic (roc-auc) curve obtained was 0.938.

Next we use the same model to make predictions on the collision test data for mass = 1000. The area under the Receiver Operating Characteristic curve obtained was 0.969.

Note that when we trained our model, we used collision data with masses exclusively in {500, 750, 1250, 1500}. But when we tested our model on collision data for mass = 1000, we got the area under roc-auc curve as good as the area under roc-auc curve for collision test data with masses in {500, 750, 1250, 1500}. The area under the roc-auc curve in both cases were greater than 0.9.

This demonstrates that the Neural Network was successfully able to interpolate the mass feature to make accurate predictions for an unseen mass.

A detailed description of the code can be found in the jupyter notebook.

About

Signal process identification using data from high energy collisions and demonstration of the interpolatable property of parameterised Neural Networks.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published