RFI-classifier

A PyTorch implementation of a supervised machine learning model to classify different morphologies of interference signals in radio telescope data.

Project report: PDF

Contributors:

Akshay Suresh (mentor, project lead)
Ryan J. Hill (Cornell University undergraduate research intern, Fall 2019)
Ethan S. Bair (Cornell University undergraduate research intern, Fall 2019)

Project Goal

Interference signals from human technologies frequently compound searches for exotic astrophysical phenomena. With modern radio telescopes generating data at > 100 GB/hr rates, automated methods are necessary to identify and flag data segments rife with interference. Unflagged data chunks can then be processed via subsequent pipelines tuned to specific science cases.

Here, we experiment with multiple toy convolutional neural network (CNN) models to distinguish between various morphologies of interference signals in radio telescope data.

Methodology

As a first pass, we defined the following 5 classes for our signal classification task.

llnb: Long-lived narrowband interference + background noise
slnb: Short-lived narrowband interference + background noise
llbb: Long-lived broadband interference + background noise
slbb: Short-lived broadband interference + background noise
noise: Background noise only

Simulated frequency-time diagrams of the first 4 signal classes are presented below. Slide credit: Ryan J. Hill

NOTE: In our study, we generated simulated data to ensure that our training and validation data are balanced across all classes. This choice allows us to evaluate model performance using the accuracy metric. Refer to the Appendix of our project report for the full confusion matrices obtained with different CNN models.

Takeaways

Figure credit: Ethan S. Bair Trialing CNNs of different depths, we observe a growth in network accuracy across all signal classes with increasing model depth. However, the incremental gain in network accuracy diminishes with every added layer. Setting a 95% accuracy threshold, the above plot suggests that an 8/9-layer CNN model would be adequate for our classification problem.

Areas for Improvement

Our definition of interference signal classes is overly simplistic and needs refinement based on inputs from real-world radio telescope data.
Models do not account for scenarios where multiple signal classes are present in a single frequency-time snippet. For instance, what if an astrophysical signal of interest overlaps in time with two bright interference signals of different bandwidths?
- Perhaps multilabel classification is worth an exercise.
- Alternatively, we can take a look at image segmentation problems.

Troubleshooting and Feedback

Please submit an issue to voice any problems or requests. Constructive critcisms are always welcome.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
img		img
README.md		README.md
RFI_classifier.pdf		RFI_classifier.pdf
RFI_neural_net.py		RFI_neural_net.py
gen_fake_original.py		gen_fake_original.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

img

img

README.md

README.md

RFI_classifier.pdf

RFI_classifier.pdf

RFI_neural_net.py

RFI_neural_net.py

gen_fake_original.py

gen_fake_original.py

Repository files navigation

RFI-classifier

Table of Contents

Project Goal

Methodology

Takeaways

Areas for Improvement

Troubleshooting and Feedback

About

Releases

Packages

Languages

akshaysuresh1/RFI-classifier

Folders and files

Latest commit

History

Repository files navigation

RFI-classifier

Table of Contents

Project Goal

Methodology

Takeaways

Areas for Improvement

Troubleshooting and Feedback

About

Topics

Resources

Stars

Watchers

Forks

Languages