Finding galaxy sources using ConvNets

Vesna Lukic
Mar 26, 2021
3 min read

In the previous two posts, we have explored using deep neural networks to classify radio sources.

In the first of these two posts, we discussed classifying sources from the Radio Galaxy Zoo project according to the number of components. The figure on the left shows compact sources in the top row, and 1-,2- and >=3- component extended sources in the subsequent rows.

In the second, we compared the performance of Capsule Networks against traditional convolutional neural networks in classifying images from LOFAR, according to the most common radio galaxy classes.

In this third and final post of the series, we explore whether we can train simple convolutional neural network architectures to find radio galaxy sources, using simulated images from the SKA.

At the end of 2018, the SKA released a science data challenge to detect, classify and characterise different kinds of radio sources (star-forming galaxies, steep, and flat-spectrum radio sources). The sources were simulated to mimic the kind of sources that were expected to be detected when the SKA begins to collect data.

The data consists of images at 3 different frequencies and exposure-times. One can see that the images are less noisy at longer exposure times, and they appear noisier at higher frequencies.

Before embarking on the source-finding task, we must decide what makes up a radio source. A common way is to firstly calculate the mean pixel value across the entire image, and set that as the threshold value, such that any pixels above this value are considered to be part of a source, and any below are considered to be noise.

A problem however is that in the radio regime, the noise tends to be correlated, which can make the noise look as though it is a source - if there is a small collection of pixels that is almost at the threshold value.

Source-finding is also more difficult at lower signal-to-noise ratios (SNRs) as the signal pixels are closer in value to the noise pixels, therefore are more difficult to isolate.

A common way to do source- finding is

to fit gaussians to sources. This is the technique used by the Python Blob Detector and Source-Finder (PyBDSF)

The method we developed is called ConvoSource, and it uses a simple convolutional neural network consisting of 3 convolutional layers.

It is trained using the real, segmented map on the corresponding 'solutions' map. The solutions map is generated using the X,Y coordinates of sources. We generated additional images by applying the image augmentation techniques of horizontal and vertical flipping, and rotation by 90-degree increments.

ree — Left: Real, segmented maps at a particular exposure time and frequency. Right: The corresponding solutions map

We segmented the maps into 50x50 pixel blocks (both the original maps and corresponding solution maps), and these blocks were used as inputs into the convolutional neural network.

The image to the right shows some of the feature maps that are detected by the first and third convolutional layer, as well as the output predictions for the source locations.

If we look closely, we can see that ConvoSource tends to output the source locations spread out over several pixels, and the pixel values range from 0 to 1. This makes it a more flexible source-finder, because instead of it outputting only binary values of 0 (no source found) and 1 (source found), the output can be interpreted as a probability of finding a source in a particular location.

We compared the source-finding results of ConvoSource to that of PyBDSF using the F1 score, which is a combination of both precision and recall. We found that at lower SNRs, ConvoSource is better at recovering the star-forming galaxy sources, whereas PyBDSF is better in recovering the steep- and flat- spectrum AGN sources. The opposite effect is seen at higher SNRs.

The above image shows the performance of the two source-finders on two different examples. The top row shows 3 sources, which ConvoSource recovers, whereas PyBDSF misses the bottom source. The bottom row shows a single source which ConvoSource identifies, however it also outputs a source in the bottom right - it was unable to distinguish it from correlated noise. PyBDSF correctly recovers the one single source in this block.

In conclusion, we see that ConvoSource outputs pixel values in the range between 0 to 1, which attributes a probability to finding a source in a particular location. The source locations tend to be spread out over several pixels, which has an advantage as well as a disadvantage: the detection of more true positives, at the expense of detecting more false positives.

ConvoSource provides a novel approach to the task of source-finding, using a simple convolutional neural network architecture.

Finding galaxy sources using ConvNets

Recent Posts

Comments