We’ve through a series of blog posts (Extracting the antenna pattern from a beacon signal, pt. I: Initial investigations using autocorrelationPt. II: Measurements of LA2SHFPt. III: Frequency analysis and code) outlined an approach for measuring the antenna pattern when the signal source is a deterministic, machine-generated morse signal. This has so far been done using autocorrelation as the base technique, which is appropriate when the signal is periodic. Some weaknesses of this approach were uncovered when measuring LA2SHF at a high sample rate, as inconsistent timing offsets led to suboptimal signal templating, which had to be dealt with using some ad-hoc thresholding approaches.

Today, we’ll outline an alternative, probabilistic approach, which doesn’t assume anything about the timing of the signal but rather tries to classify each individual sample for whether it belongs to a pause or tone segment using probability distributions. As mentioned in the first blog post (Initial investigations using autocorrelation), this was the original idea until LB6RH Jens suggested to use correlation. Even if that method worked quite well, it is still fun to see whether the first method had any merit.

The basic idea is to characterize the probability distribution of the noise floor, and use that to determine whether a given sample just belongs to the noise floor or is an actual part of the beacon signal. This can be formulated as an hypothesis test, where we compare a null hypothesis against an alternative hypothesis:

H0: The sample belongs to the noise floor
H1: The sample does not belong to the noise floor (i.e. it is a part of the signal)

Formulating the problem as an hypothesis test specifies the problem in a more clear way and conforms it down on the same format as other hypothesis tests as expected from statistics. The trick with hypothesis tests is to specify the test in such a way that H0 involves a known probability distribution, while we don’t have to assume anything special about the probability distribution associated with H1. This makes for a very robust method.

We need sufficient reason to reject H0 for a given sample, x0. The “sufficient reason” is here whether the noise floor probability distribution could have produced the observed sample or not. We quantify this by calculating the probability of the noise floor distribution producing a sample which is more extreme than the observed sample, i.e. we calculate the probability P(x >= x0), given that the H0 hypothesis was true. If this probability is high, and that it is very likely that the sample could be produced under the H0 hypothesis, we don’t reject H0. On the other hand, if the probability is sufficiently low (typically below 0.05 or 0.01), we say that we have sufficient reason to reject H0 in favor of the alternative hypothesis.

The trick here is “given that the H0 hypothesis was true”. We don’t have to calculate anything related to the H1 probability distribution. We probably wouldn’t be able to do that anyway, since the signal amplitude varies that much with azimuth angle, but in theory, the noise floor distribution should stay approximately the same in all directions unless the signal sources that are part of the noise floor change dramatically with direction.

Our main obstacle here is therefore to characterize the probability distribution of the noise floor within the pauses of the signal. The noise here should be pretty random, uncorrelated and independent, and might even be Gaussian-like for all we know, which make our calculations even easier. The noise should vary around zero, sometimes negative and sometimes positive.

The FFT magnitude is always non-zero, as it is an absolute value of something, and can’t really be expected to vary around anything. The noise we see here is actually the magnitude of another noise distribution.

The FFT components consist of cosines, and the phase and magnitude in the FFT describes the phase and magnitude of the cosine.

The noise on top of this signal is something which varies around 0, and should be easier to characterize. We will focus on the real part of the signal from here, as the real or imaginary parts of the FFT will express something similar and be distributed around zero.

Extracting the background level from the pauses require that we already know where the pauses are, which defeats the purpose of this analysis. But we actually have access to more than just the beacon signal:

 

We also have access to surrounding bins without the beacon signal, which would be just a very long pause. The accuracy of the following sentence depends on the spectral leakage, distance from the beacon frequency, the presence of other signals. In general/very specific to our measurement: Surrounding bins might also approximately have the same noise floor probability distribution as the pauses in the beacon signal. We can see this also from visual inspection of the waterfall diagram, though some spectral leakage is evident at some samples, made more clear when plotting one of the bins as a function of time:

The leakage is more evident at azimuth angles where the signal is strong (see the spikes). Some correlated trending is evident, for example around sample number 5000. Otherwise, this looks like nice, uncorrelated noise.

Taking the histogram of the above time series shows that the distribution looks very visually similar to a Gaussian distribution. We’ll test this with a normality test, which tests whether the samples could be described by a Gaussian distribution. Since we already know some of the samples don’t, and we don’t exactly know where, we’ll do this in a sliding window way: A window of 5000 samples is taken (number of samples arbitrarily chosen), and slid along the time direction of the signal, applying the normality test on each subset of the signal. The normality test applied is scipy.stats.normaltest.

The implicitly tested hypothesis here is H0: The subset comes from a normal probability density, against H1: The subset comes from a different probability density, and the p-value expresses the probability that this particular subset is produced given that H0 is true. We reject H0 for sufficiently low probabilities. For a large bulk of the signal, we don’t seem to have sufficient grounds to reject the H0 hypothesis, and it seems that the probability density function for these parts is actually the normal distribution, and that the signal is dominated by normal distributed noise here. For the other parts of the signal, it is dominated by spectral leakage from the correlated behavior of the beacon. Since we chose a subset consisting of 5000 samples, we are not checking individual pauses in the signal (these are longer than 5000 samples), and these don’t show up. But it doesn’t really matter, we’ve with this been able to identify a larger bulk of signal in a peripheral bin which we could use for general noise/background characterization. (Is this misuse of statistics? I don’t know, I’m a physicist. 8))

Note: The noise on top of the within-signal samples also have a normal distribution, most likely, but the distribution changes from sample to sample due to the variation in azimuth direction. Taking a larger bulk of the signal then wouldn’t act like a single normal distribution, but multiple normal distributions.

In this case, we have the estimated mean µ = -0.0002 and standard deviation σ =0.0124, estimated using  9259 samples. The standard deviation characterizes the amount of noise, and the hope is that this does not change much with azimuth angle.

The noise is normal distributed when we look at the real or imaginary parts of the signal. These vary in a sinusoidal way, between -magnitude and +magnitude. To find out whether to reject H0, we need to calculate the probability/p-value for an observation to occur that is more “extreme” than the current sample. “More extreme” here would be the presence of a cosine instead of the normal distributed noise, but this is challenging to formulate in a good way for probability calculation. If we work with magnitudes, however, the presence of a cosine would just mean a non-zero magnitude. A more extreme sample would just mean a value x higher than the current sample x0, and calculating the probability P(x >= x0). This is simple to formulate and calculate. If a stochastic variable X is normal distributed, the magnitude |X| will have a folded normal distribution, which luckily is available in Python as scipy.stats.foldnorm, yielding methods for calculating P(x >= x0).

Calculating the probability that the noise floor distribution could yield a more extreme value than the observed value yields the following:

Unfortunately, it seems that the noise floor in the pauses is not necessarily exactly the same as the noise floor picked out at a outlying bin, as some p-values in the pauses are pretty low. Still, for the high values, the p-values are more or less exactly 0.0, meaning that we could set the critical level very low.

This approach actually works very well, and mainly the rising samples mar the signal level, as these also are correctly classified as high. This was something we also had a problem with for the autocorrelation approach.

Plotting the real and imaginary parts of the signal might indicate that these samples also follow the same trend as the within-signal samples, and should have had the same signal magnitude:

 

It also looks worse than it is due to each signal segment being slightly out of phase with the last signal segment due to the arbitrary pause.

Unwrapping the phase yields a nice, descending curve:

With the unwrapped phase, it gets more apparent that there is transient behavior at the edges, as the edges have a different slope than the nice, linear behavior within each segment (see for example after sample 175).

In any case: The behavior seems to be at most one or two samples outside the signal segment, and could be due to the FFT including contributions from both the pause and the tone. We could have tried to extend the correct behavior of the tone segment by extrapolating the phase more correctly and somehow adjusting the magnitude, but we’d rather just remove it. We previously dealt with this using a median filter, but it is probably more appropriate to use an erosion operation. This strips away samples at the edges according to a structuring element, frequently used for two-dimensional image processing, here applied on our one-dimensional signal.

This gives us a pretty nice result:

Bæm! Only problem now is that there is weird behavior in the LOS direction.Investigating this more closely before the erosion operation: In the LOS direction, the noise floor seems to change a bit from noise floor of the surrounding bins, even in the pauses. The probability of these samples arriving from the same distribution is therefore calculated to be quite low.

Zooming in on the pauses pauses at LOS shows that the noise floor actually also has trends, and isn’t centered around 0 expectation anymore. This breaks a bit from our assumptions of the behavior of the noise floor probability distribution.

Interestingly, this feature is present only for the beacon bin of interest:

This is then the weakness of this approach: If the noise changes the probability distribution, we can’t characterize it correctly anymore. Since the noise floor doesn’t have the same behavior for the outlying bin (only approximately 1 KHz away from the beacon bin), we suspect that this is behavior present only in our receiver due to the high signal level, or is against just due to the FFT taking a combination of signal and pause to yield the observed sample and gives a time-transient effect. The method would also break down if the noise floor changes its standard deviation, or if the signal level is so low that it is mistaken as noise.

For LA2VHF, this approach works well for all directions, and the only problem is where the signal level is so low that it gets classified as background (samples 1000-6000). In this case the uncorrelated properties of the noise floor could be used to better characterize the signal, as the samples for the high signal have trends and behavior that should be very improbable under the noise floor assumption. For example, over time, the frequencies of specific samples is supposed to adher to the probability density distribution at hand. Having a single occurrence of a sample with magnitude above 0.2 has a high probability if we consider it isolated – let’s say, probability 0.3: But over time, only 30% of the samples should have magnitudes over 0.2. If this changes, this is no longer the same distribution.

The simple approach here actually goes under a larger umbrella of methods categorized as anomaly detection, where the goal is to detect anomalies from some kind of normal state. The “normal state” in our case is the noise floor, while the “anomalies” are the signals. In the end, the presented approach here is nothing more than a glorified threshold operation, but with the threshold having a better probabilistic foundation and behaving in a more non-linear fashion. The formulation also gives way for more advanced probability calculations and interpretations.

We’ve put up the source code for the presented approach on GitHub, along with the previous approaches. This is the last blog post of this series on antenna pattern extraction.

Previous parts of the series: