Recently I was given some radio data captured on the 40m band. Using a piece of software called "Universal Radio Hacker", I attempted to decode it. At the time I thought that this might be Morse code, since then I've been told by someone who has been using Morse longer than I've been alive, that it isn't.
I shared the data on my VK6FLAB GitHub repository where you can download it and see what you learn, and perhaps repeat what I did, or better still, improve on it.
Over the years I've talked a little about how Software Defined Radio or SDR works, essentially it's a glorified Analogue to Digital converter, much like the sound card in your computer, which does the same, albeit at a much lower frequency. As it happens, you can represent the signal that comes into your radio antenna as a series of values. Essentially, the stronger the signal, the bigger the number, the weaker the signal, the lower the number.
Let's talk about the characteristics of this signal. It consists of two parallel signals, in opposition to each other. The first signal jumps intermittently between 7 kHz and 40 kHz, where the second jumps between -7 kHz and -40 kHz. The recording is marked 7.06 MHz, so if we think of that as the central frequency, the whole signal sits between 7.02 and 7.1 MHz. This 80 kHz wide signal is not something you'd typically be able to hear using a standard amateur radio receiver which tops out at about 3 kHz bandwidth. It's so wide that you couldn't even hear more than one of the four tones at the same time.
Randall VK6WR, who supplied the recording, spotted it on a waterfall display showing a chunk of radio spectrum, in fact, a $25 RTL-SDR dongle could receive this signal.
Aside from the fact that this is a really wide signal, well at least in traditional amateur radio terms, it was interesting in that it was heard on the 40m band. As it happens, just after I shared my initial exploration, I was told by several other amateurs that they had heard the signal. I even saw it on a WebSDR in India and attempted to record it, but failed.
As it happens, a few weeks ago, I was playing with something called "CAN Bus", or Controller Area Network, a technology that was designed in 1983 and is used all over cars for things like sensors for speed, engine temperature, oxygen level, detonation timing and anything else that's happening inside a car. You might know the end-user view of this called OBD2 or On Board Diagnostics, second generation. I was looking into it because my car has been acting up and I've been trying to track down the root cause.
Anyway, I learned that CAN Bus is implemented using something neat, "differential signalling", where two wires each carry the same, but opposite signal, so they can be combined to ensure that in an electrically noisy environment like a car, the information still gets where it needs to go.
Seeing the radio signal Randall shared, reminded me of this.
Noise immunity is a useful attribute in digital HF communication, so I can understand why it was done like this, but it also means that either signal was sufficient to start to decode the information. We can use Universal Radio Hacker to show us only half the signal using a band pass filter.
I then decided that the 40 kHz frequency was "on" and represented by a "one" and the 7 kHz frequency was "off", represented by a "zero". Of course that's entirely arbitrary, there's no reason that it cannot be the other way around, but for our purposes it doesn't matter at this time.
That said, we don't yet have enough to decode the actual signal. We need to figure out how long each switch, or bit, lasts, because two zero's side-by-side or two ones side-by-side would look like a long "off" or a long "on". Using that logic, you could also say that the shortest possible duration for a 40 kHz or a 7 kHz tone would represent a single "one" or a single "zero".
Of course, this is a simplified view of the world. For example, the data file contains more than thirteen and a half million bytes. Half of those are for the I in I/Q, the other for the Q. I'm purposefully glossing over a bunch of stuff here, specifically the notion of so-called I/Q signals, that's for another time.
In computing a single byte can represent 256 different values. It means that if the signal is represented by a single byte, a voltage from the antenna at maximum amplitude can be represented as 255 and the minimum amplitude as 0. As it happens, voltages go up and down around zero, so, now we're only using half a byte, 127 for maximum, -128 for minimum. If we use two bytes, we get significantly more resolution, -32,768 as the minimum and 32,767 as the max.
A little trial and error using another tool, "inspectrum", told me that the data was organised as two bytes per sample. Which brings the next point. How many samples per signal?
Said differently, we're measuring the antenna voltage several times per second, let's say twice per second. If a tone of 7 kHz lasts a second, then we get two samples showing 7 kHz. If it lasts half a second, we only get one. As it happens, we're measuring over 22,000 times per second and using the cursor feature on Universal Radio Hacker, we can determine that each signal lasts 2,500 samples. It's roughly a rate of 100 bits per second. The "inspectrum" tool puts it at 91.81 Baud. It's not a standard Baud rate, sitting between 75 and 110 Baud.
Using Universal Radio Hacker, I was able to decode 1,416 bits. You'll find them on my GitHub page next to the signal.
Now for the fun. What does it mean?
I started with looking for structure, by looking for zeroes. In short order I discovered several sequences of zero, then I noticed that there appeared to be a repeating pattern. After some trial and error, using the "grep" and "fold" commands on my Linux terminal, I discovered that the pattern repeats, more or less, every 255 bits. I say more or less, because there are a few bits that are not the same. I suspect that this is a decoding error which could potentially have been eliminated by using the noise immunity features associated with the differential signalling, but I don't yet know how to do that.
Here's what I think I'm looking at.
It appears to be a signal that's a unique identifier, specifically so that it can be used to synchronise two things together. In this case, I suspect that it's an over the horizon radar and the sequence is used to synchronise the transmitter and the receiver. I think that the signal strength variations are what allows reflections to be measured and I suspect that the actual transmitter and receiver are using more than two bytes to represent each sample, but I'm speculating.
If you have an alternative explanation, I'm all ears.
I'm Onno VK6FLAB