2010/01/30

AGC control in Python modem, version 3 finished

This afternoon I put the last block into Python QAM "modem" version 3: Automatic Gain Control (AGC). The sources can be found at http://epx.com.br/wav, the m3*.py files, as well as gray.py as a dependency. The feature can be tested by e.g. changing volume of qam.wav file using Audacity.

AGC allows the modem to adapt to different power levels. Up to now, V3 modem was not adapting to the "volume" in WAV file; it assumed that TX always used 90% of dynamic range. V2 already had a crude amplitude measurer which only paid attention to the training header sequence. But a true AGC must adapt to changes mid-flight.

This is easier said than done. The main problem is that AGC must figure out the full dynamic range, that is, the maximum amplitude a symbol may have. But there is no guarantee that the "strongest" symbols will actually appear in QAM signal. And, if they appear, how does AGC "know" that they are the ones?

A crude solution would be to send the training sequence at maximum amplitude every few seconds, which is easily detectable as a no-data sound, but this wastes bandwidth and is not 1337 enough :)

The actual solution that real-world modems employ is to work with power level averages. It is trivial to calculate the average amplitude of our constellation. Then, we maintain a moving average of QAM signal reception, and compensate for the difference.

V3 uses a one-second moving average, which is short enough to profit from initial training sequence, and adapts fast enough while actual data is being received. TX was changed to send a training sequence at this average power level instead of the maximum possible power level, so AGC does not have to distinguish between training and data signal parts.

Then we have another problem: will actual signal power average match the theoretical constellation average? In theory, this will happen only if all symbols are equally probable. Real-world modems employ a randomizer on bit stream to ensure that.

I was optimistic that AGC would get a good average without a randomizer, but I was wrong. AGC will not work at all without a very random bit stream! It is incredible how ordinary data, like a text file, is so visibly non-random at such a low level.

So, out of necessity, I have implemented a very simple bit randomizer for V3, with a "4th order polynominal". This is a very pedantic way to say that randomizer takes into account the last 4 bits to figure out the next one, which is calculated with simple XOR operations. To be even more pedantic, I even employed a good programming practice this time: a self-test.


# Randomizer: 1 + x**-3 + x **-4

def randomize(b0):
global b1, b2, b3, b4
bs = b0 ^ (b3 ^ b4)
b1, b2, b3, b4 = bs, b1, b2, b3
return bs

def derandomize(b0):
global b1, b2, b3, b4
bt = b0 ^ (b3 ^ b4)
b1, b2, b3, b4 = b0, b1, b2, b3
return bt

rseq = [ int(random.random() * 2) for x in range(0, 25) ]
b1, b2, b3, b4 = (0, 0, 0, 0)
rseqS = [ randomize(b) for b in rseq ]
b1, b2, b3, b4 = (0, 0, 0, 0)
rseqT = [ derandomize(b) for b in rseqS ]
b1, b2, b3, b4 = (0, 0, 0, 0)

if rseqT != rseq:
print "Randomizer is broken"
print rseq
print rseqS
print rseqT
sys.exit(1)


This randomization scheme was copied directly from Fabio Montoro's "Modem e transmissão de dados" book.

It was funny to realize in practice how important the randomizer is, when you simply don't know the power level at RX side and must figure it out based on data stream itself.

Determining power level using averages, even with the help of randomizer, does not have the same precision as knowing it beforehand. This reduces the number of bits per symbol that modem can handle. V3 had reached 16 bits/symbol, now the highest reliable level is 12. (Turning off AGC in RX side makes it go back to 16.)

AGC was the last thing to implement in V3. The next major milestone, V4, will add some form of convolutional coding to increase noise resistance.

2010/01/29

Reaching 48000 bps



The graph above is the waveform for the 38400 bps QAM signal. While I believe in math, my common sense refuses to believe that any information can be recovered from such a chaotic wave.

The "version 3" QAM modem written in Python is a real winner. Today, some of the FIXMEs in code were implemented, and the thing reached 38400 bps at 2400 baud, and 48000 bps at 3000 baud -- both using 16 bits per symbol. Note that 3000 baud is almost twice the rate of 1800Hz carrier!

It "wanted" to work at 18 bits/symbol, but showed too many errors. I am not sure whether 16-bit WAV quantization is the hard limit here, but it seems to be the case. I was quite surprised to see that transmitted message was still recognizable, albeit corrupted.

The changes in v3 which allowed this performance jump are:

1) added code to detect adjacent symbols which are equal in amplitude and phase. While all V3 constellation symbols cause a phase change, sometimes it is very small, especially if we are sending many bits per symbol.

The algorithm is simple: if a symbol change is not detected within the expected interval, the complex signal is sampled anyway. Idiot as it seems, it allows RX to handle up to 16 bits per symbol, while it handled only 7 before.

2) Added a carrier Hz compensation, PLL-like. Any difference between TX and RX carriers cause a "phase drift" in decoded QAM signal -- that is, phase shows a small but continuous rotation in absolutely every sample. This can be distinguished from sudden, time-limited symbol phase rotations, and compensated for.

I discovered that phase drift is more stable (and hence easier to handle) if RX carrier frequency is smaller than TX. If TX carrier is above TX, phase drift oscilates in a difficult way. So, now we make sure that RX carrier is always "low". For symbols with 16 bits each, RX can tolerate up to 15 Hz carrier deviation, which seems very good to me.

3) Constellation uses Gray code now, so adjacent points are guaranteed to have only one different bit. This "softens" errors (even if RX resolves to the wrong symbol, chances are just one bit will be corrupted), and helps whatever error-correction algorithms in use (currently we use none, but I plan to implement convolutional codes in V4).

Towards v4

Actually, the v3 performance numbers are getting meaningless; of course they are possible just because the medium is a noiseless 16-bit WAV file, whose Shannon capacity is almost 700 kbps. (The Shannon limit of a phone line is around 24kbps.)

The next challenge is to make RX work well on noisy and band-limited signals. Rates like 14400bps will be possible only by implementation of convolutional code, I guess.

2010/01/26

QAM modem: 16800 bps!

As I said in the last post, I was thinking about direct (time-domain) analysis of QAM signal at RX part, in order to allow baud rates comparable to carrier frequency. I did not plan to put that into use... but I could not resist doing that.

The thing was working sooner than expected, and performance went beyond my expectations. It can handle 2400 baud with a 1800Hz carrier, like faster modems used to do. It can currently handle 7 bits per symbol, which translates to 16800bps. Going beyond this will demand improvements in symbol detection, which is currently the same as V2 (which is suboptimal).

V3 modem is basically the same as V2, except by decodification of QAM into complex samples, which V2 did in a "classical" fashion (demodulation + FIR filtering), and V3 does by direct analysis of QAM signal. So, I will list only the relevant V3 RX part:


# Proof that we don't need exact carrier frequency
CARRIER += random.random() * 20 - 10
# Proof that we don't need exact carrier phase
ra = random.random() * 2 * math.pi

symbols = []
f = CARRIER * 2 * math.pi / SAMPLE_RATE
carrier_90 = SAMPLE_RATE / CARRIER / 4.0
carrier_90 = int(carrier_90)
phase_error = 2 * math.pi * CARRIER / SAMPLE_RATE / 2.0
recovered = []

for t in range(0, len(qam) - carrier_90 - 2):
# Take derivatives of now and 90 degrees in future
d = (qam[t+1] - qam[t]) / f
d90 = (qam[t + carrier_90 + 1] - qam[t + carrier_90]) / f

st = math.sin(f * t + ra)
ct = math.cos(f * t + ra)

a = d90 * ct + d * st
b = d90 * st - d * ct

try:
tphase = -b / a
phase = math.atan(tphase)
except ZeroDivisionError:
phase = math.pi / 2
if (-b * a) < 0:
phase = -phase

if (abs(d) > abs(d90)):
amplitude = -d / (st * math.cos(phase) + ct * math.sin(phase))
else:
amplitude = -d90 / (-st * math.sin(phase) + ct * math.cos(phase))

if amplitude < 0:
amplitude = -amplitude
phase += math.pi

phase += math.pi * 2
phase %= math.pi * 2

cpx = crect(amplitude, phase)
# print t, int(phase * 180 / math.pi) % 360, cpx

recovered.append(cpx)


This technique is based on the fact that QAM signal is generated by a very simple formula:

s(t) = m . cos(2πft + p)

where "m" and "p" are the polar coordinates of constellation symbol being currently transmitted. The derivative of that is also a simple formula:

s'(t) = 2πf.m.sin(2πft + p)

The QAM decoding problem is: given s(t), find the unknown variables "m" and "p". We need two samples of s'(t) which are near enough to belong to the same symbol, to make an equation system and solve for "m" and "p".

That's exactly what the piece of code listed above does. The samples are taken 90 degrees apart because the equations would simplify very nicely.

I chose the s'(t) equation instead of s(t) because the QAM signal may not be centered at zero; that is, it may have some offset, or DC component. But the difference between each sample and the next is unaffected by the offset. "Difference" in a discrete signal is equivalent to "derivative" in a continous signal.

It is important to measure samples EXACTLY 90 degrees apart, which means that the "distance" between samples must be an integer number. This implies that WAV must be a multiple of 1800 / 4 = 450 Hz. Using a WAV of 43200 Hz works beautifully, while a WAV of 45000 Hz, which is better in theory and matches carrier, will not bear more than 4 bits/symbol. 44100 Hz won't work perfectly even for 3 bits/symbol.

Maybe I have "cheated" by choosing a "perfect" WAV sampling rate, but I suspect that a real-world modem does just that. Anyway, this technique of mine is not intended to be the best in town, it is just an exercise to try to go beyond "classical" QAM. Certainly there are ways to compensate for non-perfect alignment of carrier and sampling rate,

2010/01/25

New version of QAM modem, and ideas for the future

As I promised, I rewrote parts of Python QAM modem to use complex numbers, which leaves code much shorter and clearer (provided that you understand complex numbers). Also, I put the common code in a separate file, to avoid redundancy in TX and RX parts. The FIR lowpass filter was improved too.

The files are m2co.py, m2tx.py and m2rx.py. All of them can be found at ttp://epx.com.br/wav folder.

Other ideas

I think I went far enough with this. I was planning to write a "version 3", but I am not sure I will. Versions 1 and 2 already make good classroom examples. FWIW I will tell what's in my mind for v3, maybe someone else picks it up.

First, the "classical" demodulation approach showed its limits in version 2; baud rate can not go above 1000 or 1100 Hz for a carrier of 1800 Hz. The FIR filter has a trade-off between frequency precision and time-domain precision. Increasing baud rate demands increasing frequency precision, but then symbols begin to smear together after filtering. In order to reach V.29 or V.32 baud rate (which is almost the same as carrier frequency), some other detection approach must be employed.

I played with some formulas, remembered some trigonometry identities at Wikipedia, and came up with one idea. Since the derivative of QAM signal is well-defined by a simple formula, and there are two unknown variables that shape the QAM signal -- amplitude and phase -- it is possible to "invert" the formula and find amplitude and phase given two derivatives taken from time-domain signal.

The distance between measured derivatives could be e.g. 90 degrees to make formula "inversion" even easier. BTW derivative of time-domain QAM signal is just the difference between two consecutive samples. If one wants to make it prettier, could add a moving average to this direct measure.

This approach worked on spreadsheet; it would probably work in a Python implementation. It seems possible to determine symbol in half carrier wavelength at the most, normally a quarter wavelength is enough, which means that decoding is possible when baud rate >= carrier.

Another advantage of this technique, or any other that analyzes the QAM signal directly, is more resilience to small carrier deviations. They would appear just as a very small but persistent phase delay in every sample, which can be ignored or compensated for.

I have no idea if this is the technique that advanced modems actually use.

2010/01/22

QAM modem - RX part

Following the discussion of a "QAM modem" written in Python, let's take a look into the reception (RX) code.

Decoding the QAM signal (reliably, at least) is much more complicated than generating it, and RX has much more code. Because of that, I will explain each section separately.

First, some boilerplate code:


#!/usr/bin/env python

import wave, struct, math, numpy, random

SAMPLE_RATE = 44100.0 # Hz
CARRIER = 1800.0 # Hz
BAUD_RATE = 360.0 # Hz
PHASES = 8
AMPLITUDES = 4
BITS_PER_SYMBOL = 5

# FIR lowpass filter. This time we calculated the coefficients
# directly, without resorting to FFT.
lowpass = []
CUTOFF = BAUD_RATE * 4
w = int(SAMPLE_RATE / CUTOFF)
R = (SAMPLE_RATE / 2.0 / CUTOFF)
for x in range(-w, w):
if x == 0:
y = 1
else:
y = math.sin(x * math.pi / R) / (x * math.pi / R)
lowpass.append(y / R)

# Modem constellation (the same as TX side)
constellation = {}
invconstellation = {}
for p in range(0, PHASES):
invconstellation[PHASES - p - 1] = {}
for a in range(0, AMPLITUDES):
s = p * AMPLITUDES + a
constellation[s] = (PHASES - p - 1, a + 1)
invconstellation[PHASES - p - 1][a + 1] = s

qam = wave.open("qam.wav", "r")
n = qam.getnframes()
qam = struct.unpack('%dh' % n, qam.readframes(n))
qam = [ sample / 32768.0 for sample in qam ]


Part of the code is borrowed from TX: modulation parameters, constellation generator etc. Those things would fit better in a separate module, but I was lazy to do it :) Constellation loop generates a reverse-map dictionary, since we are doing the inverse of TX (given amplitude and phase, gimme the bits).

A new thing is the FIR filter generator. This time I did not use FFT to generate the coefficients; a closed-form expression was used instead. I am aware that this FIR filter is not the best possible (it ends abruptly at both sides), but served well enough for this toy.


# Demodulate in "real" and "imaginary" parts. The "real" part
# is the correlation with carrier. The "imaginary" part is the
# correlation with carrier offset 90 degrees. Having both values
# allows us to find QAM signal's phase.

real = []
imag = []
amplitude = []

# This proofs that we don't need to replicate the original carrier
# (at least, not in the same phase). Sometimes this random carrier
# phase makes the receiver to interpret training sequence as a
# character, but nothing serious.

t = int(random.random() * CARRIER)
# t = -1

# In the other hand, frequency must match carrier exactly; if we
# need to allow for frequency deviations, we must build a digital
# equivalent to PLL / VCO oscilator.

for sample in qam:
t += 1
angle = CARRIER * t * 2 * math.pi / SAMPLE_RATE
real.append(sample * math.cos(angle))
imag.append(sample * -math.sin(angle))
amplitude.append(abs(sample))

del qam


This was the "analog" part of the decoder. It is the most important, and the easiest to do, because trigonometry does so much for us.

Since we did a kind of AM modulation at TX, multiplying a digital signal by a sinusoid, we do the same at RX: multiply again by the sinusoid to get the original data. We don't know exactly the phase of QAM signal, and we don't need to. We just multiply by two local carriers, which are identical except by being 90 degress offset -- math.cos() and math.sin().

Note that I set "t" with a random initial number. This is a way to guarantee that local carrier will NOT be in phase with TX, as it happens in real world. The rest of RX will have to work harder because of this, naturally.

Incredible as it may seem, output of this demodulation is the X,Y position of symbol in modem constellation. The computed positions will be rotated, because local carrier is not in phase with original TX carrier, but since we have integrated the phase changes at TX, we just need to look for phase DIFFERENCES.

Finally, we "demodulated" amplitude in a separate channel, using abs(), just out of laziness, since we could obtain amplitude from real and imaginary parts.

It looks easy because we don't need to cope with frequency distortions. In practice, the local carrier would have to "tune" to the input signal, because AM demodulation depends on local carrier being EXACTLY the same frequency as modulated signal. And all transmission media (except our WAV file, of course) distort frequency in some degree.


# Pass all signals through lowpass filter
real = numpy.convolve(real, lowpass)
imag = numpy.convolve(imag, lowpass)
amplitude = numpy.convolve(amplitude, lowpass)

# After lowpass filter, all three lists show something like
# a square wave, pretty much like the original bitstream.
# If you want to see the result of early detection phase,
# uncomment this:

# f = wave.open("teste.wav", "w")
# f.setnchannels(1)
# f.setsampwidth(2)
# f.setframerate(44100)
# bla = [ int(sample * 32767) for sample in imag ]
# f.writeframes(struct.pack('%dh' % len(bla), *bla))


Actually, AM demodulation leaves a high-frequency residue in original data, so we pass our FIR filter in all signals (real, imaginary and amplitude). After this filtration, the signals will show themselves as quasi-square waves, like this:



We have three square waves like this: real, imag, and amplitude. Now we have the mission of extracting the symbols out of them. First, we need to detect phase of symbols:


# Detect phase based on real and imaginary values

phase = []
wrap = 2.0 * math.pi * (PHASES - 0.5) / PHASES
for t in range(0, len(real)):
# This converts (real, imag) to an angle
angle = math.atan2(imag[t], real[t])
if angle < 0:
angle += 2 * math.pi
# Move angle near to 2 pi to 0
if angle > wrap:
angle = 0.0
phase.append(angle)


Since our constellation does not use complex numbers, the X,Y coordinates found by demodulator are not directly useful to search for the symbol, so we need to compute actual angles, which we do using atan2().


# Find maximum amplitude based on training whistle (1s), and
# measure how much our local carrier is out-of-phase in
# relationship to QAM signal

max_amplitude = 0.0
carrier_real = 0.0
carrier_imag = 0.0
for t in range(int(0.1 * SAMPLE_RATE), int(0.9 * SAMPLE_RATE)):
max_amplitude += amplitude[t]
carrier_real += real[t]
carrier_imag += imag[t]

max_amplitude /= int(0.8 * SAMPLE_RATE)
carrier_real /= int(0.8 * SAMPLE_RATE)
carrier_imag /= int(0.8 * SAMPLE_RATE)

skew = math.atan2(carrier_imag, carrier_real)

print "Carrier phase difference: %d degrees" % (skew * 180 / math.pi)

del imag
del real


Here we analyze the "training" whistle sent by TX, in order to determine the amplitude range, as well as the phase difference between our carrier and TX carrier. We do this in a very lazy way here, and it would not be ok for a real-world modem.

Any real-world medium, be it wireless or wire or optics, does distort amplitude, phase and frequency, so it wouldn't work to estimate both at the beginning of session and leave it that way. Real-world modems use a PLL to generate local carrier, which adjusts itself all the time accordingly to input conditions.

Too good our WAV file is a perfect medium :)


# Normalize/quantify amplitude and phase to constellation steps
qsymbols = []
for t in range(0, len(phase)):
p = phase[t]
a = amplitude[t]
a /= max_amplitude
a = int(a * AMPLITUDES + 0.49)

# Compensate for local carrier phase difference
# This ensures the best phase quantization (avoiding that
# quantization edge is too near a constellation point)

p += 2 * math.pi - skew
p /= 2 * math.pi
p = int(p * PHASES + 0.49) % PHASES

qsymbols.append((p, a))

del phase
del amplitude


Here we take the "analog" phases and amplitudes, and quantize them into constellation. At this point we "unskew" the angles, so the quantified values map directly to our constellation.

The problem now is: we have thousands of samples which contain just a handful of symbols. We need to detect how many symbols are there -- and where.


# Try to detect edges between symbols

edge = []
settle_time = int(SAMPLE_RATE / BAUD_RATE / 8)
settle_timer = -1
in_symbol = 0
first_symbol = 0
last_a = last_p = 0
for t in range(0, len(qsymbols)):
s = 0
p, a = qsymbols[t]
if in_symbol > 0:
# we presume we are flying over a valid symbol
in_symbol -= 1
elif a != last_a or p != last_p:
# change detected, look for stabilization in future
settle_timer = settle_time
elif settle_timer > 0:
# one sample closer a good symbol
settle_timer -= 1
elif settle_timer == 0:
# signal settled for (settle_time) samples;
# take the current signal as a valid symbol
settle_timer = -1
in_symbol = int(0.85 * SAMPLE_RATE / BAUD_RATE)
if t > 0.5 * SAMPLE_RATE:
# make sure we are not at the beginning
s = 1
if not first_symbol:
first_symbol = t
edge.append(s)
last_p = p
last_a = a


At this section, we count on the fact that most symbols show a distinctive "edge" between them. We annotate the points where this edge is present, for further reference.

Note that this step did not find all symbols, because the message may have repeated symbols with phase 0, which don't s how any discontinuity. This is a direct result of poor constellation choice; it would have been better to avoid phase-0 points, so we would find all symbols by looking for edges.


# Find symbols based on edges and known baud rate

symbols = []
expected_next_symbol = SAMPLE_RATE / BAUD_RATE
lost = 0

for t in range(first_symbol - int(SAMPLE_RATE * 0.1), len(qsymbols)):
p, a = qsymbols[t]
s = edge[t]

if s:
# amplitude/phase edge, strong hint for symbol
lost = 0
symbols.append([p, a])
else:
lost += 1

if lost > expected_next_symbol * 1.5:
# no edge yet, take this as a non-transition signal
lost -= expected_next_symbol
symbols.append([p, a])

del qsymbols
del edge


Here we finally detect the handful of symbols we are interested in, and we can throw out the enormous time-domain lists once and for all.

The strategy of symbol detector is simple. Every detected edge is a new symbol for sure. But, if no edge was found after some time (baud rate + 50% tolerance), we blindly assume that we have a repeated symbol there.

Even if TX and RX baud rates were a bit different, this strategy would be robust, because most symbols do have an edge between them, so the algorithm will not have to work blind for more than 2 or 3 symbols at a time.


print "%d symbols found" % len(symbols)

# Differentiate phase (because we had integrated it at transmitter)

last_phase = 0
for t in range(0, len(symbols)):
p = symbols[t][0]
symbols[t][0] = (p - last_phase + PHASES) % PHASES
last_phase = p


In TX we had integrated (summed up) the phase offsets, so now we have to do the opposite operation.


# Translate constellation points into bit sequences

bitgroups = []
for t in range(0, len(symbols)):
p, a = symbols[t]
try:
bitgroups.append(invconstellation[p][a])
except KeyError:
print "Bad symbol:", p, a, ", ignored"

# Translate bit groups into individual bits

bits = []
for bitgroup in bitgroups:
for bit in range(0, BITS_PER_SYMBOL):
bits.append((bitgroup >> (BITS_PER_SYMBOL - bit - 1)) % 2)


Symbol characteristics are traded for bit sequences, and finally we can grab the actual bits we were looking for. The modem demodulation part is done, but now we have to recover bytes from bits, like a RS-232 interface would do.


# Find bytes based on start and stop bits. This will make you feel
# like a poor RS-232 port :)

t = 0
state = 0
byte = []
bytes = []
while t < len(bits):
bit = bits[t]
if state == 0:
# did not find start bit
if not bit:
# just found start bit
state = 1
elif state == 1:
# found start bit, accumulating a byte
if len(byte) < 8:
byte.append(bit)
else:
# byte complete, test stop bit
if bit:
# stop bit is there
bytes.append(byte)
else:
# oops, stop bit not there, we were cheated!
# backtrack to the point fake start bit was 'found'
t -= 8
byte = []
state = 0
t += 1

# Make a string from the bytes

msg = ""
print "Bytes:",
for byte in bytes:
value = 0
for bit in range(0, 8):
value += byte[bit] << (8 - bit - 1)
msg += chr(value)
print value,

print
print msg


The serial-to-bytes loop looks for start bits (zeros) and stop bits (ones). Once it finds a bit sequence that satisfies these conditions, it cuts a byte. If either the start or stop bit is missing, it keeps reading bits until it gets synchronized again.

Amazingly, this thing works. The improvements I can imagine for this Python modem are:

1) Take a closer look at the FIR filter, choosing the best cutoff frequency and avoid abrupt ends;
2) Choose a better constellation, without 0-phase points, and use complex numbers;
3) Generate RX local carrier using a PLL (phase-locked loop), so it can "follow" frequency and phase distortions.
4) Measure amplitude range continually using AGC (automatic gain control)
5) Add a bit randomizer