Comb filters

I’m currently making some sensitive analog electronics. It’s going to be used in relatively uncontrolled environments and so bits of mains hum sometimes creeps in. The mains is only approximately sinusoidal and so as well as the fundamental at 50Hz, there’s also significant energy in the harmonics at 100Hz, 150Hz and etc.

Digital filters provide a nice way to remove this with a comb filter, so called since it has a comb of equally spaced notches in frequency, so one filter will remove all the harmonics. The basic filter looks like this:

y[t] = x[y] - x[t-k]

Intuitively if you have a frequency component which is exactly k samples long, then the first and second part of the equation will be perfectly in phase since they’re one complete cycle apart and so by subtracting them, that component is removed. The same reasoning applies if there are two, three or more complete cycles in the k steps, so all those components get removed too.

That also applies if there are exactly 0 cycles too, so this filter removes DC.

If you don’t want to remove DC, you can replace the – with a +. The same logic applies, but to integer + half number of cycles instead of full cycles, so that they’re perfectly out of phase.

The derivation of the frequency response is fairly straightforward. Taking Z transforms gives.

H(z) = \frac{Y(z)}{X(z)} = 1 + z^{-k}

Substituting z = e^{j\Omega} and computing the magnitude |H(z)| gives:

\begin{matrix} |H(z)| & = & \sqrt{(1 + e^{j\Omega k})(1 + e^{-j\Omega k})} \\ & = & \sqrt{2 - 2 \cos \Omega k}\hfill \end{matrix}

This gives nulls at a normalised frequency of \Omega k = 0, 2\pi, 4\pi, .... The actual frequency is given by \Omega = 2 \pi f T where T is the sample period, and that gives a fundamental frequency of \frac{1}{kT}. Just to keep it concrete, I’m going to use a sample frequency of 900Hz, and a lag of k=18 giving a filter at 50Hz. Change it to k=15 for a 60Hz filter.

The result is here:


Note, just for fun and verification, as well as the theoretical filter response, I’ve plotted the power spectrum of the time domain filter applied to a signal. To make it look nice, the signal needs to be somewhat spectrally flat, so I made a flat spectrum (with a little added noise in the magnitude) and uniform random phase and took the inverse FFT to get the signal.

And back to the filter. While it works, it’s not really very good. The notch is narrow and it affects almost all of the signal, whether near to the notch or not. The problem is that it’s a first order filter. Want to bet that chucking in some higher order terms makes a better filter? How about:

y[t] = x[y] - \frac{1}{2}x[t-k] - \frac{1}{2}x[t-2k]

And the results are:


Well that’s promising :). We’ve got sharper notches and a flatter response in the passband (juuust about). So, what’s really going on? If we take a filter like:

H(z) = a z + bz^{-1} + cz^{-2} + dz^{-3}

and crunch through the magnitude of the response, we get:

|H| = \sqrt{\alpha + \beta \cos \Omega + \gamma \cos 2\Omega + \delta \cos 3\Omega}
\begin{matrix} \alpha & = a^2 &+ & b^2 &+& c^2 &+ &d^2 \\ \beta & = ab &+ & bc &+ & cd \\ \gamma & = ac &+ & bd \\ \delta & = ad \end{matrix}

(the general case follows the obvious pattern above). The result is the square root of a Fourier series (in frequency space). In order to get a filter to do just what we want, all we have to do is figure out what we want, square it, compute it’s Fourier series and, er… figure out how to invert the system of polynomial equations above… do that and by the end we ought to have a custom filter.

Trouble is, that’s rather hard :(.

Fortunately this is yet another case where Downhill Simplex saves a huge amount of work.

Aside: I love this algorithm: it has saved me a huge amount of work over the years! It’s a derivative free, unconstrained optimizer. It won’t win any prizes for speed, it’s sensitive to parameter scales and it can quite easily get stuck off a minimum, but in a lot of cases, you can just chuck in a function and let it rip, without messing around with derivatives. It’s fminunc in Octave (or MATLAB), and DownhillSimplex in TooN.

In order to turn it into an optimization problem, we first design the filter in frequency space, then make an error function where the error is the deviation of some sort between the response with a given set of weights and the designed filter. The optimizer then adjusts the weight to minimize the error.

The obvious formulation is simply least squares. However, we can get more flexible very easily. We can rate some areas as more important than others if we like (for example if performance within the notch is more important than in the passband), and we can use the absolute deviation to some power. As the power gets large, the algorithm will tend to minimize the worst case deviation. Note however that Downhill Simplex starts to peter out in its ability to optimize once powers get too large (around 8 or so). The code is pretty straightforward (only tested in Octave):

%The filter is designed by providing a target function on [0, pi).
%Since the filter has only cosines, this is reflected around 0, then
%repeated ad-nauseum

%Let the target function start at zero, then step up to 1, but with the
%step being a fragment of some function to make things smoother. Sharper transistions
%lead to larger Gibbs overshoots
%Note notches with a slow start (e.g. raised cosine, start > 0) require high order terms
%as features the width of the start region are required, and that's narrow. Ones that ramp
%linearly are very easy as it's very easy to force the functio to be 0 at 0.
width = 0.3;
start = 0.1;
target = ones(size(omega));
ind = find( (omega < width + start) & (omega > start));
target(ind) = .5 - .5 * cos((omega(ind)-start) / width * pi);    %Raised cosine section
%target(ind) = (omega(ind)-start) / width;                         %Straight line

% Set the relative importance for different regions. This allows you to trade
% accurace in one region off against another. In this case heavily weight the zero
% part of the the function (i.e. the notch)
importance = ones(size(omega));

%Set the number of terms (this is 1 + the order)
n = 40;

% These are the absolute value of the transfer function for
% a filter with the Z transform of:
%       ___
%       \        -i k
% H(z) = >   w  z
%       /__   i
%        i
% with a lag of k.

func = @(x, weights)  (abs(exp(j*x'*[0:length(weights)-1]) * weights'))';
phase = @(x, weights)  (arg(exp(j*x'*[0:length(weights)-1]) * weights'))';

%Starting parameters.
%Note, I'm interested in designing filters which have the coefficient of
%z positive and all powers have negative coefficients. This is required for
%filters which have f(0) = 0, and importantly, doubles the frequency of a given
%lag, meaning you double the lag size for the same frequency. This gives more
%flexibility with a low sample rate at the penalty of filtering out DC.
%So, choose some starting weights that (emperically) do very roughkly the
%right thing, and scale them so the initial function is about the same scale
%as the target function.
weights = [1 -1./ 2.^[1:n-1] ];   %Decay of weights
weights(2:end) = -weights(2:end) / sum(weights(2:end)); %Notch at zero
weights = weights / max(func(omega, weights)') * max(target); % Scale

%Choose an error function which minimizes the sum of some function of the
%absolute deviation, but weighted by the relative importance.
err = @(w) sum((abs((func(omega, w) - target)) .^6).*importance );

%Do the optimization.
wnew = fminunc(err, weights);

We can now get fancy and design custom high order filters. For example we can widen the notch so it has a notch region, but still has steep sides, and require that the filter has good rejection within the notch region and try to reduce the overall worst case deviation. This would help if the inteference is amplitude modulated (i.e. gets better and worse as the person moves), which would cause spreading of the frequencies.

For example here’s a 39th order, 50 Hz comb filter with a widened notch, just because we can. The first graph shows how well the optimized filter matches the design I’ve given it:


And here’s what it looks like over the spectrum:


Note the 50dB rejection in the entire notch region, the steep transition and the flat passband.

Just for completeness, the weights are (starting at a lag of 0):


They’re still quite large, showing that 39th order isn’t quite enough for the shape we’ve specified, depending on your level of obsessiveness over flatness and available computing power, but the graphs show it’s not bad.

Of course just because we’re filtering in the digital domain and can have filters with vast orders doesn’t mean that’s not overkill. And being essentially a Fourier domain process, all the usual caveats apply, such as sharp edges giving Gibbs overshoots and (with few exceptions—such as sharp notches), small features require very high order filters.

Nonetheless it works, and I it’s useful tool!

As usual: