louder, Louder, LOUDER! (or: more dead bugging)

Sometimes someone makes a chip to do just what you want.

I’ve recently been needing to generate beeps from a BLE113 module (it’s a CC2541) which runs off a CR2032 coin cell at a nominal 3V, but more like 2 to 2.5 in practice. The speaker of choice is a surface mount piezo sounder which are small (9mm square) and unlike the discs don’t require mounting on a sounding board to get sound out. I’ve not idea if those Murata ones are the best, but it’s a respectable brand and those are the first I found that seemed to meet the spec.

They’re not especially loud, only 65dB at 1.5V pk-pk. The microcontroller I’m using has 4 useful channels on timer 1 for this application, and of course the outputs are totem pole outputs. So, driving it with two PWM channels in opposition is driving it with an H bridge which gives the full 2-3V pk-pk swing (depending on the battery voltage).

That makes it little louder, but not an awful lot. The datasheet says that the sounders can be driven at up to 12V pk-pk without damage. The datasheet however merely notes that it is “probable” that increasing the voltage will increase the volume, which is a bit unhelpful, though it has a graph for one (not the one I want) showing an increase with voltage exactly as you’d expect.The question then is how to generate a higher voltage for the buzzer. I had lots of ideas:

    Boost / switched capacitor converter and another H bridge (impractical–too many components)A miniature transformer (none quite small enough or with the right turns ratio)A miniature autotransformer (closer, but still the same problem)Something cunning with an inductor—some sort of ad-hoc boost thing which generates spikes rather than a square wave. Idea not really fully formed.

None of them are really any good. They’re either require impractically large number of components, components that either don’t exist (or I can’t find) or are vague and ill formed and I don’t have the parts to test the idea and anyway I’d probably end up busting up the chip with voltage spikes.

Fortunately it appears that someone thought of this already. It turns out the PAM8904 already does exactly this. It’s a switched capacitor converter with an H bridge, that takes a digital signal in, precisely for the application of driving piezo sounders from low power microcontrollers. Which is nice.

Except I’m not very trusting, and I’ve no idea if it’s worth the effort. I don’t want to order a circuit board and then fiddle around hand soldering QFNs (I’ve seen it done, I’d rather use a stencil) for a one off test. Like so many chips, it’s QFN only now. So the obvious thing to do is to buy one and deadbug it.

I figured I’d try the nice fine hookup wire I’ve got. The colours make it a bit easier to follow which wire is which. Next time, I’d try the same soldering job with enamelled wire. It’s harder to strip and tin, but the insulation doesn’t get in the way. The key to getting the soldering to work in the end was to tape down the wires with masking tape (3M blue tape) as I went along. Even with that it’s two steps forward, one back as you accidentally desolder wires when trying to attach new ones. Here it is!

IMG_20160728_180707IMG_20160728_185724.

(OK, not as good as this, or this, or this—hey that socket is a really nice idea!)

Spot the schoolboy error? I remembered to check continuity between neighbouring pins, but I forgot to pot it or otherwise protect the wires and so some of them fell off when I tried to change the boost voltage selection. And then another 4 wires fell off when I was taking it out. The connection area is tiny and the solder work is frankly not that good, so the joints are amazingly fragile. It’s what I should have done first time, doubly so because the bits of stiff wire for the breadboard really get in the way.

IMG_20160729_134402IMG_20160729_135338

Well, it seems to operate correctly, but I think I’d do it differently next time. A chip socket or veroboard with .1″ header soldered in is a much better choice than flying wires. Potting makes it as robust, but you have to pot it before you know it works.

It’s always a bit hard to tell volume because ears have a logarithmic response and at 4kHz the sound is quite directional. Nonetheless it’s noticeably louder. Yay🙂

 

 

Good job, TI, I half mean that

I was a little surprised today when trying to debug a board when one of the output voltages was 4.8 V. The main reason for the surprise is that the power supply is specced at 3.3V. And I didn’t make the supply, Texas Instruments did. In fact it’s one of these.

IMG_20160711_182949

Checking the manual reveals that TI do indeed claim that it’s supposed to output 3.3V. Time to find the regulator! There’s no continuity between USB power and the output so it’s not a short. Poking around on likely looking chips quickly reveals that the regulator is OUCH THAT SUCKER IS BOILING HOT! this one:

IMG_20160711_180224.jpg

And has the markings in very small “PHUI”. I got a far as googling “PHUI v” before it autocompleted to “PHUI voltage regulator”, so I guessed I was on the right track🙂. Not very further down the track the TI TPS730 datasheet crops up showing it’s a TPS73033, which is a 3.3V LDO regulator rated to 200mA. And did I mention it’s baking? It seems to be fried in a rather unfortunate mode (and just for good measure, the NR pin which ought to be at 1.22V is at 0.14). Also, it turns out that the entire circuit diagram was in the manual, so there was no need for that bit of minor sleuthing.

So why good job TI? Well, the other, much more important chips, such as the micro controller I’m programming are rated to 3.9V absolute maximum and it didn’t die with 4.8V across it. I’m pretty pleased about that, because I can imagine going round a cycle frying many chips before finding out the power supply was defective. :shudder:😦.

Well anyway, it’s fried and I can’t use it. I mean technically I’ve successfully programmed the chip and not fried anything as far as I know, but there’s another as yet untested chip on the board rated to only 4.8V absolute maximum. I can’t trust it, so I can’t use it and so as far as I care it’s broken.

And that means I’m free! This guy has a great philosophy which is that if something is broken then there’s no risk of breaking it so you may as well try to fix it. Fortunately, I have some old boards which I’m not currently using which have ST Microelectronics L78L33 SO-8 voltage regulators on them. They’re not LDO so getting 3.3V from 5V is a bit dubious and actually is not quite within spec, but whatever, both my chip and the one in the programmer (a CC2511) work all the way down to 2V, so I reckon that is won’t matter. Also, it’s only rated for 100mA, not 200mA like the original, but both the chips are low power wireless ones so I doubt the current will go too high even when it’s programming. And besides that won’t be for long.

Time to dead bug it! And pot it in hot melt!

IMG_20160711_182349IMG_20160711_182651

And it works!😎

The LED perhaps to the surprise of no one is dimmer than before and the output voltage is 3.2V which is in fact well within spec for the 7833.

 

The best of both worlds

I currently run Ubuntu LTS (14.04 at the moment) on my work laptop and Arch on my home laptop. I like the stability of Ubuntu, and very much like that I  can run security updates without breaking anything. However, the old versions of programs are annoying in some cases. I love running Arch because I always have the latest versions. This bit me today since the version of gerbv I was running was too old to support the high resolution drill placements in Eagle 7 and so my gerbers looked bad.

Give me Archbuntu!

First, get arch-bootstrap.sh and run it in (say) /other_OS to make /other_OS/ARCH_x64. You now have a bootstrapped arch installation. You can chroot into it and it looks arch-like but won’t work properly.

The main reason is that none of the special files are there. Essentially you want everything to mirror your main OS, except for the the OS itself. This can be done with bind mounts. Add the following to /etc/fstab and run mount -a

/proc          /other_OS/ARCH_x64/proc          none defaults,bind 0 0
/sys           /other_OS/ARCH_x64/sys           none defaults,bind 0 0
/dev           /other_OS/ARCH_x64/dev           none defaults,bind 0 0
/tmp           /other_OS/ARCH_x64/tmp           none defaults,bind 0 0
/home          /other_OS/ARCH_x64/home          none defaults,bind 0 0
/etc/passwd    /other_OS/ARCH_x64/etc/passwd    none defaults,bind 0 0
/etc/shadow    /other_OS/ARCH_x64/etc/shadow    none defaults,bind 0 0
/etc/group     /other_OS/ARCH_x64/etc/group     none defaults,bind 0 0
/etc/sudoers   /other_OS/ARCH_x64/etc/sudoers   none defaults,bind 0 0
/etc/sudoers.d /other_OS/ARCH_x64/etc/sudoers.d none defaults,bind 0 0

Now you can chroot into /other_OS/ARCH_x64 and run pacman. Note that /tmp is shared partly to avoid duplication, but also because that’s where the X11 socket it. Without it, you’ll need to use xhost to allow chroot’d programs to connect. And the password related files are duplicated to so sudo still works.

While you can sudo, chroot, then su – to yourself then run programs, that’s a little inconvenient. chroot is root only for security reasons, but since the chroot image is legitimate, we can chroot into that safely as anyone. The following code essentially chroots into /other_OS/ARCH_x64, drops root and then executes its arguments.

#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>

int main(int argc, char** argv)
{
	int r = chroot("/other_OS/ARCH_x64/");

	if(argc == 1)
		return 0;
	
	if(r == -1)
	{
		fprintf(stderr, "Error chrooting: %s\n", strerror(errno));
		return 1;
	}
	
	r = setuid(getuid());

	if(r == -1)
	{
		fprintf(stderr, "Error dropping root: %s\n", strerror(errno));
		return 1;
	}

	r = setgid(getgid());

	if(r == -1)
	{
		fprintf(stderr, "Error dropping root: %s\n", strerror(errno));
		return 1;
	}

	execvp(argv[1], argv+1);

	fprintf(stderr, "Error: %s\n", strerror(errno));
}

Compiling this as ae (think arch exec) and putting it in the path allows you to do “ae gerbv” instead of “gerbv” to get the Arch version, for example. Note that unlike the commandline version, the chroot function doesn’t alter the CWD, so it all works just fine.

Now I have the best of both worlds!

Compile time lookup tables

There are many good reasons to have lookup tables computed at compile time. In C++ 14 this is now very easy with generalized constexpr. It took me a while thinking about it before I realised that it really was that easy! Here’s an example of one:

#include <iostream>
#include <cmath>

//Trivial reimplementation of std::array with the operator[]
//methods set to constexpr
template<class T, int N>
struct array
{
	T elems[N];

	constexpr T& operator[](size_t i)
	{
		return elems[i];
	}

	constexpr const T& operator[](size_t i) const
	{
		return elems[i];
	}
};

//Function to build the lookup table
constexpr array<float, 10> foo()
{
	array<float, 10> a = {};

	for(int i=0; i < 10; i++)
		a[i] = exp(i);

	return a;
}

constexpr static array<float, 10> lut = foo<float>();

//This simply serves as a class to be instantiated.
template<int I> struct Silly
{
	Silly()
	{
		std::cout << "Hello " << I << "\n";
	}
};

int main()
{
	//Prove they're compile time constants
	Silly<(int)lut[0]> s0;
	Silly<(int)lut[1]> s1;
	Silly<(int)lut[2]> s2;
	Silly<(int)lut[3]> s3;
	Silly<(int)lut[4]> s4;
	Silly<(int)lut[5]> s5;
	Silly<(int)lut[6]> s6;
	Silly<(int)lut[7]> s7;
	Silly<(int)lut[8]> s8;
	Silly<(int)lut[9]> s9;
}

The only minor wrinkle is that std::array does not have its members set to constexpr to a sufficient degree, so it doesn’t work in this context. However, since few features are needed, a minimal reimplementation is pretty straightforward.

Comb filters

I’m currently making some sensitive analog electronics. It’s going to be used in relatively uncontrolled environments and so bits of mains hum sometimes creeps in. The mains is only approximately sinusoidal and so as well as the fundamental at 50Hz, there’s also significant energy in the harmonics at 100Hz, 150Hz and etc.

Digital filters provide a nice way to remove this with a comb filter, so called since it has a comb of equally spaced notches in frequency, so one filter will remove all the harmonics. The basic filter looks like this:

y[t] = x[y] - x[t-k]

Intuitively if you have a frequency component which is exactly k samples long, then the first and second part of the equation will be perfectly in phase since they’re one complete cycle apart and so by subtracting them, that component is removed. The same reasoning applies if there are two, three or more complete cycles in the k steps, so all those components get removed too.

That also applies if there are exactly 0 cycles too, so this filter removes DC.

If you don’t want to remove DC, you can replace the – with a +. The same logic applies, but to integer + half number of cycles instead of full cycles, so that they’re perfectly out of phase.

The derivation of the frequency response is fairly straightforward. Taking Z transforms gives.

H(z) = \frac{Y(z)}{X(z)} = 1 + z^{-k}

Substituting z = e^{j\Omega} and computing the magnitude |H(z)| gives:

\begin{matrix} |H(z)| & = & \sqrt{(1 + e^{j\Omega k})(1 + e^{-j\Omega k})} \\ & = & \sqrt{2 - 2 \cos \Omega k}\hfill \end{matrix}

This gives nulls at a normalised frequency of \Omega k = 0, 2\pi, 4\pi, .... The actual frequency is given by \Omega = 2 \pi f T where T is the sample period, and that gives a fundamental frequency of \frac{1}{kT}. Just to keep it concrete, I’m going to use a sample frequency of 900Hz, and a lag of k=18 giving a filter at 50Hz. Change it to k=15 for a 60Hz filter.

The result is here:

comb-1

Note, just for fun and verification, as well as the theoretical filter response, I’ve plotted the power spectrum of the time domain filter applied to a signal. To make it look nice, the signal needs to be somewhat spectrally flat, so I made a flat spectrum (with a little added noise in the magnitude) and uniform random phase and took the inverse FFT to get the signal.

And back to the filter. While it works, it’s not really very good. The notch is narrow and it affects almost all of the signal, whether near to the notch or not. The problem is that it’s a first order filter. Want to bet that chucking in some higher order terms makes a better filter? How about:

y[t] = x[y] - \frac{1}{2}x[t-k] - \frac{1}{2}x[t-2k]

And the results are:

comb-2

Well that’s promising🙂. We’ve got sharper notches and a flatter response in the passband (juuust about). So, what’s really going on? If we take a filter like:

H(z) = a z + bz^{-1} + cz^{-2} + dz^{-3}

and crunch through the magnitude of the response, we get:

|H| = \sqrt{\alpha + \beta \cos \Omega + \gamma \cos 2\Omega + \delta \cos 3\Omega}
where:
\begin{matrix} \alpha & = a^2 &+ & b^2 &+& c^2 &+ &d^2 \\ \beta & = ab &+ & bc &+ & cd \\ \gamma & = ac &+ & bd \\ \delta & = ad \end{matrix}

(the general case follows the obvious pattern above). The result is the square root of a Fourier series (in frequency space). In order to get a filter to do just what we want, all we have to do is figure out what we want, square it, compute it’s Fourier series and, er… figure out how to invert the system of polynomial equations above… do that and by the end we ought to have a custom filter.

Trouble is, that’s rather hard😦.

Fortunately this is yet another case where Downhill Simplex saves a huge amount of work.

Aside: I love this algorithm: it has saved me a huge amount of work over the years! It’s a derivative free, unconstrained optimizer. It won’t win any prizes for speed, it’s sensitive to parameter scales and it can quite easily get stuck off a minimum, but in a lot of cases, you can just chuck in a function and let it rip, without messing around with derivatives. It’s fminunc in Octave (or MATLAB), and DownhillSimplex in TooN.

In order to turn it into an optimization problem, we first design the filter in frequency space, then make an error function where the error is the deviation of some sort between the response with a given set of weights and the designed filter. The optimizer then adjusts the weight to minimize the error.

The obvious formulation is simply least squares. However, we can get more flexible very easily. We can rate some areas as more important than others if we like (for example if performance within the notch is more important than in the passband), and we can use the absolute deviation to some power. As the power gets large, the algorithm will tend to minimize the worst case deviation. Note however that Downhill Simplex starts to peter out in its ability to optimize once powers get too large (around 8 or so). The code is pretty straightforward (only tested in Octave):


%The filter is designed by providing a target function on [0, pi).
%Since the filter has only cosines, this is reflected around 0, then
%repeated ad-nauseum
omega=[0:0.01:pi];

%Let the target function start at zero, then step up to 1, but with the
%step being a fragment of some function to make things smoother. Sharper transistions
%lead to larger Gibbs overshoots
%
%Note notches with a slow start (e.g. raised cosine, start > 0) require high order terms
%as features the width of the start region are required, and that's narrow. Ones that ramp
%linearly are very easy as it's very easy to force the functio to be 0 at 0.
width = 0.3;
start = 0.1;
target = ones(size(omega));
ind = find( (omega < width + start) & (omega > start));
target(ind) = .5 - .5 * cos((omega(ind)-start) / width * pi);    %Raised cosine section
%target(ind) = (omega(ind)-start) / width;                         %Straight line
target(find(omega<=start))=0;

% Set the relative importance for different regions. This allows you to trade
% accurace in one region off against another. In this case heavily weight the zero
% part of the the function (i.e. the notch)
importance = ones(size(omega));
importance(find(omega<=start))=10000000;

%Set the number of terms (this is 1 + the order)
n = 40;

% These are the absolute value of the transfer function for
% a filter with the Z transform of:
%       ___
%       \        -i k
% H(z) = >   w  z
%       /__   i
%        i
%
% with a lag of k.

func = @(x, weights)  (abs(exp(j*x'*[0:length(weights)-1]) * weights'))';
phase = @(x, weights)  (arg(exp(j*x'*[0:length(weights)-1]) * weights'))';

%Starting parameters.
%
%Note, I'm interested in designing filters which have the coefficient of
%z positive and all powers have negative coefficients. This is required for
%filters which have f(0) = 0, and importantly, doubles the frequency of a given
%lag, meaning you double the lag size for the same frequency. This gives more
%flexibility with a low sample rate at the penalty of filtering out DC.
%
%So, choose some starting weights that (emperically) do very roughkly the
%right thing, and scale them so the initial function is about the same scale
%as the target function.
weights = [1 -1./ 2.^[1:n-1] ];   %Decay of weights
weights(2:end) = -weights(2:end) / sum(weights(2:end)); %Notch at zero
weights = weights / max(func(omega, weights)') * max(target); % Scale

%Choose an error function which minimizes the sum of some function of the
%absolute deviation, but weighted by the relative importance.
err = @(w) sum((abs((func(omega, w) - target)) .^6).*importance );

%Do the optimization.
wnew = fminunc(err, weights);

We can now get fancy and design custom high order filters. For example we can widen the notch so it has a notch region, but still has steep sides, and require that the filter has good rejection within the notch region and try to reduce the overall worst case deviation. This would help if the inteference is amplitude modulated (i.e. gets better and worse as the person moves), which would cause spreading of the frequencies.

For example here’s a 39th order, 50 Hz comb filter with a widened notch, just because we can. The first graph shows how well the optimized filter matches the design I’ve given it:

comb-3

And here’s what it looks like over the spectrum:

comb-4

Note the 50dB rejection in the entire notch region, the steep transition and the flat passband.

Just for completeness, the weights are (starting at a lag of 0):


  6.9255e-01
 -5.0611e-01
 -3.1259e-01
 -1.6371e-01
 -5.5549e-02
  1.7559e-02
  6.3697e-02
  8.7947e-02
  9.3523e-02
  8.6793e-02
  7.2796e-02
  5.3985e-02
  3.2915e-02
  1.2862e-02
 -4.8328e-03
 -1.9165e-02
 -2.9259e-02
 -3.4763e-02
 -3.6499e-02
 -3.4953e-02
 -3.0909e-02
 -2.5026e-02
 -1.8267e-02
 -1.1082e-02
 -4.1618e-03
  1.9620e-03
  6.5590e-03
  9.8330e-03
  1.1730e-02
  1.2231e-02
  1.1802e-02
  1.1500e-02
  1.0308e-02
  7.8142e-03
  5.8288e-03
  4.6363e-03
  1.8351e-03
 -8.7649e-04
 -3.1003e-03
 -1.8157e-02

They’re still quite large, showing that 39th order isn’t quite enough for the shape we’ve specified, depending on your level of obsessiveness over flatness and available computing power, but the graphs show it’s not bad.

Of course just because we’re filtering in the digital domain and can have filters with vast orders doesn’t mean that’s not overkill. And being essentially a Fourier domain process, all the usual caveats apply, such as sharp edges giving Gibbs overshoots and (with few exceptions—such as sharp notches), small features require very high order filters.

Nonetheless it works, and I it’s useful tool!

As usual: https://github.com/edrosten/filter_design

Apparently this is legal.

EDIT: I posted this on /r/cpp and there’s lots of good discussion there. I’m now less sure it’s legal, but still not sure either way.

Apparently, it’s legal to take pointers to elements in a C++ structure and do arithmetic on them to get pointers to other elements (caveats apply).


struct foo
{
    float a, b, c;
};

...

foo f;
float* b_ptr = &f.a + 1;

I’m about 99% sure it’s legal, provided that the class is “standard layout” (a superset of POD—there are no restrictions on constructors or destructors). The standard doesn’t seem to contradict this view. Section 9.2.13 of N3690 (the C++14 final complete draft) specifies that

Nonstatic data members of a (non-union) class with the same access control (Clause 11) are allocated so that later members have higher addresses within a class object.

so a, b, c must all be at increasing addresses. The standard does allow arbitrary padding (9.2.19), but the presence of padding can be detected easily enough (for example sizeof(foo) == 3*sizeof(float). Having a static assert for that would in theory limit portability, but I’ve not encountered a platform where structs of a single scalar type aren’t packed.

It also doesn’t break type punning/strict aliasing rules since a, b, c are of the same type.

It’s possible that 5.7.5 says it’s illegal. 5.7.4 says that a pointer to a non array shall be treated as a pointer to an array of length 1, and 5.7.5 says that going too far over the end of an array (more than 1 element over) is undefined.

However, offsetof is well defined and depends on allowing you to move around within a standard layout class using pointer arithmetic on the pointer to the class.

TL;DR: it’s safe.

OK, why?

Well, one of my TooN2 users asked if it was possible to have a Vector with named elements (such as .x, .y, .z) for a 3-vector, but there are many other possible variants too. If it’s legal, then you could replace float my_data[3]; with float a, b, c;

Here’s an excessively simplified version of TooN to demonstrate the principle:


//Array is one data storage class
#include <array>

//This is another data storage class
struct RGB
{
	float f, g, b;

	float* data()
	{
		return &f;
	}
};

//This is an excessively simplified Vector class. It takes 
//the data storage class and provides operator[]. TooN itself has more layers
//and the size as part of the type, but the principle is the same
template<class Base>
struct Vec: public Base
{
	float* operator[](int i)
	{
		return Base::data()[i];
	}
};

//Define Vec's of various lengths
Vec<std::array<float,2> > v2;
Vec<std::array<float,3> > v3;
Vec<std::array<float,4> > v4;

//Define a Vec of length 3 with named elements
Vec<RGB> vRGB;

Here’s the actual code:
https://github.com/edrosten/TooN/commit/32cb582e8e8f526980e2c7355793d23a3a629c9a

You can now do:

#include <TooN/TooN.h>
#include <TooN/named_elements.h>
#include <TooN/named_elements.h>


//Now actually make some statically allocated vectors with name members.
TOON_MAKE_NAMED_ELEMENT_VECTOR(CMYK, c, m, y, k);
TOON_MAKE_NAMED_ELEMENT_VECTOR(RGB, r, g, b);
TOON_MAKE_NAMED_ELEMENT_VECTOR(XYZ, x, y, z);




int main()
{
        CMYK<> v = TooN::makeVector(1, 2, 3, 4);
        RGB<> r = TooN::makeVector(1, 2, 3);

        std::cout << v << std::endl;
        std::cout << " c = " << v.c 
             << " m = " << v.m 
             << " y = " << v.y 
             << " k = " << v.k << endl;

        cout << v * TooN::makeVector(1, 2, 3, 4) << endl;

}

The code relies on a variadic macro (C++11 inherited the C99 preprocessor which has variadic macros) to generate a vector Base class with the correct named elements in. Note that CMYK, RGB and etc are proper TooN vectors, but (much like slices) they don’t use the same base as a straightforward Vector declaration.

EDIT:

The conclusion from the various discussions is that my technique might not be allowed, though I think it is not 100% clear. A modification is to change the underlying by adding a union so that an array aliases the members:


//Array is one data storage class
#include <array>

//This is another data storage class
struct RGB
{
	union
	{
		struct
		{
			float f, g, b;
		};

		float my_data[3];
	};

	float* data()
	{
		return my_data;
	}
};

It appears from the standard that this is definitively not forbidden, though it’s also not explicitly allowed either.