New paper: Y-Autoencoders: Disentangling latent representations via sequential encoding

Massimiliano Patacchiola, Patrick Fox-Roberts and myself published a new paper in Pattern Recognition Letters (PDF here).

This work presents a new way of training auto encoders to allow separation of style and content which gives GAN like performance with the ease of training of auto encoders.

Abstract

In the last few years there have been important advancements in generative models with the two dominant approaches being Generative Adversarial Networks (GANs)and Variational Autoencoders (VAEs). However, standard Autoencoders (AEs) and closely related structures have re-mained popular because they are easy to train and adapt to different tasks. An interesting question is if we can achieve state-of-the-art performance with AEs while retaining their good properties. We propose an answer to this question by introducing a new model called Y-Autoencoder (Y-AE). The structure and training procedure of a Y-AE enclose a representation into an implicit and an explicit part. The implicit part is similar to the output of an auto-encoder and the explicit part is strongly correlated with labels in the training set. The two parts are separated in the latent space by split-ting the output of the encoder into two paths (forming a Y shape) before decoding and re-encoding. We then impose a number of losses, such as reconstruction loss, and a loss on dependence between the implicit and explicit parts. Additionally, the projection in the explicit manifold is monitored by a predictor, that is embedded in the encoder and trained end-to-end with no adversarial losses. We provide significant experimental results on various domains, such as separation of style and content, image-to-image translation, and inverse graphics.

Undefined behaviour and nasal demons, or, do not meddle in the affairs of optimizers

Discussions on undefined behaviour often degenerate into vague mumblings about “nasal demons”. This comes from a 1992 usenet post on comp.lang.c, where the poster “quotes” the C89 standard (this is oddly hard to find now) as:

“1.6 Definitions of Terms
In this standard, … “shall not” is to be interpreted as a prohibition
* Undefined behavior — behavior, upon use of a nonportable or erroneous program construct, … for which the standard imposes no requirements. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to having demons fly out of your nose.”

John F. Woods

Essentially what he means is that the compiler is allowed to do anything, including things which seem completely nonsensical. Usually you feel you have a handle on what sort of things the compiler might do, but sometimes it can do some very unexpected things. Of course they make sense in the compiler’s internal logic and are not incorrect, but it can take a long time to figure out what line of reasoning led the compiler to its decision.

I have a nice example which was investigated and distilled down to a minimal example by my friend David McCabe who kindly allowed me to post it here (I’ve changed it slightly). The example involves accidental memory scribbling in the context of derived classes (so the vtable pointers get scribbled over) where the compiler can inline enough to “see” the scribbling and then reason about it.

Here is the code snippet (and Godbolt link to peruse):

#include <cstring>

void e();
void f();
void g();

class A {
public:
    virtual ~A() = default;
};

class B : public A {
public:
    virtual void x() {f();};
};


void r(A* a) {
    B c;
    std::memcpy(static_cast<void*>(a), static_cast<const void*>(&c), sizeof(A));
}

int main() {
    A a;
    r(&a);
    e();
    ((B*)&a)->x();
    g();
}

So, what’s going on, or what’s supposed to be going on? And by “supposed”, I mean what would one think a compiler with no optimizations most likely do, given what I know about how C++ code translates to unoptimized assembly.

So, we have a base class with a vtable, A, and in main create an instance a. Note that A and the derived class B, happen to be the same size (the size of the vtable pointer). The function r is the memory scribbler: it creates an instance of B and memcpy()s it over A. In an intuitive way, the instance a has been turned into an instance of b, because it’s had its vtable pointer changed to a B vtable pointer. This is really the only thing at a low level which actually causes these types to be different, so if you were to cast a pointer to a to a B pointer, it should now work as a B and you can make the virtual function call x on it.

And that’s what main does. The only purpose of e, f, and g are to place markers in the generated code so we can see what’s run.

Now, scribbling over vtables is wildly ridiculously undefined, and nasal demons my fly (spoiler alert: they do). And you also add in some stubs for e, f and g in a separate source file (to prevent it optimizing) like this:

#include <iostream>
void e(){ std::cerr << "e\n"; }
void f(){ std::cerr << "f\n"; }
void g(){ std::cerr << "g\n"; }

then compile and run it (first without optimization), and it prints:

e
f
g

which is what you might expect. But what about with optimizations? On my machine (gcc 7.5.0), if I compile the main file with -O3 and the stubs with no optimization, it prints out:

e
e
⋮
<repeats for a total of 37401 e's>
⋮
e
Segmentation fault (core dumped)

These are the nasal demons. How on earth can the optimizer have turned that code into a loop and a segfault? It turns out it makes sense, but only after a lot of investigation. But first I’m going to consider some other cases before getting on to gcc with -O3. The Godbolt link, does nice colourisation so you can see which program lines match to which assembly lines.

First, here’s what unoptimized GCC 7.5.0 does (note I’ve snipped a lot of the setup and teardown and so on):

        lea     rax, [rbp-24]
        mov     rdi, rax
        call    r(A*)
        call    e()
        lea     rax, [rbp-24]
        mov     rax, QWORD PTR [rax]
        add     rax, 16
        mov     rax, QWORD PTR [rax]
        lea     rdx, [rbp-24]
        mov     rdi, rdx
        call    rax
        call    g()
        lea     rax, [rbp-24]
        mov     rdi, rax
        call    A::~A() [complete object destructor]

I’ve highlighted the most relevant lines, which are in order a call to r, e, then an indirect call via the vtable, then g. It then calls the destructor for A not B. The call is non virtual because a is an instance not a pointer, so it ignores the scribbled vtable and calls the “wrong” destructor.

I’m now going to look at what other compilers do first because GCC 5 and newer definitely do the most unexpected thing. First, the last pre-5 version on Godbolt, GCC 4.9.4 with -O3:

main:
        sub     rsp, 40
        mov     QWORD PTR [rsp+16], OFFSET FLAT:vtable for B+16
        mov     QWORD PTR [rsp], OFFSET FLAT:vtable for B+16
        call    e()
        mov     rax, QWORD PTR [rsp]
        mov     rdi, rsp
        call    [QWORD PTR [rax+16]]
        call    g()
        xor     eax, eax
        add     rsp, 40
        ret

I’ve stripped out everything except main(). Note it’s inlined both r() and the call to memcpy. On line 3, it’s creating the concrete b, and copying in the vtable, on the following line, it’s memcpying that over a, but it’s noticed it can take the data from the source. then it calls e, makes a virtual call (which we know will be f) and then calls g. It inlines and removes the trivial ~A(), tears down main() and leaves.

So this program behaves as “expected”, it would print out e, f, g like the unoptimized one. MSVC 19.14 with /Ox does the same:

main    PROC
$LN19:
        sub     rsp, 40                             ; 00000028H
        lea     rax, OFFSET FLAT:const B::`vftable'
        mov     QWORD PTR a$[rsp], rax
        call    void e(void)                         ; e
        mov     rax, QWORD PTR a$[rsp]
        lea     rcx, QWORD PTR a$[rsp]
        call    QWORD PTR [rax+8]
        call    void g(void)                         ; g
        xor     eax, eax
        add     rsp, 40                             ; 00000028H
        ret     0
main    ENDP

Using 19.28 and compiling for x86 not x64 has an extra flourish where it doesn’t make an indirect call, instead it checks if it has a B vtable and makes a direct call. But that’s the same in essence since it’s still deciding based on the vtable:

_main   PROC
        push    ecx
        mov     DWORD PTR _a$[esp+4], OFFSET const B::`vftable'
        call    void e(void)                         ; e
        mov     eax, DWORD PTR _a$[esp+4]
        mov     eax, DWORD PTR [eax+4]
        cmp     eax, OFFSET virtual void B::x(void)      ; B::x
        jne     SHORT $LN6@main
        call    void f(void)                         ; f
        call    void g(void)                         ; g
        xor     eax, eax
        pop     ecx
$LN6@main:
        lea     ecx, DWORD PTR _a$[esp+4]
        call    eax
        call    void g(void)                         ; g
        xor     eax, eax
        pop     ecx
        ret     0        ret     0

Clang on the other hand will apply its powerful optimizer to this case:

main:                                   # @main
        push    rax
        call    e()
        call    f()
        call    g()
        xor     eax, eax
        pop     rcx
        ret

It’s likely using dataflow, so I am guessing it follows the provenance of the pointer, finds it’s been copied from the B vtable and then devirtualises. Note that if you put in destructors it will call ~A() not ~B() because like GCC, it never makes a virtual call in the first place. So far so sort-of sensible. You can nod approvingly at the power of clang’s optimizer but not feel pessimistic about VS2017 just punting on that and doing the “obvious” thing (except on x86, where it makes the same deduction but it much more conservative with its actions).

But what about GCC? This is present in all version from 5 onwards, and I’m picking 7.5 (my machine version, though 10.2, the latest on Godbolt, does the same) with -O3 and it does:

main:
        sub     rsp, 8
        call    e()
WAT

That’s it. The whole of main(). It calls e(), then nothing. No tear down, no return, nothing, it just falls off the end into whatever code happens to be lying there. This is the source of the nasal demons.

Undefined behaviour is not allowed, according to the standard. Therefore, according to GCC, it does not happen. The compiler goes one step further than clang and noticed that the line after e() has the wrong vtable and that is undefined behaviour so it has deduced that the line is never reached. And the only way for that to happen is if e() never returns therefore it marks the code after e() as unreachable and deletes it.

When the standard says “undefined”, the standard means it, and the compiler is allowed to reason backwards in time to make optimizations with the assumption that the program is valid. This is very unintuitive but entirely legal and part of a very powerful optimization pass.

This isn’t the compiler being, as some people feel, perversely pedantic just to mess with the unwary programmer.

It’s really handy: GCC can take some inlined code and figure out that a pointer is non-null based on how it’s used, and can then, say, travel back through the code with that knowledge and remove all the tests for nullness and alternative branches based on such tests making the inlined code both faster and more compact. So you write your code to be generic, and GCC gives you an extra-fast, extra compact version when it finds a special use case. It’s exactly the sort of thing a person might do, but it’s automatic and woe betide the person who violates the preconditions.

So the last question is why this specific behaviour? Why does it print out many lines and then quit? Well, here’s an objdump of the relevant part of the resulting executable:

int main() {
 8d0:	48 83 ec 08          	sub    $0x8,%rsp
    A a;
    r(&a);
    e();
 8d4:	e8 51 01 00 00       	callq  a2a <e()>
 8d9:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)

00000000000008e0 <_start>:
 8e0:	31 ed                	xor    %ebp,%ebp
 8e2:	49 89 d1             	mov    %rdx,%r9
 8e5:	5e                   	pop    %rsi
 8e6:	48 89 e2             	mov    %rsp,%rdx
 8e9:	48 83 e4 f0          	and    $0xfffffffffffffff0,%rsp
 8ed:	50                   	push   %rax
 8ee:	54                   	push   %rsp
 8ef:	4c 8d 05 8a 02 00 00 	lea    0x28a(%rip),%r8        # b80 <__libc_csu_fini>
 8f6:	48 8d 0d 13 02 00 00 	lea    0x213(%rip),%rcx        # b10 <__libc_csu_init>
 8fd:	48 8d 3d cc ff ff ff 	lea    -0x34(%rip),%rdi        # 8d0 <main>
 904:	ff 15 d6 16 20 00    	callq  *0x2016d6(%rip)        # 201fe0 <__libc_start_main@GLIBC_2.2.5>
 90a:	f4                   	hlt    
 90b:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)

Look what happens to sit right after main(): it’s _start, which is the operating system’s entry point. So after falling off the end of main, it runs into that and so simply restarts the program entirely from scratch. The program then makes a call to main which never returns because it runs off the end… and you get the idea. Eventually the program exhausts the stack and segfaults. If you change the optimization settings of the file with the stubs in, the behaviour changes, because a different piece of code gets randomly wandered into.

I thought this was fascinating: I know that undefined behaviour can in principle do some very strange things but it’s interesting to see an example where it really does. I would never have predicted infinite recursion as an outcome. So don’t mess with undefined behaviour: one day the compiler might really make demons fly from your nose.

Static binaries

So, in a previous post, I claimed static binaries “last forever”. The oldest binaries I have to hand are from 2006. These were either RedHat or Suse, but I don’t remember which. We were transitioning to Suse at around that time.

Anyway, they work. At least the Linux/x86 one does. I don’t have a handy PPC machine or SGI or Sun around to try the more exotic architectures. I mostly keep those binaries up to buff my nerd credentials. Anyway point is, the 14 year old binary works just fine.

So how old will it go?

I do remember there was a “compatibility break” between libc5 and glibc back in the day, but that was with dynamically linked code. I remember though one could fix it by finding the loader and libc from an old system and simply copying them across.

Time to dig out an old static executable!

Hm, well, no idea where I’d find one. The easiest way is to make one. Sadly, because I’m not a complete tech hoarder I threw away my RedHat 5.2 CDs which came in a spiffy box set for something like £50 at the local bookshop, bundled with a bunch of books on CD. Best £50 I ever spent; it’s what got me into Linux.

See that? A complete computing environment IN ONE BOX!!!!!11

Fortunately good Linux vendors still make their ancient distributions available for historic interest and presumably maintenance of ancient systems. RedHat are among them and so for old time’s sake I’ll install RedHat 5.2.

The easiest way is to start with QEmu. First thing to do is to make a disk image. I think I installed it on my brand new lavishly huge 9G drive. I reckon 1G is OK, since it came on a CD:

qemu-img create deadrat-5.2.img 1G

You’ll then need the boot image (boot.img) and as I discovered after some trial and error, the supplemental floppy image (supp.img) from here:

ftp://archive.download.redhat.com/pub/redhat/linux/5.2/en/os/i386/images/

FTP for the win! I’m using FTP since it’s necessary for later. RedHat no longer offers the ISO mages to download, so I can’t grab one and do a CD install. They do however have the FTP server still running (see above) so you can do an FTP install if you have network access. A little bit of trial and error found me this incantation of QEmu to get on the ‘net:

qemu-system-i386 -fda boot.img -hda deadrat-5.2.img  -netdev user,id=mynet0 -device ne2k_pci,netdev=mynet0 

This emulates an NE2000 PCI network card, a popular model with many almost compatible clones in the mid 90s. So…

OK now to try some very old school 1997 era C++ code (/tmp/prog.cc):

#include <iostream.h>
int main(){
    cout << "Hello world.\n";
}

Don’t complain it’s not standards compliant, there wasn’t a standard in 1997. And compile it, check it and run it:

[root@hax /tmp]# g++ --version
egcs-2.90.29 980515 (egcs-1.0.3 releae)
[root@hax /tmp]# g++ -static prog.cc
[root@hax /tmp]# ldd a.out
        not a dynamic executable
[root@hax /tmp]# ./a.out
Hello, world.

OK seems to work. Now to get that file onto my machine. First power it off, then mount it. Mounting isn’t completely trivial since you first need to examine the partition table and pass in an offset to the loopback device before mounting:

$ fdisk -lu deadrat-5.2.img 
Disk deadrat-5.2.img: 1 GiB, 1073741824 bytes, 2097152 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x00000000

Device           Boot   Start     End Sectors   Size Id Type
deadrat-5.2.img1           63   34271   34209  16.7M 83 Linux
deadrat-5.2.img2        34272 2064383 2030112 991.3M  5 Extended
deadrat-5.2.img5        34335 1997855 1963521 958.8M 83 Linux
deadrat-5.2.img6      1997919 2064383   66465  32.5M 82 Linux swap / Solaris
$ sudo losetup -o $((512 * 34335)) /dev/loop99 deadrat-5.2.img 
$ sudo mount /dev/loop99 /mnt/

And running it…

$ cd /mnt/prog
$ ./a.out
Hello, world.

Woahhhhh. A 24 year old (well simulated) just runs on a current Linux machine. Actually mostly, the whole thing runs. I can chroot to it and just run stuff. It takes a little setup: modern Ubuntu seems to have blocked TCP access to the X server, so I have to bind mount /tmp to /mnt/tmp. But once that’s done, I can just run stuff! For example:

Huh, a filament bulb. That’s pretty retro.

Static executables just work. Dynamic ones are harder, they work but not without a chroot because while the kernel mostly maintains compatibility, libc does not.

But anyway, point proven. Static executables just work out of the box 24 years later. I wonder if my docker images will work in a quarter of a century’s time…

Adventures in Docker: dockerizing an old build process

So I don’t really hold with this new-fangled “container” stuff.

Well, OK that’s not really true. I don’t really hold with it’s massive over use. A lot of the time it seems to be used because people don’t know how to distribute executables on Linux. Actually the way you make distributable executables on Linux is the same way as on other systems.

However, Linux also has an extra-easy way of compiling locally which isn’t great for distribution, leading to the problem that “well it works on my machine” or people claiming that you can’t distribute binaries. So, people tend to use a container to ship their entire machine. At least that’s better than using an entire VM which was briefly in vogue.

A frequently better way is to statically link what you can, dynamically link what you can’t and ship the executable and any dynamic dependencies. Not only is this lightweight, it also works when people don’t have large container systems installed (most people), or where it won’t work (e.g. for a plugin).

And that’s precisely what I do for the 3B system.

THE END

OK that was a short post. What about building though? Building is hard, and a different matter entirely. To build reliably and a version you know works, you need your source code at a specific version, your dependencies at a specific version, and your compiler at a specific version and ideally a “clean” system with nothing that might disturb the build in some unexpected way.

And you want this on several platforms (e.g. 32 and 64 bit Linux and Windows).

In an ideal world you wouldn’t need these, but libraries have compatibility breaks, and even compilers do. Even if you try hard to write in compliant code, bear in mind I wrote the code starting in 2009, and there have been some formal breaks since then (auto_ptr leaving us), as well as bits I didn’t get 100% right which came to bite many years later.

Now I could in principle maintain the code to keep it up to date with new versions of everything and so on. Maybe I should, but the project is essentially finished and in maintenance mode, and that takes time and testing.

Oh and one of the main use cases is on clusters, and these often use very table versions of RedHat or some equivalent so they tend to be years old, so my maintained version would need to be buildable on ancient redhat which I don’t have to hand. And either way, Linux tends to be backwards compatible, not forwards, so your best bet is to build on an old system.

So I solved this problem years ago with this hideous script. Since you haven’t read it, what it does is it creates ubuntu images (in various configurations) using debootstrap, sets them up with all the packages, compiles the code in all the different ways I want and assembles a release.

It took ages to write, but really took the burn out of making releases since it required lots of configurations; 32 and 64 bit Windows and Linux static executables plus JNI plugins for all of those (compiled with MinGW). It even has a caching mechanism: it builds a base system, then constructs a system with all the dependencies from that, then constructs a clean system for building the code from those.

It’s quite neat, if all you need to do is rebuild, it’s pretty quick because it only needs copy the image and build the code. And it still (mostly) works. I ran it for the first time in ages and apart from some of the URLs going stale (ubuntu packages have moved for the now historic 10.04, a has libpng, and libtiff), it worked as well today as it did 10 years ago.

The downside is it has to run as root though because it needs to run debootstrap and chroot, which at the time required root. This makes it hard to run on restricted systems (clusters) and it builds the entire thing every time making it hard for people to modify the code. I could update them to use things like fakechroot, but this sort of thing is precisely what Docker does well.

Docker basically makes shipping and managing whole OS images easier, has a built-in and easy to use caching mechanism and so on. The Dockerfile if you care looks like this:

FROM ubuntu@sha256:51523b5adbc67853e73d7e5faff234614942f9ff3872f329d2bb59478baf53db
LABEL description="Builder for 3B on an ancient system"

# Since 10.04 (lucid) is long out of support, the packages have moved
RUN echo 'deb http://old-releases.ubuntu.com/ubuntu/ lucid main restricted universe' > /etc/apt/sources.list

#Install all the packages needed
RUN apt-get update
RUN apt-get install -y --force-yes openjdk-6-jre-headless && \
	apt-get install -y --force-yes openjdk-6-jdk wget zip vim && \
	apt-get install -y --force-yes libjpeg-dev libpng-dev libtiff-dev && \
	apt-get install -y --force-yes build-essential g++ 

RUN mkdir -p /tmp/deps /usr/local/lib /usr/local/include

WORKDIR /tmp/deps

#Build lapack
#Note Docker automatcally untars with add
ADD clapack.tgz   /tmp/deps
ADD clapack-make-linux.patch /tmp/deps
ADD clapack_mingw.patch /tmp/deps
WORKDIR /tmp/deps/CLAPACK-3.2.1
RUN cp make.inc.example make.inc && patch make.inc < ../clapack-make-linux.patch 
RUN patch -p1 < ../clapack_mingw.patch
RUN make -j8 blaslib && make -j8 f2clib && cd INSTALL && make ilaver.o slamch.o dlamch.o lsame.o && echo > second.c && cd .. && make -j8 lapacklib
RUN cp blas_LINUX.a /usr/local/lib/libblas.a && cp lapack_LINUX.a /usr/local/lib/liblapack.a && cp F2CLIBS/libf2c.a /usr/local/lib/libf2c.a

ADD TooN-2.0.tar.gz   /tmp/deps
WORKDIR /tmp/deps/TooN-2.0
RUN ./configure && make install

ADD gvars-3.0.tar.gz   /tmp/deps
WORKDIR /tmp/deps/gvars-3.0
RUN ./configure --without-head --without-lang && make -j8 && make install 

ADD libcvd-20121025.tar.gz   /tmp/deps
WORKDIR /tmp/deps/libcvd-20121025
RUN ./configure --disable-fast7 --disable-fast8 --disable-fast9 --disable-fast10 --disable-fast11 --disable-fast12 && make -j8 && make install

RUN mkdir -p /home/build
WORKDIR /home/build

It’s not very interesting. It gets an ubuntu 10.04 base image, updates it to point at the historic archive, and then patches, builds and installs the dependencies. More or less what the shell script did before (minus downloading the dependencies). it’s nearly 2021, not 2011, I no longer care about 16M of binary blobs checked into a git repository that’s unchanging.

The result is a docker environment that’s all set up with everything needed to build the project. Building the docker environment is easy:

docker build -t edrosten/threeb-static-build-env:latest .

But the really neat bit is executing it. With all that guff out of the way from a user’s point of view, building is a slight modification of the usual ./configure && make. Basically share the current directory with docker as a mount (that’s what -v does), and run in the container:

docker run -v $PWD:/home/build edrosten/threeb-static-build-env ./configure
docker run -v $PWD:/home/build edrosten/threeb-static-build-env make -j 8

The funny thing about this is it gives a very modern way of partying like it’s 1999 or 2009 at any rate. The environment is stock Ubuntu 10.04, with a C++98 compiler. Obsolete? Sure, but it works and will likely keep working for a long time yet. And static binaries last for ever. I last touched the FAST binaries in 2006 probably on some Red Hat machine, and they still work fine today.

I’m not likely to update the builder for the ImageJ plugin any time soon. No one except me builds that since anyone who is in as deep as modifying the code likely wants to analyse large datasets for which ImageJ isn’t the right tool.

New paper: Large Scale Photometric Bundle Adjustment

Myself and Olly Woodford published a new paper: Large Scale Photometric Bundle Adjustment, PDF here, at BMCV 2020.

This work presents a fully photometric formulation for bundle adjustment. Starting from a classical system (such as COLMAP), the system performs a structure and pose refinement, where the cost function is essentially the normalised correlation cost of patches reprojected into the source images.

Abstract


Direct methods have shown promise on visual odometry and SLAM, leading to greater accuracy and robustness over feature-based methods. However, offline 3-d reconstruction from internet images has not yet benefited from a joint, photometric optimization over dense geometry and camera parameters. Issues such as the lack of brightness constancy, and the sheer volume of data, make this a more challenging task. Thiswork presents a framework for jointly optimizing millions of scene points and hundreds of camera poses and intrinsics, using a photometric cost that is invariant to local lighting changes. The improvement in metric reconstruction accuracy that it confers over feature-based bundle adjustment is demonstrated on the large-scale Tanks & Temples benchmark. We further demonstrate qualitative reconstruction improvements on an in ternet photo collection, with challenging diversity in lighting and camera intrinsics.

Make hacks: embedded version info, detecting non timestamp change and automatically generated dependencies

For the purposes of traceability you may wish to embed the version of a program into it. If you’re using full on CI, you very much shouldn’t need this. If your CI system doesn’t record such things, then you need to fix it now.

But if you’re hacking around locally, especially for research it can be really useful to know where that executable came from or more specifically where some results came from, because it’s easy to lose track of that. An easy way to do this is to embed the git hash of the repository into the executable so it can be written alongside the data.

Essentially if git status --porcelain prints nothing then the repository is entirely clean. With that in mind, here’s a short script which generates a C++ source file with the git hash if the repository is clean, and prints a loud, red warning if it is not clean. Here is get_version.sh:

git_hash=`git rev-parse HEAD`

if [[ "$(git status --porcelain)" != "" ]]
then
	echo -e "\033[31mThere are uncommitted changes. This means that the build" 1>&2
	echo -e "Will not represent a traceable version.\033[0m" 1>&2
	time=`date +%s`
	version="${git_hash}-${time}"
else
	version="${git_hash}"
fi

cat <<FOO
namespace version{
	const char* version_string = "$version";
}
FOO

It’s easy enough to use from make:

.PHONY: FORCE
versioninfo.cc: FORCE
	bash get_version.sh > versioninfo.cc

and of course make your program depend on versioninfo.o. But it’s not very clean; this will rebuild and then re-link every single time. The key is to make the FORCE dependency depend on whether anything has changed.

This script (get_version_target.sh), reruns get_version.sh and compares it to the existing versioninfo.cc. If there’s a change, it prints the target (FORCE), otherwise it prints nothing.

if ! diff -q versioninfo.cc <( bash get_version.sh 2> /dev/null ) > /dev/null 2>&1
then
        echo FORCE
fi

You then need to plumb this into make using the shell command:

version_target=$(shell bash get_version_target.sh)
.PHONY: FORCE 
versioninfo.cc: $(version_target)
	bash get_version.sh > versioninfo.cc

This will now only re-generate versioninfo.cc (and hence the .o and the final executables) if the git hash changes.

With the basics in place, you can make the version info as detailed as you like, for example you could record any tags and branch names, so you could record those and if it’s an actual point release etc. The downside of the shell commands is that they run every time make is called so you will want to make them fast otherwise incremental rebuilds will become annoyingly slow.

Using this mechanism, make can be much more malleable than expected. This is an immensely powerful feature. But remember:

Realtime AR world transformations with occlusions

This is me, my team and collaborators have been working on recently. World transforming AR specifically the floor.

You can see occlusions, such as the pillars occluding the floor effect, but we have more sophisticated occlusion handling too:

You can’t tell from this video that the occlusion handling is dynamic, so if the postbox managed to move, the occlusions would stay up to date. And here’s a gallery of nice shots

If you have Snapchat and want to try it for yourself, here are the snapcodes:

Guest blog post: weekend science edition

Inspired by Helen Czerski’s post about bubbles in paper straws, we decided to try some experiments to see if paper straws are the only ones to create bubble rings. We recorded all the videos here using super slow-mo mode on a Galaxy S8 plus. The frame rate (480fps) and picture quality is good but the autofocus had an itchy trigger finger and delighted in switching to focus on something a few inches away just as we started filming. Just because I’m filming glass and water under fluorescent light doesn’t mean I need a back seat video taker. More override controls please!

Iteration one: Paper straws with fizzy water

This worked pretty well, and we were quickly able to reproduce Helen’s results. Go reproducible science! We played around with different ways of filming and got some nice videos.

Paper straws in an old (May 27th 2011) glass

Iteration two: The peristaltic pump

This is the point where @edrosten suggested it would be ‘really easy’ to do this using a peristaltic pump. And I was like:

Usually when either of us say something will be really easy, it usually means a 3-5 year research program. Anyway, we got the pump down from the attic and worked out that electrical tape was airtight enough to seal a straw to the tubing.

This is legit science

The peristaltic pump was very cute and could pump water fine, but rarely produced bubble rings except very occasionally at the beginning (results not shown). A constant flow of water did not produce bubble rings, just a little stream of bubbles. Maybe we just needed a bigger pump.

Iteration three: Just add everything except water

To see if viscosity might have any effect, we created a sugar syrup to increase the viscosity. To hydrolyse the sugar and ensure that the syrup would be miscible in water, we added citric acid.

Is that an industrial size bag of critic acid in your kitchen, or are you just pleased to see me?

Concentration was lost for a while as I was trying to use an inverted camera tripod to film the process from under the glass. The syrup ended up caramelising a bit. Mixed with sparkling water, it produced a taste I would definitely quaff while eating a burger and fries.

It also produced very nice bubble rings. Difficult to say if the viscosity had any effect. More experiments are needed.

The next obvious step after sugar was to really crank up the viscosity, so we decided to try adding xanthan gum. This is a standard storecupboard ingredient with anyone obsessed with changing the texture of food. We had two packs. Mixed with water, it produces something that basically looked like snot.

No sir, its really very gloopy.

The bubbles slowed down beautifully in this mix. We didn’t capture any good rings, but we had a lot of fun. This is definitely the part of the experiment I would do more of if I had time.

Iteration four: Paper and plastic, with a cleaner container

It occurred to us that ten year old glasses might possibly not be the smoothest surface. This turned out to very definitely be the case, as we discovered when we got a new jar and tried the experiment in that. Lower background! Easier to detect the signal! Hurrah!

We also decided to purchase some plastic straws. These seem to have almost disappeared (which is a good thing). After some hunting we were able to get some reusable plastic straws.

The first thing that we noticed was that the bubble profile was very different on the plastic and paper straws. Like, really different. The paper straw has lots of little bubbles and the plastic straw had (to my eye) a bigger range of sizes and, on average, larger bubbles. Could this be due to different numbers of nucleation points? Or are the bubbles more motile on the surface of the plastic and so aggregate? I have no idea. I know nothing about bubbles except that I have consumed far too many of them during these experiments. Seriously, I had no idea what drinking this amount of gassy water does to a person.

Anyway…

First we tried to get bubble rings with the plastic straws. This was definitely much, much harder than with the paper straws. We only saw a ring once:

And that was out of a lot of tries. More often it looked like this:

As you can see, there is one absolutely honking bubble that comes out at the end like Jabba the Hutt. That happened a lot with the plastic straws, and it’s one of the things that made us think maybe the bubbles were more motile on plastic and could be joining forces.

Iteration five: MOAR NUCLEATIONS

So anyway, then we started thinking about the issue of nucleation points. Could we add nucleation points to plastic straws by roughening the surface? I grabbed some sandpaper.

I’m not sure why I enjoyed this so much. Possibly there is something wrong with me.

As you can see, the bubble profile changed a bit if we compare paper, plastic, and sandpapered plastic. The sandpapered plastic has more bubbles and the paper has more small bubbles, something that was particularly noticeable when the straws were initially inserted into the water.

Yellow: sanded plastic straw. Pink: plastic straw. White paper straw.

And the sandpapered plastic produced bubble rings pretty reliably!

In conclusion

Plastic straws rarely produce bubble rings, but can be induced to produce them by roughening the surface. I met a friend this afternoon and we were discussing teflon coated tubing that minimises bubbles. Suddenly, bubbles are everywhere!

I’m sure that the differences are fundamentally down to the material properties, and that makes me both happy and nostalgic for the good old days when I used to do materials science.

For me it’s time to go back to microscopy, but I hope you enjoyed our bubble journey. I had a lot of fun, and it was nice to actually do some experiments at home. And if I ever end up in a bubble-off with an Instagram celebrity or Bond supervillain, I now know which straw to choose.

Adafruit mini thermal printer, part 3/3: Long jobs, cancellation and paper out

Writing a printer driver from scratch is quite involved. Who knew?

Code on github: https://github.com/edrosten/adafruit-thermal-printer-driver. Note: I wrote these posts as I went along so there may be bugs in the code snippets which are fixed later. I recommend checking the GitHub source before using a snippet.

This post appears to be about three unrelated things but it isn’t. It’s all about reading back data from the printer.

Cancellation

So, cancellation works in as much as things stop printing. Except none of the end of job stuff gets printed (the “cancelled” message and the paper eject). First I thought it was because I was lazy, so I changed the signal handler to:

	{
		struct sigaction int_action;
		memset(&int_action, 0, sizeof(int_action));
		sigemptyset(&int_action.sa_mask);
		int_action.sa_handler = [](int){
			cancel_job = 1;
		};
		sigaction(SIGTERM, &int_action, nullptr);
	}

This is the approved method, since the signal method is ill specified in general and on Linux on entry to the handler, it causes the handler to get reset to the default (terminate). I thought maybe that was happening. Do you think this worked?

The next step was to add LogLevel debug to /etc/cups/cupsd.conf, so it records all my debug messages. It does, along with a bunch of other useful stuff and its all indexed by the print job number. A filtered log looks like this:

D [29/Dec/2019:14:21:18 +0000] [Job 184] envp[25]=\"PRINTER=pl\"
D [29/Dec/2019:14:21:18 +0000] [Job 184] envp[26]=\"PRINTER_STATE_REASONS=none\"
D [29/Dec/2019:14:21:18 +0000] [Job 184] envp[27]=\"CUPS_FILETYPE=document\"
D [29/Dec/2019:14:21:18 +0000] [Job 184] envp[28]=\"FINAL_CONTENT_TYPE=application/vnd.cups-raster\"
D [29/Dec/2019:14:21:18 +0000] [Job 184] envp[29]=\"AUTH_INFO_REQUIRED=none\"
D [29/Dec/2019:14:21:19 +0000] [Job 184] Start rendering...
D [29/Dec/2019:14:21:19 +0000] [Job 184] Set job-printer-state-message to "Start rendering...", current level=INFO
D [29/Dec/2019:14:21:19 +0000] [Job 184] Processing page 1...
D [29/Dec/2019:14:21:19 +0000] [Job 184] Set job-printer-state-message to "Processing page 1...", current level=INFO
D [29/Dec/2019:14:21:19 +0000] [Job 184] PAGE: DEBUG: Read 2 bytes of print data...
D [29/Dec/2019:14:21:19 +0000] [Job 184] 1 1
D [29/Dec/2019:14:21:19 +0000] [Job 184] bitsperpixel 8
D [29/Dec/2019:14:21:19 +0000] [Job 184] BitsPerColor 8
D [29/Dec/2019:14:21:19 +0000] [Job 184] Width 384
D [29/Dec/2019:14:21:19 +0000] [Job 184] Height799
D [29/Dec/2019:14:21:19 +0000] [Job 184] feed_between_pages_mm 0
D [29/Dec/2019:14:21:19 +0000] [Job 184] mark_page_boundary 0
D [29/Dec/2019:14:21:19 +0000] [Job 184] eject_after_print_mm 10
D [29/Dec/2019:14:21:19 +0000] [Job 184] auto_crop 0
D [29/Dec/2019:14:21:19 +0000] [Job 184] enhance_resolution DEBUG: Wrote 2 bytes of print data...
D [29/Dec/2019:14:21:19 +0000] [Job 184] 0
D [29/Dec/2019:14:21:19 +0000] [Job 184] Feeding 155 lines
D [29/Dec/2019:14:21:19 +0000] [Job 184] Feeding 47 lines

It has the outputs from various filters all mixed together, possibly with some race conditions… (can you spot them?). Anyway, the cancel message is coming through and getting processed correctly. But no output is happening.

Debugging this was tricky because there were several causes. What I eventually did was add a 100ms pause between lines in order to reduce the amount of paper wasted and that revealed something interesting. One case was simply that sometimes the heating level was too low and the text was invisible.

In the other case, I’m just not sure. If the buffer is too full, then the last bits of the job seem to get “lost” somehow, if a cancellation occurs. With a 100ms pause, I always get the cancellation message. If I make the pause shorter then the printer can’t keep up and after a while the buffers all become full. In that case, I get cancellation messages if done early (when the buffers aren’t yet full) but not late.

I don’t yet know if long jobs get truncated. I suspect that the same would happen because there appears to be nothing functionally different between cancellation and normal termination. I don’t know who is responsible for this, but I’d be surprised if it was CUPS. My guess is no one has ever tested printing large amounts of full page bitmaps on this printer simply because that’s not the intended use. Speaking of not the intended use…

Abuse of paper sensors

As far as I can tell there isn’t an obvious way to query the buffer status to avoid it getting too full. I don’t even know where the buffer is. I expect the USB system has one, as does the USB chip and the UART on the printer.

But the printer does have a “Transmit Status” command (Page 42) for which it warns that there may be a lag since it’s processed in sequence. Even worse/better you can’t use this one to detect paper out because once the paper ends, the printer goes offline and won’t execute the command (I expect the paper sensor status command may be more asynchronous). Also that appears to be untrue, I tried it with the following code:

exec 3<> /dev/usb/lp0
echo -ne '\x1dr1' >&3
dd bs=1 count=1 status=none <&3 | od -td1 

And I got back 0 with paper in and 12 with the door open.

That apparently useless synchronous mechanism may be just the ticket: I bet if I stuff the command stream with these then I can get an approximation of the number of lines printed. The code looks something like this:

void transmit_status(){
	cout << GS << "r1" << flush;
}


void wait_for_lines(const int lines_sent, int& read_back, int max_diff){
	for(;;){
		char buf;
		ssize_t bytes_read = cupsBackChannelRead(&buf, 1, 0.0);

		if(bytes_read > 0)
			read_back++;

		if(lines_sent - read_back <= max_diff)
			break;

		cerr << "DEBUG: buffer too full (" << lines_sent - read_back << "), pausing...\n";
		using namespace std::literals;
		std::this_thread::sleep_for(100ms);
	}
}

// ... and in the main print loop...
			//Stuff requests for paper status into the command stream
			//and count the returns. We allow a gap of 80 lines (1cm of printing)
			transmit_status();
			lines_sent++;
			wait_for_lines(lines_sent, read_back, 80);

Checking the print logs shows this does what is expected. Furthermore, cancellation works properly (it prints the cancelled message and ejects the job) and is pretty quick!

Paper out!

OK, so I’m already reading the paper status. The manual suggests I might not be as I mentioned except I’m reading it before/after every line, so in that case I think I’m safe. Besides, it’s not entirely clear how you’re meant to differentiate between all the async replies:

When Auto Status Back (ASB) is enabled using GS a, the status
transmitted by GS r and the ASB status must be differentiated using.

(page 42)

Maybe some of the undefined bits are actually set. Who knows?

Anyway, all that remains is to transmit that back to CUPS. It’s broadly covered here.

BAH!

It didn’t work. Turns out the manual is only not right in very specific circumstances. Fortunately it seems for the status command bit 5 is always set so I could test for that.

So I stuffed the command stream with the proper status reports too and, well, guess what?

I just got a big old stream of zeros back from the printer. I could try the async reporting. That might work, but the printer has only a single sensor and stops running when it’s tripped. What I could do is see if nothing has changed for some time and report that as a paper out event.

This seems a bit hacky and it is. I’m not all that surprised though. This family of printers are mostly RS/232 based with asynchronous status lines in addition for paper, not USB. They’re also not expected to print out vast amounts of data; receipts are usually a few pages at most of plain text. I expect these obscure paths haven’t been exercised much.

Oh yes, hacky. So, here’s the code, it’s pretty straightforward overall:

void wait_for_lines(const int lines_sent, int& read_back, int max_diff){
	using namespace std::literals;
	using namespace std::chrono;

	auto time_of_last_change = steady_clock::now();
	bool has_paper=true;

	for(;;){
		char buf;
		ssize_t bytes_read = cupsBackChannelRead(&buf, 1, 0.0);

		if(bytes_read > 0){
			read_back++;
			
			if(!has_paper){
				cerr << "STATE: -media-empty\n";
				cerr << "STATE: -media-needed\n";
				cerr << "STATE: -cover-open\n";
				cerr << "INFO: Printing\n";
			}
				
			has_paper = true;
			time_of_last_change = steady_clock::now();
		}
		else if(auto interval = steady_clock::now() - time_of_last_change; interval > 2500ms){
			cerr << "DEBUG: no change for " << duration_cast<seconds>(interval).count() << " seconds, assuming no paper\n";
			if(has_paper){
				cerr << "STATE: +media-empty\n";
				cerr << "STATE: +media-needed\n";
				cerr << "STATE: +cover-open\n";
				cerr << "INFO: Printer door open or no paper left\n";
			}
			has_paper = false;
		}

		cerr << "DEBUG: Lines sent=" << lines_sent << " lines printed=" << read_back << "\n";

		if(lines_sent - read_back <= max_diff)
			break;

		cerr << "DEBUG: buffer too full (" << lines_sent - read_back << "), pausing...\n";
		std::this_thread::sleep_for(100ms);
	}
}

I’ve gone for an all inclusive approach with the messages. The printer cannot distinguish between the door being open and a lack of paper, so I’ve reported both.

It works!

The driver is now feature complete for a first version at any rate. There’s some minor image quality problems in normal mode (caused by fast feeds before bitmaps) and a bit of stripyness caused by poor calibration in enhanced mode. And the plain text filter probably should be a proper filter that does status read back and buffering. But it isn’t.

Adafruit mini thermal printer, part 2/3: CUPS and other vessels

I bought a printer and have blogged about it because it’s literally the most interesting thing ever.

Code on github: https://github.com/edrosten/adafruit-thermal-printer-driver. Note: I wrote these posts as I went along so there may be bugs in the code snippets which are fixed later. I recommend checking the GitHub source before using a snippet.

This post is about integrating with CUPS so I can print from normal programs.

Integrating with CUPS

So, I have a sort of working example of CUPS integration in the existing ZJ-58 driver. It is, I suspect not very good. Nonetheless, I’ll start there since it’s vastly easier starting from a working example than the documentation.

Note from the future: The documentation…

It does exist, but it’s scattered over the various projects, those being PostScript, other Adobe printer guff, CUPS, the GhostScript interpreter, the printer working group and so on). It’s the type of documentation where you can’t find anything so you do 95% of the work the hard way, get stuck on an obscure API call/keyword/etc and then that string turns up the documentation you needed at the beginning.

Anyway, Here’s the install script from the existing driver:

#!/bin/bash

# Installs zj-58 driver
# Tested as working under Ubuntu 14.04

/etc/init.d/cups stop
cp rastertozj /usr/lib/cups/filter/
mkdir -p /usr/share/cups/model/zjiang
cp ZJ-58.ppd /usr/share/cups/model/zjiang/
cd /usr/lib/cups/filter
chmod 755 rastertozj
chown root:root rastertozj
cd -
/etc/init.d/cups start

That’s pretty simple: basically it dumps some files into the CUPS tree and restarts CUPS.

From what I understand, CUPS essentially has some sort of specification of various filter chains (which can vary based on the input, e.g. a plain text file, a postscript file and a JPG will have different input filters). A given filter lists its accepted inputs and CUPS works backwards to figure out how to generate what’s required. For raster things (i.e. not plain text when the printer can accept plain text) CUPS will rasterise the input and you need to then get it sent to the converter to convert it to the right control codes (mostly the topic of the previous post).

Many of those are controlled by a PPD file. This stands for “PostScript Printer Description” and tells CUPS all about the printer capabilities. It also has extensions beyond the Adobe PPD spec to allow you to specify rasterisation and filters for non PostScript printers.

The driver of course comes with a PPD (it has to), but it’s long, complicated, has fragments of PostScript in it and doesn’t even pass the tests run by the cupstestppd command. And there’s a lot of duplicated information about pages sizes which I suspect needs to be consistent. Not great but it’s a start.

So, while PPD is documented (or some approximation thereof) and the CUPS extensions are likewise, apparently you’re not really meant to write PPDs anyway. The easy/approved way is to write DRV files and then compile them into one or more PPD files using ppdc.

Either way the documentation is poor. There are lots of attributes in existing PPD and DRV files like “Filesystem” that it’s very hard to find any kind of documentation for and others like “PSVersion” which are weakly documented (what are the acceptable range of values?).

Some information is here. On the subject of PSVersion, typing revision = into a GhostScript interpreter reveals that my machine (Ubuntu 18.04) has revision 926 for whatever that’s worth. Either way it seems optional. Anyway, I’ve tried to pare down my DRV file to the absolute minimum which covers what I want and I got this:

#include <font.defs>

DriverType custom  //Required I believe to set downstream filters 
ManualCopies Yes //Set to yes if the driver doesn't know how to print multiples of pages
Attribute "LanguageLevel" "" "3" //Default is 2 (from 1991), latest version is from 1997
Attribute "DefaultColorSpace" "" "Gray" //Self explanatory except does this mean something else can change it?
Attribute "TTRasterizer" "" "Type42" //Default is none, Type42 is the only extant useful one.
Filter application/vnd.cups-raster 0 rastertoadafruitmini //Arguments are datatype to feed to the filter, the expected CPU load, and the name of the filter executable
ColorDevice False

Font * //Include all fonts

// Manufacturer, model name, and version of the driver
Manufacturer "Adafruit"
ModelName "Mini"
Version 1.0
ModelNumber 579 //That's the product number on the website.

//I believe this allows users to specify custom sizes in the
//print dialog, or on the command line.
VariablePaperSize Yes
MinSize 58mm 5mm
MaxSize 58mm 1000mm

//#media creates media definitions which may or may not be used
//The paper is always 58mm wide, and have for now three different
//lengths
#media "58x50mm" 58mm 50mm
#media "58x100mm" 58mm 100mm
#media "58x200mm" 58mm 200mm

//The print area is always 48mm wide, centred
HWMargins 5mm 0 5mm 0

//This actually uses the media definitions above
*MediaSize "58x50mm"
MediaSize "58x100mm"
MediaSize "58x200mm"

// Supported resolutions
// Use as: Resolution colorspace bits-per-color row-count row-feed row-step name
// Apparently mostly the row stuff is 0 in most drivers. The last field
// (name) needs to be formatted correctly
*Resolution k 8 0 0 0 "208dpi/208 DPI"

// Name of the PPD file to be generated
PCFileName "mini.ppd"

OK, strictly speaking this isn’t the absolute minimum, since I’ve specified several virtual page sizes and variable sized pages, which is how CUPS deals with roll media. Here’s the corresponding install shell script to dump things in the right place:

/etc/init.d/cups stop

mkdir -p /usr/share/cups/model/adafruit
install rastertoadafruitmini  /usr/lib/cups/filter/rastertoadafruitmini
install ppd/mini.ppd /usr/share/cups/model/adafruit/mini.ppd

/etc/init.d/cups start

Now, running that and going to http://localhost:631 and going through the motions shows the printer there with the options I’d expect (i.e. paper size). The printer device appears as “unknown” in CUPS since it works as a USB parallel port (/dev/usb/lp0), but doesn’t report anything back to CUPS. Even with that , it won’t work yet, because I need in no particular order

  • Proper information logging to stderr in a format that CUPS likes
  • Deal with commandline arguments that CUPS hands me
  • Handle SIGTERM (used to cancel jobs) and not leave the printer in a bad state

In addition, you can add arbitrary choices to the driver which get passed on to the filter so I think I’ll add ones for feeding paper after the job has done (so the end of the last page ends at tearoff on the printer), auto cropping pages (removing white space at the top and bottom–useful for roll media), and marking page boundaries. Because why not? I only have to implement them later.

Options are implemented using an option directive followed by a bunch of choice directives, e.g.

Option "TestOption" PickOne DocumentSetup 0
  *Choice "A" ""
  Choice "B" ""
  Choice "C" ""

You can have Boolean, PickOne or PickMany. I don’t really see the point of Boolean: all of them need to have choice directives (for reasons which will soon become clear), so there’s little difference between a Boolean and a PickOne with two options.

The only difference seems to be that it renders a boolean as a radio group not a drop down list in the web interface:

hmmm. I wonder…

OK Confirmed! You can have as many “boolean” choices as you like, though note that the troolean choices don’t appear in the print dialog boxes, whereas booleans appear as checkboxes. Neither the compiler nor the validator complained which seems like a mild oversight.

With that silly aside out of the way, the next bit is how those options are passed to the printer driver. It turns out there are two ways, both of which are applied simultaneously.

The first, is that the options are passed as a command line argument to the filter, along with the PPD file (in the PPD environment variable). The CUPS API provides some handy functions for parsing PPD files and option strings and generally dealing with it.

The second is that each choice comes with an arbitrary snippet if PostScript code which is run at the point specified by the option directive (it can be at places like document start, page start). Now PostScript has a setpagedevice command which basically accumulates a dictionary for device specific use. The CUPS driver will put certain elements in that dictionary into the raster page headers, and you can access them from C in the filter. It doesn’t support arbitrary dictionaries, and in fact what it has is:

unsigned cupsInteger[16];
float cupsReal[16];
char cupsString[16][64];

You can fill these up by putting appropriately named things into the dictionary, e.g.:

<</cupsInteger1 10 /cupsReal7 2.2 /cupsString3 (a string)>> setpagedevice

W00t! I just found the documentation (by searching for cupsInteger0 to see if it was 0-based or 1-based; it’s 0-based). Turns out there are loads of parameters you can pass this way. Many have “accepted” meanings but you can abuse them to pass arbitrary data since you control both sides.

The two choices are pretty much equivalent, so I’ll pick… uh. Ummm OK wow I’m suffering from choice indecision here. OK, I’ll go for option 2. The API for option 1 is the usual annoying C faff, plus apparently it’s been deprecated since 2012 and I don’t have a nice example of the new API to copy from.

Putting all that together code added to the DRV file looks like this:

//The last argument is the order in which the order in which the options 
//are executed (each one comes with a snippet of code to execute). In this
//case, all snippets are empty.
Option "PageFeed/Feed paper between pages" PickOne DocumentSetup 0
  *Choice "None" "<</cupsInteger0  0>>setpagedevice"
  Choice "1mm"   "<</cupsInteger0  1>>setpagedevice"
  Choice "2mm"   "<</cupsInteger0  2>>setpagedevice"
  Choice "5mm"   "<</cupsInteger0  5>>setpagedevice"
  Choice "10mm" "<</cupsInteger0 10>>setpagedevice"

Option "PageMark/Mark where to cut pages" Boolean DocumentSetup 1
  *Choice "No" "<</cupsInteger1 0>>setpagedevice"
  Choice "Yes" "<</cupsInteger1 1>>setpagedevice"

Option "EjectFeed/Feed paper after printing" PickOne DocumentSetup 2
  Choice "None"  "<</cupsInteger2  0>>setpagedevice"
  *Choice "5mm"  "<</cupsInteger2  5>>setpagedevice"
  Choice "10mm" "<</cupsInteger2 10>>setpagedevice"
  
Option "AutoCrop/Crop page to printed area" Boolean DocumentSetup 3
  *Choice "No" "<</cupsInteger3 0>>setpagedevice"
  Choice "Yes" "<</cupsInteger3 1>>setpagedevice"

The *’s indicate the default choices. And this so far appears to work! The web interface shows this:

And the print dialog in Firefox looks like this:

Sweet!

Writing a valid CUPS filter

This is actually documented reasonably well if you know where to look. I believe I can ignore all arguments (I’m using the other method for options, and I’ve told the driver I don’t know how to make copies myself) except the optional argv[6] which is the file to print if it’s not stdin. Yay.

Cancellation is easy: ignore SIGPIPE and clean up on SIGTERM. Since it’s a simple program, I can use a simple solution where I just poll a global variable:

volatile sig_atomic_t cancel_job = 0;
//...
	signal(SIGPIPE, SIG_IGN);
	signal(SIGTERM, [](int){ cancel_job = 1;});

Logging likewise is easy and involves writing to stderr something like TYPE: data where TYPE is the message type. The type has things such as ERROR, DEBUG, etc for logging, PAGE for recording the current page number, STATE for indicating things like paper empty and so on. The format of the data depends on the message type.

Paper empty and so on can be queried from the printer using special control codes and CUPS looks like it has a way to read back anything returned. I’m not so sure how this works yet. I’ll deal with that later.

Dealing with options took me far too long. I started with the following code snippet:

	cups_raster_t *ras;
	cups_page_header2_t header;
	//...
	while (cupsRasterReadHeader2(ras, &header))
	{
		feed_between_pages_mm = header.cupsInteger[0];
		mark_page_boundary = header.cupsInteger[1];
		eject_after_print_mm = header.cupsInteger[2];
		auto_crop = header.cupsInteger[3];
		enhance_resolution = header.cupsInteger[4];

and it didn’t really work. And by “didn’t work”, I mean that I tried adding -dcupsInteger0=1 to the GhostScript invocation (this sets an integer variable and somehow these magically wind up in setpagedevice, I don’t know how) and I could only set 0, 1 and 2. None of the other integers could be set.

If you cast your mind back to the first post in this series, I mentioned that I cargo-culted an invocation of GhostScript and wasn’t sure what everything did. Well, it came to bite me here. It has the innocuous looking argument -sMediaClass=PwgRaster (-s just sets a variable in the interpreter). This is now getting in quite deep. MediaClass is a variable which affects the setpagedevice command (page 21 of the PostScript® Language Reference Manual Supplement published in 1996 on April 1 and it is deadly serious) in various nonspecific (vendor defined) ways. And one such vendor is the shadowy cabal known as the “Printer Working Group” or PWG for short (its more exciting if they are a shadowy cabal). I sort of unearthed them by forlornly digging through cups/raster.h looking for clues and found this (edited) for display:

// The following PWG 5102.4 definitions specify indices into the
// cupsInteger[] array in the raster header.
#  define CUPS_RASTER_PWG_TotalPageCount	0
#  define CUPS_RASTER_PWG_CrossFeedTransform	1
// etc...
#  define CUPS_RASTER_PWG_VendorLength		15

Turns out they have defined their own meanings for the user-defined extensions and brazenly took all of them. What I don’t understand is why I could set 0, 1 and 2, but not 3 onwards. No clues there. It also stopped cupsReal and cupsString from working and set PWG_AlternatePrimary to 224-1. ¯\_(ツ)_/¯

What went wrong

So that all sort of worked, and I can print out cats using lpr. Except…

Inverted cats. And junk

The cats come out inverted, like this:

meow!

This is because I had:

*Resolution k 8 0 0 0 "203dpi/203 DPI"

which is the “black” colour model. If I change the “k” to “w”, I get what I expect except with some junk at the top.

What I actually need is:

*ColorModel Gray/Grayscale w chunky 0
*Resolution - 8 0 0 0 "203dpi/203 DPI"

I don’t know why. The colour model specifies the white model (along with chunky which is means packed for colour data and no compression), then the resolution says to not modify the colour model. Ok, sure…

Nope!!

Turns out that wasn’t it. I must have just reset things when making that change. The junk was because… well I don’t know exactly. It doesn’t appear on the first printout, it only appears on the third. And if I send enough text to the printer then the next image is fine. It therefore appears as if something was getting flushed before the last line was complete. Then the first few bytes (including the start bitmap control code) were getting eaten up finishing the previous line and then it was printing data out as text.

Turns out the offending bit was this function

void printerInitialise(){
	cout << ESC << '\x40';
}

calls to which I sprinkled liberally around, and these are messing things up. Here’s the funny thing though: putting a cout << flush after the first one fixed it. That ought to make sense: the printer gets data asynchronously then starts processing it while the UART asynchronously fills the receive buffer. It processes the initialise command and loses the first few control codes. Or something.

Except… the symptoms only manifested after several images, making it look like it was state being carried over. It’s weird, I don’t get it. Clearly there’s some internal state somewhere, and part of me things is might be in CUPS because I suspect the original driver used to work just fine.

Page Sizes

The print dialog boxes seemed to get deeply confused about the smallest page size (58x50mm). The reason for this it turns out is that it’s really a landscape page not a portrait one and pages need to be specified in portrait orientation. Except that would make the width wrong. If I’d paid attention to the warnings from cupstestppd, then I would not have had this problem.

ppd/mini.ppd: PASS
        WARN    Size "58x50mm" should be the Adobe standard name "50x58mmRotated".

And it turns out all I have to do is switch the name:

#media "50x58mmRotated" 58mm 50mm
#media "58x100mm" 58mm 100mm
#media "58x200mm" 58mm 200mm

HWMargins 5mm 0 5mm 0

*MediaSize "50x58mmRotated"
MediaSize "58x100mm"
MediaSize "58x200mm"

and things seem to be much more sensible.

Booleans

The print dialog box renderers don’t really know which option is meant to correspond to a check mark and which isn’t. I tried changing the keyword to “True” and “False” and putting true first in the list, e.g.:

Option "PageMark/Mark where to cut pages" Boolean DocumentSetup 1
  Choice "True/Yes" "<</cupsInteger1 1>>setpagedevice"
  *Choice "False/No" "<</cupsInteger1 0>>setpagedevice"

That seemed to do the job. I believe it’s the ordering that matters, I’m not sure though.

Other stuff

There were a few other miscellaneous bits and bobs to fix too. In addition I implemented the various features I mentioned above. I decided also to emit blank lines as a feed rather than a blank line because it’s a fair bit faster. Except I had to suppress that in enhanced resolution mode, because otherwise the first few lines printed after a gap were too dark.

I also want the printer to print plain text as plain text. This isn’t necessary but it’s always been idiomatic to pass through like that, rather than relying on the postscript rasteriser. I can fix that with one extra line in the DRV file:

Filter text/plain 0 -

That tells CUPS that it accepts text, is no cost and to use a null filter program.

Cancellation

Oh wow this turned out to be hard. Way harder than expected because it reveals deep problems. It’s going to be a whole other blog post.

Result!

OK so basically it works!

I can print using lp (or lpr), and set options like -o Enhance=True -o PageMark=True and it obeys them.

Recognise this?