A new thermal printer (Part 1, USB exploration)

TL;DR

I missed having a thermal printer. Fortunately they are pretty easy to come by, since they’re commercial devices, so there’s a very healthy second hand market from liquidations, shop refits and that sort of thing. I found an SNBC BTP-R180 II (not identical in appearance to mine, but same model number). It has the following features:

• 80mm paper—the standards are 54 and 80, my previous one was a small one.
• 180dpi—I suspect this is wrong and it’s actually 203.
• Epson ESC/POS control codes—pretty much the standard, same as AdaFruit mini, and I have already written a driver.
• Ethernet, USB and RS-232!
• Automatic cutter!!
• 200mm per second print speed!!!

This post is in some sense an act of extreme motivated laziness. You see, the only available ethernet port nearby is broken (by which I mean I messed up the wiring 3 times in a row and flounced off), and the working ones are all the way upstairs. ALL the way. In hindsight as I’m writing this, I just realised I could have used the ethernet port on my laptop and made an additional network. But I didn’t so I tried to get it working over USB instead.

So I plugged in the printer and what happened was… a whole pile of nothing.

And by nothing, I mean I fired up http://localhost:631 and tried to add a printer, and it didn’t appear. For the Adafruit mini, it worked out of the box. On further inspection, there was nothing in `/dev`, I was hoping for` /dev/usblp0`, or maybe an `ACM0` or even a `ttyUSB1` (not sure what the existing 0 is assigned to). Checking the kernel logs gave this:

```usb 3-1: new full-speed USB device number 3 using xhci_hcd
usb 3-1: New USB device found, idVendor=154f, idProduct=154f
usb 3-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
```

And that’s it, no driver attached. The vendor ID indeed matches the vendor, and the product ID matches… the vendor also. My first thought was maybe no one had plumbed in the right IDs, and a bit of searching revealed you can now use a file in `/sys` to force this. No more reloading modules with weird argument strings. Apparently this feature was new in 2011. You can tell the driver to bind to the vendor IDs, which I did. First, force load the module with `modprobe usblp`, now do:

```echo 154f 154f | sudo dd of=/sys/bus/usb/drivers/usblp/new_id
```

and replug the device. And that did… a whole pile of nothing. I tried also for` cdc-acm`, which is often a standin for USB serial ports, and that at least gave an error in `dmesg`:

```usb 1-1.2: new full-speed USB device number 8 using ehci-pci
usb 1-1.2: New USB device found, idVendor=154f, idProduct=154f
usb 1-1.2: New USB device strings: Mfr=0, Product=0, SerialNumber=0
cdc_acm 1-1.2:1.0: Zero length descriptor references
cdc_acm: probe of 1-1.2:1.0 failed with error -22
```

No luck. At least that shows that the `new_id` method was doing something. But the printer is resolutely not responding. I’m now 2 hours in and I have nothing more than a kernel log.

Time to break out libusb via pyusb and see what’s inside (note from the future: `sudo lsusb -v` can also do some of that). This is remarkably easy, because the documentation (which mine is a copy from) is very good:

```import usb.core
dev = usb.core.find(idVendor=0x154f, idProduct=0x154f)
print(dev)
```

This prints out a bunch:

``````DEVICE ID 154f:154f on Bus 001 Address 015 =================
bLength                :   0x12 (18 bytes)
bDescriptorType        :    0x1 Device
bcdUSB                 :  0x110 USB 1.1
bDeviceClass           :   0xff Vendor-specific
bDeviceSubClass        :   0xff
bDeviceProtocol        :   0xff
bMaxPacketSize0        :   0x40 (64 bytes)
idVendor               : 0x154f
idProduct              : 0x154f
bcdDevice              :  0x1f0 Device 1.15
iManufacturer          :    0x0
iProduct               :    0x0
iSerialNumber          :    0x0
bNumConfigurations     :    0x1
CONFIGURATION 1: 2 mA ====================================
bLength              :    0x9 (9 bytes)
bDescriptorType      :    0x2 Configuration
wTotalLength         :   0x27 (39 bytes)
bNumInterfaces       :    0x1
bConfigurationValue  :    0x1
iConfiguration       :    0x0
bmAttributes         :   0xc0 Self Powered
bMaxPower            :    0x1 (2 mA)
INTERFACE 0: Vendor Specific ===========================
bLength            :    0x9 (9 bytes)
bDescriptorType    :    0x4 Interface
bInterfaceNumber   :    0x0
bAlternateSetting  :    0x0
bNumEndpoints      :    0x3
bInterfaceClass    :   0xff Vendor Specific
bInterfaceSubClass :   0xff
bInterfaceProtocol :   0xff
iInterface         :    0x0
ENDPOINT 0x2: Bulk OUT ===============================
bLength          :    0x7 (7 bytes)
bDescriptorType  :    0x5 Endpoint
bmAttributes     :    0x2 Bulk
wMaxPacketSize   :   0x40 (64 bytes)
bInterval        :    0x1
ENDPOINT 0x82: Bulk IN ===============================
bLength          :    0x7 (7 bytes)
bDescriptorType  :    0x5 Endpoint
bmAttributes     :    0x2 Bulk
wMaxPacketSize   :   0x40 (64 bytes)
bInterval        :    0x1
ENDPOINT 0x85: Bulk IN ===============================
bLength          :    0x7 (7 bytes)
bDescriptorType  :    0x5 Endpoint
bmAttributes     :    0x2 Bulk
wMaxPacketSize   :   0x40 (64 bytes)
bInterval        :    0x1
``````

Note the highlighted parts. The IDs match, of course. There’s also three endpoints of which the output one is marked. I’ll bet the output one is data for printing, one is data readback and the other is some sort of status register. So, let’s try writing:

```import usb.core
dev = usb.core.find(idVendor=0x154f, idProduct=0x154f)

dev.set_configuration()
dev.write(0x02, "hello, world" + "\n"*10)
```

What’s all this new-fangled DBus rubbish?

DBus has been one of the larger changes that swept through Linux in the last 20 years or so. I mostly (with the exception of a small amount of whinging) ignored it. Usually I’ve encountered it when it hasn’t worked or has got in the way with something or other. Naturally when things worked, I was generally unaware of it.

It’s basically:

• Async RPC system with call, response, exception and asynchronus push.
• Authentication (useful for talking to daemons running as root)
• Introspection (the set of methods, arguments etc are exposed via DBus)
• Central point for services to register objects and methods: you open a dbus connection as opposed to a random socket

Various tools exist to send and receive messages, e.g. python libraries, shell commands via ‘dbus-send’ and so on. I’ve nothing against RPC in general, but it replaces essentially shell scripts triggered by the kernel etc with programs exchanging messages. You can do more with the latter (though often that isn’t necessary), but it replaces easily discoverable scripts with reading documentation, something the programming community is not well known for and as such it has a rather higher barrier to entry.

It has also has been used to replace simply libraries with relatively complex IPC requiring complex setup. But I won’t blame a table saw for injuries it causes through misuse. Anyway this post is a nearly linear stream of me learning DBus with the mistakes and confusion removed.

The tools

I will be relying on the following:

• python dbus module `import dbus`
• The basic dbus commandline program: `dbus-send`
• QT’s `qdbus` since it provides in many cases a nicer interface and introspection
• Gnome’s `gdbus` for more verbose introspection

The basics

I’m interested in manipulating the system, so I will be working with the system bus. There’s usually a system bus (for talking to OS related daemons) and a session bus (for all the programs in a logged in session). You can make more if you like but no one does.

In order to make an RPC call, you need:

1. A program to talk to, e.g. NetworkManager (this is called the bus name) and usually has a name like `org.freedesktop.NetworkManager`
2. An object in the program. Each program exports a tree of objects, rooted at `/` and separated with forward slashes. Something like: `/org/freedesktop/NetworkManager`. Freedesktop likes redundant naming because it’s redundant and repeats things redundantly. They could equally well have exported `/`.
3. The interface and method name. Just like any OO system each object can present zero or more interfaces with methods.

We can examine this. First, `qdbus` will show the names present on the system bus. For example, on my system (I don’t know what the numbers prefixed with colons are) I have:

``````qdbus --system  | grep -v :
org.freedesktop.systemd1
org.freedesktop.resolve1
fi.epitest.hostap.WPASupplicant
fi.w1.wpa_supplicant1
org.freedesktop.NetworkManager
org.gnome.DisplayManager
org.freedesktop.ColorManager
org.freedesktop.Avahi
org.bluez
org.freedesktop.UPower
org.freedesktop.Accounts
org.freedesktop.RealtimeKit1
org.freedesktop.UDisks2
org.freedesktop.ModemManager1
org.freedesktop.bolt
org.freedesktop.PackageKit
org.freedesktop.PolicyKit1
org.freedesktop.DBus
``````

There’s a few one might recognise, Udisks for hotplug disks, NetworkManager for wifi control etc, ModemManager1 because this machine actually has a real actual physical 56k modem, and a few other well established ones like systemd. You can go further and query what’s on the bus. For example, I can query systemd using gdbus, which as you may recall is the verbose, detailed one. I’m going to query the root object to see what’s there:

``````\$gdbus introspect  --system --dest org.freedesktop.systemd1   --object-path /
node / {
interface org.freedesktop.DBus.Peer {
methods:
Ping();
GetMachineId(out s machine_uuid);
signals:
properties:
};
interface org.freedesktop.DBus.Introspectable {
methods:
Introspect(out s data);
signals:
properties:
};
interface org.freedesktop.DBus.Properties {
methods:
Get(in  s interface,
in  s property,
out v value);
GetAll(in  s interface,
out a{sv} properties);
Set(in  s interface,
in  s property,
in  v value);
signals:
PropertiesChanged(s interface,
a{sv} changed_properties,
as invalidated_properties);
properties:
};
node org {
};
};
``````

If you’ve ever read IDLs of any sort this will look vaguely familiar. Systemd is exporting an object which presents three interfaces:

• org.freedesktop.DBus.Peer
• org.freedesktop.DBus.Introspectable
• org.freedesktop.DBus.Properties

and a subobject, `org`. Since gdbus doesn’t recurse by default, that’s all that’s displayed. The first of those objects has two methods neither of which have any arguments, which makes them easy to call. So I shall ping dbus:

``````\$dbus-send --print-reply --system --dest=org.freedesktop.systemd1 / org.freedesktop.DBus.Peer.GetMachineId
method return time=1632074720.055870 sender=:1.0 -> destination=:1.3006 serial=54374 reply_serial=2
string "73f867af39124a3583c288e620019332"
``````

and it responds. See how the bus name (`dest`), path (`/`) and interface.object (`org.freedesktop.DBus.Peer.GerMachineId`) are used. I can examine another common one. when given a bus, but no object, qdbus will recurse showing the object tree:

``````\$qdbus --literal --system org.freedesktop.NetworkManager
/
/org
/org/freedesktop
/org/freedesktop/NetworkManager
/org/freedesktop/NetworkManager/DnsManager
/org/freedesktop/NetworkManager/DHCP4Config
/org/freedesktop/NetworkManager/DHCP4Config/70
/org/freedesktop/NetworkManager/ActiveConnection
/org/freedesktop/NetworkManager/ActiveConnection/2
/org/freedesktop/NetworkManager/ActiveConnection/72
/org/freedesktop/NetworkManager/AccessPoint
/org/freedesktop/NetworkManager/AccessPoint/2686
/org/freedesktop/NetworkManager/Devices
/org/freedesktop/NetworkManager/Devices/3
/org/freedesktop/NetworkManager/Devices/2
/org/freedesktop/NetworkManager/Devices/1
/org/freedesktop/NetworkManager/Devices/25
/org/freedesktop/NetworkManager/Devices/4
/org/freedesktop/NetworkManager/AgentManager
/org/freedesktop/NetworkManager/Settings
/org/freedesktop/NetworkManager/Settings/8
/org/freedesktop/NetworkManager/Settings/7
/org/freedesktop/NetworkManager/Settings/6
/org/freedesktop/NetworkManager/Settings/5
/org/freedesktop/NetworkManager/Settings/4
/org/freedesktop/NetworkManager/Settings/3
/org/freedesktop/NetworkManager/Settings/2
/org/freedesktop/NetworkManager/Settings/1
/org/freedesktop/NetworkManager/Settings/24
/org/freedesktop/NetworkManager/Settings/9
/org/freedesktop/NetworkManager/IP6Config
/org/freedesktop/NetworkManager/IP6Config/3
/org/freedesktop/NetworkManager/IP6Config/204
/org/freedesktop/NetworkManager/IP6Config/203
/org/freedesktop/NetworkManager/IP6Config/6
/org/freedesktop/NetworkManager/IP4Config
/org/freedesktop/NetworkManager/IP4Config/3
/org/freedesktop/NetworkManager/IP4Config/204
/org/freedesktop/NetworkManager/IP4Config/203
/org/freedesktop/NetworkManager/IP4Config/6
``````

For reference, gdbus (non recursive; recursive is too verbose for this blog post) gives:

``````\$gdbus introspect  --system --dest org.freedesktop.NetworkManager   --object-path /
node / {
node org {
};
};``````

The root node isn’t very interesting, it just has the child `org` and nothing else. qdbus will also give a more compact method view, for example, I can query one of the devices:

``````\$qdbus --literal --system org.freedesktop.NetworkManager /org/freedesktop/NetworkManager/Devices/3
method QDBusVariant org.freedesktop.DBus.Properties.Get(QString interface_name, QString property_name)
method QVariantMap org.freedesktop.DBus.Properties.GetAll(QString interface_name)
signal void org.freedesktop.DBus.Properties.PropertiesChanged(QString interface_name, QVariantMap changed_properties, QStringList invalidated_properties)
method void org.freedesktop.DBus.Properties.Set(QString interface_name, QString property_name, QDBusVariant value)
method QString org.freedesktop.DBus.Introspectable.Introspect()
method QString org.freedesktop.DBus.Peer.GetMachineId()
method void org.freedesktop.DBus.Peer.Ping()
method void org.freedesktop.NetworkManager.Device.Delete()
method void org.freedesktop.NetworkManager.Device.Disconnect()
method QDBusRawType::a{sa{sv}} org.freedesktop.NetworkManager.Device.GetAppliedConnection(uint flags, qulonglong& version_id)
method void org.freedesktop.NetworkManager.Device.Reapply(QDBusRawType::a{sa{sv}} connection, qulonglong version_id, uint flags)
signal void org.freedesktop.NetworkManager.Device.StateChanged(uint new_state, uint old_state, uint reason)
signal void org.freedesktop.NetworkManager.Device.Wireless.AccessPointRemoved(QDBusObjectPath access_point)
method QList<QDBusObjectPath> org.freedesktop.NetworkManager.Device.Wireless.GetAccessPoints()
method QList<QDBusObjectPath> org.freedesktop.NetworkManager.Device.Wireless.GetAllAccessPoints()
signal void org.freedesktop.NetworkManager.Device.Wireless.PropertiesChanged(QVariantMap properties)
method void org.freedesktop.NetworkManager.Device.Wireless.RequestScan(QVariantMap options)
signal void org.freedesktop.NetworkManager.Device.Statistics.PropertiesChanged(QVariantMap properties)
``````

Yikes! There’s a lot there. Actually gdbus has the nice thing where it also queries properties for you. I recommend trying that. Anyway if you carefully read through, you can see the GetAccessPoints method. If I call it, I get a list of accesspoints:

``````\$dbus-send --print-reply --system --dest=org.freedesktop.NetworkManager /org/freedesktop/NetworkManager/Devices/3 org.freedesktop.NetworkManager.Device.Wireless.GetAccessPoints
method return time=1632075788.446527 sender=:1.12 -> destination=:1.3041 serial=431030 reply_serial=2
array [
object path "/org/freedesktop/NetworkManager/AccessPoint/2686"
object path "/org/freedesktop/NetworkManager/AccessPoint/2699"
object path "/org/freedesktop/NetworkManager/AccessPoint/2700"
object path "/org/freedesktop/NetworkManager/AccessPoint/2701"
object path "/org/freedesktop/NetworkManager/AccessPoint/2702"
object path "/org/freedesktop/NetworkManager/AccessPoint/2703"
object path "/org/freedesktop/NetworkManager/AccessPoint/2704"
object path "/org/freedesktop/NetworkManager/AccessPoint/2706"
object path "/org/freedesktop/NetworkManager/AccessPoint/2708"
object path "/org/freedesktop/NetworkManager/AccessPoint/2710"
object path "/org/freedesktop/NetworkManager/AccessPoint/2711"
object path "/org/freedesktop/NetworkManager/AccessPoint/2712"
object path "/org/freedesktop/NetworkManager/AccessPoint/2713"
object path "/org/freedesktop/NetworkManager/AccessPoint/2714"
object path "/org/freedesktop/NetworkManager/AccessPoint/2715"
object path "/org/freedesktop/NetworkManager/AccessPoint/2716"
object path "/org/freedesktop/NetworkManager/AccessPoint/2717"
]
``````

This is not very shell friendly, but I can persist. For example, if I examine one of the APs, I get:

``````\$gdbus introspect  --system --dest org.freedesktop.NetworkManager   --object-path /org/freedesktop/NetworkManager/AccessPoint/2714
node /org/freedesktop/NetworkManager/AccessPoint/2714 {
interface org.freedesktop.DBus.Properties {
methods:
Get(in  s interface_name,
in  s property_name,
out v value);
GetAll(in  s interface_name,
out a{sv} properties);
Set(in  s interface_name,
in  s property_name,
in  v value);
signals:
PropertiesChanged(s interface_name,
a{sv} changed_properties,
as invalidated_properties);
properties:
};
interface org.freedesktop.DBus.Introspectable {
methods:
Introspect(out s xml_data);
signals:
properties:
};
interface org.freedesktop.DBus.Peer {
methods:
Ping();
GetMachineId(out s machine_uuid);
signals:
properties:
};
interface org.freedesktop.NetworkManager.AccessPoint {
methods:
signals:
PropertiesChanged(a{sv} properties);
properties:
readonly ay Ssid = [0x48, 0x50, 0x2d, 0x50, 0x72, 0x69, 0x6e, 0x74, 0x2d, 0x33, 0x31, 0x2d, 0x4f, 0x66, 0x66, 0x69, 0x63, 0x65, 0x6a, 0x65, 0x74, 0x20, 0x36, 0x36, 0x30, 0x30];
};
};
``````

That looks interesting. The ssid, which is a byte array is a property. The way to read properties is with the property get method which is present, so you have to call that. Note it takes two arguments, both strings, one which is the interface name, the other being the method, so you have to specify those. And querying AP number 2714 gives:

`````` \$dbus-send --print-reply --system --dest=org.freedesktop.NetworkManager /org/freedesktop/NetworkManager/AccessPoint/2714 org.freedesktop.DBus.Properties.Get string:org.freedesktop.NetworkManager.AccessPoint string:Ssid
method return time=1632076552.174223 sender=:1.12 -> destination=:1.3082 serial=431816 reply_serial=2
variant       array of bytes "HP-Print-31-Officejet 6600"
``````

Oh look, one of my neighbours has one of the worse printers ever made. I had one of those printers.

Python provides a reasonable library for doing such things. Here’s some code which iterates over all available network interfaces and prints the SSID of whichever access points it finds:

```import dbus
bus = dbus.SystemBus()

obj = bus.get_object('org.freedesktop.NetworkManager', '/org/freedesktop/NetworkManager')
network_manager = dbus.Interface(obj, 'org.freedesktop.NetworkManager')

#Iterate over all devices
for device in network_manager.GetDevices():

#Get the wireless interface for each device
obj = bus.get_object('org.freedesktop.NetworkManager', device)
wlan =  dbus.Interface(obj, 'org.freedesktop.NetworkManager.Device.Wireless')

#Note we don't get an error until we attempt to use the interface
#I suspect there is a better way
try:
for ap_path in  wlan.GetAccessPoints():
obj =  bus.get_object('org.freedesktop.NetworkManager', ap_path)
ap_props =   dbus.Interface(obj, 'org.freedesktop.DBus.Properties')
ssid = ap_props.Get('org.freedesktop.NetworkManager.AccessPoint', 'Ssid')
print(''.join([str(v) for v in ssid]))

except dbus.exceptions.DBusException as e:
pass

```

So that’s it for the basics. There’s a whole introspection API as well which I believe returns the structure as XML.

But to call methods, introspection isn’t needed.

Writing a service

Writing a service is pretty easy in Python. Here’s an example where dbus doesn’t steal the entire main loop. Note that this runs in the sesson bus, not the system one, because of security.

```from gi.repository import GLib
import time

class Service(dbus.service.Object):
def __init__(self):
#Register dbus with GLib's main loop
dbus.mainloop.glib.DBusGMainLoop(set_as_default=True)

#Bus name
bus_name = dbus.service.BusName("com.hello.helloworld", dbus.SessionBus())

#Object to export
dbus.service.Object.__init__(self, bus_name, "/")

self._count=0

#Register two methods to the one interface
@dbus.service.method("com.hello.helloworld.Message", in_signature='', out_signature='s')
def get_message(self):
self._count+=1
return "Hello, world " + str(self._count)

@dbus.service.method("com.hello.helloworld.Message", in_signature='i')
def set_counter(self, i):
self._count = i

#A way of polling the main loop so that GLib doesn't
#steal the program's main loop
def poll(self):
loop = GLib.MainLoop()
def quit():
loop.quit()
loop.run()

#Instantiate the service
service = Service()

#Poll it
while True:
service.poll()
time.sleep(.1)
```

And it works:

``````~ \$qdbus --session com.hello.helloworld / get_message
Hello, world 1
~ \$qdbus --session com.hello.helloworld / get_message
Hello, world 2
~ \$qdbus --session com.hello.helloworld / get_message
Hello, world 3
~ \$qdbus --session com.hello.helloworld / get_message
Hello, world 4
~ \$qdbus --session com.hello.helloworld / get_message
Hello, world 5
~ \$qdbus --session com.hello.helloworld / set_counter 0

~ \$qdbus --session com.hello.helloworld / get_message
Hello, world 1
~ \$qdbus --session com.hello.helloworld / get_message
Hello, world 2
``````

qdbus is often a lot less verbose to use!

Writing a system service

If you try to run with the system bus, it will fail, even if you run as root. This is because of the security policy in place. The policies allow non-root daemons to run as system services, and no one special cased it to make root ones always allowed, which is fine.

The policies are in `/etc/dbus-1/system.d/` and they’re sort of understandable, but only sort of. It is documented: essentially it’s default deny with allow/deny rules applied top to bottom based on a matching scheme. In order to run the service as root, the following file works:

```<!DOCTYPE busconfig PUBLIC
"-//freedesktop//DTD D-BUS Bus Configuration 1.0//EN"
"http://www.freedesktop.org/standards/dbus/1.0/busconfig.dtd">
<busconfig>

<!-- Only root can own the service -->
<policy user="root">
<allow own="com.hello.helloworld"/>
</policy>

<policy context="default">
<allow send_destination="com.hello.helloworld"/>
</policy>
</busconfig>
```

essentially this allows only root to own the service, but allows anyone (the default context) to send messages. One the file is created, you then need to reload DBus which can be done with SIGHUP or with `systemctl reload debus` on systemd based systems.

Starting a system service with systemd

I could do it manually, but perhaps I should bend to the winds of change.

This is simple, for this rather basic service. First, add `#!/usr/bin/env python3` to the first line of the script and make it executable. Then copy it to `/opt/helloservice/service.py`. Then add a very simple unit file to `/etc/systemd/system/hello.service`:

``````[Unit]
Description=Hello world service

[Service]
ExecStart=/opt/helloservice/service.py

[Install]
WantedBy=multi-user.target``````

Is it me or is it weird that the old Windows 3 style .INI files have become a perverse sort of standard?

Anyway, now a simple:

``````sudo systemctl daemon-reload
sudo systemctl start hello``````

starts the service. And to enable it on boot:

``sudo systemctl enable hello``

which apparently symlinks it in a directory.

And that’s it

A new system service, running as root controllable by a user. The purpose is to have some NeoPixels on an Raspberry Pi controlled as a user (the pixels can only run as root due to the need to access low level hardware).

Undefined behaviour and nasal demons, or, do not meddle in the affairs of optimizers

Discussions on undefined behaviour often degenerate into vague mumblings about “nasal demons”. This comes from a 1992 usenet post on comp.lang.c, where the poster “quotes” the C89 standard (this is oddly hard to find now) as:

“1.6 Definitions of Terms
In this standard, … “shall not” is to be interpreted as a prohibition
* Undefined behavior — behavior, upon use of a nonportable or erroneous program construct, … for which the standard imposes no requirements. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to having demons fly out of your nose.”

John F. Woods

Essentially what he means is that the compiler is allowed to do anything, including things which seem completely nonsensical. Usually you feel you have a handle on what sort of things the compiler might do, but sometimes it can do some very unexpected things. Of course they make sense in the compiler’s internal logic and are not incorrect, but it can take a long time to figure out what line of reasoning led the compiler to its decision.

I have a nice example which was investigated and distilled down to a minimal example by my friend David McCabe who kindly allowed me to post it here (I’ve changed it slightly). The example involves accidental memory scribbling in the context of derived classes (so the vtable pointers get scribbled over) where the compiler can inline enough to “see” the scribbling and then reason about it.

Here is the code snippet (and Godbolt link to peruse):

```#include <cstring>

void e();
void f();
void g();

class A {
public:
virtual ~A() = default;
};

class B : public A {
public:
virtual void x() {f();};
};

void r(A* a) {
B c;
std::memcpy(static_cast<void*>(a), static_cast<const void*>(&c), sizeof(A));
}

int main() {
A a;
r(&a);
e();
((B*)&a)->x();
g();
}
```

So, what’s going on, or what’s supposed to be going on? And by “supposed”, I mean what would one think a compiler with no optimizations most likely do, given what I know about how C++ code translates to unoptimized assembly.

So, we have a base class with a vtable, `A`, and in main create an instance `a`. Note that `A` and the derived class `B`, happen to be the same size (the size of the vtable pointer). The function `r` is the memory scribbler: it creates an instance of `B` and `memcpy()`s it over `A`. In an intuitive way, the instance `a` has been turned into an instance of `b`, because it’s had its vtable pointer changed to a `B` vtable pointer. This is really the only thing at a low level which actually causes these types to be different, so if you were to cast a pointer to `a` to a `B` pointer, it should now work as a `B` and you can make the virtual function call `x` on it.

And that’s what main does. The only purpose of `e`, `f`, and `g` are to place markers in the generated code so we can see what’s run.

Now, scribbling over vtables is wildly ridiculously undefined, and nasal demons my fly (spoiler alert: they do). And you also add in some stubs for e, f and g in a separate source file (to prevent it optimizing) like this:

```#include <iostream>
void e(){ std::cerr << "e\n"; }
void f(){ std::cerr << "f\n"; }
void g(){ std::cerr << "g\n"; }
```

then compile and run it (first without optimization), and it prints:

```e
f
g
```

which is what you might expect. But what about with optimizations? On my machine (gcc 7.5.0), if I compile the main file with -O3 and the stubs with no optimization, it prints out:

```e
e
⋮
<repeats for a total of 37401 e's>
⋮
e
Segmentation fault (core dumped)
```

These are the nasal demons. How on earth can the optimizer have turned that code into a loop and a segfault? It turns out it makes sense, but only after a lot of investigation. But first I’m going to consider some other cases before getting on to gcc with -O3. The Godbolt link, does nice colourisation so you can see which program lines match to which assembly lines.

First, here’s what unoptimized GCC 7.5.0 does (note I’ve snipped a lot of the setup and teardown and so on):

```        lea     rax, [rbp-24]
mov     rdi, rax
call    r(A*)
call    e()
lea     rax, [rbp-24]
mov     rax, QWORD PTR [rax]
mov     rax, QWORD PTR [rax]
lea     rdx, [rbp-24]
mov     rdi, rdx
call    rax
call    g()
lea     rax, [rbp-24]
mov     rdi, rax
call    A::~A() [complete object destructor]
```

I’ve highlighted the most relevant lines, which are in order a call to `r`, `e`, then an indirect call via the vtable, then `g`. It then calls the destructor for `A` not `B`. The call is non virtual because `a` is an instance not a pointer, so it ignores the scribbled vtable and calls the “wrong” destructor.

I’m now going to look at what other compilers do first because GCC 5 and newer definitely do the most unexpected thing. First, the last pre-5 version on Godbolt, GCC 4.9.4 with -O3:

```main:
sub     rsp, 40
mov     QWORD PTR [rsp+16], OFFSET FLAT:vtable for B+16
mov     QWORD PTR [rsp], OFFSET FLAT:vtable for B+16
call    e()
mov     rax, QWORD PTR [rsp]
mov     rdi, rsp
call    [QWORD PTR [rax+16]]
call    g()
xor     eax, eax
ret
```

I’ve stripped out everything except `main()`. Note it’s inlined both r() and the call to memcpy. On line 3, it’s creating the concrete `b`, and copying in the vtable, on the following line, it’s `memcpy`ing that over `a`, but it’s noticed it can take the data from the source. then it calls `e`, makes a virtual call (which we know will be `f`) and then calls `g`. It inlines and removes the trivial `~A()`, tears down `main()` and leaves.

So this program behaves as “expected”, it would print out e, f, g like the unoptimized one. MSVC 19.14 with /Ox does the same:

```main    PROC
\$LN19:
sub     rsp, 40                             ; 00000028H
lea     rax, OFFSET FLAT:const B::`vftable'
mov     QWORD PTR a\$[rsp], rax
call    void e(void)                         ; e
mov     rax, QWORD PTR a\$[rsp]
lea     rcx, QWORD PTR a\$[rsp]
call    QWORD PTR [rax+8]
call    void g(void)                         ; g
xor     eax, eax
ret     0
main    ENDP
```

Using 19.28 and compiling for x86 not x64 has an extra flourish where it doesn’t make an indirect call, instead it checks if it has a B vtable and makes a direct call. But that’s the same in essence since it’s still deciding based on the vtable:

```_main   PROC
push    ecx
mov     DWORD PTR _a\$[esp+4], OFFSET const B::`vftable'
call    void e(void)                         ; e
mov     eax, DWORD PTR _a\$[esp+4]
mov     eax, DWORD PTR [eax+4]
cmp     eax, OFFSET virtual void B::x(void)      ; B::x
jne     SHORT \$LN6@main
call    void f(void)                         ; f
call    void g(void)                         ; g
xor     eax, eax
pop     ecx
\$LN6@main:
lea     ecx, DWORD PTR _a\$[esp+4]
call    eax
call    void g(void)                         ; g
xor     eax, eax
pop     ecx
ret     0        ret     0
```

Clang on the other hand will apply its powerful optimizer to this case:

```main:                                   # @main
push    rax
call    e()
call    f()
call    g()
xor     eax, eax
pop     rcx
ret
```

It’s likely using dataflow, so I am guessing it follows the provenance of the pointer, finds it’s been copied from the `B` vtable and then devirtualises. Note that if you put in destructors it will call `~A()` not `~B()` because like GCC, it never makes a virtual call in the first place. So far so sort-of sensible. You can nod approvingly at the power of clang’s optimizer but not feel pessimistic about VS2017 just punting on that and doing the “obvious” thing (except on x86, where it makes the same deduction but it much more conservative with its actions).

But what about GCC? This is present in all version from 5 onwards, and I’m picking 7.5 (my machine version, though 10.2, the latest on Godbolt, does the same) with -O3 and it does:

```main:
sub     rsp, 8
call    e()
```

That’s it. The whole of `main()`. It calls `e()`, then nothing. No tear down, no return, nothing, it just falls off the end into whatever code happens to be lying there. This is the source of the nasal demons.

Undefined behaviour is not allowed, according to the standard. Therefore, according to GCC, it does not happen. The compiler goes one step further than clang and noticed that the line after `e()` has the wrong vtable and that is undefined behaviour so it has deduced that the line is never reached. And the only way for that to happen is if `e()` never returns therefore it marks the code after `e()` as unreachable and deletes it.

When the standard says “undefined”, the standard means it, and the compiler is allowed to reason backwards in time to make optimizations with the assumption that the program is valid. This is very unintuitive but entirely legal and part of a very powerful optimization pass.

This isn’t the compiler being, as some people feel, perversely pedantic just to mess with the unwary programmer.

It’s really handy: GCC can take some inlined code and figure out that a pointer is non-null based on how it’s used, and can then, say, travel back through the code with that knowledge and remove all the tests for nullness and alternative branches based on such tests making the inlined code both faster and more compact. So you write your code to be generic, and GCC gives you an extra-fast, extra compact version when it finds a special use case. It’s exactly the sort of thing a person might do, but it’s automatic and woe betide the person who violates the preconditions.

So the last question is why this specific behaviour? Why does it print out many lines and then quit? Well, here’s an objdump of the relevant part of the resulting executable:

```int main() {
8d0:	48 83 ec 08          	sub    \$0x8,%rsp
A a;
r(&a);
e();
8d4:	e8 51 01 00 00       	callq  a2a <e()>
8d9:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)

00000000000008e0 <_start>:
8e0:	31 ed                	xor    %ebp,%ebp
8e2:	49 89 d1             	mov    %rdx,%r9
8e5:	5e                   	pop    %rsi
8e6:	48 89 e2             	mov    %rsp,%rdx
8e9:	48 83 e4 f0          	and    \$0xfffffffffffffff0,%rsp
8ed:	50                   	push   %rax
8ee:	54                   	push   %rsp
8ef:	4c 8d 05 8a 02 00 00 	lea    0x28a(%rip),%r8        # b80 <__libc_csu_fini>
8f6:	48 8d 0d 13 02 00 00 	lea    0x213(%rip),%rcx        # b10 <__libc_csu_init>
8fd:	48 8d 3d cc ff ff ff 	lea    -0x34(%rip),%rdi        # 8d0 <main>
904:	ff 15 d6 16 20 00    	callq  *0x2016d6(%rip)        # 201fe0 <__libc_start_main@GLIBC_2.2.5>
90a:	f4                   	hlt
90b:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)

```

Look what happens to sit right after `main()`: it’s `_start`, which is the operating system’s entry point. So after falling off the end of main, it runs into that and so simply restarts the program entirely from scratch. The program then makes a call to main which never returns because it runs off the end… and you get the idea. Eventually the program exhausts the stack and segfaults. If you change the optimization settings of the file with the stubs in, the behaviour changes, because a different piece of code gets randomly wandered into.

I thought this was fascinating: I know that undefined behaviour can in principle do some very strange things but it’s interesting to see an example where it really does. I would never have predicted infinite recursion as an outcome. So don’t mess with undefined behaviour: one day the compiler might really make demons fly from your nose.

Adafruit mini thermal printer, part 3/3: Long jobs, cancellation and paper out

Writing a printer driver from scratch is quite involved. Who knew?

Code on github: https://github.com/edrosten/adafruit-thermal-printer-driver. Note: I wrote these posts as I went along so there may be bugs in the code snippets which are fixed later. I recommend checking the GitHub source before using a snippet.

This post appears to be about three unrelated things but it isn’t. It’s all about reading back data from the printer.

Cancellation

So, cancellation works in as much as things stop printing. Except none of the end of job stuff gets printed (the “cancelled” message and the paper eject). First I thought it was because I was lazy, so I changed the signal handler to:

```	{
struct sigaction int_action;
memset(&int_action, 0, sizeof(int_action));
int_action.sa_handler = [](int){
cancel_job = 1;
};
sigaction(SIGTERM, &int_action, nullptr);
}
```

This is the approved method, since the signal method is ill specified in general and on Linux on entry to the handler, it causes the handler to get reset to the default (terminate). I thought maybe that was happening. Do you think this worked?

The next step was to add LogLevel debug to /etc/cups/cupsd.conf, so it records all my debug messages. It does, along with a bunch of other useful stuff and its all indexed by the print job number. A filtered log looks like this:

```D [29/Dec/2019:14:21:18 +0000] [Job 184] envp[25]=\"PRINTER=pl\"
D [29/Dec/2019:14:21:18 +0000] [Job 184] envp[26]=\"PRINTER_STATE_REASONS=none\"
D [29/Dec/2019:14:21:18 +0000] [Job 184] envp[27]=\"CUPS_FILETYPE=document\"
D [29/Dec/2019:14:21:18 +0000] [Job 184] envp[28]=\"FINAL_CONTENT_TYPE=application/vnd.cups-raster\"
D [29/Dec/2019:14:21:18 +0000] [Job 184] envp[29]=\"AUTH_INFO_REQUIRED=none\"
D [29/Dec/2019:14:21:19 +0000] [Job 184] Start rendering...
D [29/Dec/2019:14:21:19 +0000] [Job 184] Set job-printer-state-message to "Start rendering...", current level=INFO
D [29/Dec/2019:14:21:19 +0000] [Job 184] Processing page 1...
D [29/Dec/2019:14:21:19 +0000] [Job 184] Set job-printer-state-message to "Processing page 1...", current level=INFO
D [29/Dec/2019:14:21:19 +0000] [Job 184] PAGE: DEBUG: Read 2 bytes of print data...
D [29/Dec/2019:14:21:19 +0000] [Job 184] 1 1
D [29/Dec/2019:14:21:19 +0000] [Job 184] bitsperpixel 8
D [29/Dec/2019:14:21:19 +0000] [Job 184] BitsPerColor 8
D [29/Dec/2019:14:21:19 +0000] [Job 184] Width 384
D [29/Dec/2019:14:21:19 +0000] [Job 184] Height799
D [29/Dec/2019:14:21:19 +0000] [Job 184] feed_between_pages_mm 0
D [29/Dec/2019:14:21:19 +0000] [Job 184] mark_page_boundary 0
D [29/Dec/2019:14:21:19 +0000] [Job 184] eject_after_print_mm 10
D [29/Dec/2019:14:21:19 +0000] [Job 184] auto_crop 0
D [29/Dec/2019:14:21:19 +0000] [Job 184] enhance_resolution DEBUG: Wrote 2 bytes of print data...
D [29/Dec/2019:14:21:19 +0000] [Job 184] 0
D [29/Dec/2019:14:21:19 +0000] [Job 184] Feeding 155 lines
D [29/Dec/2019:14:21:19 +0000] [Job 184] Feeding 47 lines
```

It has the outputs from various filters all mixed together, possibly with some race conditions… (can you spot them?). Anyway, the cancel message is coming through and getting processed correctly. But no output is happening.

Debugging this was tricky because there were several causes. What I eventually did was add a 100ms pause between lines in order to reduce the amount of paper wasted and that revealed something interesting. One case was simply that sometimes the heating level was too low and the text was invisible.

In the other case, I’m just not sure. If the buffer is too full, then the last bits of the job seem to get “lost” somehow, if a cancellation occurs. With a 100ms pause, I always get the cancellation message. If I make the pause shorter then the printer can’t keep up and after a while the buffers all become full. In that case, I get cancellation messages if done early (when the buffers aren’t yet full) but not late.

I don’t yet know if long jobs get truncated. I suspect that the same would happen because there appears to be nothing functionally different between cancellation and normal termination. I don’t know who is responsible for this, but I’d be surprised if it was CUPS. My guess is no one has ever tested printing large amounts of full page bitmaps on this printer simply because that’s not the intended use. Speaking of not the intended use…

Abuse of paper sensors

As far as I can tell there isn’t an obvious way to query the buffer status to avoid it getting too full. I don’t even know where the buffer is. I expect the USB system has one, as does the USB chip and the UART on the printer.

But the printer does have a “Transmit Status” command (Page 42) for which it warns that there may be a lag since it’s processed in sequence. Even worse/better you can’t use this one to detect paper out because once the paper ends, the printer goes offline and won’t execute the command (I expect the paper sensor status command may be more asynchronous). Also that appears to be untrue, I tried it with the following code:

```exec 3<> /dev/usb/lp0
echo -ne '\x1dr1' >&3
dd bs=1 count=1 status=none <&3 | od -td1
```

And I got back 0 with paper in and 12 with the door open.

That apparently useless synchronous mechanism may be just the ticket: I bet if I stuff the command stream with these then I can get an approximation of the number of lines printed. The code looks something like this:

```void transmit_status(){
cout << GS << "r1" << flush;
}

void wait_for_lines(const int lines_sent, int& read_back, int max_diff){
for(;;){
char buf;

break;

cerr << "DEBUG: buffer too full (" << lines_sent - read_back << "), pausing...\n";
using namespace std::literals;
}
}

// ... and in the main print loop...
//Stuff requests for paper status into the command stream
//and count the returns. We allow a gap of 80 lines (1cm of printing)
transmit_status();
lines_sent++;
```

Checking the print logs shows this does what is expected. Furthermore, cancellation works properly (it prints the cancelled message and ejects the job) and is pretty quick!

Paper out!

OK, so I’m already reading the paper status. The manual suggests I might not be as I mentioned except I’m reading it before/after every line, so in that case I think I’m safe. Besides, it’s not entirely clear how you’re meant to differentiate between all the async replies:

When Auto Status Back (ASB) is enabled using GS a, the status
transmitted by GS r and the ASB status must be differentiated using.

(page 42)

Maybe some of the undefined bits are actually set. Who knows?

Anyway, all that remains is to transmit that back to CUPS. It’s broadly covered here.

BAH!

It didn’t work. Turns out the manual is only not right in very specific circumstances. Fortunately it seems for the status command bit 5 is always set so I could test for that.

So I stuffed the command stream with the proper status reports too and, well, guess what?

I just got a big old stream of zeros back from the printer. I could try the async reporting. That might work, but the printer has only a single sensor and stops running when it’s tripped. What I could do is see if nothing has changed for some time and report that as a paper out event.

This seems a bit hacky and it is. I’m not all that surprised though. This family of printers are mostly RS/232 based with asynchronous status lines in addition for paper, not USB. They’re also not expected to print out vast amounts of data; receipts are usually a few pages at most of plain text. I expect these obscure paths haven’t been exercised much.

Oh yes, hacky. So, here’s the code, it’s pretty straightforward overall:

```void wait_for_lines(const int lines_sent, int& read_back, int max_diff){
using namespace std::literals;
using namespace std::chrono;

bool has_paper=true;

for(;;){
char buf;

if(!has_paper){
cerr << "STATE: -media-empty\n";
cerr << "STATE: -media-needed\n";
cerr << "STATE: -cover-open\n";
cerr << "INFO: Printing\n";
}

has_paper = true;
}
else if(auto interval = steady_clock::now() - time_of_last_change; interval > 2500ms){
cerr << "DEBUG: no change for " << duration_cast<seconds>(interval).count() << " seconds, assuming no paper\n";
if(has_paper){
cerr << "STATE: +media-empty\n";
cerr << "STATE: +media-needed\n";
cerr << "STATE: +cover-open\n";
cerr << "INFO: Printer door open or no paper left\n";
}
has_paper = false;
}

cerr << "DEBUG: Lines sent=" << lines_sent << " lines printed=" << read_back << "\n";

break;

cerr << "DEBUG: buffer too full (" << lines_sent - read_back << "), pausing...\n";
}
}

```

I’ve gone for an all inclusive approach with the messages. The printer cannot distinguish between the door being open and a lack of paper, so I’ve reported both.

It works!

The driver is now feature complete for a first version at any rate. There’s some minor image quality problems in normal mode (caused by fast feeds before bitmaps) and a bit of stripyness caused by poor calibration in enhanced mode. And the plain text filter probably should be a proper filter that does status read back and buffering. But it isn’t.

Adafruit mini thermal printer, part 2/3: CUPS and other vessels

I bought a printer and have blogged about it because it’s literally the most interesting thing ever.

Code on github: https://github.com/edrosten/adafruit-thermal-printer-driver. Note: I wrote these posts as I went along so there may be bugs in the code snippets which are fixed later. I recommend checking the GitHub source before using a snippet.

This post is about integrating with CUPS so I can print from normal programs.

Integrating with CUPS

So, I have a sort of working example of CUPS integration in the existing ZJ-58 driver. It is, I suspect not very good. Nonetheless, I’ll start there since it’s vastly easier starting from a working example than the documentation.

Note from the future: The documentation…

It does exist, but it’s scattered over the various projects, those being PostScript, other Adobe printer guff, CUPS, the GhostScript interpreter, the printer working group and so on). It’s the type of documentation where you can’t find anything so you do 95% of the work the hard way, get stuck on an obscure API call/keyword/etc and then that string turns up the documentation you needed at the beginning.

Anyway, Here’s the install script from the existing driver:

```#!/bin/bash

# Installs zj-58 driver
# Tested as working under Ubuntu 14.04

/etc/init.d/cups stop
cp rastertozj /usr/lib/cups/filter/
mkdir -p /usr/share/cups/model/zjiang
cp ZJ-58.ppd /usr/share/cups/model/zjiang/
cd /usr/lib/cups/filter
chmod 755 rastertozj
chown root:root rastertozj
cd -
/etc/init.d/cups start

```

That’s pretty simple: basically it dumps some files into the CUPS tree and restarts CUPS.

From what I understand, CUPS essentially has some sort of specification of various filter chains (which can vary based on the input, e.g. a plain text file, a postscript file and a JPG will have different input filters). A given filter lists its accepted inputs and CUPS works backwards to figure out how to generate what’s required. For raster things (i.e. not plain text when the printer can accept plain text) CUPS will rasterise the input and you need to then get it sent to the converter to convert it to the right control codes (mostly the topic of the previous post).

Many of those are controlled by a PPD file. This stands for “PostScript Printer Description” and tells CUPS all about the printer capabilities. It also has extensions beyond the Adobe PPD spec to allow you to specify rasterisation and filters for non PostScript printers.

The driver of course comes with a PPD (it has to), but it’s long, complicated, has fragments of PostScript in it and doesn’t even pass the tests run by the `cupstestppd` command. And there’s a lot of duplicated information about pages sizes which I suspect needs to be consistent. Not great but it’s a start.

So, while PPD is documented (or some approximation thereof) and the CUPS extensions are likewise, apparently you’re not really meant to write PPDs anyway. The easy/approved way is to write DRV files and then compile them into one or more PPD files using `ppdc`.

Either way the documentation is poor. There are lots of attributes in existing PPD and DRV files like “Filesystem” that it’s very hard to find any kind of documentation for and others like “PSVersion” which are weakly documented (what are the acceptable range of values?).

Some information is here. On the subject of PSVersion, typing `revision =` into a GhostScript interpreter reveals that my machine (Ubuntu 18.04) has revision 926 for whatever that’s worth. Either way it seems optional. Anyway, I’ve tried to pare down my DRV file to the absolute minimum which covers what I want and I got this:

```#include <font.defs>

DriverType custom  //Required I believe to set downstream filters
ManualCopies Yes //Set to yes if the driver doesn't know how to print multiples of pages
Attribute "LanguageLevel" "" "3" //Default is 2 (from 1991), latest version is from 1997
Attribute "DefaultColorSpace" "" "Gray" //Self explanatory except does this mean something else can change it?
Attribute "TTRasterizer" "" "Type42" //Default is none, Type42 is the only extant useful one.
Filter application/vnd.cups-raster 0 rastertoadafruitmini //Arguments are datatype to feed to the filter, the expected CPU load, and the name of the filter executable
ColorDevice False

Font * //Include all fonts

// Manufacturer, model name, and version of the driver
ModelName "Mini"
Version 1.0
ModelNumber 579 //That's the product number on the website.

//I believe this allows users to specify custom sizes in the
//print dialog, or on the command line.
VariablePaperSize Yes
MinSize 58mm 5mm
MaxSize 58mm 1000mm

//#media creates media definitions which may or may not be used
//The paper is always 58mm wide, and have for now three different
//lengths
#media "58x50mm" 58mm 50mm
#media "58x100mm" 58mm 100mm
#media "58x200mm" 58mm 200mm

//The print area is always 48mm wide, centred
HWMargins 5mm 0 5mm 0

//This actually uses the media definitions above
*MediaSize "58x50mm"
MediaSize "58x100mm"
MediaSize "58x200mm"

// Supported resolutions
// Use as: Resolution colorspace bits-per-color row-count row-feed row-step name
// Apparently mostly the row stuff is 0 in most drivers. The last field
// (name) needs to be formatted correctly
*Resolution k 8 0 0 0 "208dpi/208 DPI"

// Name of the PPD file to be generated
PCFileName "mini.ppd"
```

OK, strictly speaking this isn’t the absolute minimum, since I’ve specified several virtual page sizes and variable sized pages, which is how CUPS deals with roll media. Here’s the corresponding install shell script to dump things in the right place:

```/etc/init.d/cups stop

/etc/init.d/cups start
```

Now, running that and going to http://localhost:631 and going through the motions shows the printer there with the options I’d expect (i.e. paper size). The printer device appears as “unknown” in CUPS since it works as a USB parallel port (`/dev/usb/lp0`), but doesn’t report anything back to CUPS. Even with that , it won’t work yet, because I need in no particular order

• Proper information logging to stderr in a format that CUPS likes
• Deal with commandline arguments that CUPS hands me
• Handle SIGTERM (used to cancel jobs) and not leave the printer in a bad state

In addition, you can add arbitrary choices to the driver which get passed on to the filter so I think I’ll add ones for feeding paper after the job has done (so the end of the last page ends at tearoff on the printer), auto cropping pages (removing white space at the top and bottom–useful for roll media), and marking page boundaries. Because why not? I only have to implement them later.

Options are implemented using an `option` directive followed by a bunch of `choice` directives, e.g.

```Option "TestOption" PickOne DocumentSetup 0
*Choice "A" ""
Choice "B" ""
Choice "C" ""

```

You can have `Boolean`, `PickOne` or `PickMany`. I don’t really see the point of `Boolean`: all of them need to have choice directives (for reasons which will soon become clear), so there’s little difference between a `Boolean` and a `PickOne` with two options.

The only difference seems to be that it renders a boolean as a radio group not a drop down list in the web interface:

hmmm. I wonder…

OK Confirmed! You can have as many “boolean” choices as you like, though note that the troolean choices don’t appear in the print dialog boxes, whereas booleans appear as checkboxes. Neither the compiler nor the validator complained which seems like a mild oversight.

With that silly aside out of the way, the next bit is how those options are passed to the printer driver. It turns out there are two ways, both of which are applied simultaneously.

The first, is that the options are passed as a command line argument to the filter, along with the PPD file (in the `PPD` environment variable). The CUPS API provides some handy functions for parsing PPD files and option strings and generally dealing with it.

The second is that each choice comes with an arbitrary snippet if PostScript code which is run at the point specified by the `option` directive (it can be at places like document start, page start). Now PostScript has a `setpagedevice` command which basically accumulates a dictionary for device specific use. The CUPS driver will put certain elements in that dictionary into the raster page headers, and you can access them from C in the filter. It doesn’t support arbitrary dictionaries, and in fact what it has is:

```unsigned cupsInteger[16];
float cupsReal[16];
char cupsString[16][64];
```

You can fill these up by putting appropriately named things into the dictionary, e.g.:

```<</cupsInteger1 10 /cupsReal7 2.2 /cupsString3 (a string)>> setpagedevice
```

W00t! I just found the documentation (by searching for cupsInteger0 to see if it was 0-based or 1-based; it’s 0-based). Turns out there are loads of parameters you can pass this way. Many have “accepted” meanings but you can abuse them to pass arbitrary data since you control both sides.

The two choices are pretty much equivalent, so I’ll pick… uh. Ummm OK wow I’m suffering from choice indecision here. OK, I’ll go for option 2. The API for option 1 is the usual annoying C faff, plus apparently it’s been deprecated since 2012 and I don’t have a nice example of the new API to copy from.

Putting all that together code added to the DRV file looks like this:

```//The last argument is the order in which the order in which the options
//are executed (each one comes with a snippet of code to execute). In this
//case, all snippets are empty.
Option "PageFeed/Feed paper between pages" PickOne DocumentSetup 0
*Choice "None" "<</cupsInteger0  0>>setpagedevice"
Choice "1mm"   "<</cupsInteger0  1>>setpagedevice"
Choice "2mm"   "<</cupsInteger0  2>>setpagedevice"
Choice "5mm"   "<</cupsInteger0  5>>setpagedevice"
Choice "10mm" "<</cupsInteger0 10>>setpagedevice"

Option "PageMark/Mark where to cut pages" Boolean DocumentSetup 1
*Choice "No" "<</cupsInteger1 0>>setpagedevice"
Choice "Yes" "<</cupsInteger1 1>>setpagedevice"

Option "EjectFeed/Feed paper after printing" PickOne DocumentSetup 2
Choice "None"  "<</cupsInteger2  0>>setpagedevice"
*Choice "5mm"  "<</cupsInteger2  5>>setpagedevice"
Choice "10mm" "<</cupsInteger2 10>>setpagedevice"

Option "AutoCrop/Crop page to printed area" Boolean DocumentSetup 3
*Choice "No" "<</cupsInteger3 0>>setpagedevice"
Choice "Yes" "<</cupsInteger3 1>>setpagedevice"
```

The *’s indicate the default choices. And this so far appears to work! The web interface shows this:

Sweet!

Writing a valid CUPS filter

This is actually documented reasonably well if you know where to look. I believe I can ignore all arguments (I’m using the other method for options, and I’ve told the driver I don’t know how to make copies myself) except the optional `argv[6]` which is the file to print if it’s not `stdin`. Yay.

Cancellation is easy: ignore SIGPIPE and clean up on SIGTERM. Since it’s a simple program, I can use a simple solution where I just poll a global variable:

```volatile sig_atomic_t cancel_job = 0;
//...
signal(SIGPIPE, SIG_IGN);
signal(SIGTERM, [](int){ cancel_job = 1;});
```

Logging likewise is easy and involves writing to stderr something like `TYPE: data` where TYPE is the message type. The type has things such as ERROR, DEBUG, etc for logging, PAGE for recording the current page number, STATE for indicating things like paper empty and so on. The format of the data depends on the message type.

Paper empty and so on can be queried from the printer using special control codes and CUPS looks like it has a way to read back anything returned. I’m not so sure how this works yet. I’ll deal with that later.

Dealing with options took me far too long. I started with the following code snippet:

```	cups_raster_t *ras;
//...
{

```

and it didn’t really work. And by “didn’t work”, I mean that I tried adding `-dcupsInteger0=1` to the GhostScript invocation (this sets an integer variable and somehow these magically wind up in `setpagedevice`, I don’t know how) and I could only set 0, 1 and 2. None of the other integers could be set.

If you cast your mind back to the first post in this series, I mentioned that I cargo-culted an invocation of GhostScript and wasn’t sure what everything did. Well, it came to bite me here. It has the innocuous looking argument `-sMediaClass=PwgRaster` (-s just sets a variable in the interpreter). This is now getting in quite deep. `MediaClass` is a variable which affects the se`tpagedevice` command (page 21 of the PostScript® Language Reference Manual Supplement published in 1996 on April 1 and it is deadly serious) in various nonspecific (vendor defined) ways. And one such vendor is the shadowy cabal known as the “Printer Working Group” or PWG for short (its more exciting if they are a shadowy cabal). I sort of unearthed them by forlornly digging through `cups/raster.h` looking for clues and found this (edited) for display:

```// The following PWG 5102.4 definitions specify indices into the
// cupsInteger[] array in the raster header.
#  define CUPS_RASTER_PWG_TotalPageCount	0
#  define CUPS_RASTER_PWG_CrossFeedTransform	1
// etc...
#  define CUPS_RASTER_PWG_VendorLength		15
```

Turns out they have defined their own meanings for the user-defined extensions and brazenly took all of them. What I don’t understand is why I could set 0, 1 and 2, but not 3 onwards. No clues there. It also stopped cupsReal and cupsString from working and set PWG_AlternatePrimary to 224-1. ¯\_(ツ)_/¯

What went wrong

So that all sort of worked, and I can print out cats using lpr. Except…

Inverted cats. And junk

The cats come out inverted, like this:

```*Resolution k 8 0 0 0 "203dpi/203 DPI"
```

which is the “black” colour model. If I change the “k” to “w”, I get what I expect except with some junk at the top.

What I actually need is:

```*ColorModel Gray/Grayscale w chunky 0
*Resolution - 8 0 0 0 "203dpi/203 DPI"
```

I don’t know why. The colour model specifies the white model (along with `chunky` which is means packed for colour data and no compression), then the resolution says to not modify the colour model. Ok, sure…

Nope!!

Turns out that wasn’t it. I must have just reset things when making that change. The junk was because… well I don’t know exactly. It doesn’t appear on the first printout, it only appears on the third. And if I send enough text to the printer then the next image is fine. It therefore appears as if something was getting flushed before the last line was complete. Then the first few bytes (including the start bitmap control code) were getting eaten up finishing the previous line and then it was printing data out as text.

```void printerInitialise(){
cout << ESC << '\x40';
}
```

calls to which I sprinkled liberally around, and these are messing things up. Here’s the funny thing though: putting a `cout << flush` after the first one fixed it. That ought to make sense: the printer gets data asynchronously then starts processing it while the UART asynchronously fills the receive buffer. It processes the initialise command and loses the first few control codes. Or something.

Except… the symptoms only manifested after several images, making it look like it was state being carried over. It’s weird, I don’t get it. Clearly there’s some internal state somewhere, and part of me things is might be in CUPS because I suspect the original driver used to work just fine.

Page Sizes

The print dialog boxes seemed to get deeply confused about the smallest page size (58x50mm). The reason for this it turns out is that it’s really a landscape page not a portrait one and pages need to be specified in portrait orientation. Except that would make the width wrong. If I’d paid attention to the warnings from `cupstestppd`, then I would not have had this problem.

```ppd/mini.ppd: PASS
WARN    Size "58x50mm" should be the Adobe standard name "50x58mmRotated".
```

And it turns out all I have to do is switch the name:

```#media "50x58mmRotated" 58mm 50mm
#media "58x100mm" 58mm 100mm
#media "58x200mm" 58mm 200mm

HWMargins 5mm 0 5mm 0

*MediaSize "50x58mmRotated"
MediaSize "58x100mm"
MediaSize "58x200mm"
```

and things seem to be much more sensible.

Booleans

The print dialog box renderers don’t really know which option is meant to correspond to a check mark and which isn’t. I tried changing the keyword to “True” and “False” and putting true first in the list, e.g.:

```Option "PageMark/Mark where to cut pages" Boolean DocumentSetup 1
Choice "True/Yes" "<</cupsInteger1 1>>setpagedevice"
*Choice "False/No" "<</cupsInteger1 0>>setpagedevice"
```

That seemed to do the job. I believe it’s the ordering that matters, I’m not sure though.

Other stuff

There were a few other miscellaneous bits and bobs to fix too. In addition I implemented the various features I mentioned above. I decided also to emit blank lines as a feed rather than a blank line because it’s a fair bit faster. Except I had to suppress that in enhanced resolution mode, because otherwise the first few lines printed after a gap were too dark.

I also want the printer to print plain text as plain text. This isn’t necessary but it’s always been idiomatic to pass through like that, rather than relying on the postscript rasteriser. I can fix that with one extra line in the DRV file:

```Filter text/plain 0 -
```

That tells CUPS that it accepts text, is no cost and to use a null filter program.

Cancellation

Oh wow this turned out to be hard. Way harder than expected because it reveals deep problems. It’s going to be a whole other blog post.

Result!

OK so basically it works!

I can print using `lp` (or `lpr`), and set options like `-o Enhance=True -o PageMark=True` and it obeys them.

Recognise this?

Adafruit mini thermal printer, part 1/3: getting better pictures

I bought an AdaFruit Mini thermal printer.

Code on github: https://github.com/edrosten/adafruit-thermal-printer-driver. Note: I wrote these posts as I went along so there may be bugs in the code snippets which are fixed later. I recommend checking the GitHub source before using a snippet.

It’s pretty cute, and it’s actually very old school in terms of its function. Firstly in a very old fashioned twist, it comes with a full manual documenting every single control code. Not only that but the printer is surprisingly capable and it’s designed to work with very low end driving systems. It doesn’t just print bitmaps, it has various fonts and modes (double height, width, etc), you can download custom fonts and bitmaps to print on demand.  You can print upside down and back to front so the text looks the right way round if you’re facing the printer (super cute!). I has justification modes, bold and underline. It can even print barcodes!

You know this reminds me of when I was 14(?) and got my first computer, a BBC Micro complete with a 5.25″ floppy drive and a printer. The printer came with a manual with full documentation of all the control codes and I devoured them and wrote a basic typesetter like system in which I did my school projects.

So where was I?

Oh yes, well I don’t actually need most of those features. I’m planing on driving it from Linux (on a Pi), which means it’ll be driven by GhostScript via CUPS and will print bitmaps. And not use any of those features.

Turns out Adafruit provide a CUPS driver. Apparently provided from one provided by the printer maufacturer? So, I installed it and this is the result:

woo! it prints! Except… the output isn’t great. The printer is monochrome and the pictures come out halftoned using a halftone screen. While that’s a fine choice for various kinds of printing, it’s not great for a device with independent pixels. For that, a dithering method such as Floyd-Steinberg would be much better. Also it’s messing up the first line and printing junk, but you know details, details.

PostScript, being designed for proper printing has native support for halftoning. It doesn’t for dithering, and it turns out there’s no way to persuade it to emit a monochrome bitmap using dithering instead of halftone screens. If you want dithering, you need to do it in the driver. So, I’m going to need a custom driver.

So I first need to understand printing.

Printing on Linux greatly simplified

Printing on Linux isn’t simple. Partly this is because printing in general is not simple. And partly it’s because printing has changed a lot over the years and there are lots of vestigial bits lying around. For common, modern systems the order of operations is roughly:

1. CUPS accepts jobs (and provides information to the print dialogs).
2. CUPS examines the file type and decides what to do next, e.g. whether to run it through GhostScript.
3. CUPS runs it through ghostscript generating a stream in CUPS raster format. This is a simple bitmap format with a C API.
4. CUPS runs some arbitrary filter program.
5. Filter program transforms CUPS bitmap into printer control codes.
6. CUPS routes the resulting data to the correct device.

GhostScript also has some printer drivers built in, an there are various other filter schemes (GhostScript is one of many) such as foomatic, and of course printers can accept plain text too. I’m not really interested in those so I’ll stick to the sequence above.

Steps 2-4 are controlled by a PPD (PostScript Printer Description) file, and 5 is a program which reads in CUPS bitmap data and emits control codes. The CUPS raster format is well documented but it seems simpler to use the C API, especially as I have a working driver to cadge from.

What I’m going to do first is figure out how to print out what I want (i.e. the right control codes) and then figure out how to work it into CUPS.

Getting CUPS raster data and a simple driver

The first job is to get the input data. After a bunch of cargo-culting, I got this script:

```DPI=203.2
gs -dPARANOIDSAFER -dNOPAUSE -dBATCH -sstdout=%stderr -sOutputFile=%stdout \
-sDEVICE=cups -sMediaClass=PwgRaster -sOutputType=Automatic -r\${DPI}x\${DPI} \
-dDEVICEWIDTH=384 -dDEVICEHEIGHT=384 -dcupsBitsPerColor=8 -dcupsColorOrder=0 \
-dcupsColorSpace=0 -dcupsBorderlessScalingFactor=0.0000 -dcupsInteger1=1 \
-dcupsInteger2=1 -scupsPageSizeName=na_letter_8.5x11in -I/usr/share/cups/fonts \
"\$@"
```

I don’t remember precisely how I found all the various bits. The important things are that it’s colourspace 0 (white), 8 bits per colour, CUPS raster format, 384 pixels wide and 8 pixels per mm. Everything else is just necessary guff (IO redirection, batch and no pause) or irrelevant stuff I never deleted.

I then basically deleted everything except the stream processing from the driver, then deleted that and started writing from scratch. After lots of head scratching and making a lot of mistakes I read the manual more carefully (bitmaps are always a multiple of 8 pixels wide) and got this code up and running:

```#include <cups/raster.h>

#include <iostream>
#include <vector>
#include <array>
#include <utility>
#include <cmath>

using std::clog;
using std::cout;
using std::endl;
using std::vector;
using std::array;

constexpr unsigned char ESC = 0x1b;
constexpr unsigned char GS = 0x1d;

// Write out a std::array of bytes as bytes.  This will form the basis
// of sending data to the printer.
template<size_t N>
std::ostream& operator<<(std::ostream& out, const array<unsigned char, N>& a){
out.write(reinterpret_cast<const char*>(a.data()), a.size());
return out;
}

array<unsigned char, 2> binary(uint16_t n){
return {{static_cast<unsigned char>(n&0xff), static_cast<unsigned char>(n >> 8)}};
}

void printerInitialise(){
cout << ESC << '\x40';
}

// enter raster mode and set up x and y dimensions
{
// Page 33 of the manual
// The x size is the number of bytes per row, so the number of pixels
// is always a multiple of 8
cout << GS << 'v' << '0' << '\0' << binary((xsize+7)/8) << binary(ysize);
}

int main(){

int page = 0;

{
page ++;
clog << "PAGE: " << page << " " << header.NumCopies << "\n";
clog << "BPP: " << header.cupsBitsPerPixel << endl;
clog << "BitsPerColor: " << header.cupsBitsPerColor << endl;
clog << "Width: " << header.cupsWidth << endl;
clog << "Height: " << header.cupsHeight << endl;

// Input data buffer for one line

clog << "Line bytes: " << buffer.size() << endl;
printerInitialise();

for (unsigned int y = 0; y < header.cupsHeight; y ++)
{
break;

//Print in MSB format, one line at a time
unsigned char current=0;
int bits=0;

for(const auto& pixel: buffer){
current |= (pixel>128)<<(7-bits);
bits++;
if(bits == 8){
cout << current;
bits = 0;
current = 0;
}
}
if(bits)
cout << current;
}

}
cout << "\n\n\n";
cupsRasterClose(ras);
}
```

To run the program, make an eps file, ideally with a cat in it. Then assuming the above script is called “to_cups.sh” and the compiled executable is called “rastertoadafruitmini”, you can run it with:

```bash to_cups.sh cat.eps | ./rastertoadafruitmini | sudo dd of=/dev/usb/lp0
```

Note that the quantisation is simply “greater than 128”, and the result is:

Note the false start at the top, and the slightly stretched image due to me converting to EPS badly. The underlying image is this:

It works! You’ll note I got the colours inverted, because I had 1 for white, and 0 for black, whereas 1 means print a pixel (i.e. black). The black bar is because of that and the page being white. The funny thing is that black areas feel incredibly wasteful of ink even though that makes no sense on a thermal printer.

Dithering the output

Clearly a simple threshold is not a very good way of converting greyscale to black and white. In fact it’s somewhat worse than the original halftoned image. The key is to employ some sort of dithering and this is best done by some sort of error diffusion algorithm.

The process works like this. While going in raster scan order:

1. Quantize the pixel current to 0 or 255
2. Work out the error between the quantized output and the pixel
3. Add fractions of the error to nearby pixels which haven’t been processed yet (this is the error diffusion step)

There are quite a few articles on it, such as this excellent one. The most common/well know algorithm for images is the Floyd-Steinberg dithering algorithm. It’s popular because it’s low resource and efficient on simple processors. Since the target machine for this will be lavishly resourced (a Raspberry Pi) I decided to go for the Jarvis, Judice, Ninke algorithm which is essentially identical to Floyd-Steinberg but with a larger error diffusion window and is more expensive and gives slightly better results.

Here’s the code (with the new bits highlighted):

```#include <cups/raster.h>

#include <iostream>
#include <vector>
#include <array>
#include <utility>
#include <cmath>
#include <algorithm>

using std::clog;
using std::cout;
using std::endl;
using std::vector;
using std::array;

constexpr unsigned char ESC = 0x1b;
constexpr unsigned char GS = 0x1d;

// Write out a std::array of bytes as bytes.  This will form the basis
// of sending data to the printer.
template<size_t N>
std::ostream& operator<<(std::ostream& out, const array<unsigned char, N>& a){
out.write(reinterpret_cast<const char*>(a.data()), a.size());
return out;
}

array<unsigned char, 2> binary(uint16_t n){
return {{static_cast<unsigned char>(n&0xff), static_cast<unsigned char>(n >> 8)}};
}

void printerInitialise(){
cout << ESC << '\x40';
}

// enter raster mode and set up x and y dimensions
{
// Page 33 of the manual
// The x size is the number of bytes per row, so the number of pixels
// is always a multiple of 8
cout << GS << 'v' << '0' << '\0' << binary((xsize+7)/8) << binary(ysize);
}

constexpr array<array<int, 5>, 3> diffusion_coefficients = {{
{{0, 0, 0, 7, 5}},
{{3, 5, 7, 5, 3}},
{{1, 3, 5, 3, 1}}
}};
constexpr double diffusion_divisor=42;

int main(){

int page = 0;

{
page ++;
clog << "PAGE: " << page << " " << header.NumCopies << "\n";
clog << "BPP: " << header.cupsBitsPerPixel << endl;
clog << "BitsPerColor: " << header.cupsBitsPerColor << endl;
clog << "Width: " << header.cupsWidth << endl;
clog << "Height: " << header.cupsHeight << endl;

// Input data buffer for one line

//Error diffusion data
vector<vector<double>> errors(diffusion_coefficients.size(), vector<double>(buffer.size(), 0.0));

clog << "Line bytes: " << buffer.size() << endl;
printerInitialise();

for (unsigned int y = 0; y < header.cupsHeight; y ++)
{
break;

//Print in MSB format, one line at a time
unsigned char current=0;
int bits=0;

for(int i=0; i < (int)buffer.size(); i++){

//The actual pixel value with gamma correction
double pixel = pow(buffer[i]/255., 1./2.2) + errors[0][i];
double actual = pixel>.5?1:0;
double error = pixel - actual; //This error is then distributed

//Diffuse forward the error
for(int r=0; r < (int)diffusion_coefficients.size(); r++)
for(int cc=0; cc < (int)diffusion_coefficients[0].size(); cc++){
int c = cc - diffusion_coefficients[0].size()/2;
if(c+i >= 0 && c+i < (int)buffer.size() && diffusion_coefficients[r][cc]){
errors[r][i+c] += error * diffusion_coefficients[r][cc] / diffusion_divisor;
}
}

current |= (pixel<0.5)<<(7-bits);
bits++;
if(bits == 8){
cout << current;
bits = 0;
current = 0;
}
}
if(bits)
cout << current;

//Roll the buffer round.
std::rotate(errors.begin(), errors.begin()+1, errors.end());
for(auto& p:errors.back())
p=0;

}

}
cout << "\n\n\n";
cupsRasterClose(ras);
}
```

And here’s the result

KITTY!!!!!!!!!!!!!

But can we do better? I’m not sure, but look at this:

You can draw on the paper using a finger nail. The faster you move at a given pressure the darker the line. I believe this is due to getting more heating. So, the paper is definitely analogue. Turns out the printer is too, kind of in that you can set the heat output per line (though not per pixel). The command is on page 47 and is the general control command. So what I did is print 255 solid lines, each one with a different heat output. The code (in AWK) is:

```BEGIN{
for(i=0; i < 255; i++){
printf("%c7%c%c%c", 27, 64, i, 2)
printf("\x1dv0\0%c\0\x01\0", 40)
for(j=0; j < 40; j++)
printf("\xff")
}
print "\n\n\n\n"
}
```

What got me going for ages is that the locale wasn’t C, to characters above 127 were getting mangled. Anyway the results is this:

That’s a yes! It’s a bit speckly, but it can definitely output greyscale. After a bit of messing around, I got the range. It goes from a timing (heat output is essentially controlled by setting the time the heating elements dwell on the paper) range of about 16 (full white) to 112 (full black) and empirically, raising the input to a power of 2 makes it look a little better. Working it into the dithering code is pretty straightforward: find the darkest pixel and set the black level to be able to reproduce that.

```#include <cups/raster.h>

#include <iostream>
#include <vector>
#include <array>
#include <utility>
#include <cmath>
#include <algorithm>

using std::clog;
using std::cout;
using std::endl;
using std::vector;
using std::array;

constexpr unsigned char ESC = 0x1b;
constexpr unsigned char GS = 0x1d;

// Write out a std::array of bytes as bytes.  This will form the basis
// of sending data to the printer.
template<size_t N>
std::ostream& operator<<(std::ostream& out, const array<unsigned char, N>& a){
out.write(reinterpret_cast<const char*>(a.data()), a.size());
return out;
}

array<unsigned char, 2> binary(uint16_t n){
return {{static_cast<unsigned char>(n&0xff), static_cast<unsigned char>(n >> 8)}};
}

void printerInitialise(){
cout << ESC << '\x40';
}

// enter raster mode and set up x and y dimensions
{
// Page 33 of the manual
// The x size is the number of bytes per row, so the number of pixels
// is always a multiple of 8
cout << GS << 'v' << '0' << '\0' << binary((xsize+7)/8) << binary(ysize);
}

void set_heating_time(int time_factor){
// Page 47 of the manual
// Everything is default except the heat time
cout << ESC << 7 << (char)7 << (unsigned char)std::max(3, std::min(255,time_factor)) << '\02';
}

constexpr array<array<int, 5>, 3> diffusion_coefficients = {{
{{0, 0, 0, 7, 5}},
{{3, 5, 7, 5, 3}},
{{1, 3, 5, 3, 1}}
}};
constexpr double diffusion_divisor=42;

double degamma(int p){
return pow(p/255., 1/2.2);
}

int main(){

int page = 0;

{
page ++;
clog << "PAGE: " << page << " " << header.NumCopies << "\n";
clog << "BPP: " << header.cupsBitsPerPixel << endl;
clog << "BitsPerColor: " << header.cupsBitsPerColor << endl;
clog << "Width: " << header.cupsWidth << endl;
clog << "Height: " << header.cupsHeight << endl;

// Input data buffer for one line

//Error diffusion data
vector<vector<double>> errors(diffusion_coefficients.size(), vector<double>(buffer.size(), 0.0));

clog << "Line bytes: " << buffer.size() << endl;
printerInitialise();

for (unsigned int y = 0; y < header.cupsHeight; y ++)
{
break;

//Estimate the lowest value pixel in the row
double low_val=1.0;
for(int i=0; i < (int)buffer.size(); i++)
low_val = std::min(low_val, degamma(buffer[i]) + errors[0][i]);
//dark enough
low_val*=0.99;

//Set the darkness based on the darkest pixel we want

//Emperical formula for the effect of the timing
double full_white=16;
double full_black=16*7;
set_heating_time(pow(1-low_val,2.0)*(full_black-full_white)+full_white);

//Print in MSB format, one line at a time
unsigned char current=0;
int bits=0;

for(int i=0; i < (int)buffer.size(); i++){

//The actual pixel value with gamma correction
double pixel = degamma(buffer[i]) + errors[0][i];
double actual = pixel>(1-low_val)/2 + low_val?1:low_val;
double error = pixel - actual; //This error is then distributed

//Diffuse forward the error
for(int r=0; r < (int)diffusion_coefficients.size(); r++)
for(int cc=0; cc < (int)diffusion_coefficients[0].size(); cc++){
int c = cc - diffusion_coefficients[0].size()/2;
if(c+i >= 0 && c+i < (int)buffer.size() && diffusion_coefficients[r][cc]){
errors[r][i+c] += error * diffusion_coefficients[r][cc] / diffusion_divisor;
}
}

current |= (actual!=1)<<(7-bits);
bits++;
if(bits == 8){
cout << current;
bits = 0;
current = 0;
}
}
if(bits)
cout << current;

//Roll the buffer round.
std::rotate(errors.begin(), errors.begin()+1, errors.end());
for(auto& p:errors.back())
p=0;

}

}
cout << "\n\n\n\n\n\n";
cupsRasterClose(ras);
}
```

And it works!!

OK, the results aren’t spectacular, but look if you hear a dog talking, you’re impressed that it can talk at all, not disappointed that it can’t talk well.

The enhanced grey level image definitely has some horizontal streaking. I don’t know if that’s due to the printer or the really ad-hoc calibration of grey levels that I did. I should probably limit the rate at which the temperature changes vertically to mitigate that.

Overall I’m really pleased. The finer details are clearer and there are definitely some whiskers which you can make out in the left image which are washed out in the speckle right one. Bare in mind there are optimistically 64 distinct grey levels this printer can produce which means this technique is adding about 6 bits per line of 384 bits.

This also pushes the printer far, far beyond what it was ever supposed to do. The heating time is really a way to reduce print time and/or save on total energy draw, presumably for battery powered chip and pin machines.

I expect there is more fiddling to do, but the next stage is to integrate it into CUPS so I can print the usual way (i.e. using lp of course).

Light chasing robot part 2 (of 2)

The first version worked, but oscillated a lot in its motion. If you haven’t read it yet, I recommend reading it first otherwise this post won’t make as much sense. And if you have, it might be worth a re-read, since it took me nearly two years to post the followup.

The reason for the oscillation is that it has essentially very high feedback. If it’s very slightly off to one side, then the opposite motor comes on full, because the direction sensor divider goes into a simple comparator. Also, it turns out (I found this about a year later–yes I am a bit lazy about writing blog posts) the response of the LDRs is really slow, measurable over the timescale of a second, so the robot will swing round a significant amount before the resistive divider starts to respond. Either way making the response have a much lower gain will help.

I can reduce the gain by making the motor come on at a reduced speed in proportion to the ratio between the two LDRs.

The circuit is a little more complex than the previous one. It also falls into the category of “should have used a microcontroller” since then the upgrade would just be software and a lot more flexible. Essentially I have used a CMOS 555 in equal duty cycle mode and I’m using the capacitor voltage to get a sawtooth wave. That’s thresholded  by the comparator (opamp) to make a PWM signal. I could have also used the other amplifier in the dual opamp chip to do the same job. That would have been neater in hindsight.

Simple PWM circuit

The result is really pretty good! See:

Er… take 2!

That works well, and is a good validation of the directional light sensors (the original point of this project).

Building an automatic plant waterer (4/?): Calibrating the sensor

A short day in the attic today.

• Part 1: resistive sensing
• Part 2: finding resistive sensing is bad and capacitive sensing is hard
• Part 3: another crack at a capacitive sensor
• Part 4: calibrating the sensor

Day VII (weekend 6)

First, to check everything’s OK, I’m going to calibrate the sensor. I have a box of cheap ceramic capacitors in the E3 series and I’m going to go from 10pF to 2200pF, and I’m going to measure them with my old Academy PG015 capacitance meter since it’s likely to be more accurate than the capacitor rating.

Here are the measurements:

 Rating Measured capacitance (pf) count 0 0 12.99 10 10.5 18.84 22 22.6 25.80 47 48.3 40.48 100 101.7 70.90 220 221 134.03 470 453 259.21 1000 965 539.16 2200 2240 1227.2

I’m not 100% sure how to fit this. The obvious choice is a least squares straight line fit to find the slope and offset. However, the variance increases with the measurement and I didn’t record that. Also, I don’t know what the error on the capacitance meter is like.

So, I think the best choice is a fit in log space. The fixed slope of line works well with errors on both measurements and it deals with higher measurements having higher variance, to some extent. The equation to map measurements (M) to capacitances (C) is:
$C = p_1 ( M + p_2)$

So we just take the log of that and do least squares on the result. The code is really simple in Octave:

```% Data
d = [
0 0 12.99
10 10.5 18.84
22 22.6 25.80
47 48.3 40.48
100 101.7 70.90
220 221 134.03
470 453 259.21
1000 965 539.16
2200 2240 1227.2
];

% Initial parameters: zero point and shift
p=[1 1];

% Least squares in log space
err = @(p) sum((log(d(2:end,2)) - (log(p(1)) + log(d(2:end,3) + p(2)))).^2);

% Find the parameters
p = fminunc(err, p);

count=115;

% Compute the capacitance for a new measurement
p(1) * (count + p(2))
```

Nice and easy now does it work? Well, it seems to work with a variety of capacitors I tried it with. And to get intermediate values, I tried it with this rather delightful device from a long dead radio (range 16pF to 493pF):

and it works beautifully!

So, then I tries it on the wire wound capacitive sensor. Can you guess if it worked?

Well, it did! Funny thing though is that my capacitance meter didn’t work on that. Naturally I assumed my home built device was wrong. But it seems life wanted to troll me. Here’s what my capacitance meter does when all is good:

Nice and easy. Changing the range switch alters the speed of the downwards decay curve. So far so good. But when I attached my sensor, this happened:

Well, it did! Funny thing though is that my capacitance meter didn’t work on that. Naturally I assumed my home built device was wrong. But it seems life wanted to troll me. Here’s what my capacitance meter does when all is good:

Absolutely no idea why. It is a big coil, so it might have something to do with the inductance, or maybe pickup. I expect it has a higher input impedance than my device.

TL;DR a short one today, but the sensor works well and is in excellent agreement with my dedicated capacitance meter.

Building an automatic plant waterer (2/?): resistive sensing

This was harder than I expected.

• Part 1: resistive sensing
• Part 2: finding resistive sensing is bad and capacitive sensing is hard
• Part 3: another crack at a capacitive sensor
• Part 4: calibrating the sensor

Day III (weekend 3)

Not really much time this weekend. I pulled out the electrode and (it had been sitting there unpowered) and saw this:

Copper salts deposited on the electrodes. It was really hard to get my phone to reproduce the colour.

It looks like corrosion has already started. There’s not much but it’s only been there a few weeks and has probably spent a few of hours powered by now. With my current plan (maybe waking up every 30 minutes), the amount of time ‘on’ would be about 16 minutes per day (10 seconds of higher current charge in each direction). That’s only a week or two before it reaches these levels of corrosion. So, that I think precludes 10 second measurements, or at least 10 seconds of direct charging without a resistor in the way.

Day IV (Weekend 4)

Wow really not getting as much time on this as I’d like. So, what about capacitive measurements? Having insulated electrodes should preclude any corrosion problems and I’ll bet that using soil as dielectric will increase the capacitance as the water content increases. First, an initial experiment:

The sensor is a bit of electrical tape over some strip board.

This indicates we’re into the realms of possibility, though the number of picofarads is still small enough to be pretty irritating. It’s also a pretty useless soil sensor: soil/water get through the holes behind and the result is a measurable resistance between the electrodes (about 9M or so). And the insulator is pretty thick which is going to make the capacitance low and reduce the sensitivity.

I do have some enameled wire in various grades. The finest (0.14mm) has a very thin coating. I made a couple of different sensors using the wire, mostly wrapping it around lots to get a decent surface area:

A couple of attempts at a capacitive sensor. Left, interleaved wires, right the two plates are well separated.

One slight problem: it doesn’t measure open circuit; there’s about 15M between the two sides, rising as it dries off. The flat one measures about 28pF when in the air, and about 280 in damp soil, rising to about 2nF when just watered. Unfortunately I don’t know how much the resistance is affecting this, so I’m going to have to try again. I’m going to try the next thicker grade of wire I have (0.23mm) and incidentally it has a different color coating.

Having a nice large spacing between the two plates seemed to work well in that it was easy to clean and reset back to the dry state. So, on to version 2:

Version two of the sensor with thicker wire and nicely soldered joints. 40 turns of wire in each section.

As a reminiscent aside, I remember soldering lacquered wire back in the olden days with my fixed temperature iron. I could never settle on fine sandpaper versus a flame to remove it. Those days I do not miss, now I just crank up the iron temperature.

Anyway this seems to be going better: the resistance is greater than 2G. I think I might have mentioned it before but my multimeter skipped leg day. It can measure up to 2GOhm, but only down to 20mA. Weird. On to the capacitance measurements. So they are:

• BOGUS! that’s what they are, bogus!

Well shoot. It seemed to be working great, but after a bit of use the resistance is back to being about 10M. That’s disappointing. OK, try3! I’m going to wrap the wires longitudinally so that they never even cross:

I forgot to take a picture of it! Look at the “fix” below with hot-melt to get the idea.

Anyway the wires are always separated by about 4mm. So the measurements are:

• First use: 12pF
• Finger lightly on one side: 28pF
• Fingers pressed on both sides: 100pF
• Damp soil: 185pf, 233pF, 202pF, 190pF
• Slightly compacted damp soil: 323pF,
• Same place during watering: 680pF
• Resistance: 12M

OK well, this is getting suspicious. The 20M range is maxed out. But the 2G range reads low (there’s nothing in between, that’s only a few % difference). Now the capacitance reads in the nF range as well. Hitting it with a heat gun seems to reset everything.

So putting a blob of water on in the middle doesn’t do anything. Butting a blob of water on the end where the wires are bent round quickly drops the resistance back town to 10M. I think the act of wrapping the wire breaks the insulation very slightly. Well, that’s irritating. Let’s see:

If you use hot melt and don’t have a reflow style hot air gun, you’re really missing out.

OK, so the new measurements:

• Damp soil: 250pF, 360pF
• During watering: 1nF
• After watering: 600pF and dropping
• Resistance: 𝟚𝟘𝕄

Apparently there is something hot-melt can’t fix. Observing more, the resistance is climbing very slowly, up to 26 now, now 40. Well, it might not matter. If I keep my measurement resistors well under the 20M range (say 200k), then the small error incurred due to leakage won’t matter. Still, I’d prefer to have it work properly.

So where are we? The capacitance sensor definitely works after a fashion, but we need to measure it. It bottoms out at 30pF, and is well into useful readings at about 300pF or so. I think I could get away with a 1M resistor safely. For an RC circuit, that would give a time constant of about 30us, which is small, but that’s at the 0 end of the range. It’s just about measurable on an Attiny85 with the 16MHz clock.

Additionally, the Attiny85 has a built in comparator. So, my current mental design has a relaxation oscillator in mind: charge up the capacitor through a 1M resistor, then discharge through a GPIO pin once the voltage crosses a threshold.

Sounds like a plan.

Building an automatic plant waterer (1/?): resistive sensing

This turned into a saga. Naturally this is Part 1.

• Part 1: resistive sensing
• Part 2: finding resistive sensing is bad and capacitive sensing is hard
• Part 3: another crack at a capacitive sensor
• Part 4: calibrating the sensor

This is a blow-by-blow account rather than a neat design story, so you get to see the experiments I did to prove/disprove ideas and the dead ends that I went down. All the dead ends. So many…

Day 1

I bought a pitcher plant. Unfortunately it turns out that I am less good at remembering to water it than I fooled myself into believing. So, instead of watering it, I’m in my lab building a device to water it for me. I’m also engaging in the entertaining game of minimizing the BoM on the electronics side as much as possible. My current thought is an old 12V supply, an attiny of some sort, a MOSFET, a soil probe and a peristaltic pump.

Through the magic of the internet I have some supplies:

A peristaltic pump (12V), some T-adapters which were supposed just be couplers but I must have ordered the wrong ones and some slightly odd sized tube because the tube I ordered did not arrive.

I ordered a series of tubes

And because I have to take everything apart, here’s the pump:

There are no gears. There are only 12 parts (motor, 2 screws, mount, case top, case bottom, 3 rollers, roller holder and tube). Compare this to the older design of cheapie peristaltic pump:

Old cheap peristaltic pump.

The old design is more complex. It also noisier and doesn’t run as smoothly, it’s harder to put the tube in and it has a real tendency to split tubes if they’re not precisely the right size. I like the new design.

Exploration

The next bit is to figure out the moisture sensor. I’m going to measure the resistance between two conductors (on stripboard). Firstly, splitting a chunk off by bending it over in a vice is less reliable than I thought it might be…

Yuck.

Pre-scoring it heavily, then filing after was tedious but ultimately gave a much cleaner cut:

Now to try it on a victim plant. It’s a fuchsia which I’ve propagated from cuttings and I’ve been keeping indoors. I’ve not watered it in a while, so it’s very dry. I made a bunch of measurements with a spacing zero and 1 columns between the electrodes.

Interestingly it didn’t make all that much difference. Either way the resistance was between about 0.8 and 3.3MΩ. Now for the other end of the scale:

Of course I didn’t wait half an hour. I waited more like an hour. Either way the moisture looks like it’s thoroughly propagated around. Time to measure. First a spacing of one row:

Interestingly, the resistance takes ages to settle, on the order of minutes where it keeps changing. The direction depends on the measurement range, so I suspect there’s some sort of electrochemical effect going on. Brief pulsed measurements (as brief as I can get) on the 200K range give about 30K resistance, but it rapidly climbs. On the 200k range long term it gives, well,  it’s up to118K and climbing very slowly. On the 2M range long term it gives about 62K resistance.

Now I find it’s a dodgy battery

And guess what! It’s a cell apparently (my trusty old multimeter has a 2GΩ range but nothing below 20mA. I think it skipped leg day):

Interestingly, measurements “shortly” after shorting it (hee hee) are also around 30k. Maybe less. There’s actually something interesting going on here, and it’s too fast really to see on one of these multimeters. Plus I don’t know what their characteristics are in general. So, I’ve set up a 5V supply, a 22k resistor and the moisture sensor, and I’ve put a scope across the moisture sensor. I start by shorting across the plant, then releasing the short. And this is what it looks like on both short and long timescales:

The results are… interesting. I suspect electrolysis is occurring.  I’m going to have to try feeding it with AC to see what the results look like. Everything is always more complicated than I expect! And here’s the voltage recovery after stopping shorting it:

Trying to measure it

OK, so to the Arduino! I’m going to use one to generate AC. I use two GPIO pins as a very tiny H-bridge to generate square wave AC which is 5V pk-pk. The setup is the same before, I’ve got the sensor in series with a 22kΩ resistor. I’m measuring the voltage across the sensor. For much of the rest of this post, the measurements are going to be done in the same way, with the results shown on a scope.

For interest I’m going to always be showing the voltage in yellow and the  absolute value of the voltage in red. All things being equal, you’d expect it to be symmetric, so both halves of the trace will look the same. Hey, here’s a question: do you think it’ll be nice and simple?

So, here’s the first measurement…

Look how asymmetric the measurements are: despite the voltage reversing every cycle, all the yellow measurements are negative!

but after a while it looked like this:

Very symmetric measurements. The y axis has been doubled.

OK so what is going on here?

Day II

You know I suspect now that I was making measurements with the same polarity every single time and I made a very crude rechargeable battery. During the AC measurements it eventually discharged which is why it went from biased to unbiased.

OK, so what about a 4 point measurement? That should eliminate effects on the driven electrodes on the other hand suddenly the complexity will have spiraled rather high, from essentially microcontroller and MOSFET to a whole analogue front end. Unless I can essentially do two 3 point measurements and subtract them. Then it’s just wires…

But first the rechargeable battery hypothesis. I’m going to apply 5V for 30 seconds at whatever current it will take across the electrodes. Then I’ll measure the open circuit voltage and the short circuit current. Also, of course take those measurements before with cleaned electrodes in an undisturbed location. Before, we get 8mv and 0.3uA. I guess the electrodes weren’t perfectly clean or the soil is not perfectly isotropic…

After it’s about 0.5V rapidly decaying to about 0.3 then 0.2, but delivering at 0.2V about 15uA rapidly decaying, slowing down at about 4uA, but continuing to decay. To investigate further, I’m going to use a square wave again, but with a more interesting pattern. I’m going to drive through the 22k resistor, then discharge through the 22k resistor, then the same but with the opposite polarity:

Charge, discharge, reverse charge, discharge. Also measured with averaging for a cleaner signal. It’s already lost a bit of symmetry.

You can see that after applying a voltage, some residual charge remains. Clearly though 100ms isn’t anything like enough to reach any kind of steady state. So, here’s a longer timescale:

10 second pulse, discharge, reverse pulse, discharge. Moderately symmetric this time.

That’s looking somewhat better. Most of them seem to have reached steady state after about 10s.

Day III (weekend 2)

OK, so the soil is drier than it was. That means the resistance will have gone up and so I’d expect a higher voltage across the soil than the last time I did some measurements.I’m going to do very long, long and medium length measurements (100s, 10s and 1s). Mostly I picked that as my scope maxes out at 50s per division. Here’s how they look. Also wow, 50s per division takes aaaagessss.

well, they are looking oddly asymmetric (again). I didn’t leave any time between the measurements.  It looks settled after 20 seconds.

I wonder though, can I speed this up? At the moment, the battery is charging through a 22k resistor. Perhaps what I could do is put another pin in parallel with the resistor, so I can charge directly, then measure with the resistor. Time to add another pin and some more code…

The cycle is going to be charge directly, then measure using the 22k resistor, then discharge directly. Then repeat the cycle but in reverse. The first result is this:

That looks pretty promising. Those reads look pretty stable. But just to be sure, I’m going to go for some longer reads to see how they look. By the way the code for this is:

```
void setup() {
pinMode(4, INPUT); // This connects to the top of the moisture sensor.
pinMode(2, OUTPUT); //This connects to the top of the 22k resistor
pinMode(3, OUTPUT); // This connects to the bottom of the moisture sensor
//The bottom of the 22K resistor and the top of the moisture sensor
//are connected to form a potential divider
}

// the loop routine runs over and over again forever:
void loop() {

static const int32_t D1 = 1000;
static const int32_t D2 = 1000;

//Fast charge
pinMode(4, OUTPUT);
digitalWrite(4, HIGH);
digitalWrite(2, HIGH);
digitalWrite(3, LOW);
delay(D1);

//Slow charge/ measure
pinMode(4, INPUT);
delay(D2);

//Discharge
pinMode(4, OUTPUT);
digitalWrite(4, LOW);
digitalWrite(2, LOW);
digitalWrite(3, LOW);
delay(D1);

//Fast charge
pinMode(4, OUTPUT);
digitalWrite(4, LOW);
digitalWrite(2, LOW);
digitalWrite(3, HIGH);
delay(D1);

//Slow charge/ measure
pinMode(4, INPUT) ;
delay(D2);

//Discharge
pinMode(4, OUTPUT);
digitalWrite(4, LOW);
digitalWrite(2, LOW);
digitalWrite(3, LOW);
delay(D1);
```

Note how I can change the fast charging versus the measuring. So using that I’m going to keep the fast charge at 1s and extend the measuring to 10s.

Uhmmmm what? I’m getting seriously confused here. It looks like the cell isn’t fully charged after the initial 1 second spike. And it looks like one direction holds more charge than the other (one flattens off, the other does not). I’m going to try bumping everything up to 10s.

If anything that seems worse than the 1s measurements.

Conclusions so far

• Water does indeed reduce the resistance of the soil.
• Weird electrochemical effects are happening
• Longer measurements are not definitively better than shorter ones
• AC seems to be necessary to stop really unpleasant memory effects
• Shorter measurements might be better, causing less corrosion and electrolysis of the electrodes.
• It’s probably worth trying a higher resistor to capture more of the useful range.

Moving on

So, I decided to try watering it just to see what happens. Here’s what the measurement plots look like:

I don’t remotely understand what’s going on. The cycle times are 10x apart and yet the curves look really really similar. Either way though it looks like adding water makes it vary more over time. Bleh. Or maybe it makes the charging happen faster? I’m now really quite unsure what’s going on. In fact look at this one:

It decays down as usual after the first quick charge, but then the line slopes up slightly. It’s almost like it continues to charge.

The plan

Either way though it looks like the scheme will work.  The plan is to wake up every half an hour or so, measure the resistivity and dispense some water if it’s too high. It’s probably worth taking long measurements so they can be read on a multimeter, so the level can be calibrated easily. 10s seems decent for that.

Oh yeah! I completely forgot about doing 4 point measurements. I should totally do that. Here’s how:

4 point measurement electrode. The voltage is applied to the outer two and measured on the inner two.

This should be good. Yes:

Yeah so I have even less idea what’s going on there. Time to abandon THAT line of inquiry.

The conclusion is that the resistive sensor is probably workable, and with nice simple 2 point measurements, which is nice.

(on to Part 2)