New paper: 3D Structure from 2D Microscopy images using Deep Learning

Benjamin J. Blundell, Christian Sieben, Suliana Manley, myself, QueeLim Ch’ng, and Susan Cox published a new paper in Frontiers in Bioinformatics (open access).

Abstract

Understanding the structure of a protein complex is crucial in determining its function. However, retrieving accurate 3D structures from microscopy images is highly challenging, particularly as many imaging modalities are two-dimensional. Recent advances in Artificial Intelligence have been applied to this problem, primarily using voxel based approaches to analyse sets of electron microscopy images. Here we present a deep learning solution for reconstructing the protein complexes from a number of 2D single molecule localization microscopy images, with the solution being completely unconstrained. Our convolutional neural network coupled with a differentiable renderer predicts pose and derives a single structure. After training, the network is discarded, with the output of this method being a structural model which fits the data-set. We demonstrate the performance of our system on two protein complexes: CEP152 (which comprises part of the proximal toroid of the centriole) and centrioles.

Don’t pay for what you don’t use in libCVD

libCVD has a pretty spiffy image loading function. You can do:

Image<Rgb<byte>> img = img_load("a_file.ext");

and you’re ready to go accessing pixels. The img_load function takes care of a lot for you: it determines the file type, calls the appropriate handler, then converts whatever the pixel type on disk is (it could be binary, greyscale or high bit depth) into the type in your program (and you don’t even need to provide the type to img_load).

At this point bear in mind that libCVD is a library for computer vision (especially frame rate vision), and at that point you know which type you need at compile time. The automatic conversion would be a showstopper if you wanted to accurately represent the file, but for the domain, you want to get the data into the desired type.

This function is very easy to use, but potentially expensive because libCVD supports quite a wide variety of image types. That causes two problems for shipped code:

  1. The shipped binary will be larger than necessary because it will contain code to load image formats you probably don’t care to support in production (e.g. FITS).
  2. Increased attack surface. TIFF in particular is a very complex file format with a vast number of options, and as a result libtiff has even recently had a number of serious CVEs

You can compile libCVD without an external library (e.g. TIFF), but currently there’s no way of switching off built-in libraries. I could add that, but that creates another problem: I didn’t add those formats for no reason. During debugging or analysis it can often be very useful to save and load internal state such as floating point images (for which you’d need TIFF or FITS). Then you’d need to build libCVD in multiple configurations, both with and without various options and switch them in and out as necessary.

That’s a big administrative disadvantage and adds a continual ongoing burden of wrangling multiple build configurations. The solution it turns out was remarkably simple:

Why not provide a type list? Then the linker can remove unused code.

– David McCabe

And that’s it, really! There are three minor variations:

Image<byte> i = img_load("file"); 
Image<byte> i = img_load<PNG::Reader, JPEG::Reader>("file"); 

using Files = std::tuple<PNG::Reader, JPEG::Reader>;
Image<byte> i = img_load<Files>("file");  

The first works as always and will load all supported image types. The second and third limit the list to only the specified ones and other types won’t be included anywhere in the resulting binary. Both variations are provided for ergonomics. The second because it’s nice to use directly, the third because you can’t save a parameter pack.

Internally it’s implemented using tuples because converting from a pack to a tuple is easy, but the reverse is more annoying.

The implementation is pretty straightforward (edited for brevity; the runtime error checking stuff isn’t relevant). First the code to load given a typelist in a tuple:

template<class I, class ImageTypeList, int N=0>
void img_load_tuple(Image<I>& im, std::istream& i, [[maybe_unused]] int c){
	if constexpr (N==std::tuple_size_v<ImageTypeList>) {
		throw Exceptions::Image_IO::UnsupportedImageType();
	}
	else{
		using ImageReader = std::tuple_element_t<N, ImageTypeList>;
	
		if(ImageReader::first_byte_matches(c))
			CVD::Internal::readImage<I,ImageReader>(im, i);
		else
			img_load_tuple<I, ImageTypeList, N+1>(im, i, c);
	}
}

It’s a pretty run-of-the-mill compile time iteration scheme, which is now just a single fnuction with if constexpr. Note that loading makes it’s decision based on the first byte of the file, and each image loader has a function to test for a match. This replaces the old if-else chain which I misremembered as a switch statement (edited):

	template<class I> void img_load(Image<I>& im, std::istream& i)
	{
	  unsigned char c = i.peek();
	 
	  if(c == 'P')
	    CVD::Internal::readImage<I, PNM::Reader>(im, i);
	  else if(c == 0xff)
	    CVD::Internal::readImage<I, JPEG::reader>(im, i);
	  else if(c == 'I' || c == 'M') //Little or big endian TIFF
	    CVD::Internal::readImage<I, TIFF::tiff_reader>(im, i);
	  else if(c == 0x89)
	    CVD::Internal::readImage<I, PNG::png_reader>(im, i);
	  else if(c == 'B')
	    CVD::Internal::readImage<I, BMP::Reader>(im, i);
	  else if(c == 'S')
	    CVD::Internal::readImage<I, FITS::reader>(im, i);
	  else if(c == 'C')
		CVD::Internal::readImage<I, CVDimage::reader>(im, i);
	  else if(c == ' ' || c == '\t' || isdigit(c) || c == '-' || c == '+')
	    CVD::Internal::readImage<I, TEXT::reader>(im, i);
	  else
	    throw Exceptions::Image_IO::UnsupportedImageType();
	}

Fortunately the image types are distinguishable from exactly 1 byte. This is handy because it allows all applicable types to be read from a non-seekable istream (e.g. one wrapping a pipe) without any modification because you can peek exactly one byte.

The neat bit allowing the dual interfaces easily is a template which turns its input into a tuple. You give it a tuple or a parameter pack and you always get a tuple back:

template<class... T> struct as_tuple{
	using type = std::tuple<T...>;
};

template<class... T> struct as_tuple<std::tuple<T...>>{
	using type = std::tuple<T...>;
};

This allows then a single function which has no opinions on what it is meant to operate on:

template <class I, class Head = Internal::AllImageTypes, class... ImageTypes>
void img_load(Image<I>& im, std::istream& i)
{
	img_load_tuple<I, typename Internal::as_tuple<Head, ImageTypes...>::type>(im, i, c);
}

The only flourish is taking two arguments for the loaders, one simple one a pack in order to allow the arguments to be defaulted.

The results I think gives the best of all worlds. It has a simple to use interface for the library, allows the user to not pay for what they don’t use, avoids compile time configuration, and it’s a clear and straightforward implementation. It also enforces a more sensible separation: the image loaders themselves can determine from the file magic bytes whether they can operate, rather than separating two different parts of the loading into different places.

What’s all this new-fangled DBus rubbish?

The 1990s called…

DBus has been one of the larger changes that swept through Linux in the last 20 years or so. I mostly (with the exception of a small amount of whinging) ignored it. Usually I’ve encountered it when it hasn’t worked or has got in the way with something or other. Naturally when things worked, I was generally unaware of it.

It’s basically:

  • Async RPC system with call, response, exception and asynchronus push.
  • Authentication (useful for talking to daemons running as root)
  • Introspection (the set of methods, arguments etc are exposed via DBus)
  • Central point for services to register objects and methods: you open a dbus connection as opposed to a random socket

Various tools exist to send and receive messages, e.g. python libraries, shell commands via ‘dbus-send’ and so on. I’ve nothing against RPC in general, but it replaces essentially shell scripts triggered by the kernel etc with programs exchanging messages. You can do more with the latter (though often that isn’t necessary), but it replaces easily discoverable scripts with reading documentation, something the programming community is not well known for and as such it has a rather higher barrier to entry.

It has also has been used to replace simply libraries with relatively complex IPC requiring complex setup. But I won’t blame a table saw for injuries it causes through misuse. Anyway this post is a nearly linear stream of me learning DBus with the mistakes and confusion removed.

The tools

I will be relying on the following:

  • python dbus module import dbus
  • The basic dbus commandline program: dbus-send
  • QT’s qdbus since it provides in many cases a nicer interface and introspection
  • Gnome’s gdbus for more verbose introspection

The basics

I’m interested in manipulating the system, so I will be working with the system bus. There’s usually a system bus (for talking to OS related daemons) and a session bus (for all the programs in a logged in session). You can make more if you like but no one does.

In order to make an RPC call, you need:

  1. A program to talk to, e.g. NetworkManager (this is called the bus name) and usually has a name like org.freedesktop.NetworkManager
  2. An object in the program. Each program exports a tree of objects, rooted at / and separated with forward slashes. Something like: /org/freedesktop/NetworkManager. Freedesktop likes redundant naming. They could equally well have exported /.
  3. The interface and method name. Just like any OO system each object can present zero or more interfaces with methods.

We can examine this. First, qdbus will show the names present on the system bus. For example, on my system (I don’t know what the numbers prefixed with colons are) I have:

qdbus --system  | grep -v :
 org.freedesktop.systemd1
 org.freedesktop.resolve1
 fi.epitest.hostap.WPASupplicant
 fi.w1.wpa_supplicant1
 org.freedesktop.NetworkManager
 org.gnome.DisplayManager
 org.freedesktop.ColorManager
 org.freedesktop.Avahi
 org.bluez
 org.freedesktop.UPower
 org.freedesktop.Accounts
 org.freedesktop.login1
 org.freedesktop.RealtimeKit1
 org.freedesktop.UDisks2
 org.freedesktop.ModemManager1
 org.freedesktop.bolt
 org.freedesktop.PackageKit
 org.freedesktop.PolicyKit1
org.freedesktop.DBus

There’s a few one might recognise, Udisks for hotplug disks, NetworkManager for wifi control etc, ModemManager1 because this machine actually has a real actual physical 56k modem, and a few other well established ones like systemd. You can go further and query what’s on the bus. For example, I can query systemd using gdbus, which as you may recall is the verbose, detailed one. I’m going to query the root object to see what’s there:

$gdbus introspect  --system --dest org.freedesktop.systemd1   --object-path / 
node / {
  interface org.freedesktop.DBus.Peer {
    methods:
      Ping();
      GetMachineId(out s machine_uuid);
    signals:
    properties:
  };
  interface org.freedesktop.DBus.Introspectable {
    methods:
      Introspect(out s data);
    signals:
    properties:
  };
  interface org.freedesktop.DBus.Properties {
    methods:
      Get(in  s interface,
          in  s property,
          out v value);
      GetAll(in  s interface,
             out a{sv} properties);
      Set(in  s interface,
          in  s property,
          in  v value);
    signals:
      PropertiesChanged(s interface,
                        a{sv} changed_properties,
                        as invalidated_properties);
    properties:
  };
  node org {
  };
};

If you’ve ever read IDLs of any sort this will look vaguely familiar. Systemd is exporting an object which presents three interfaces:

  • org.freedesktop.DBus.Peer
  • org.freedesktop.DBus.Introspectable
  • org.freedesktop.DBus.Properties

and a subobject, org. Since gdbus doesn’t recurse by default, that’s all that’s displayed. The first of those objects has two methods neither of which have any arguments, which makes them easy to call. So I shall ping dbus:

$dbus-send --print-reply --system --dest=org.freedesktop.systemd1 / org.freedesktop.DBus.Peer.GetMachineId
method return time=1632074720.055870 sender=:1.0 -> destination=:1.3006 serial=54374 reply_serial=2
   string "73f867af39124a3583c288e620019332"

and it responds. See how the bus name (dest), path (/) and interface.object (org.freedesktop.DBus.Peer.GerMachineId) are used. I can examine another common one. when given a bus, but no object, qdbus will recurse showing the object tree:

$qdbus --literal --system org.freedesktop.NetworkManager 
/
/org
/org/freedesktop
/org/freedesktop/NetworkManager
/org/freedesktop/NetworkManager/DnsManager
/org/freedesktop/NetworkManager/DHCP4Config
/org/freedesktop/NetworkManager/DHCP4Config/70
/org/freedesktop/NetworkManager/ActiveConnection
/org/freedesktop/NetworkManager/ActiveConnection/2
/org/freedesktop/NetworkManager/ActiveConnection/72
/org/freedesktop/NetworkManager/AccessPoint
/org/freedesktop/NetworkManager/AccessPoint/2686
/org/freedesktop/NetworkManager/Devices
/org/freedesktop/NetworkManager/Devices/3
/org/freedesktop/NetworkManager/Devices/2
/org/freedesktop/NetworkManager/Devices/1
/org/freedesktop/NetworkManager/Devices/25
/org/freedesktop/NetworkManager/Devices/4
/org/freedesktop/NetworkManager/AgentManager
/org/freedesktop/NetworkManager/Settings
/org/freedesktop/NetworkManager/Settings/8
/org/freedesktop/NetworkManager/Settings/7
/org/freedesktop/NetworkManager/Settings/6
/org/freedesktop/NetworkManager/Settings/5
/org/freedesktop/NetworkManager/Settings/4
/org/freedesktop/NetworkManager/Settings/3
/org/freedesktop/NetworkManager/Settings/2
/org/freedesktop/NetworkManager/Settings/1
/org/freedesktop/NetworkManager/Settings/24
/org/freedesktop/NetworkManager/Settings/9
/org/freedesktop/NetworkManager/IP6Config
/org/freedesktop/NetworkManager/IP6Config/3
/org/freedesktop/NetworkManager/IP6Config/204
/org/freedesktop/NetworkManager/IP6Config/203
/org/freedesktop/NetworkManager/IP6Config/6
/org/freedesktop/NetworkManager/IP4Config
/org/freedesktop/NetworkManager/IP4Config/3
/org/freedesktop/NetworkManager/IP4Config/204
/org/freedesktop/NetworkManager/IP4Config/203
/org/freedesktop/NetworkManager/IP4Config/6

For reference, gdbus (non recursive; recursive is too verbose for this blog post) gives:

$gdbus introspect  --system --dest org.freedesktop.NetworkManager   --object-path /
node / {
  node org {
  };
};

The root node isn’t very interesting, it just has the child org and nothing else. qdbus will also give a more compact method view, for example, I can query one of the devices:

$qdbus --literal --system org.freedesktop.NetworkManager /org/freedesktop/NetworkManager/Devices/3 
method QDBusVariant org.freedesktop.DBus.Properties.Get(QString interface_name, QString property_name)
method QVariantMap org.freedesktop.DBus.Properties.GetAll(QString interface_name)
signal void org.freedesktop.DBus.Properties.PropertiesChanged(QString interface_name, QVariantMap changed_properties, QStringList invalidated_properties)
method void org.freedesktop.DBus.Properties.Set(QString interface_name, QString property_name, QDBusVariant value)
method QString org.freedesktop.DBus.Introspectable.Introspect()
method QString org.freedesktop.DBus.Peer.GetMachineId()
method void org.freedesktop.DBus.Peer.Ping()
property read QDBusObjectPath org.freedesktop.NetworkManager.Device.ActiveConnection
property readwrite bool org.freedesktop.NetworkManager.Device.Autoconnect
property read QList<QDBusObjectPath> org.freedesktop.NetworkManager.Device.AvailableConnections
property read uint org.freedesktop.NetworkManager.Device.Capabilities
property read uint org.freedesktop.NetworkManager.Device.DeviceType
property read QDBusObjectPath org.freedesktop.NetworkManager.Device.Dhcp4Config
property read QDBusObjectPath org.freedesktop.NetworkManager.Device.Dhcp6Config
property read QString org.freedesktop.NetworkManager.Device.Driver
property read QString org.freedesktop.NetworkManager.Device.DriverVersion
property read bool org.freedesktop.NetworkManager.Device.FirmwareMissing
property read QString org.freedesktop.NetworkManager.Device.FirmwareVersion
property read QString org.freedesktop.NetworkManager.Device.Interface
property read uint org.freedesktop.NetworkManager.Device.Ip4Address
property read QDBusObjectPath org.freedesktop.NetworkManager.Device.Ip4Config
property read QDBusObjectPath org.freedesktop.NetworkManager.Device.Ip6Config
property read QString org.freedesktop.NetworkManager.Device.IpInterface
property read QDBusRawType::aa{sv} org.freedesktop.NetworkManager.Device.LldpNeighbors
property readwrite bool org.freedesktop.NetworkManager.Device.Managed
property read uint org.freedesktop.NetworkManager.Device.Metered
property read uint org.freedesktop.NetworkManager.Device.Mtu
property read bool org.freedesktop.NetworkManager.Device.NmPluginMissing
property read QString org.freedesktop.NetworkManager.Device.PhysicalPortId
property read bool org.freedesktop.NetworkManager.Device.Real
property read uint org.freedesktop.NetworkManager.Device.State
property read QDBusRawType::(uu) org.freedesktop.NetworkManager.Device.StateReason
property read QString org.freedesktop.NetworkManager.Device.Udi
method void org.freedesktop.NetworkManager.Device.Delete()
method void org.freedesktop.NetworkManager.Device.Disconnect()
method QDBusRawType::a{sa{sv}} org.freedesktop.NetworkManager.Device.GetAppliedConnection(uint flags, qulonglong& version_id)
method void org.freedesktop.NetworkManager.Device.Reapply(QDBusRawType::a{sa{sv}} connection, qulonglong version_id, uint flags)
signal void org.freedesktop.NetworkManager.Device.StateChanged(uint new_state, uint old_state, uint reason)
property read QList<QDBusObjectPath> org.freedesktop.NetworkManager.Device.Wireless.AccessPoints
property read QDBusObjectPath org.freedesktop.NetworkManager.Device.Wireless.ActiveAccessPoint
property read uint org.freedesktop.NetworkManager.Device.Wireless.Bitrate
property read QString org.freedesktop.NetworkManager.Device.Wireless.HwAddress
property read uint org.freedesktop.NetworkManager.Device.Wireless.Mode
property read QString org.freedesktop.NetworkManager.Device.Wireless.PermHwAddress
property read uint org.freedesktop.NetworkManager.Device.Wireless.WirelessCapabilities
signal void org.freedesktop.NetworkManager.Device.Wireless.AccessPointAdded(QDBusObjectPath access_point)
signal void org.freedesktop.NetworkManager.Device.Wireless.AccessPointRemoved(QDBusObjectPath access_point)
method QList<QDBusObjectPath> org.freedesktop.NetworkManager.Device.Wireless.GetAccessPoints()
method QList<QDBusObjectPath> org.freedesktop.NetworkManager.Device.Wireless.GetAllAccessPoints()
signal void org.freedesktop.NetworkManager.Device.Wireless.PropertiesChanged(QVariantMap properties)
method void org.freedesktop.NetworkManager.Device.Wireless.RequestScan(QVariantMap options)
property readwrite uint org.freedesktop.NetworkManager.Device.Statistics.RefreshRateMs
property read qulonglong org.freedesktop.NetworkManager.Device.Statistics.RxBytes
property read qulonglong org.freedesktop.NetworkManager.Device.Statistics.TxBytes
signal void org.freedesktop.NetworkManager.Device.Statistics.PropertiesChanged(QVariantMap properties)

Yikes! There’s a lot there. Actually gdbus has the nice thing where it also queries properties for you. I recommend trying that. Anyway if you carefully read through, you can see the GetAccessPoints method. If I call it, I get a list of accesspoints:

$dbus-send --print-reply --system --dest=org.freedesktop.NetworkManager /org/freedesktop/NetworkManager/Devices/3 org.freedesktop.NetworkManager.Device.Wireless.GetAccessPoints
method return time=1632075788.446527 sender=:1.12 -> destination=:1.3041 serial=431030 reply_serial=2
   array [
      object path "/org/freedesktop/NetworkManager/AccessPoint/2686"
      object path "/org/freedesktop/NetworkManager/AccessPoint/2699"
      object path "/org/freedesktop/NetworkManager/AccessPoint/2700"
      object path "/org/freedesktop/NetworkManager/AccessPoint/2701"
      object path "/org/freedesktop/NetworkManager/AccessPoint/2702"
      object path "/org/freedesktop/NetworkManager/AccessPoint/2703"
      object path "/org/freedesktop/NetworkManager/AccessPoint/2704"
      object path "/org/freedesktop/NetworkManager/AccessPoint/2706"
      object path "/org/freedesktop/NetworkManager/AccessPoint/2708"
      object path "/org/freedesktop/NetworkManager/AccessPoint/2710"
      object path "/org/freedesktop/NetworkManager/AccessPoint/2711"
      object path "/org/freedesktop/NetworkManager/AccessPoint/2712"
      object path "/org/freedesktop/NetworkManager/AccessPoint/2713"
      object path "/org/freedesktop/NetworkManager/AccessPoint/2714"
      object path "/org/freedesktop/NetworkManager/AccessPoint/2715"
      object path "/org/freedesktop/NetworkManager/AccessPoint/2716"
      object path "/org/freedesktop/NetworkManager/AccessPoint/2717"
   ]

This is not very shell friendly, but I can persist. For example, if I examine one of the APs, I get:

$gdbus introspect  --system --dest org.freedesktop.NetworkManager   --object-path /org/freedesktop/NetworkManager/AccessPoint/2714
node /org/freedesktop/NetworkManager/AccessPoint/2714 {
  interface org.freedesktop.DBus.Properties {
    methods:
      Get(in  s interface_name,
          in  s property_name,
          out v value);
      GetAll(in  s interface_name,
             out a{sv} properties);
      Set(in  s interface_name,
          in  s property_name,
          in  v value);
    signals:
      PropertiesChanged(s interface_name,
                        a{sv} changed_properties,
                        as invalidated_properties);
    properties:
  };
  interface org.freedesktop.DBus.Introspectable {
    methods:
      Introspect(out s xml_data);
    signals:
    properties:
  };
  interface org.freedesktop.DBus.Peer {
    methods:
      Ping();
      GetMachineId(out s machine_uuid);
    signals:
    properties:
  };
  interface org.freedesktop.NetworkManager.AccessPoint {
    methods:
    signals:
      PropertiesChanged(a{sv} properties);
    properties:
      readonly u Flags = 0;
      readonly u WpaFlags = 0;
      readonly u RsnFlags = 0;
      readonly ay Ssid = [0x48, 0x50, 0x2d, 0x50, 0x72, 0x69, 0x6e, 0x74, 0x2d, 0x33, 0x31, 0x2d, 0x4f, 0x66, 0x66, 0x69, 0x63, 0x65, 0x6a, 0x65, 0x74, 0x20, 0x36, 0x36, 0x30, 0x30];
      readonly u Frequency = 2412;
      readonly s HwAddress = 'xx:xx:xx:xx:xx:xx';
      readonly u Mode = 2;
      readonly u MaxBitrate = 54000;
      readonly y Strength = 0x31;
      readonly i LastSeen = 2602413;
  };
};

That looks interesting. The ssid, which is a byte array is a property. The way to read properties is with the property get method which is present, so you have to call that. Note it takes two arguments, both strings, one which is the interface name, the other being the method, so you have to specify those. And querying AP number 2714 gives:

 $dbus-send --print-reply --system --dest=org.freedesktop.NetworkManager /org/freedesktop/NetworkManager/AccessPoint/2714 org.freedesktop.DBus.Properties.Get string:org.freedesktop.NetworkManager.AccessPoint string:Ssid
method return time=1632076552.174223 sender=:1.12 -> destination=:1.3082 serial=431816 reply_serial=2
   variant       array of bytes "HP-Print-31-Officejet 6600"

Oh look, one of my neighbours has one of the worse printers ever made. I had one of those printers.

Python provides a reasonable library for doing such things. Here’s some code which iterates over all available network interfaces and prints the SSID of whichever access points it finds:

import dbus
bus = dbus.SystemBus()

obj = bus.get_object('org.freedesktop.NetworkManager', '/org/freedesktop/NetworkManager')
network_manager = dbus.Interface(obj, 'org.freedesktop.NetworkManager')

#Iterate over all devices
for device in network_manager.GetDevices():

    #Get the wireless interface for each device
    obj = bus.get_object('org.freedesktop.NetworkManager', device)
    wlan =  dbus.Interface(obj, 'org.freedesktop.NetworkManager.Device.Wireless')
    
    #Note we don't get an error until we attempt to use the interface
    #I suspect there is a better way
    try:
        for ap_path in  wlan.GetAccessPoints():
            # Read the SSID property
            obj =  bus.get_object('org.freedesktop.NetworkManager', ap_path)
            ap_props =   dbus.Interface(obj, 'org.freedesktop.DBus.Properties')
            ssid = ap_props.Get('org.freedesktop.NetworkManager.AccessPoint', 'Ssid')
            print(''.join([str(v) for v in ssid]))

    except dbus.exceptions.DBusException as e:
        pass

So that’s it for the basics. There’s a whole introspection API as well which I believe returns the structure as XML.

But to call methods, introspection isn’t needed.

Writing a service

Writing a service is pretty easy in Python. Here’s an example where dbus doesn’t steal the entire main loop. Note that this runs in the sesson bus, not the system one, because of security.

from gi.repository import GLib
import time

class Service(dbus.service.Object):
   def __init__(self):
      #Register dbus with GLib's main loop
      dbus.mainloop.glib.DBusGMainLoop(set_as_default=True)
      
      #Bus name 
      bus_name = dbus.service.BusName("com.hello.helloworld", dbus.SessionBus())

      #Object to export
      dbus.service.Object.__init__(self, bus_name, "/")

      self._count=0

   #Register two methods to the one interface
   @dbus.service.method("com.hello.helloworld.Message", in_signature='', out_signature='s')
   def get_message(self):
      self._count+=1
      return "Hello, world " + str(self._count)
    
   @dbus.service.method("com.hello.helloworld.Message", in_signature='i')
   def set_counter(self, i):
      self._count = i
    
   #A way of polling the main loop so that GLib doesn't 
   #steal the program's main loop
   def poll(self):
      loop = GLib.MainLoop()
      def quit():
          loop.quit()
      GLib.idle_add(quit)
      loop.run()

#Instantiate the service
service = Service()

#Poll it
while True:
    service.poll()
    time.sleep(.1)

And it works:

~ $qdbus --session com.hello.helloworld / get_message
Hello, world 1
~ $qdbus --session com.hello.helloworld / get_message
Hello, world 2
~ $qdbus --session com.hello.helloworld / get_message
Hello, world 3
~ $qdbus --session com.hello.helloworld / get_message
Hello, world 4
~ $qdbus --session com.hello.helloworld / get_message
Hello, world 5
~ $qdbus --session com.hello.helloworld / set_counter 0

~ $qdbus --session com.hello.helloworld / get_message
Hello, world 1
~ $qdbus --session com.hello.helloworld / get_message
Hello, world 2

qdbus is often a lot less verbose to use!

Writing a system service

If you try to run with the system bus, it will fail, even if you run as root. This is because of the security policy in place. The policies allow non-root daemons to run as system services, and no one special cased it to make root ones always allowed, which is fine.

The policies are in /etc/dbus-1/system.d/ and they’re sort of understandable, but only sort of. It is documented: essentially it’s default deny with allow/deny rules applied top to bottom based on a matching scheme. In order to run the service as root, the following file works:

<!DOCTYPE busconfig PUBLIC
 "-//freedesktop//DTD D-BUS Bus Configuration 1.0//EN"
 "http://www.freedesktop.org/standards/dbus/1.0/busconfig.dtd">
<busconfig>

  <!-- Only root can own the service -->
  <policy user="root">
    <allow own="com.hello.helloworld"/>
  </policy>

  <policy context="default">
    <allow send_destination="com.hello.helloworld"/>
  </policy>
</busconfig>

essentially this allows only root to own the service, but allows anyone (the default context) to send messages. One the file is created, you then need to reload DBus which can be done with SIGHUP or with systemctl reload debus on systemd based systems.

Starting a system service with systemd

I could do it manually, but perhaps I should bend to the winds of change.

This is simple, for this rather basic service. First, add #!/usr/bin/env python3 to the first line of the script and make it executable. Then copy it to /opt/helloservice/service.py. Then add a very simple unit file to /etc/systemd/system/hello.service:

[Unit]
Description=Hello world service

[Service]
ExecStart=/opt/helloservice/service.py

[Install]
WantedBy=multi-user.target

Is it me or is it weird that the old Windows 3 style .INI files have become a perverse sort of standard?

The future..

Anyway, now a simple:

sudo systemctl daemon-reload
sudo systemctl start hello

starts the service. And to enable it on boot:

sudo systemctl enable hello

which apparently symlinks it in a directory.

And that’ it

A new system service, running as root controllable by a user. The purpose is to have some NeoPixels on an Raspberry Pi controlled as a user (the pixels can only run as root due to the need to access low level hardware).

New paper: Y-Autoencoders: Disentangling latent representations via sequential encoding

Massimiliano Patacchiola, Patrick Fox-Roberts and myself published a new paper in Pattern Recognition Letters (PDF here).

This work presents a new way of training auto encoders to allow separation of style and content which gives GAN like performance with the ease of training of auto encoders.

Abstract

In the last few years there have been important advancements in generative models with the two dominant approaches being Generative Adversarial Networks (GANs)and Variational Autoencoders (VAEs). However, standard Autoencoders (AEs) and closely related structures have re-mained popular because they are easy to train and adapt to different tasks. An interesting question is if we can achieve state-of-the-art performance with AEs while retaining their good properties. We propose an answer to this question by introducing a new model called Y-Autoencoder (Y-AE). The structure and training procedure of a Y-AE enclose a representation into an implicit and an explicit part. The implicit part is similar to the output of an auto-encoder and the explicit part is strongly correlated with labels in the training set. The two parts are separated in the latent space by split-ting the output of the encoder into two paths (forming a Y shape) before decoding and re-encoding. We then impose a number of losses, such as reconstruction loss, and a loss on dependence between the implicit and explicit parts. Additionally, the projection in the explicit manifold is monitored by a predictor, that is embedded in the encoder and trained end-to-end with no adversarial losses. We provide significant experimental results on various domains, such as separation of style and content, image-to-image translation, and inverse graphics.

Undefined behaviour and nasal demons, or, do not meddle in the affairs of optimizers

Discussions on undefined behaviour often degenerate into vague mumblings about “nasal demons”. This comes from a 1992 usenet post on comp.lang.c, where the poster “quotes” the C89 standard (this is oddly hard to find now) as:

“1.6 Definitions of Terms
In this standard, … “shall not” is to be interpreted as a prohibition
* Undefined behavior — behavior, upon use of a nonportable or erroneous program construct, … for which the standard imposes no requirements. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to having demons fly out of your nose.”

John F. Woods

Essentially what he means is that the compiler is allowed to do anything, including things which seem completely nonsensical. Usually you feel you have a handle on what sort of things the compiler might do, but sometimes it can do some very unexpected things. Of course they make sense in the compiler’s internal logic and are not incorrect, but it can take a long time to figure out what line of reasoning led the compiler to its decision.

I have a nice example which was investigated and distilled down to a minimal example by my friend David McCabe who kindly allowed me to post it here (I’ve changed it slightly). The example involves accidental memory scribbling in the context of derived classes (so the vtable pointers get scribbled over) where the compiler can inline enough to “see” the scribbling and then reason about it.

Here is the code snippet (and Godbolt link to peruse):

#include <cstring>

void e();
void f();
void g();

class A {
public:
    virtual ~A() = default;
};

class B : public A {
public:
    virtual void x() {f();};
};


void r(A* a) {
    B c;
    std::memcpy(static_cast<void*>(a), static_cast<const void*>(&c), sizeof(A));
}

int main() {
    A a;
    r(&a);
    e();
    ((B*)&a)->x();
    g();
}

So, what’s going on, or what’s supposed to be going on? And by “supposed”, I mean what would one think a compiler with no optimizations most likely do, given what I know about how C++ code translates to unoptimized assembly.

So, we have a base class with a vtable, A, and in main create an instance a. Note that A and the derived class B, happen to be the same size (the size of the vtable pointer). The function r is the memory scribbler: it creates an instance of B and memcpy()s it over A. In an intuitive way, the instance a has been turned into an instance of b, because it’s had its vtable pointer changed to a B vtable pointer. This is really the only thing at a low level which actually causes these types to be different, so if you were to cast a pointer to a to a B pointer, it should now work as a B and you can make the virtual function call x on it.

And that’s what main does. The only purpose of e, f, and g are to place markers in the generated code so we can see what’s run.

Now, scribbling over vtables is wildly ridiculously undefined, and nasal demons my fly (spoiler alert: they do). And you also add in some stubs for e, f and g in a separate source file (to prevent it optimizing) like this:

#include <iostream>
void e(){ std::cerr << "e\n"; }
void f(){ std::cerr << "f\n"; }
void g(){ std::cerr << "g\n"; }

then compile and run it (first without optimization), and it prints:

e
f
g

which is what you might expect. But what about with optimizations? On my machine (gcc 7.5.0), if I compile the main file with -O3 and the stubs with no optimization, it prints out:

e
e
⋮
<repeats for a total of 37401 e's>
⋮
e
Segmentation fault (core dumped)

These are the nasal demons. How on earth can the optimizer have turned that code into a loop and a segfault? It turns out it makes sense, but only after a lot of investigation. But first I’m going to consider some other cases before getting on to gcc with -O3. The Godbolt link, does nice colourisation so you can see which program lines match to which assembly lines.

First, here’s what unoptimized GCC 7.5.0 does (note I’ve snipped a lot of the setup and teardown and so on):

        lea     rax, [rbp-24]
        mov     rdi, rax
        call    r(A*)
        call    e()
        lea     rax, [rbp-24]
        mov     rax, QWORD PTR [rax]
        add     rax, 16
        mov     rax, QWORD PTR [rax]
        lea     rdx, [rbp-24]
        mov     rdi, rdx
        call    rax
        call    g()
        lea     rax, [rbp-24]
        mov     rdi, rax
        call    A::~A() [complete object destructor]

I’ve highlighted the most relevant lines, which are in order a call to r, e, then an indirect call via the vtable, then g. It then calls the destructor for A not B. The call is non virtual because a is an instance not a pointer, so it ignores the scribbled vtable and calls the “wrong” destructor.

I’m now going to look at what other compilers do first because GCC 5 and newer definitely do the most unexpected thing. First, the last pre-5 version on Godbolt, GCC 4.9.4 with -O3:

main:
        sub     rsp, 40
        mov     QWORD PTR [rsp+16], OFFSET FLAT:vtable for B+16
        mov     QWORD PTR [rsp], OFFSET FLAT:vtable for B+16
        call    e()
        mov     rax, QWORD PTR [rsp]
        mov     rdi, rsp
        call    [QWORD PTR [rax+16]]
        call    g()
        xor     eax, eax
        add     rsp, 40
        ret

I’ve stripped out everything except main(). Note it’s inlined both r() and the call to memcpy. On line 3, it’s creating the concrete b, and copying in the vtable, on the following line, it’s memcpying that over a, but it’s noticed it can take the data from the source. then it calls e, makes a virtual call (which we know will be f) and then calls g. It inlines and removes the trivial ~A(), tears down main() and leaves.

So this program behaves as “expected”, it would print out e, f, g like the unoptimized one. MSVC 19.14 with /Ox does the same:

main    PROC
$LN19:
        sub     rsp, 40                             ; 00000028H
        lea     rax, OFFSET FLAT:const B::`vftable'
        mov     QWORD PTR a$[rsp], rax
        call    void e(void)                         ; e
        mov     rax, QWORD PTR a$[rsp]
        lea     rcx, QWORD PTR a$[rsp]
        call    QWORD PTR [rax+8]
        call    void g(void)                         ; g
        xor     eax, eax
        add     rsp, 40                             ; 00000028H
        ret     0
main    ENDP

Using 19.28 and compiling for x86 not x64 has an extra flourish where it doesn’t make an indirect call, instead it checks if it has a B vtable and makes a direct call. But that’s the same in essence since it’s still deciding based on the vtable:

_main   PROC
        push    ecx
        mov     DWORD PTR _a$[esp+4], OFFSET const B::`vftable'
        call    void e(void)                         ; e
        mov     eax, DWORD PTR _a$[esp+4]
        mov     eax, DWORD PTR [eax+4]
        cmp     eax, OFFSET virtual void B::x(void)      ; B::x
        jne     SHORT $LN6@main
        call    void f(void)                         ; f
        call    void g(void)                         ; g
        xor     eax, eax
        pop     ecx
$LN6@main:
        lea     ecx, DWORD PTR _a$[esp+4]
        call    eax
        call    void g(void)                         ; g
        xor     eax, eax
        pop     ecx
        ret     0        ret     0

Clang on the other hand will apply its powerful optimizer to this case:

main:                                   # @main
        push    rax
        call    e()
        call    f()
        call    g()
        xor     eax, eax
        pop     rcx
        ret

It’s likely using dataflow, so I am guessing it follows the provenance of the pointer, finds it’s been copied from the B vtable and then devirtualises. Note that if you put in destructors it will call ~A() not ~B() because like GCC, it never makes a virtual call in the first place. So far so sort-of sensible. You can nod approvingly at the power of clang’s optimizer but not feel pessimistic about VS2017 just punting on that and doing the “obvious” thing (except on x86, where it makes the same deduction but it much more conservative with its actions).

But what about GCC? This is present in all version from 5 onwards, and I’m picking 7.5 (my machine version, though 10.2, the latest on Godbolt, does the same) with -O3 and it does:

main:
        sub     rsp, 8
        call    e()
WAT

That’s it. The whole of main(). It calls e(), then nothing. No tear down, no return, nothing, it just falls off the end into whatever code happens to be lying there. This is the source of the nasal demons.

Undefined behaviour is not allowed, according to the standard. Therefore, according to GCC, it does not happen. The compiler goes one step further than clang and noticed that the line after e() has the wrong vtable and that is undefined behaviour so it has deduced that the line is never reached. And the only way for that to happen is if e() never returns therefore it marks the code after e() as unreachable and deletes it.

When the standard says “undefined”, the standard means it, and the compiler is allowed to reason backwards in time to make optimizations with the assumption that the program is valid. This is very unintuitive but entirely legal and part of a very powerful optimization pass.

This isn’t the compiler being, as some people feel, perversely pedantic just to mess with the unwary programmer.

It’s really handy: GCC can take some inlined code and figure out that a pointer is non-null based on how it’s used, and can then, say, travel back through the code with that knowledge and remove all the tests for nullness and alternative branches based on such tests making the inlined code both faster and more compact. So you write your code to be generic, and GCC gives you an extra-fast, extra compact version when it finds a special use case. It’s exactly the sort of thing a person might do, but it’s automatic and woe betide the person who violates the preconditions.

So the last question is why this specific behaviour? Why does it print out many lines and then quit? Well, here’s an objdump of the relevant part of the resulting executable:

int main() {
 8d0:	48 83 ec 08          	sub    $0x8,%rsp
    A a;
    r(&a);
    e();
 8d4:	e8 51 01 00 00       	callq  a2a <e()>
 8d9:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)

00000000000008e0 <_start>:
 8e0:	31 ed                	xor    %ebp,%ebp
 8e2:	49 89 d1             	mov    %rdx,%r9
 8e5:	5e                   	pop    %rsi
 8e6:	48 89 e2             	mov    %rsp,%rdx
 8e9:	48 83 e4 f0          	and    $0xfffffffffffffff0,%rsp
 8ed:	50                   	push   %rax
 8ee:	54                   	push   %rsp
 8ef:	4c 8d 05 8a 02 00 00 	lea    0x28a(%rip),%r8        # b80 <__libc_csu_fini>
 8f6:	48 8d 0d 13 02 00 00 	lea    0x213(%rip),%rcx        # b10 <__libc_csu_init>
 8fd:	48 8d 3d cc ff ff ff 	lea    -0x34(%rip),%rdi        # 8d0 <main>
 904:	ff 15 d6 16 20 00    	callq  *0x2016d6(%rip)        # 201fe0 <__libc_start_main@GLIBC_2.2.5>
 90a:	f4                   	hlt    
 90b:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)

Look what happens to sit right after main(): it’s _start, which is the operating system’s entry point. So after falling off the end of main, it runs into that and so simply restarts the program entirely from scratch. The program then makes a call to main which never returns because it runs off the end… and you get the idea. Eventually the program exhausts the stack and segfaults. If you change the optimization settings of the file with the stubs in, the behaviour changes, because a different piece of code gets randomly wandered into.

I thought this was fascinating: I know that undefined behaviour can in principle do some very strange things but it’s interesting to see an example where it really does. I would never have predicted infinite recursion as an outcome. So don’t mess with undefined behaviour: one day the compiler might really make demons fly from your nose.

Static binaries

So, in a previous post, I claimed static binaries “last forever”. The oldest binaries I have to hand are from 2006. These were either RedHat or Suse, but I don’t remember which. We were transitioning to Suse at around that time.

Anyway, they work. At least the Linux/x86 one does. I don’t have a handy PPC machine or SGI or Sun around to try the more exotic architectures. I mostly keep those binaries up to buff my nerd credentials. Anyway point is, the 14 year old binary works just fine.

So how old will it go?

I do remember there was a “compatibility break” between libc5 and glibc back in the day, but that was with dynamically linked code. I remember though one could fix it by finding the loader and libc from an old system and simply copying them across.

Time to dig out an old static executable!

Hm, well, no idea where I’d find one. The easiest way is to make one. Sadly, because I’m not a complete tech hoarder I threw away my RedHat 5.2 CDs which came in a spiffy box set for something like £50 at the local bookshop, bundled with a bunch of books on CD. Best £50 I ever spent; it’s what got me into Linux.

See that? A complete computing environment IN ONE BOX!!!!!11

Fortunately good Linux vendors still make their ancient distributions available for historic interest and presumably maintenance of ancient systems. RedHat are among them and so for old time’s sake I’ll install RedHat 5.2.

The easiest way is to start with QEmu. First thing to do is to make a disk image. I think I installed it on my brand new lavishly huge 9G drive. I reckon 1G is OK, since it came on a CD:

qemu-img create deadrat-5.2.img 1G

You’ll then need the boot image (boot.img) and as I discovered after some trial and error, the supplemental floppy image (supp.img) from here:

ftp://archive.download.redhat.com/pub/redhat/linux/5.2/en/os/i386/images/

FTP for the win! I’m using FTP since it’s necessary for later. RedHat no longer offers the ISO mages to download, so I can’t grab one and do a CD install. They do however have the FTP server still running (see above) so you can do an FTP install if you have network access. A little bit of trial and error found me this incantation of QEmu to get on the ‘net:

qemu-system-i386 -fda boot.img -hda deadrat-5.2.img  -netdev user,id=mynet0 -device ne2k_pci,netdev=mynet0 

This emulates an NE2000 PCI network card, a popular model with many almost compatible clones in the mid 90s. So…

OK now to try some very old school 1997 era C++ code (/tmp/prog.cc):

#include <iostream.h>
int main(){
    cout << "Hello world.\n";
}

Don’t complain it’s not standards compliant, there wasn’t a standard in 1997. And compile it, check it and run it:

[root@hax /tmp]# g++ --version
egcs-2.90.29 980515 (egcs-1.0.3 releae)
[root@hax /tmp]# g++ -static prog.cc
[root@hax /tmp]# ldd a.out
        not a dynamic executable
[root@hax /tmp]# ./a.out
Hello, world.

OK seems to work. Now to get that file onto my machine. First power it off, then mount it. Mounting isn’t completely trivial since you first need to examine the partition table and pass in an offset to the loopback device before mounting:

$ fdisk -lu deadrat-5.2.img 
Disk deadrat-5.2.img: 1 GiB, 1073741824 bytes, 2097152 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x00000000

Device           Boot   Start     End Sectors   Size Id Type
deadrat-5.2.img1           63   34271   34209  16.7M 83 Linux
deadrat-5.2.img2        34272 2064383 2030112 991.3M  5 Extended
deadrat-5.2.img5        34335 1997855 1963521 958.8M 83 Linux
deadrat-5.2.img6      1997919 2064383   66465  32.5M 82 Linux swap / Solaris
$ sudo losetup -o $((512 * 34335)) /dev/loop99 deadrat-5.2.img 
$ sudo mount /dev/loop99 /mnt/

And running it…

$ cd /mnt/prog
$ ./a.out
Hello, world.

Woahhhhh. A 24 year old (well simulated) just runs on a current Linux machine. Actually mostly, the whole thing runs. I can chroot to it and just run stuff. It takes a little setup: modern Ubuntu seems to have blocked TCP access to the X server, so I have to bind mount /tmp to /mnt/tmp. But once that’s done, I can just run stuff! For example:

Huh, a filament bulb. That’s pretty retro.

Static executables just work. Dynamic ones are harder, they work but not without a chroot because while the kernel mostly maintains compatibility, libc does not.

But anyway, point proven. Static executables just work out of the box 24 years later. I wonder if my docker images will work in a quarter of a century’s time…

Adventures in Docker: dockerizing an old build process

So I don’t really hold with this new-fangled “container” stuff.

Well, OK that’s not really true. I don’t really hold with it’s massive over use. A lot of the time it seems to be used because people don’t know how to distribute executables on Linux. Actually the way you make distributable executables on Linux is the same way as on other systems.

However, Linux also has an extra-easy way of compiling locally which isn’t great for distribution, leading to the problem that “well it works on my machine” or people claiming that you can’t distribute binaries. So, people tend to use a container to ship their entire machine. At least that’s better than using an entire VM which was briefly in vogue.

A frequently better way is to statically link what you can, dynamically link what you can’t and ship the executable and any dynamic dependencies. Not only is this lightweight, it also works when people don’t have large container systems installed (most people), or where it won’t work (e.g. for a plugin).

And that’s precisely what I do for the 3B system.

THE END

OK that was a short post. What about building though? Building is hard, and a different matter entirely. To build reliably and a version you know works, you need your source code at a specific version, your dependencies at a specific version, and your compiler at a specific version and ideally a “clean” system with nothing that might disturb the build in some unexpected way.

And you want this on several platforms (e.g. 32 and 64 bit Linux and Windows).

In an ideal world you wouldn’t need these, but libraries have compatibility breaks, and even compilers do. Even if you try hard to write in compliant code, bear in mind I wrote the code starting in 2009, and there have been some formal breaks since then (auto_ptr leaving us), as well as bits I didn’t get 100% right which came to bite many years later.

Now I could in principle maintain the code to keep it up to date with new versions of everything and so on. Maybe I should, but the project is essentially finished and in maintenance mode, and that takes time and testing.

Oh and one of the main use cases is on clusters, and these often use very table versions of RedHat or some equivalent so they tend to be years old, so my maintained version would need to be buildable on ancient redhat which I don’t have to hand. And either way, Linux tends to be backwards compatible, not forwards, so your best bet is to build on an old system.

So I solved this problem years ago with this hideous script. Since you haven’t read it, what it does is it creates ubuntu images (in various configurations) using debootstrap, sets them up with all the packages, compiles the code in all the different ways I want and assembles a release.

It took ages to write, but really took the burn out of making releases since it required lots of configurations; 32 and 64 bit Windows and Linux static executables plus JNI plugins for all of those (compiled with MinGW). It even has a caching mechanism: it builds a base system, then constructs a system with all the dependencies from that, then constructs a clean system for building the code from those.

It’s quite neat, if all you need to do is rebuild, it’s pretty quick because it only needs copy the image and build the code. And it still (mostly) works. I ran it for the first time in ages and apart from some of the URLs going stale (ubuntu packages have moved for the now historic 10.04, a has libpng, and libtiff), it worked as well today as it did 10 years ago.

The downside is it has to run as root though because it needs to run debootstrap and chroot, which at the time required root. This makes it hard to run on restricted systems (clusters) and it builds the entire thing every time making it hard for people to modify the code. I could update them to use things like fakechroot, but this sort of thing is precisely what Docker does well.

Docker basically makes shipping and managing whole OS images easier, has a built-in and easy to use caching mechanism and so on. The Dockerfile if you care looks like this:

FROM ubuntu@sha256:51523b5adbc67853e73d7e5faff234614942f9ff3872f329d2bb59478baf53db
LABEL description="Builder for 3B on an ancient system"

# Since 10.04 (lucid) is long out of support, the packages have moved
RUN echo 'deb http://old-releases.ubuntu.com/ubuntu/ lucid main restricted universe' > /etc/apt/sources.list

#Install all the packages needed
RUN apt-get update
RUN apt-get install -y --force-yes openjdk-6-jre-headless && \
	apt-get install -y --force-yes openjdk-6-jdk wget zip vim && \
	apt-get install -y --force-yes libjpeg-dev libpng-dev libtiff-dev && \
	apt-get install -y --force-yes build-essential g++ 

RUN mkdir -p /tmp/deps /usr/local/lib /usr/local/include

WORKDIR /tmp/deps

#Build lapack
#Note Docker automatcally untars with add
ADD clapack.tgz   /tmp/deps
ADD clapack-make-linux.patch /tmp/deps
ADD clapack_mingw.patch /tmp/deps
WORKDIR /tmp/deps/CLAPACK-3.2.1
RUN cp make.inc.example make.inc && patch make.inc < ../clapack-make-linux.patch 
RUN patch -p1 < ../clapack_mingw.patch
RUN make -j8 blaslib && make -j8 f2clib && cd INSTALL && make ilaver.o slamch.o dlamch.o lsame.o && echo > second.c && cd .. && make -j8 lapacklib
RUN cp blas_LINUX.a /usr/local/lib/libblas.a && cp lapack_LINUX.a /usr/local/lib/liblapack.a && cp F2CLIBS/libf2c.a /usr/local/lib/libf2c.a

ADD TooN-2.0.tar.gz   /tmp/deps
WORKDIR /tmp/deps/TooN-2.0
RUN ./configure && make install

ADD gvars-3.0.tar.gz   /tmp/deps
WORKDIR /tmp/deps/gvars-3.0
RUN ./configure --without-head --without-lang && make -j8 && make install 

ADD libcvd-20121025.tar.gz   /tmp/deps
WORKDIR /tmp/deps/libcvd-20121025
RUN ./configure --disable-fast7 --disable-fast8 --disable-fast9 --disable-fast10 --disable-fast11 --disable-fast12 && make -j8 && make install

RUN mkdir -p /home/build
WORKDIR /home/build

It’s not very interesting. It gets an ubuntu 10.04 base image, updates it to point at the historic archive, and then patches, builds and installs the dependencies. More or less what the shell script did before (minus downloading the dependencies). it’s nearly 2021, not 2011, I no longer care about 16M of binary blobs checked into a git repository that’s unchanging.

The result is a docker environment that’s all set up with everything needed to build the project. Building the docker environment is easy:

docker build -t edrosten/threeb-static-build-env:latest .

But the really neat bit is executing it. With all that guff out of the way from a user’s point of view, building is a slight modification of the usual ./configure && make. Basically share the current directory with docker as a mount (that’s what -v does), and run in the container:

docker run -v $PWD:/home/build edrosten/threeb-static-build-env ./configure
docker run -v $PWD:/home/build edrosten/threeb-static-build-env make -j 8

The funny thing about this is it gives a very modern way of partying like it’s 1999 or 2009 at any rate. The environment is stock Ubuntu 10.04, with a C++98 compiler. Obsolete? Sure, but it works and will likely keep working for a long time yet. And static binaries last for ever. I last touched the FAST binaries in 2006 probably on some Red Hat machine, and they still work fine today.

I’m not likely to update the builder for the ImageJ plugin any time soon. No one except me builds that since anyone who is in as deep as modifying the code likely wants to analyse large datasets for which ImageJ isn’t the right tool.

New paper: Large Scale Photometric Bundle Adjustment

Myself and Olly Woodford published a new paper: Large Scale Photometric Bundle Adjustment, PDF here, at BMCV 2020.

This work presents a fully photometric formulation for bundle adjustment. Starting from a classical system (such as COLMAP), the system performs a structure and pose refinement, where the cost function is essentially the normalised correlation cost of patches reprojected into the source images.

Abstract


Direct methods have shown promise on visual odometry and SLAM, leading to greater accuracy and robustness over feature-based methods. However, offline 3-d reconstruction from internet images has not yet benefited from a joint, photometric optimization over dense geometry and camera parameters. Issues such as the lack of brightness constancy, and the sheer volume of data, make this a more challenging task. Thiswork presents a framework for jointly optimizing millions of scene points and hundreds of camera poses and intrinsics, using a photometric cost that is invariant to local lighting changes. The improvement in metric reconstruction accuracy that it confers over feature-based bundle adjustment is demonstrated on the large-scale Tanks & Temples benchmark. We further demonstrate qualitative reconstruction improvements on an in ternet photo collection, with challenging diversity in lighting and camera intrinsics.

Make hacks: embedded version info, detecting non timestamp change and automatically generated dependencies

For the purposes of traceability you may wish to embed the version of a program into it. If you’re using full on CI, you very much shouldn’t need this. If your CI system doesn’t record such things, then you need to fix it now.

But if you’re hacking around locally, especially for research it can be really useful to know where that executable came from or more specifically where some results came from, because it’s easy to lose track of that. An easy way to do this is to embed the git hash of the repository into the executable so it can be written alongside the data.

Essentially if git status --porcelain prints nothing then the repository is entirely clean. With that in mind, here’s a short script which generates a C++ source file with the git hash if the repository is clean, and prints a loud, red warning if it is not clean. Here is get_version.sh:

git_hash=`git rev-parse HEAD`

if [[ "$(git status --porcelain)" != "" ]]
then
	echo -e "\033[31mThere are uncommitted changes. This means that the build" 1>&2
	echo -e "Will not represent a traceable version.\033[0m" 1>&2
	time=`date +%s`
	version="${git_hash}-${time}"
else
	version="${git_hash}"
fi

cat <<FOO
namespace version{
	const char* version_string = "$version";
}
FOO

It’s easy enough to use from make:

.PHONY: FORCE
versioninfo.cc: FORCE
	bash get_version.sh > versioninfo.cc

and of course make your program depend on versioninfo.o. But it’s not very clean; this will rebuild and then re-link every single time. The key is to make the FORCE dependency depend on whether anything has changed.

This script (get_version_target.sh), reruns get_version.sh and compares it to the existing versioninfo.cc. If there’s a change, it prints the target (FORCE), otherwise it prints nothing.

if ! diff -q versioninfo.cc <( bash get_version.sh 2> /dev/null ) > /dev/null 2>&1
then
        echo FORCE
fi

You then need to plumb this into make using the shell command:

version_target=$(shell bash get_version_target.sh)
.PHONY: FORCE 
versioninfo.cc: $(version_target)
	bash get_version.sh > versioninfo.cc

This will now only re-generate versioninfo.cc (and hence the .o and the final executables) if the git hash changes.

With the basics in place, you can make the version info as detailed as you like, for example you could record any tags and branch names, so you could record those and if it’s an actual point release etc. The downside of the shell commands is that they run every time make is called so you will want to make them fast otherwise incremental rebuilds will become annoyingly slow.

Using this mechanism, make can be much more malleable than expected. This is an immensely powerful feature. But remember:

Realtime AR world transformations with occlusions

This is me, my team and collaborators have been working on recently. World transforming AR specifically the floor.

You can see occlusions, such as the pillars occluding the floor effect, but we have more sophisticated occlusion handling too:

You can’t tell from this video that the occlusion handling is dynamic, so if the postbox managed to move, the occlusions would stay up to date. And here’s a gallery of nice shots

If you have Snapchat and want to try it for yourself, here are the snapcodes: