Occasional ENOSYS with l2cap connect() on the RPi 3.

I’ve been working a lot with bluetooth and my library has recently been giving me occasional errors. The error is when calling connect() on an l2cap socket (this is the only bluetooth low energy socket and is packet oriented with retries) I occasionally and at random get ENOSYS.

That means “Function not implemented” or “System call not implemented”. Very strange. Even stranger is that it was not on connect(), but on return getsockopt. This is an async call, so connect() returns either EINPROGRESS (if all is well) or another error. Connection errors (i.e ETIMEOUT, EREFUSED, or if you’re unlucky EIO or ENOMEM) are then collected when you come back later and pick them up with getsockopt.

My first thought was my program had bugs (of the form of pointer related memory errors) and I was somehow corrupting my system calls. Strace revealed that actually my system calls were precisely as expected and identical and so it wasn’t (maybe) my fault.

I then came across https://www.raspberrypi.org/forums/viewtopic.php?f=63&t=145144&p=962612, which seemed to imply that this was a problem on the RPi 3, and not entirely unique to me. I was beginning to really suspect that it wasn’t my fault and something else was causing it. I then started (unfairly as it transpires) cursing the kernel devs responsible in my head because that should never happen: either the syscall is implemented or it is not. There’s no dynamic behaviour there.

Anyway it’s clearly a problem with bluetooth, so it was time to break out btmon, a bluez tool which gets copies of all HCI packets, parses them and pretty prints them. And I was getting this:

 HCI Event: LE Meta Event (0x3e) plen 12                                                                                                                                                   [hci0] 461.313621
      LE Read Remote Used Features (0x04)
        Status: Connection Failed to be Established (0x3e)
        Handle: 64
        Features: 0x1f 0x00 0x00 0x00 0x00 0x00 0x00 0x00
          LE Encryption
          Connection Parameter Request Procedure
          Extended Reject Indication
          Slave-initiated Features Exchange
          LE Ping

It had me going a while because you’ll see that the error code (0x3e ) is by irritating coincidence the same number as the code indicating it’s a BLE related event. To cut a rather long and winding story short, I eventually ended up digging into the kernel sources to find where bluetooth errors got translated into system call errors. And I found this:

http://lxr.free-electrons.com/source/net/bluetooth/lib.c#L45

The rather handily named “bt_to_errno()” function. Now 0x3e was missing from the list. Checking with the bluetooth 4 spec, we eventually find in table 1.1 in  Volume 4, Part D, Section 1.1 the list of error codes. And it corresponds to “Connection Failed to be Established”. There’s no real explanation and this code seems to mean “something went wrong”.

As I mentioned, that was missing from bt_to_errno(), I’m guessing because it’s rare in the wild, and possibly no hardware until recently actually ever returned the code. I’m generally in  favour of the idea of never writing code to handle a condition you’ve never seen, since it’s awfully hard to test.

And flipping to the end you can see that if a code arrives and no code has been written to handle it, the function returns ENOSYS. And you know that’s kind of sensible. The list of errors is not very rich, and there isn’t really anything more suitable.

Of course now we know this happens and seems to correspond to a sporadic error from the hardware, I think the correct choice is to return EAGAIN, which is more or less “try again: it might work, fail again or fail with a new error”. I’ll see if the kernel bluetooth people agree.

Edit: they don’t: EAGAIN and EINPROGRESS are the same error code! Time to figure out a better code.

 

 

 

 

 

Advertisements

libble++: simple Bluetooth Low Energy on Linux.

Here’s the code on github!

I’ve been working on a Bluetooth Low Energy library for Linux. The emphasis is on modern C++ design, and simplicity of use. I started before BlueZ supported BLE in a meaningful manner, so this library uses bits and bobs from BlueZ, some linked from libbluetooth, and some extracted from the source code.

I don’t make use of any of the GATT (that’s the high level interface, which is implemented over the ATT protocol which itself is sent over bluetooth sockets) code within BlueZ, which is all implemented over DBus. I just construct the packets (that’s the code I pulled out of BlueZ) and send them over a bluetooth socket myself.

With the brief introduction out of the way, how about some code :).

Here’s a complete program which scans for BLE devices:

#include <blepp/lescan.h>

int main()
{
   BLEPP::log_level = BLEPP::LogLevels::Info;
   BLEPP::HCIScanner scanner;
   while (1) {
    std::vector<BLEPP::AdvertisingResponse> ads = scanner.get_advertisements();
   }
}

Is that about the shortest scanning example you’ve ever seen? 😎 I cheated very slightly by bumping up the log level. Otherwise it’d sit there scanning and generating no output whatsoever, which would not be all that much use.

I’ve also not commented this because I think it’s reasonably self evident. There are only two active lines, creating a scanner object and getting the scan results. I also bump up the logging level so it shows scan results without having to print stuff from the ads struct. Not that that is hard, but since more or less everything is optional information, it’s mildly tedious.

With that little hook out of the way, the library follows a socket-and-function design, much like XLib if you happen to know that. In other words, there’s a socket which all the functions use and which is created for you. If you have a simple interaction with BLE to write, you can just use the functions and never worry further. If you have a slightly more complex interaction, for example if you don’t want to block, then you can get the socket and select(), poll(), epoll() or use any other of your favourite system calls.

So as a result, there’s no framework. It’s just a library. It won’t steal your main loop or make it hard to integrate with any other I/O system you happen to have kicking around.

Interacting with a bluetooth device beyond scanning is a bit more complicated, because devices with interaction are a bit more complicated. It’s more or less the same model, except it uses callbacks (set via std::functions), because various bits of information can arrive asynchronously.

Here’s a complete example which shows how to connect to a temperature monitor and log the readings it gives. I’ve highlighted the non-trivial lines so you can see how much code and how much boilerplace there is (not much!):

#include <iostream>
#include <iomanip>
#include <blepp/blestatemachine.h>
#include <blepp/float.h>  //BLE uses the obscure IEEE11073 decimal exponent floating point values
#include <unistd.h>
#include <chrono>
using namespace std;
using namespace chrono;
using namespace BLEPP;

int main(int argc, char **argv)
{
	if(argc != 2)
	{	
		cerr << "Please supply address.\n";
		cerr << "Usage:\n";
		cerr << "prog <addres>";
		exit(1);
	}

	log_level = Error;

	//This class does all of the GATT interactions for you.
	//It's a callback based interface, so you need to provide 
	//callbacks to make it do anything useful. Don't worry: this is
	//a library, not a "framework", so it won't steal your main loop.
	//In other examples, I show you how to get and use the file descriptor, so 
	//you will only make calls into BLEGATTStateMachine when there's data
	//to process.
	BLEGATTStateMachine gatt;

	//This function will be called when a push notification arrives from the device.
	//Not much sanity/error checking here, just for clarity.
	//Basically, extract the float and log it along with the time.
	std::function<void(const PDUNotificationOrIndication&)> notify_cb = [&](const PDUNotificationOrIndication& n)
	{
		auto ms_since_epoch = duration_cast<milliseconds>(system_clock::now().time_since_epoch());
		float temp = bluetooth_float_to_IEEE754(n.value().first+1);

		cout << setprecision(15) << ms_since_epoch.count()/1000. << " " << setprecision(5) << temp << endl;
	};
	
	//This is called when a complete scan of the device is done, giving
	//all services and characteristics. This one simply searches for the 
	//standardised "temperature" characteristic (aggressively cheating and not
	//bothering to check if the service is correct) and sets up the device to 
	//send us notifications.
	//
	//This will simply sit there happily connected in blissful ignorance if there's
	//no temperature characteristic.
	std::function<void()> found_services_and_characteristics_cb = [&gatt, &notify_cb](){
		for(auto& service: gatt.primary_services)
			for(auto& characteristic: service.characteristics)
				if(characteristic.uuid == UUID("2a1c"))
				{
					characteristic.cb_notify_or_indicate = notify_cb;
					characteristic.set_notify_and_indicate(true, false);
				}
	};
	
	//This is the simplest way of using a bluetooth device. If you call this 
	//helper function, it will put everything in place to do a complete scan for
	//services and characteristics when you connect. If you want to save a small amount
	//of time on a connect and avoid the complete scan (you are allowed to cache this 
	//information in certain cases), then you can provide your own callbacks.
	gatt.setup_standard_scan(found_services_and_characteristics_cb);

	//I think this one is reasonably clear?
	gatt.cb_disconnected = [](BLEGATTStateMachine::Disconnect d)
	{
		cerr << "Disconnect for reason " << BLEGATTStateMachine::get_disconnect_string(d) << endl;
		exit(1);
	};
	
	//This is how to use the blocking interface. It is very simple. You provide the main 
	//loop and just hammer on the state machine struct. 
	gatt.connect_blocking(argv[1]);
	for(;;)
		gatt.read_and_process_next();

}

OK so there are come caveats.

Mostly, it’s not finished. I wrote the library to support the work I’m currently doing, and so I’ve not needed to do anything like everything that the BLE protocol supports. BLE also has a lot of features I doubt anyone’s seen in the wild. I don’t like implementing code I can’t use and test, so I’ve not got any interface to features I don’t currently use.

It’ll likely grow over time and get more features, but if there’s something you’d like in there, let me know. It might be easy for me to put in, or I can give pointers to help you write the necessary code.