Threading


In 2.5 on Win32 we made all calls into the Widcomm API come from one thread, this was due to the note in the Widcomm documentation as discussed at https://32feetnetdev.wordpress.com/2009/12/04/widcomm-single-thread-access/   However since then two people noted in the forums that on Windows Mobile some operations failed when called from an async callback.  One noted that in a DiscoverDevices callback that calling BeginDiscoverDevices failed, and another noted that a Connect failed when called from there.

So what do we learn from this?  If a library doesn’t state its threading requirements then assume the worst.  This is unlike the managed code I’ve created where everything is safe when called from multiple threads and from a callback.  I ensure that all invariants are correct and am holding no locks etc when I raise the callback etc.  So nothing is forbidden, and no special main thread is required.

So, since then in the Widcomm support code on both platforms we’ve made all user callbacks occur on a thread-pool thread.  Hopefully that’ll solve all the threading issues once and for all.  On the downside it could make a small change in behaviour, for instance the callbacks could take a slightly longer period to start after an event, and if there were multiple reads or writes released by a stack event then they could now run in parallel.

Advertisements

Following on from Widcomm single thread access, I’ve now moved all operations onto the ‘special thread’ and not just the connection-oriented ones.  So now StartInquiry, and StartDiscovery for instance are run on the main thread.  There’s still some tidying-up to do at some time.

This has been followed by a big round of testing.  I repeated my testing from Widcomm faults in OBEX operations where Win32 stopped sending after many megabytes.  I wasn’t able to reproduce this, so it seemed to have been fixed.  However I then tried repeating the test on the 2.4 version and couldn’t reproduce it either…  Until I switched user (XP’s fast-user switching) and then the data transfer stopped!  So it seems that there’s actually no problem with “buffer-empty” events stopping due to not being thread affinite, but instead that Widcomm keeps state in the current logon session and thus breaks if the session changes.  Yikes!  So be aware of this.  I wonder what happens when calling the API from a NTService…

Also the change doesn’t seem to have had any effect on the conflict with the Microsoft stack.  Oh well…

Following on from Widcomm Thread restrictions on Win32 and Widcomm BluetoothListener stops getting connections, I first tried putting all calls to the Widcomm CRfCommPort API onto a new thread; that fulfilled that first part of the Widcomm statement: “An additional implication of this guideline is that derived functions must not call back into the stack with another SDK API.”

However it did not fix the problem where the WidcommBluetoothListener stopped accepting connections after a while.  So it seems like the second part of the statement really does mean we must use only one thread: “Any SDK API that must be called as a result of a callback from the stack must be executed from the application main thread, not the callback execution context.”

So now I’ve implemented a scheme where we create a thread and force all API calls to be done from that thread only.  (In class WidcommPortSingleThreader we have the commands (PortWriteCommand, OpenServerCommand, etc) which we add to a queue, and then wait for the single thread to take that off the queue and action it).

Some more testing is required, but this seems to have fixed this problem.  I was able to create one thousand connections one after the other without problem.  That’s much more that it could manage previously, so it looks good.

Hopefully this also fixes the problem where send operations were getting stuck on Win32  (Widcomm faults in OBEX operations) — one of the operations now done on the single thread is CRfCommPort.Write.  When I get a moment I’ll test that too.

Its been reported in the forums that BluetoothListener eventually stops accepting new connections.  I’ve reproduced this on Windows XP where new connections stop being accepted but it appears that the client device reports that it is still managing to form new connections.  I’ve still to go and test this on Windows Mobile.  If its Win32 only, then maybe its again related to Widcomm Thread restrictions on Win32  A workaround according to the reporter is to stop the BluetoothListener and create a new one.

I’ve added a new method to the ConsoleMenuTesting/DeviceMenuTesting test apps to test these scenarios.  Use “ListenAcceptMultiple” and “ConnectMultipleTimes” on the two sides.

The fault again seems to be that Widcomm stops reporting the respective event(s)…

UPDATE: Testing on WM (on my iPAQ) shows that this problem doesn’t occur there.  It just keeps on accepting connection.  So the problem is Win32 only!

Widcomm/Broadcom produces different SDKs for their different platforms, but the PDF documentation files in the the various SDKs looks very similar, so I printed the Windows CE/WM version and worked from that.  Recently however I was reading from the Win32 version and found this statement:

“An additional implication of this guideline is that derived functions must not call back into the stack with another SDK API.  Any SDK API that must be called as a result of a callback from the stack must be executed from the application main thread, not the callback execution context.”

So in a Widcomm-callback handler we shouldn’t call back into the stack. 😦  In the current code we do that — for instance when we have get pending data to send, and that stack sends us a ‘finished sending’ event, we call Write from there.  Maybe its this that causes the “gets stuck sending” problem

Anyway I’ve got some work in process to enable this, being tracked by bug 25410.

 I intend just to

  1. Mark threads with a flag noting whether they’re from a Widcomm callback or not.
  2. Any time I make a call back into the stack, check that flag, and if set force the call onto another thread, e.g. via a Delegate.BeginInvoke/EndInvoke pair.

Other much more complex mechanisms are available, e.g. one I tried earlier was to action all Widcomm-callbacks on a new thread, but that was very slow.

Note however that the documentation says “from the application main thread”.  As a library, we can’t know the application’s main thread.  We could though create a single thread and pass all calls to the Widcomm API via it, but is that really required??  It would make things much more complex…