IODevice ownership and destruction

We’re having a problem with our LabJack driver not letting go of the USB port when an experiment is unloaded after being run.

If you load the experiment, then don’t run it, then unload it, the destructor is called as expected (although the call stack reveals it is being called when an IODeviceVariableNotification is being destroyed, which is not what I would expect.)

If you load the experiment, then run it, then stop it, then unload the experiment, the destructor is not being called. As a result the cleanup is not happening and the USB port is not being released. If we then load the experiment, the load fails, because the driver can’t acquire the USB port.

It’s not clear why the destructor isn’t being called after a run, but I suspect a shared_ptr is involved, keeping it alive. Some of the shared_ptrs in the API, such as the one held by IODeviceVariableNotification, probably ought to be weak_ptrs because they don’t really ‘own’ the device, and ownership should be limited to the IODeviceManager.

Any suggestions on tracking down the ownership issue, or otherwise making sure the device is destroyed and cleaned up when we want it to be?

Thx,

Jon

Hi Jon,

I’ve reproduced the problem (using a FakeMonkey instance as the I/O device, so it’s not specific to your plugin). You’re probably right that there’s a shared_ptr hanging around somewhere. I’ll dig around some more and see if I can figure out what’s happening.

Thanks,
Chris

Hi Jon,

Quick question: Does your device class derive from IODevice or LegacyIODevice?

Thanks,
Chris

IODevice. It was written for 0.4.3.

using a FakeMonkey instance as the I/O device, so it’s not specific to your plugin

I take that back. Although I did get the same symptoms with both FakeMonkey and MSSWGamepad, it turns out that the plugins themselves were at fault.

Both plugins were creating ScheduleTask instances, which they held onto via shared_ptr’s but failed to release in stopDeviceIO. Since the ScheduleTask instances hold a shared_ptr to the IODevice, this left a reference cycle in place after the experiment was run and closed, meaning neither object was destroyed.

I resolved the problem by adding a bit of code like the following to stopDeviceIO:

if (schedule_node) {
    schedule_node->cancel();
    schedule_node.reset();  // Release the reference
}

Maybe your plugin needs a similar fix?

IODevice. It was written for 0.4.3.

OK. Although you shouldn’t need to, if you do end up subclassing LegacyIODevice, be sure that LegacyIODevice::stopAllScheduleNodes gets called in your stopDeviceIO. (One option is just to call LegacyIODevice::stopDeviceIO for your subclass’s stopDeviceIO.)

Chris

Thanks! I’ll check what we’re doing.

  • Jon

We have:

if (pollScheduleNode != NULL) {
boost::mutex::scoped_lock(pollScheduleNodeLock);
pollScheduleNode->cancel();
}

We have another scheduleNode that we use, for a non-repeating timed function invocation to turn off the reward. That one is only canceled in the destructor.

However, I’m seeing the problem even if I comment out the code blocks where the scheduleNodes are created and bound.

  • Jon

In addition to cancel(), you need to call reset():

pollScheduleNode->cancel(); 
pollScheduleNode.reset();

Otherwise, you keep a reference to the ScheduleTask instance, and it can’t be deallocated.

However, I’m seeing the problem even if I comment out the code blocks where the scheduleNodes are created and bound.

OK, but I still suspect the problem is in your code. If you send me a URL, I’ll take a look and see if I can spot any issues.

Chris

Hm. I’m not getting anywhere with this. And I noticed no use of reset() on shared_ptrs in other devices.

I’ve committed some changes.

The repository is:
git@github.com:jonhendry/MWorks.git

The branch is wip/labjack

The commit id is 937bc6a417533078d3c6

(Not sure if there’s a better way of getting a URL you can use from github.)

(Not sure if there’s a better way of getting a URL you can use from github.)

This works:

http://github.com/jonhendry/MWorks/tree/937bc6a417533078d3c6

And I noticed no use of reset() on shared_ptrs in other devices.

Normally, you don’t need it. But you do need it in the situation I described, where there’s a reference cycle between the IODevice and a ScheduleTask (because each holds a shared_ptr to the other).

In your code, the only place you would need it is in LabJackU6Device::stopDeviceIO, where you cancel pollScheduleNode (and probably should cancel pulseScheduleNode, too). I say “would” because you’ve presumably eliminated the reference cycle by passing weak_ptr’s to boost::bind when creating the schedule nodes.

As it is, I don’t see any obvious reference cycles or leaks in your code. However, I also don’t think the problem you’re experiencing applies to IO devices in general, since (after making the bug fixes I described previously) the destructors for FakeMonkey and MSSWGamepad are being called consistently. At least, that’s true with the latest HEAD code. If you’re still using your own fork of “0.4.3”, then there may be other issues that you have and I don’t.

As a workaround, you could make the USB port handle a static class member, so that it’s shared by all instances. That way, if one instance fails to release it, the next one can take over ownership. (This assumes that you only have one physical device.)

Chris

Thanks, Chris.

Perhaps the thing to do is try it on 0.4.4, and if that doesn’t help, go to the static member.

  • Jon

I believe this issue was resolved a while back. (See this dicussion and the accompanying bug fix). If you run into it again, please let us know.

Chris