ehFeedback error

Hi Chris and Lindsey,

I suspect that this could be a hardware issue – with the Labjack, or the USB hub, or perhaps the Mac. If Chris can’t get the crash replicated, perhaps he should look to remote in.
We replaced one of our Macs, which we think eliminated one source of these issues on one rig, and we’re watching to see if we see this again on the second rig that was suspect.

Mark

Hi Mark,
We replaced the Mac and it didn’t help, but it could easily be the Labjack
or hub. I’ve got new labjacks arriving on Weds, and I think I have a spare
hub that I can try tomorrow.
Thanks,
Lindsey.

Quick update: I have now seen the ehFeedback error. MWServer didn’t subsequently crash, but it did seem to deadlock. I’m continuing to investigate.

Chris

Ok- good to know.
Now that you’ve at least partially replicated, I’m leaning away from a
hardware issue (though could cause increased susceptibility). But just to
follow up on our end- we eliminated the USB hub (just plugged the labjack
directly into the mac) and it still crashed.
Lindsey.

Hi Lindsey,

I think I found the problem:

In the method LabJackU6Device::stopDeviceIO, there are two calls to ljU6WriteDO that execute without holding ljU6DriverLock. This allows them to overlap with other commands sent to the LabJack device in update_lever, which runs on a separate thread associated with pollScheduleNode. Since stopDeviceIO is called at the end of every trial, an overlap eventually happens. This causes commands to the device to be interleaved, leading to unexpected responses and the associated errors.

The fix is simply to acquire the lock before calling ljU6WriteDO. Prior to making the fix, I found that the error consistently occurred 10-15 minutes after starting the experiment. Since making the fix, I’ve run the experiment continuously for up to 1.5 hours, and I haven’t see any errors.

I’ve attached another MWorks 0.10 build of the plugin that includes the fix. Can you try this out and see if it resolves the issue for you, too? If so, I’ll submit a pull request with the changes on GitHub.

Mark: The problematic code in Lindsey’s plugin isn’t present in yours, so clearly this isn’t the cause of the issues you’re seeing. It’s possible that there’s a similar problem elsewhere, or it could be the case that your issues are hardware-related, as you’ve suggested.

Chris

Hi Chris,
So, good news is that MWorks no longer crashes with the new Labjack plugin.
However, now there seems to be a problem with the communication with
Matlab. I’m still not sure what the exact issue is, but our plotting
function gets hung up on variables it should recognize. I’m fairly positive
that it’s related to the new plugin (after this happened on the first
computer I updated, I tested the old plugin on a different computer and
then installed the new plugin and tested again, and it no longer worked).
Maybe related- the python script that we use to load our variable set also
no longer works (this issue came up after the first plugin you sent us, but
I didn’t make the connection until now).
There are few Matlab files that you would need to replicate, but I’m happy
to send if it’s helpful.
Thanks,
Lindsey.

Hi Lindsey,

I just realized I made a mistake with both of the previous plugin builds I sent you: I had changed the path to the file led.txt, and I built the plugin for you without changing it back to its original value. I’ve attached a new build of the plugin with the correct path restored.

Is it possible that the bad path was the source of your MATLAB and Python issues?

Chris

Attachment: LabJackU6Plugin.bundle.zip (125 KB)

Hi Chris,
Turns out the Matlab bug was on our side after all (the python bug is still
a mystery). But I got your new plug-in with the updated path and everything
looks great!
Thank you so much for your help.
Lindsey.

Hi Lindsey,

I’ve submitted a pull request with the bug fix. Also in that PR are changes that allow the plugin to natively support both Intel and Apple Silicon processors, as we’ve discussed elsewhere.

Cheers,
Chris

Hi Lindsey,

Following up on this comment:

we have a resurgence of the ehFeedback error with the new nightly. We merged your pull request from July that should fix this- so maybe the nightly introduces some new issue?

I just ran your experiment for over two hours, and I didn’t observe any errors. The fix I made was entirely in your plugin code, so I don’t see how changes in the nightly build could re-introduce the issue.

Given that Mark reported seeing similar errors, it’s possible that there’s another, independent issue at work. Are you seeing the errors consistently, or do you only get them occasionally? Is there a particular experiment, or a particular variable set for the experiment you shared previously, that seems to trigger the error?

Thanks,
Chris

Hi Chris,
Thanks for trying to test it. We haven’t tried to replicate it with that
experiment (we can try)- it’s been happening with a similar experiment that
requires triggers received from our microscope (and is therefore harder for
you to replicate). However, even with that experiment, unlike before it
doesn’t happen with doRobot = 1 (so also harder for you to replicate- and
strange). However, one thing is the same: it only happens when we have the
variable doFeedbackMotion = 1. This makes me think it’s the same problem,
but it doesn’t have to be. In any case, when these conditions are true, it
reliably crashes within 5-15 trials.

Here is the error from std out:

ehFeedback error : write failed
Error : getCalibrationInfo write failed
ehFeedback error : write failed
ehFeedback error : write failed

And here is the error from the console:

ERROR: Error reading DI, stopping IO and returning FALSE
00:19:13: ERROR: Error Calibrating LabJack U6.
00:19:13: ERROR: bug: ehFeedback error, see stdout
00:19:13: ERROR: bug: writing laser trigger state; device likely to
be broken (state 0)
00:19:13: ERROR: bug: ehFeedback error, see stdout
00:19:13: ERROR: LJU6: error in readLeverDI()

Lindsey.

We tried but could not replicate the error with the experiment that you
have.
I have attached the experiment that is problematic. There are multiple
reasons why it will be difficult for you to use it (it requires two
different forms of TTL inputs), but perhaps the errors above give you clues
about where to look?
Lindsey.

Hi Lindsey,

I have attached the experiment that is problematic.

Sorry, I didn’t get the attachment. Can you try sending it again?

Thanks,
Chris

May have been a security issue. Trying again with a compressed file…

Attachment: Lego_unity_frames.mwel.zip (13.5 KB)

It came through this time. Thanks!

Chris

I’ll ask if anyone has seen this on my end.

Hi Chris,

Do you have any new thoughts on this error?

We’ve also been getting this (or a similar) error when using a different experiment. It only happens sporadically, and unlike the other experiment I haven’t figured out the conditions that cause it. This experiment only requires triggers sent from a pulse generator to run- so it will potentially be easier to replicate.

I’ve attached a screenshot of the error in the console window and the .xml that I was using.

Thanks,
Lindsey

Attachments: