Question about mwk2 event structure

Hi Chris,

When I extracted events from .mwk files with different event codes, the extracted events were sorted in the order of the events’ occurrence. However, with .mwk2, I noticed that the data is sorted according to the type of the event and not chronologically. Is this the only way of extracting data now or can I specify some option to get them in chronological order?
I can always re-order them according to the time stamps in Matlab, but I wanted to get more info form you.

Thanks in advance,

Cheers,
Beshoy

Hi Beshoy,

Yes, MWK2 files are sorted first by event code, then by event time. This makes extracting events by code tremendously faster than with the old MWK format. (Filtering by time is usually faster, too, but to a smaller degree.)

At present, there’s no way to request that getEvents (or MWKFile in Python) return the events for multiple event codes sorted first by time. I could add support for this pretty easily. (Under the hood, MWK2 files are SQLite databases, so it’d just be a matter of adding an optional ORDER BY clause to the selection query.) However, this would result in longer read times, since you’d be jumping around through the event file (which is naturally sorted by code first).

I’ll do some tests to assess exactly how much slower sorted-by-time reads take. If performance isn’t drastically worse, I’ll add that as an option to MWorks’ event-reading tools.

Cheers,
Chris

Hi Chris,

Thanks for the quick reply. Indeed, it’s much faster than before. It shouldn’t be a problem really; I was just curious if i was missing anything. I assume it would be easier to do it after extraction.

Cheers,
Beshoy

Hi Beshoy,

I’ll do some tests to assess exactly how much slower sorted-by-time reads take. If performance isn’t drastically worse, I’ll add that as an option to MWorks’ event-reading tools.

I just did a quick test of reading MWK2 files with and without sorting by time. I used an 842Mb data file containing 38,895,014 events. Without sorting by time, reading all events in a Python script took 40.4 seconds. With sorting by time, reading all events took 73.8 seconds. So that’s ~80% increase in read time.

Given this result, my inclination is to keep things as they are (so events are returned in “native” sort order, i.e. by code then time), and leave it up to users to sort by time when desired. Thoughts?

Chris

Hi Chris,

Thanks for checking. I would say leave it like this since it can take forever for extracting some data (like eye tracking for example since it’s sampled at 1kHz).

To facilitate trial by trial extraction, maybe add a quick guide on how to use system variables for this very task. For example, extract events trial by trial by using the time stamps from #anouncetrials (or have a separate variable as I did).

Also, I have a question, is it possible to get the information regarding the running protocols (like name or a value for each protocol?). A number assigned with each protocol coupled with the use of #anouncetrials would make trial extraction much easier. Of course, this can be done by the experiment programmer using custom variables, but I am wondering if i can get this info from Mworks system variables as well.

Thanks again for checking.

Cheers,
Beshoy

Hi Beshoy,

To facilitate trial by trial extraction, maybe add a quick guide on how to use system variables for this very task. For example, extract events trial by trial by using the time stamps from #anouncetrials (or have a separate variable as I did).

I’ve been meaning to expand the data analysis section of the user manual with some more complicated examples. This will be a good one to include.

is it possible to get the information regarding the running protocols (like name or a value for each protocol?)

When you start an experiment, MWorks announces the protocol name in a message, e.g.

00:05:23:  Setting protocol to Hello World
00:05:23:  Running MWorks 0.9.dev (2019.05.02)

If you find the #announceMessage event containing that message, you can use its time stamp to mark the start of the protocol.

Alternatively, you could get the component code for the protocol out of the component codec, and then look for #announceCurrentState events that contain that code.

At present, there’s no better way to identify the active protocol. I suppose we could add a new system variable (e.g. #announceProtocol) that would contain the active protocol’s name or code. (I was already planning to add a #announceTaskSystemState variable to report the current task system state.) Does that sound like it would meet your needs?

Chris