Non-unique task system state names lead to control-flow errors

The attached experiment defines a single protocol composed of two task systems, each of which contains three states. The task systems have the same name (“Task System”), and in both the three states are named “First State”, “Second State”, and “Third State”. Each of the states includes a “report” action that announces the task system number (1 or 2) and state number (1, 2, or 3). When the experiment is run, the report actions produce the following output:

TS 1, TSS 1
TS 2, TSS 2
TS 2, TSS 3
TS 2, TSS 1
TS 2, TSS 2
TS 2, TSS 3

Note that state 1 in task system 1 is followed by state 2 in task system 2. This happens because the two “Second State” states have the same full name (“New Protocol/Task System/Second State”). Since the state in TS 2 is created after the one in TS 1, it overwrites the first one in the ComponentRegistry.

Najib and I ran into this problem in lab the other day. Initially we thought it was a range replicator issue, but my attached example does not include any replicators. Najib says that this problem has come up (and been fixed) in the past. Looking at the ComponentRegistry source, I see that there used to be code that prevented experiments from using non-unique component names. However, this commit disabled it, with a note mentioning that it caused problems with replicated stimuli. Maybe Dave can shed some light on that.

One way or another, we need to prevent this from happening.

Attachment: name_confusion.xml (3.44 KB)

Yes, this needs to get fixed.

I think the fix that Najib is remembering was actually when we moved to registering paradigm components by their full paths (e.g. “/New Protocol/Task System/Second State”), rather than just their tag name. There’s unit test coverage for the problem that initially prompted this (e.g. “TwoProtocolsWithTaskSystemsWithTheSameNameRandomWithoutReplacementSelection.xml”). These still pass, as far as I know.

It’s never been okay to have two components with the same “full path” name, though there is, of course, little way for the user to know that without documentation or active warnings. I tried putting a prohibition against this in the parser (if I recall, this was pre-emptive, rather than reactive to a particular problem someone was having), but then that actively broke someone’s protocol, so we backed off quickly. I don’t remember the details, beyond what was in the note (i.e. something about stimuli in replicators doing something wrong). Guess I never got back to putting that check back in.

I’ll take a quick look into this. We should make sure that we leave unit test coverage behind this time, one way or another.

– Dave

FYI, there’s a new branch error_on_duplicate_components in mw_core that enables errors on invalid duplicates. Your test protocol now raises errors, and stimulus replicator protocols no longer false alarm. There is still work to be done, but please feel free to check it out and test / comment.

The strategy is a little bit different from before: attempts to register more than one component to the same name don’t immediately cause an error, but rather accumulate information into an AmbiguousComponentReference object. Attempting to access an ambiguous name results in an error, and provides the opportunity to display accumulated information about all of the objects that resulted in the name collision. The infrastructure for this approach was already present in the code, but hadn’t been pressed into service yet.

In cases where the name collision is irrelevant (which is currently the case for the replicated stimulus objects), there is no error, because the protocol never attempts to access these objects by their ambiguous names. With some care, it should be possible to remove these “irrelevant” duplications in at least this one particular case, though the parser is sufficiently complex that I didn’t want to enter into these modifications lightly. The access-driven approach avoids making an issue out of such cases, while providing the opportunity to give additional debug information on error, so it seemed like a reasonable place to start for now.

On another note, we will probably want the editor to highlight an error if this rule is broken at edit-time. I think that this rule is expressible as a schematron rule, so that could be a good place to insert it, assuming online error checking is eventually re-enabled in the editor.

The editor already actively works to prevent name collisions on new object creation and copy/paste operations. Did Najib’s situation arise from editing in text editor, or did he manually edit the names to be the same? Or is there some other loophole in the editor that allowed this to happen (if so, we should close it)? Obviously, while not a substitute for parser error, the closer to creation-time that we can highlight (or avert) errors like these, the better.

Finally, if we choose to, we could at some point safely relax the no duplicates rule specifically for task systems. Paradigm components that maintain randomization state (e.g. Trials, Blocks, etc.) need to have unique names at least so that one can refer to them to manipulate their randomization (e.g. rejectSelections → trial). Task systems don’t have anything like this, so one could make these object “anonymous”. The current choice was made for the sake of consistency, though I think an argument could also be made for enabling as much syntax to work as is cleanly interpretable.

Thanks for looking into this, Dave.

There’s unit test coverage for the problem that initially prompted this (e.g. “TwoProtocolsWithTaskSystemsWithTheSameNameRandomWithoutReplacementSelection.xml”). These still pass, as far as I know.

Yep, that test and its friends still run (and pass) nightly.

FYI, there’s a new branch error_on_duplicate_components in mw_core that enables errors on invalid duplicates.

Looks good. I’ll check it out and do some testing.

On another note, we will probably want the editor to highlight an error if this rule is broken at edit-time.

Agreed. I opened a ticket for this.

Chris

Hey Dave,

I ran all the tests against the error_on_duplicate_components branch, and everything looks good. Is it OK to merge it into master?

Chris

Hey,

I’m in the middle of tracking down a problem with an old protocol that I have that doesn’t appear to parse correctly under this branch. I’ll let you know when I get to the bottom of it.

– Dave


David Cox, Ph.D.
The Rowland Institute at Harvard
cox@rowland.harvard.edu
http://www.rowland.harvard.edu/rjf/cox
office: 617-497-4682

Hey Chris,

I just pushed some more edits to the ‘error_on_duplicate_components’ branch. Basically all exceptions / error reporting stuff – trying to get comprehensible error messages out at the end as much as possible. More generally, there’s a lot of work to be done on the user-friendliness of error reporting. This should probably be a priority for 0.4.5.

Turns out the problem that I was encountering on this branch with an old experiment was due to an image appearing in more than one stimulus group, with the same (replicator-generated) tag name. This was the correct intent for the experiment, and it worked correctly, but this should still be flagged as an error since there are opportunities for confusion/errors here.

Feel free to merge the branch into master if everything looks / works / tests okay for you.

– Dave


David Cox, Ph.D.
The Rowland Institute at Harvard
cox@rowland.harvard.edu
http://www.rowland.harvard.edu/rjf/cox
office: 617-497-4682

Feel free to merge the branch into master if everything looks / works / tests okay for you.

Done. I’m marking this discussion resolved.

Chris