Hey Mark,
[UPDATE: I just saw your response come in while I’ve been writing. I
will respond to it after this]
Thanks for bringing this up. Here’s my current thinking on the
subject. To shortcut ahead to the upshot: while I understand where
you’re coming from, I don’t think using an existing scripting language
(by itself) is a good idea, but I do think there is a path where we
can have our cake an eat it too, using a (very) simple domain specific
language (DSL). I’m certainly open to outside opinions and arguments,
and obviously everything is open source, so folks are welcome to go in
a different direction if they don’t agree with me.
Here’s the longer discussion:
Some desirables:
- something editable in any text editor, something that plays well
with version control
- something lightweight: an experiment should be expressible in very
few characters with very little extra “line noise” junk
- something that fits the multi-threaded, asynchronous nature of
these experiments without exposing the user to threading issues
- it would be nice if non-programmer users had some route whereby
they can create always-guaranteed-to-be-valid experiments with a
shallower learning curve
- it should be possible to verify that an experiment won’t
crash/raise prior to setting it in motion
- as much as possible, all experiment state should be logged by design
XML arguably fails on #1 and decidedly fails on #2. The editor
enables #4, but yeah, the current editor irritates me too. We haven’t
had enough development bandwidth to do as good a job on it as I would
have liked.
An obvious alternative is to scrap the XML and replace it with a
scripting front-end. Personally, I’d rather eat my own face than do
this in MATLAB, so basically we’re talking about Python. For the
record, I come pretty close to zealotry when it comes to Python, so
bear that in mind in what follows.
Python is great, but it doesn’t excel at everything. A huge
blind-spot for Python is multi-threading (#3). While Python includes
threading libraries, cPython has a global interpreter lock that makes
it effectively only run on one processor/core at a time. You can’t
even have more than one interpreter live in the same process. MW
conceptually relies heavily on multithreading, so it would be hard to
imagine replacing much of the core functionality of MW with Python,
unless you’re content to effectively run on one core.
This isn’t a problem, per se, since it is easy to wrap C or C++ code
with Python bindings, and let the C/C++ do the threading behind the
scenes. I’m a big fan of Python as a wrapper language in this
capacity, which is why I started building out the infrastructure for
conduit and analysis bindings in Python.
So why not build experiments in Python using more extensive bindings
like this? This could certainly work, and I am in favor of us
building out first class Python bindings so that you can do this if
you want. I’m not super excited about it, though, for my own use,
because explicit “experiment building” semantics in my eyes fail on #2
(lightweight) above. Actually, in a lot of the realistic experiments
that I’ve mocked up in this kind of syntax, the result isn’t much less
verbose than the XML (even if it is somewhat easier to look at), and
the structure of the experiment isn’t terribly obvious at a casual
glance (it’s easy to come up with contrived simple experiments where
this isn’t evident, but that’s not terribly helpful). Plus, the door
to #4 (use of an editor for non-programmers) is completely closed.
There are more aggressive ways of using Python (e.g. putting the
entirety of MW under the control of a Python interpreter) that perhaps
mitigate the #2 argument a bit, but I would argue that these violate
(or potentially violate) #5 and 6 above, and make a potentially huge
mess on the multithreading (#3) front. It’s also not clear to me that
if you go too far down this route that it wouldn’t make more sense to
start with something like VisionEgg anyways (which would be fine, but
that’s a different discussion).
Here’s what I propose as an alternative:
I think a simple DSL to specify experiment structure is the way to go.
This isn’t as scary or as error prone as it might sound. I’ve
already constructed a simple DSL with a lightweight syntax (and a
working parser) that maps in a pretty obviously onto the original XML.
The parser is written in just a few hundred lines of Python (using
pyparsing), and it reduces the number of characters needed to specify
an experiment by around 60% relative to XML. It’s also quite a bit
less verbose than any realistic Python-to-build-up-the-experiment type
approaches that I’ve been able to come up with. Here’s a quick
example snippet:
experiment[“My experiment”]{
protocol["Test protocol", randomization="random_with_replacement",
draw = 4]{
task_system["My task system"]{
state["Start state"]{
# actions
wait(100ms)
python{
print("These can go anywhere an action can. But
use sparingly")
}
report(s[1])
x = 4 * (3 + y) * 2
} transition {
timer_expired(blah) -> "State 2"
(lick_sensor1 > 5) -> "Initiated"
}
}
}
}
As you can see, it’s pretty obvious what’s going on here, and there is
some basic syntactic sugar that makes creation of assignment actions,
etc. less cumbersome to specify. Just to be clear, the parser for all
of this already exists and works (and already has much better error
reporting than the current MW parser). There’s a switch to make it
work with “Python-like” significant whitespace syntax, if you prefer;
this is a detail. There are also some fancier features I’ve
implemented on top of this, like template expansion, which I think
could greatly enhance the maintainability of large experiments. I’m
planning to check in a copy of this prototype parser soon so that
others can take a look, kick the tires, and offer feedback.
I also propose that we include “interpreted code” actions (+ good
bindings) to serve as a stop-gap for anything we don’t anticipate
(that’s what the “python” block in the snippet above is). Python is
the first, obvious candidate here, but I’d also like to see at least a
Ruby represented here as well. This lets you get stuff done quickly
if there’s a hole in what MW covers that hasn’t been patched yet, but
it doesn’t turn everyone’s experiments into the wild west.
I think that this approach enables us to satisfy #1-6 – it’s very
terse, it’s easy to see the structure of the experiment, experiments
are verifiable at parse-time, it retains the existing basic structure
of MW (enabling native multi-threadedness), state changes are all done
within the infrastructure of MW so it all gets logged properly, and
you can use a full-featured scripting language in a pinch if you need
to.
This approach also offers an interim solution whereby users can start
using the new DSL with the existing infrastructure, since the DSL is
one-to-one compilable into XML, as an option. I’ve also got the
beginnings of a translator that will automatically convert old XML
experiments into the new DSL. Of course, even with the availability
of the new DSL, XML and the editor would/could naturally remain a
one-to-one equivalent path into basically the same parser.
Anyhow, lots of options, and lots of room for debate.
Dave