Expression parser and floats/ints

Hi Chris,

Are functions like round() and int() available in the MWorks expression parser? I want to do some casting between floating point values and boolean (i.e. valueOfOneOrTwo = (rand(1) > 0.5) + 1).

I’m wondering whether I have to worry about floating point errors on comparisons like this – i.e. the old “0.1 == 10.0/1 is false on some machines” issue. With a round/int/floor function I could do this explicitly.

Thank you,
Mark

Hi Mark,

The expression parser supports C-style type casts of the form (type)value. The supported types are bool; signed integer types char, short, int (aka integer), and long; unsigned integer types byte, word, dword, and qword; floating-point types float and double; and string.

You can use casts to truncate a float to an integer. For example, (int)1.9 evaluates to 1.

Cheers,
Chris

Thanks, Chris. I figured out parts of this from the online version of the STX parser:
https://idlebox.net/2007/stx-exparser/online.htt

One final question - the parser clearly maintains a type for all numbers. I found that the server will exit on an assertion if you try to add a number to an uncasted value of type boolean. Do you use a float/double internally for all scalar numeric consts or does MWorks use a mix of ints/floats?

Thank you
Mark

One final question - the parser clearly maintains a type for all numbers. I found that the server will exit on an assertion if you try to add a number to an uncasted value of type boolean. Do you use a float/double internally for all scalar numeric consts or does MWorks use a mix of ints/floats?

Yeah, the expression parser (via the stx::AnyScalar class) forbids arithmetic operations involving booleans. However, if the boolean value is coming from an MWorks variable, then the expression parser sees it as an integer, so arithmetic is allowed.

For example, if bool_var is an MWorks variable of type “boolean” with a default value of “true”, then bool_var + 2 is valid and evaluates to 3 (see attached example). However, (bool)bool_var + 2 will fail to parse and produce a “No binary operators are allowed on bool values” error. Whether this distinction is wise or helpful is another question.

Internally, both the expression parser and MWorks (via the mw::Datum class) maintain type info for all values. In particular, both distinguish between booleans, integers, and floats.

Cheers,
Chris

Attachment: bool_var.xml (961 Bytes)

Great - this is good to know; for now I’m going to be using explicit casts.

Internally, both the expression parser and MWorks (via the mw::Datum class) maintain type info for all values. In particular, both distinguish between booleans, integers, and floats.

But didn’t you say that variables in the XML ignore whether they are type=“integer” or type=“float”?
This is important if expressions like intVariable/2 don’t automatically cast the int to a float.
Mark

But didn’t you say that variables in the XML ignore whether they are type=“integer” or type=“float”? This is important if expressions like intVariable/2 don’t automatically cast the int to a float.

Variables themselves don’t care and will happily store values of any type. However, the “type” parameter does matter to the XML parser, which uses it to decide how the text in the “default_value” parameter is converted to a Datum instance. So if intVariable has type="integer", then intVariable/2 will indeed result in integer (i.e. truncating) division.

Chris

Hi Chris,
Do you think you could document in more detail how types are assigned and converted by MWorks?
You seem to be saying the parser respects the supplied type. Can the variable type then change based on e.g. “assignment” actions? If so how does MWorks detect the type of a literal, or of an expression? What about on save and restore of variables?
What about passing to the matlab and python bridges?

Sorry - I now realize this is more complicated than I had assumed and people that write XML probably need to know the type assignment and casting rules. No rush on this, my current code works.

Thank you
Mark

Hello,

Just wanted to revive this thread for discussion since one of my students is getting bitten by type issues like this.

The problem arises when you assign a value, such as “0” to a variable that is marked as a float. This action “demotes” the variable to an integer, which isn’t necessarily a problem, but it can have weird consequences (e.g. if it participating elsewhere in a division… suddenly what was a float division is now a truncating integer division). Very confusing and sometimes hard to reproduce, especially if that “0” gets entered from from the client. Simple test case attached.

If those variable “type” fields are going to be binding in some contexts, perhaps we should make them binding in all contexts? This would simply require enforcing types whenever an assignment is made, and should be a fairly surgical change. I’m not sure if we’re deriving an benefit from having numerical values be loosely typed like this.

Thoughts?

  • Dave

Attachment: test_types.xml (1.81 KB)

Aha. This may describe what has bitten me in the past.

I’m in favor of specifying the variable type in the XML and keeping it the same throughout the lifetime of the variable.

Mark

Hi Dave & Mark,

If those variable “type” fields are going to be binding in some contexts, perhaps we should make them binding in all contexts?

I’d like to point out again that the “type” field matters only when parsing the default value of the variable. The declared type of should_be_a_float_variable in Dave’s example could be any of “integer”, “float”, “boolean”, or “string”, and the output of the experiment would be identical (seriously, try it), because the value of the variable is changed before it’s ever used.

As I recall, the reason why the “type” field is needed is that it was impossible to give a variable a default value of string type without it, e.g.

<variable tag="name" default_value="Chris" type="string" ...

Thinking about it now, it seems like this shouldn’t be necessary, since the expression parser is fully capable of recognizing string literals (see attached example). But for some reason I opted to rely on the “type” parameter, and if I thought about it long enough I’d probably remember why.

Anyway, regarding the specific issues at hand, I think there’s a simpler solution. The two examples of unexpected/confusing behavior cited in this thread (i.e. (bool)bool_var + 2 raising an exception, and 1/5 evaluating to 0) are both the result of design decisions in the STX expression evaluator. In my opinion, neither behavior is very useful. I don’t see any danger in treating boolean true and false as integer 1 and 0 (and off the top of my head, I can’t think of any programming language that doesn’t do that), so it seems pointless to disallow it. And I assume that most MWorks users would expect 1/5 to evaluate to 0.2; in the unlikely case that someone really wants truncating division, they can get it by casting the result of floating-point division to an integer.

So, why not just change the expression evaluator to eliminate those behaviors? That is, allow boolean true and false in arithmetic expressions, and change the division operation to always return a floating-point result (as is the case with division in Python 3). That would resolve these issues without requiring any changes to MWorks’ XML parser or the Variable and Datum classes.

What do you think? Is there some disadvantage to this approach that I’m not seeing?

Chris

Attachment: say_hello.xml (941 Bytes)

This would be okay by me. Integer division is nothing but trouble in 99% of cases.

The bool thing is fine by me too. Incidentally, there are plenty of languages that don’t allow booleans in arithmetic expressions (e.g. Scala), and there are principled reasons to want this behavior. However, we were just inheriting it from STX – it wasn’t a principled decision – and I agree that most users will be most familiar with languages where true == 1 and false == 0.

  • Dave

If you’re going to implicitly promote integers to floats for division,
there are some edge cases.

In particular, expressions like
3/3 == 1 may not be true (depending on the base-2 floating point
representation of 2)

More subtly:
b = 3/3
and later in your code:
b==1 may not be true

This basically rules out any use of logical comparisons for numeric values
in MWorks, as you won’t know in general whether numeric values are floats
or ints at the time of comparison. Maybe you’ll need to add a
‘withintol(a,b)’ floating point comparison.

Matlab does something similar - treats everything as a float. But to get
around the comparison issue they proactively detect integer representations
in floating point. (They call them ‘flints’ internally). I believe they
special case logical operations for this.

I’m not sure if there are other subtle issues beyond comparison that we are
missing.

My predisposition would be to keep both integers and float as first class
fixed types specified in the XML initialization code, and make users deal
with the differences between int and float math, handling casts themselves,
with no implicit conversions at assignment time. (I think my bug was due
to entering integers into the client window).

If you want to implicitly cast up that’s also fine with me but I’d suggest
adding a new logical comparison operator.

Mark

This basically rules out any use of logical comparisons for numeric values
in
MWorks, as you won’t know in general whether numeric values are floats or
ints at the time of comparison. Maybe you’ll need to add a
‘withintol(a,b)’ floating point comparison.

This is a little strong. I haven’t worked it through completely in my
head; maybe most comparisons are done on variables that never are set
through math expressions. And maybe the implicit conversion to float is no
harder to understand than the current situation and documentation on it
will take care of this.
You guys should decide what you prefer.

Mark

Machine representations of numbers are never going to make everyone happy all of the time. Either we expect a fractional value and don’t get it, or we expect a specific comparison to work and we don’t get it. As Chris notes, there are examples of languages that make various choices along this spectrum, so there is no obvious consensus on the one “right” answer. The best we can do is should strive for consistency and maximal clarity.

We can and must document whichever path we choose to achieve better clarity, but fundamentally, I think the options are:

  1. (old behavior) All numeric values are really floats. “1/2” results in “0.5”, but “1/2” isn’t guaranteed to be the same as “0.5” (but it often is). Best practice: all users need to know to be careful with “==”. A “compare-with-tol” would be a useful tool for advanced users.

  2. (current behavior) The “type” field only applies to the default value, which is potentially confusing, but could be maybe finessed with better labeling in the editor (for those who use it) and documentation (for those who don’t). Beyond that, Python 2.x rules basically apply.

  3. (my original suggestion from today) the “type” field is binding in all contexts (does not just apply to the default value). In this scenario, setting a “float” value to “1” would be the same as setting it to “1.0”. You can still get in trouble by setting a “float” to “1/2”, since this is an integer division (result would be “0.0”). This would basically be something like C/C++ rules.

  4. (Chris’s suggestion) the “type” field could remain non-binding or be removed (see #2 for issues / ways to improve), but division would always result in floats.Integer division is no longer possible (good riddance I say), though “30/10” might not exactly evaluate to “3”. This is basically Python 3.x rules.

Are there any other options I’m missing?

  • Dave

My main request is that it should be possible to assume whether a variable is a float or an int at any point in the XML code. So I’m in favor of your (1) or (3) below, or (4), as long as all variables are considered floats. I’d prefer to not have the assignment code guess the right type. Chris points out that type-guessing can cause problems with division and suggests making all division float division. I’d raised the point that type-guessing may also cause problems with comparisons.

I’ll agree with anything you guys decide long as it’s documented.

I see it as two decisions
(a) is the ‘type’ field binding, or does variable assignment code try to guess the right type (or are all variables floats)?
(b) Does all division result in a float, or does int division exist?

Mark

I just gave a quick look at several different languages’ rules for type-guessing on assignment and division.
Python3 does what you say - all division is float division but numbers are ‘duck-typed’. And there hasn’t been a comparison outcry; I think largely because sums, differences, and products of mixed ints and integer-valued floats can be compared to ints safely (in all languages; excepting overflows).
If it works for Python3 it’s fine with me. So I support Chris’s suggestion from Friday.

Mark

I’m still okay with that suggestion as well. However, I think we should additionally do some improved labeling/documentation around the “type” field, or just remove it altogether.

  • Dave