Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Seed for randomization derived from non-numeric-looking string is always 0 #800

Open
brontolosone opened this issue Oct 14, 2024 · 0 comments · May be fixed by #801
Open

Seed for randomization derived from non-numeric-looking string is always 0 #800

brontolosone opened this issue Oct 14, 2024 · 0 comments · May be fixed by #801

Comments

@brontolosone
Copy link

Related: getodk/web-forms#240
Related: getodk/web-forms#49

Here is a form that uses choice randomization.
Yes, the referenced seed node is a text input! And if you put non-numerical looking text in it, by my reading of OpenRosa, it'll evaluate as a NaN double, which then becomes 0 when converted to the Long that the Park-Miller PRNG takes as a seed. Which means that basically any text that doesn't happen to look like a number results in the same seed.

It's not a huge disaster (which is why it went unnoticed?) but on the other hand there's no indication for this behaviour; the spec (not xform, but xlsform) doesn't say "only use integer / numeric types or numberish-looking text", and neither does ODK Validate say anything about it, nor is there a runtime warning (not that we'd want one).
Thus to a user doing a superficial test of their survey it does look like the choice list gets randomized when they put in some text in the designated seed field. They might find out with more extensive testing that almost any string (any non-numeric-looking string) results in the same sort order though, and only at that point they might think about how what was promised relates to these observations, and how those oberservations relate to what they want. And that sort of ambiguity is probably not what we want.

I stumbled upon this in the context of getodk/web-forms#49.
Of the solution alternatives I could come up with ("do nothing but port the behaviour to webforms", "clarify current behaviour in the spec", and "fix things") @lognaturel is leaning towards "fix things", which I thought about as:

  • in the current fallthrough case (string can't be parsed into a Double) currently resulting in NaN, we take the original string value instead, hash-digest it (in utf8-encoding), and read the first 8 bytes of the resulting digest as a double. The goal is to achieve stable and reproducible (cross-platform) seeded randomization regardless of the nature of the input, while giving the same results for inputs that are currently not squashed to 0 (and thus being backwards-compatible).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant