-
Notifications
You must be signed in to change notification settings - Fork 979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strange results from a non-equi join with multiple conditions #2275
Labels
Milestone
Comments
Fixing #2360 also takes care of this. Please write back if not. Just issued the PR. Should be merged shortly, assuming tests pass. TODO: update the SO post linked by Frank. |
mattdowle
added a commit
that referenced
this issue
Nov 8, 2017
arunsrinivasan
pushed a commit
that referenced
this issue
Nov 11, 2017
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
This was brought up on SO.
The goal is to find out if each row in DT1 has a match in DT2 in the sense of
on=.(RANDOM_STRING, DATE >= START_DATE, DATE <= EXPIRY_DATE)
:My usual approach is to do a join, counting matches with
.N
andby=.EACHI
. However, the OP found that this fails here:There is probably a way to come up with the correct result in a less slow way (foverlaps?), but my point is that I expect the
.N, by=.EACHI]$N > 0L
way to work. Is it failing thanks to a bug or am I mistaken in using it here?I had trouble making a smaller example. Drop the
n
parameter by a factor of 10 and you'll see that the problem disappears. Stranger, the OP noticed that if you repeatedly run theDT1[!(MATCHED), MATCHED := ... ]
line, it will keep making changes over many iterations. Also, the OP said they couldn't construct an example when theon=
condition only contained one inequality.EDIT: one faster way of coming up with the correct result, thanks to SO OP:
The text was updated successfully, but these errors were encountered: