-
Notifications
You must be signed in to change notification settings - Fork 979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
joins with type coercion: 1.5 == 1 is TRUE #2592
Comments
Good spot and agree. With the caveat that it should only coerce to double if |
I have done a bit more extensive testing on joins with non-matching types. The table below lists the findings and my proposed change of behaviour. Code for examples can be found at the end of this post.
Here is the code with examples for all cases:
|
I have updated the table with type conversions in the last post to reflect all combinations. |
If type is NA in dt2 and let say character in dt1 (or any other type really for dt1), should the merge not failed and still work , assuming NA in dt2 is of type of dt1 ? Sometimes do not know you are getting NA, and there is no reason that column with type NA is of type logical if not NA ? |
Is that supposed to fail ?
|
@Fablepongiste Thanks for reporting. dt1 <- data.table("a" = 1, b = NA_character_)
dt2 <- data.table("a" = 2, b = NA_integer_)
merge(dt1, dt2, by = intersect(names(dt1), names(dt2)), all = TRUE, sort = FALSE) |
@jangorecki Do you expect this fix to be released soon at least in develop branch ? |
@Fablepongiste cannot promise but we are trying to clear out PR queue, so it will be definitely on our list |
Interesting case @Fablepongiste. We will have to check, whether PR #2734 covers that and amend if not. Unfortunately, the PR is very old, so merging will be painful and I have very limited time at the moment. I will nonetheless try to see after it in the next weeks. |
I managed to take a look. #2734 now covers all-NA coercion. Thanks @Fablepongiste for raising. |
Maybe this is a known issue, but to me it came as a very bad surprise.
During a join, if
i
is coerced to integer to matchx
's column type, 1.5 is joined to 1:The reason is the blind coercion to integer of dt2$x during
bmerge
.We should adopt the base R loic of
merge.data.frame
where the join columns get coerced to the 'highest' involved type (https:/wch/r-source/blob/e690b0d6998dfbc360f0fa14492eb8648df20949/src/main/unique.c) Lines 902ff:The text was updated successfully, but these errors were encountered: