-
Notifications
You must be signed in to change notification settings - Fork 979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FR: warn/error when updating and i has duplicates #2837
Comments
Detecting duplicates in The current behavior is "correct" in the following sense: as we go along vector |
not possible to use a hash table to say "seen"/"not seen" at each element? what about an option to green light the speed hit? also in the case of a keyed subset, the cost is trivial |
Somewhat related: the same problem during a join #2022 Fwiw, I find that I use both joins and Michael's approach, since often
is more expedient than adding a row-number column...
so my code is vulnerable to both idioms until/unless I add a dupe check, |
if we want to add so many checks we should also collect more attributes. We already keep info if object is sorted, we can also put info if any is NA, or if there are duplicates, uniqueN. So at least we can reduce overhead related to extra checks. |
Related: #2879. Agree with Pasha that potential speed hit should be the primary consideration re: whether to implement this. At a minimum, we should make sure there's a quick blurb in |
It seems the final element to be assigned (here 4 corresponds to the second instance of 1 in
i
). Not clear what the right behavior is in this case; my guess is most often it's a user mistake, hence a warning. But also possibly an error since "correct" behavior is ambiguous.The text was updated successfully, but these errors were encountered: