Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"not in" operator %!in% #4152

Closed
jangorecki opened this issue Dec 31, 2019 · 5 comments · Fixed by #4931
Closed

"not in" operator %!in% #4152

jangorecki opened this issue Dec 31, 2019 · 5 comments · Fixed by #4931

Comments

@jangorecki
Copy link
Member

jangorecki commented Dec 31, 2019

FR to propose a not-in operator

`%!in%` = function(x, y) !(x%in%y)

Motivation is to have a single call that we could optimise. Otherwise we have to investigate unevaluated expression tree more deeply (and in a much more complex way) to know when user is doing a not-in operation.
Pseudo code in R C api

CAR(x %!in% 1:10)
#`%!in%`
CAR(! x %in% 1:10)
#`!`
@jangorecki jangorecki changed the title "not in" operator "not in" operator %!in% Dec 31, 2019
@jangorecki
Copy link
Member Author

jangorecki commented Jan 2, 2020

Actually we still need to handle ! because !x%in% was just one example, another example is !x%like%y (where we would need x%!like%y). It is definitely not robust to ask users to rewrite their code from base R style to our functions so we can optimize them. And it turns out it was not that difficult to handle !. I will leave this issue open because there are upvotes, so it seems such helper function might be useful anyway.

@jaapwalhout
Copy link

Such a function is certainly useful! I would prefer to write it as %nin% though, which is easier to type (imho)

@legendre6891
Copy link

I would really welcome the addition of %notin%, even if unoptimized; it is quite handy to have.

mczek added a commit to mczek/data.table that referenced this issue Mar 7, 2021
@mczek
Copy link
Contributor

mczek commented Mar 9, 2021

I've been working on this and I'm wondering if someone can explain something for me. How does this function call get evaluated?
chmatch("a", c(), 5)
It returns 5, but as far as I can tell it dispatches to chmatchMain and it should return a vector allocVector(chin?LGLSXP:INTSXP, xlen) of length 0 which is logical(0) but somehow it returns 5 from R. There's no reference to the nomatch parameter before it returns.

@MichaelChirico
Copy link
Member

MichaelChirico commented Mar 9, 2021

That's happening here:

data.table/src/chmatch.c

Lines 30 to 36 in 788c585

const int tablelen = length(table);
if (tablelen==0) {
const int val=(chin?0:nomatch), n=xlen;
for (int i=0; i<n; ++i) ansd[i]=val;
UNPROTECT(nprotect);
return ans;
}

Note that this does match the corresponding match() behavior:

match("a", NULL, 5)
# [1] 5

@mattdowle mattdowle added this to the 1.14.3 milestone Jul 19, 2022
@jangorecki jangorecki modified the milestones: 1.14.9, 1.15.0 Oct 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants