Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rbindlist could use an "ignore.attributes" argument #3911

Closed
arunsrinivasan opened this issue Sep 25, 2019 · 5 comments · Fixed by #5446
Closed

rbindlist could use an "ignore.attributes" argument #3911

arunsrinivasan opened this issue Sep 25, 2019 · 5 comments · Fixed by #5446
Labels
idate/itime rbindlist top request One of our most-requested issues

Comments

@arunsrinivasan
Copy link
Member

Binding integer64 with integer/numeric works silently and correctly, whereas binding IDate with Date fails and so does binding of attribute cols where one/more tables are empty (0-rows).

integer 64 case:

require(data.table) # 1.12.2 current CRAN version, but I don't think things change with current devel)
require(bit64)

ll <- list(data.table(a=1, b=as.integer64(2)), data.table(a=2, b=1))
str(rbindlist(ll))
# Classes ‘data.table’ and 'data.frame':	2 obs. of  2 variables:
#  $ a: num  1 2
#  $ b:integer64 2 1 
#  - attr(*, ".internal.selfref")=<externalptr> 

The 2nd element's col b seems to be implicitly converted to integer64 type without a message/warning.

IDate case

ll <- list(data.table(a=1, b=as.IDate(Sys.Date())), data.table(a=2, b=Sys.Date()))
rbindlist(ll)
# Error in rbindlist(ll) : 
#   Class attribute on column 2 of item 2 does not match with column 2 of item 1.

First element's b column is IDate and 2nd is Date and there's an error.

0-row data.table case:

ll <- list(data.table(a=1, b=Sys.time()), data.table(a=numeric(0), b=numeric(0)))
rbindlist(ll)
# Error in rbindlist(ll) : 
#   Class attribute on column 2 of item 2 does not match with column 2 of item 1.

Even if the empty table has a wrong type, it could be skipped as it doesn't really make it to the final table. (This used to work in previous versions when fill=TRUE -- probably accidental).


In essence, I'd think it'd be immensely convenient to have an argument to ignore strict attributes check, e.g., ignore.attr=FALSE (together with a global option so that code that has been written with old+lenient behaviour work continue to work) which would simply retain the first element's class.

And if the default is to be strict, then, the first case should also error, for consistency.

@statquant
Copy link

Since data.table 1.13.0 and fread reading dates as IDate I come across this issue all the time.

rbindlist(list(data.table(x=1, y = Sys.Date()), data.table(x=1, y = as.IDate(Sys.Date()))))
Error in rbindlist(list(data.table(x = 1, y = Sys.Date()), data.table(x = 1,  :                       
  Class attribute on column 2 of item 2 does not match with column 2 of item 1.     

I think rbindlist should indeed ignore attributes in cases like this

@pkress
Copy link

pkress commented Nov 24, 2020

I just wanted to bump this issue as well - it's a real pain to have to specify dt[, date.col:=as.Date(date.col)] before binding data.tables together.

@jrausch12

This comment has been minimized.

@therosko
Copy link

therosko commented Oct 6, 2021

I can only agree with this. Seems like a completely unnecessary step to add to a process, if the attributes could have just been ignored.

@Nj221102
Copy link
Contributor

Nj221102 commented Mar 20, 2024

I m working on this issue, will raise a PR soon.

@MichaelChirico MichaelChirico added the top request One of our most-requested issues label Apr 14, 2024
@Nj221102 Nj221102 removed their assignment Jul 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
idate/itime rbindlist top request One of our most-requested issues
Projects
None yet
8 participants