Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow functions in setnames input #3703

Closed
smingerson opened this issue Jul 13, 2019 · 9 comments · Fixed by #3860
Closed

Allow functions in setnames input #3703

smingerson opened this issue Jul 13, 2019 · 9 comments · Fixed by #3860

Comments

@smingerson
Copy link

Below are two proposed enhancements to setnames(), contrasted with current alternatives. I think they coincide with data.table's desire to provide concise syntax. I'd be happy to work on a PR for either or both of these if you think they are worth adding.

library(data.table)

dt <- data.table(a = 1:2, b = c('fa', 'fb'), d = NA, yes = 1, no = 0)
dt
#>    a  b  d yes no
#> 1: 1 fa NA   1  0
#> 2: 2 fb NA   1  0

setnames(dt, c('A', 'B', 'D', 'YES', 'NO'))[]
#>    A  B  D YES NO
#> 1: 1 fa NA   1  0
#> 2: 2 fb NA   1  0
# Offer  `setnames(dt, function)` as an option to rename all using a function.
# This would be `setnames(dt, toupper)`
# Right now, I would do setnames(dt, toupper(names(dt))) -- not much more verbose,
# but I'd have to type descriptive table names twice.

to_change <- c("YES", "NO")
setnames(dt, to_change, paste0("IND_", to_change))[]
#>    A  B  D IND_YES IND_NO
#> 1: 1 fa NA       1      0
#> 2: 2 fb NA       1      0
# Offer `setnames(dt, patterns(...), function)` as an option to rename a subset.
# This would be `setnames(dt, patterns("[A-z]{3}), function(x) paste0("IND_", x))`.
# In this simple example, the proposed syntax doesn't have much benefit, but in
# more complicated cases I think it would be more concise than
# to_change <- grep(pattern,names(dt)))
# setnames(dt, to_change, renaming_function(to_change))

Created on 2019-07-13 by the reprex package (v0.2.1)

@shrektan
Copy link
Member

Looks interesting to me.

@g3o2
Copy link

g3o2 commented Jul 14, 2019

one could imagine a whole range of functions: lower, upper, proper, camelcase, snakecase ...

@MichaelChirico
Copy link
Member

Makes sense and should be relatively easy for the basic functionality. New functions (proper/camelCase/snake_case) are a bit tougher, esp if we plan to export, though the following would seem to work:

# unexported
proper = function(x) ...
camelCase = function(x) ...
snake_case = function(x) ...
# exported
setnames(x, proper)

@shrektan
Copy link
Member

shrektan commented Sep 11, 2019

Or we could simply tweak setnames(x, old, new), where new could be any function that returns a string vector... let the user figure out what kind of transformation he/she needs...

@smingerson
Copy link
Author

I was thinking the user would provide their own functions. The snakecase package already exists for a variety of casing options, thought it does depend on string(r/i).

@MichaelChirico
Copy link
Member

To be clear, I have two extensions in mind:

setnames(x: data.table, old: function)
setnames(x: data.table, old: {character, integer}, new: function)

Full customization could be delegated to downstreams, or we could roll a few of our own for ubiquitous cases like those laid out here.

Third way is to take the is.wholenumber approach done by base in ?integer & not officially support any extension functions, but simply document example implementation in the manual & let user roll their own by copy-paste. Or just refer users to e.g. snakecase in the manual.

@shrektan
Copy link
Member

shrektan commented Sep 11, 2019

  1. it may be difficult to decide which function to use if the extension functions are not exported (e.g., the calling frame may exist a camelCase function);
  2. I can imagine most of such use case is less about styling but more about adding some prefixes/suffixes or very customized transforms... (@smingerson 's original post is about prefixes and so is my personal experience);
  3. For styling, maybe toupper(), tolower() is enough for many users, like myself... if not, they can write the function by themselves or as @smingerson mentioned, using other packages like snakecase.

So I personally vote for just refer users to e.g. snakecase in the manual...

(and we can always add the extension functions in the future if there're users request for it...)

@jangorecki
Copy link
Member

jangorecki commented Sep 11, 2019

as for me I prefer old style

setnames(dt, fun(names(dt)))

@shrektan
Copy link
Member

But if you only want to change a subset of the names, you may need to define a separate variable above like vars = c('a','b'), then call setnames(dt, vars, fun(vars)) ...

while @smingerson 's idea is :

setnames(dt, c('a','b'), fun)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants