Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataloaders with socket conection #803

Merged
merged 7 commits into from
Mar 31, 2022
Merged

Dataloaders with socket conection #803

merged 7 commits into from
Mar 31, 2022

Conversation

dfalbel
Copy link
Member

@dfalbel dfalbel commented Mar 30, 2022

Passing data between the workers can now be accelerated by using socket connections.

> bench_ds <- dataset(
+   initialize = function(size_of_tensor) {
+     self$size_of_tensor <- size_of_tensor
+   },
+   .getitem = function(i) {
+     list(x = torch_randn(self$size_of_tensor))
+   },
+   .length = function() {32*20}
+ )
> system.time({
+   withr::with_options(list(torch.dataloader_use_socket_con = TRUE), {
+     dl <- dataloader(bench_ds(256*256*3), batch_size = 32, num_workers = 4)
+     coro::loop(for(x in dl) {
+       k <- x
+     })
+   })
+ })
   user  system elapsed 
 48.416   2.419  58.796 
> system.time({
+   withr::with_options(list(torch.dataloader_use_socket_con = FALSE), {
+     dl <- dataloader(bench_ds(256*256*3), batch_size = 32, num_workers = 4)
+     coro::loop(for(x in dl) {
+       k <- x
+     })
+   })
+ })
   user  system elapsed 
 61.417   2.273  74.165 

Next steps should be investing in torch_save speedups, specially, if we could avoid the compression.
Perhaps, also finding a different way of serializing the tensor.

@dfalbel dfalbel added the lantern Use this label if your PR affects lantern so it's built in the CI label Mar 30, 2022
@dfalbel
Copy link
Member Author

dfalbel commented Mar 30, 2022

With further improvements in the serialization function we get:

> system.time({
+   withr::with_options(list(torch.dataloader_use_socket_con = TRUE), {
+     dl <- dataloader(bench_ds(256*256*3), batch_size = 32, num_workers = 4)
+     coro::loop(for(x in dl) {
+       k <- x
+     })
+   })
+ })
   user  system elapsed 
  2.330   0.942   6.623 
> system.time({
+   withr::with_options(list(torch.dataloader_use_socket_con = FALSE), {
+     dl <- dataloader(bench_ds(256*256*3), batch_size = 32, num_workers = 4)
+     coro::loop(for(x in dl) {
+       k <- x
+     })
+   })
+ })
   user  system elapsed 
  4.409   0.814  14.548

@dfalbel
Copy link
Member Author

dfalbel commented Mar 30, 2022

Further improvements from callr (r-lib/callr#223) make it slightly faster, specially when not using the socket conections.

> system.time({
+   withr::with_options(list(torch.dataloader_use_socket_con = TRUE), {
+     dl <- dataloader(bench_ds(256*256*3), batch_size = 32, num_workers = 4)
+     coro::loop(for(x in dl) {
+       k <- x
+     })
+   })
+ })
   user  system elapsed 
  2.248   0.987   6.796 
> system.time({
+   withr::with_options(list(torch.dataloader_use_socket_con = FALSE), {
+     dl <- dataloader(bench_ds(256*256*3), batch_size = 32, num_workers = 4)
+     coro::loop(for(x in dl) {
+       k <- x
+     })
+   })
+ })
   user  system elapsed 
  2.166   0.896   8.003

@dfalbel dfalbel merged commit c3bfb63 into main Mar 31, 2022
@dfalbel dfalbel deleted the compress branch March 31, 2022 00:17
@hsbadr hsbadr mentioned this pull request Mar 31, 2022
dfalbel added a commit that referenced this pull request Mar 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lantern Use this label if your PR affects lantern so it's built in the CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant