Faster dataloader #900

dfalbel · 2022-10-10T20:35:04Z

Improve dataloaders performance.

No longer use coro::generator() in critical parts as it adds a lot of overhead.
Don't use coro::as_iterator() as this will use <<- from coro, making it slower.
nit: Don't recreate sliced tensor using Tensor$new() as it adds small overhead and is no longer necessary.
Vectorized batch samplers allow for faster loading. Specially if using .getbatch instead of .getitem().

… C++ output is already the required object.

… it's much faster. Also don't use `coro::as_iterator()` as it dispatches to coro's `<<-` that is much slower than base R's.

dfalbel added 3 commits October 10, 2022 17:31

Small optimization by no longer re-creating the sliced tensor. As the…

1334b64

… C++ output is already the required object.

Re-implement the batch sampler without using a coro::generator() so…

1e26434

… it's much faster. Also don't use `coro::as_iterator()` as it dispatches to coro's `<<-` that is much slower than base R's.

Vectorized batch samplers.

43974c2

dfalbel merged commit 48832d3 into main Oct 11, 2022

dfalbel deleted the faster-dataloader branch October 11, 2022 15:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster dataloader #900

Faster dataloader #900

dfalbel commented Oct 10, 2022 •

edited

Loading

Faster dataloader #900

Faster dataloader #900

Conversation

dfalbel commented Oct 10, 2022 • edited Loading

dfalbel commented Oct 10, 2022 •

edited

Loading