Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy arrays #140

Merged
merged 15 commits into from
Jan 6, 2023
Merged

Lazy arrays #140

merged 15 commits into from
Jan 6, 2023

Conversation

kushalkolar
Copy link
Collaborator

@kushalkolar kushalkolar commented Dec 8, 2022

Implements #139

Dask was not a good option, too complex for our use case, so just created a base array type.

todo:

  • base LazyArray
  • Add min and max properties to get the min and max of if-fully-computed array.
  • RCMArray
  • RCBArray
  • ResidualsArray
  • See how fast these are with very large datasets, use a LRU cache if necessary.
    • RCM
      • single indexing is very fast, 300+ Hz with 4 RCM movies of shape 400x250 with 600-1300 components each being indexed simultaneously
      • indexing 100 slices simultaneously for all 4 results in 7 Hz, but is probably overkill amount of indexing
      • indexing 50 slices which is a reasonable upper limit results in 15 Hz which is a reasonable upper limit for indexing
      %%timeit
      i = np.random.randint(0, rcm.shape[0] - 200)
      rcm[i: i + 50]
      rcm2[i: i + 50]
      rcm3[i: i + 50]
      rcm4[i: i + 50]
    • For mesmerize-viz we can implement a "meta lazy array" which is an object containing attributes lazy_rcm_array and lazy_rcb_array so that rcm and rcb are computed only once. This object would return (RCMArray, RCBArray, and ResidualsArray) for given input indices.
  • tests
  • docstrings for the arrays and the cnmf extensions

⚠️ This will break the current API for cnmf.get_rcm() etc., but we are still in beta and this is much better than the current implementation.

Implemented for reconstructed movie so far, usage:

>>> rcm = RCMArray(cnmf.estimates.A, cnmf.estimates.C, cnmf.estimates.dims)
>>> rcm.shape
(3000, 170, 170)  # [n_frames, x, y]

# can do basic int indexing, works like any other array
>>> rcm[5].shape
(170, 170)

# all the fancy indexing works
>>> rcm[3:10].shape

>>> rcm[::2].shape
(1500, 170, 170)

>>> rcm[:1000:10].shape
(100, 170, 170)

Works with upcoming fastplotlib.ImageWidget

So far works with indexing for single ints and slices, but having issues when slice_avg is used

lazy_arrays-2022-12-08_01.15.50.mp4

Need to implement for rcb & residuals too.

@kushalkolar kushalkolar marked this pull request as draft December 8, 2022 07:40
@kushalkolar kushalkolar mentioned this pull request Dec 8, 2022
13 tasks
This was referenced Dec 18, 2022
@kushalkolar
Copy link
Collaborator Author

For ResidualsArray since it requires subtracting the RCM and RCB from the movie, we can allow rcm_lazy_array and rcb_lazy_array as optional kwargs and put a simple @lru_cache(maxsize=5) for __getitem__ in RCMArray & RCBArray so that RCM or RCB are not re-computed if being used in a visualization together.

@kushalkolar kushalkolar marked this pull request as ready for review January 6, 2023 22:38
@kushalkolar kushalkolar changed the base branch from master to v0.1-dev January 6, 2023 22:39
@kushalkolar kushalkolar merged commit 0a2b7a5 into v0.1-dev Jan 6, 2023
@kushalkolar kushalkolar mentioned this pull request Jan 7, 2023
@kushalkolar kushalkolar deleted the lazy-arrays branch January 8, 2023 08:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant