Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up import #1948

Open
1 of 2 tasks
huard opened this issue Oct 8, 2024 · 2 comments
Open
1 of 2 tasks

Speed up import #1948

huard opened this issue Oct 8, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@huard
Copy link
Collaborator

huard commented Oct 8, 2024

Addressing a Problem?

Import takes 2.5s on my laptop.

Benchmark using
python -X importtime test.py
where test.py is just import xclim

Potential Solution

  • Lazy import of virtual module (cf, anuclim, icclim): 0.2 s
  • Lazy import of xclim.indicators: 0.1 s
  • Lazy import of xclim.indices: 0.7 s

For reference, here are import times for some of our dependencies. Note that these numbers are only valid in the xclim context, you'd get different results by testing them individually, since they import each other.

  • xarray: 0.4
  • pint: 0.4
  • cf_xarray.units: .3
  • numba: .2
  • scipy.stats: .2
  • numpy: 0.1

Additional context

Code for lazy import (https://docs.python.org/3/library/importlib.html#implementing-lazy-imports)

import importlib.util
import sys
def lazy_import(name):
    spec = importlib.util.find_spec(name)
    loader = importlib.util.LazyLoader(spec.loader)
    spec.loader = loader
    module = importlib.util.module_from_spec(spec)
    sys.modules[name] = module
    loader.exec_module(module)
    return module

Note that if we lazy import indicators, then they're not in the xclim registry. So the virtual module creation, which relies on the registry, would need to trigger their import.

Contribution

  • I would be willing/able to open a Pull Request to contribute this feature.

Code of Conduct

  • I agree to follow this project's Code of Conduct
@huard huard added the enhancement New feature or request label Oct 8, 2024
@aulemahal
Copy link
Collaborator

I ran the same tests and piped it through tuna, like I did in #1135 and here's a snapshot:

image

I fear that most time is not lost by loading indicators. xclim.indices shows up at the top only because of the order of operations. The longest-loading submodule seems to be in the fire indicators, and that might be numba jitting functions eagerly rather than lazily. Some gain could be made there.

@huard
Copy link
Collaborator Author

huard commented Oct 9, 2024

Regarding the load time of indices, what I did is I commented from indices import * in the __init__ and commented another side import of indices elsewhere in indicators.py. I computed the difference between the import time in this scenario and the base scenario.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants