Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure import try/except over doubles pylint memory usage #9442

Closed
Jamie- opened this issue Feb 16, 2024 · 5 comments · Fixed by #9504 or #9619
Closed

Azure import try/except over doubles pylint memory usage #9442

Jamie- opened this issue Feb 16, 2024 · 5 comments · Fixed by #9504 or #9619
Labels
Enhancement ✨ Improvement to a component Good first issue Friendly and approachable by new contributors Needs astroid update Needs an astroid update (probably a release too) before being mergable Needs PR This issue is accepted, sufficiently specified and now needs an implementation performance
Milestone

Comments

@Jamie-
Copy link
Contributor

Jamie- commented Feb 16, 2024

Bug description

We run pylint on a reasonably large codebase, ~900 .py files, ~350k lines.
Recently we've needed to support two similar versions of the Azure SDK and as such made the change below in a single file.

from

from azure.mgmt.network.v2022_01_01 import NetworkManagementClient
from azure.mgmt.network.v2022_01_01.models import Resource as NetworkResource

to

try:
    from azure.mgmt.network.v2022_01_01 import NetworkManagementClient
    from azure.mgmt.network.v2022_01_01.models import Resource as NetworkResource
except ImportError:
    from azure.mgmt.network import NetworkManagementClient
    from azure.mgmt.network.v2022_07_01.models import Resource as NetworkResource

This change has increased pylint's run time but more importantly, over doubled resident memory usage! I've tried various combinations of disables and module ignores but cannot get anywhere close to previous figures without reverting the code change. Including attempting to ignore the azure module altogether with --ignored-modules=azure

Numbers below are taken from /usr/bin/time -v python3 -m pylint package1 package2 -f colorized -r n -j 1

Old code:

Elapsed (wall clock) time (h:mm:ss or m:ss): 5:13.65
Maximum resident set size (kbytes): 1589112

New code:

Elapsed (wall clock) time (h:mm:ss or m:ss): 6:30.88
Maximum resident set size (kbytes): 3993300

New code with --ignored-modules=azure:

Elapsed (wall clock) time (h:mm:ss or m:ss): 6:42.64
Maximum resident set size (kbytes): 3994000

To make this more generic, I'm running without a pylintrc file using the latest version from PyPI. This has the side effect of producing masses of warning/error output as usually we have a fair number of disables, however the issue described is still present in this state.

Configuration

No response

Command used

/usr/bin/time -v python3 -m pylint package1 package2 -f colorized -r n -j 1
and
/usr/bin/time -v python3 -m pylint --ignored-modules=azure package1 package2 -f colorized -r n -j 1

Pylint output

n/a

Expected behavior

Original run times and memory usage

Pylint version

pylint 3.0.3
astroid 3.0.3
Python 3.9.18 (main, Aug 25 2023, 13:20:14) 
[GCC 11.4.0]

OS / Environment

Ubuntu 22.04

Additional dependencies

# non-Azure modules removed
azure-common==1.1.28
azure-core==1.26.4
azure-identity==1.12.0
azure-keyvault-certificates==4.7.0
azure-keyvault-secrets==4.7.0
azure-mgmt-automation==1.0.0
azure-mgmt-compute==29.1.0
azure-mgmt-core==1.4.0
azure-mgmt-keyvault==10.2.1
azure-mgmt-monitor==6.0.0
azure-mgmt-network==21.0.1
azure-mgmt-resource==23.0.0
azure-mgmt-storage==21.0.0
azure-storage-blob==12.16.0
msrestazure==0.6.4
@Jamie- Jamie- added the Needs triage 📥 Just created, needs acknowledgment, triage, and proper labelling label Feb 16, 2024
@jacobtylerwalls jacobtylerwalls added performance Needs astroid update Needs an astroid update (probably a release too) before being mergable Needs PR This issue is accepted, sufficiently specified and now needs an implementation Good first issue Friendly and approachable by new contributors and removed Needs triage 📥 Just created, needs acknowledgment, triage, and proper labelling labels Feb 24, 2024
@jacobtylerwalls
Copy link
Member

jacobtylerwalls commented Feb 24, 2024

Thanks for the report! Sounds reasonable to expect ignored-modules to be more useful than it is.

I'm showing this is solved with two tiny patches to pylint and astroid. If you're up for it, would you be willing to confirm and open the PRs? We'd need just a little test and some documentation spruce-ups for the ignored-modules option.

diff --git a/pylint/lint/pylinter.py b/pylint/lint/pylinter.py
index 30250154e..473cd0900 100644
--- a/pylint/lint/pylinter.py
+++ b/pylint/lint/pylinter.py
@@ -1073,6 +1073,7 @@ class PyLinter(
         MANAGER.always_load_extensions = self.config.unsafe_load_any_extension
         MANAGER.max_inferable_values = self.config.limit_inference_results
         MANAGER.extension_package_whitelist.update(self.config.extension_pkg_allow_list)
+        MANAGER.module_denylist.extend(self.config.ignored_modules)
         if self.config.extension_pkg_whitelist:
             MANAGER.extension_package_whitelist.update(
                 self.config.extension_pkg_whitelist
diff --git a/astroid/manager.py b/astroid/manager.py
index c499fe55..386a3838 100644
--- a/astroid/manager.py
+++ b/astroid/manager.py
@@ -59,6 +59,7 @@ class AstroidManager:
         "optimize_ast": False,
         "max_inferable_values": 100,
         "extension_package_whitelist": set(),
+        "module_denylist": [],
         "_transform": TransformVisitor(),
     }
 
@@ -70,6 +71,7 @@ class AstroidManager:
         self.extension_package_whitelist = AstroidManager.brain[
             "extension_package_whitelist"
         ]
+        self.module_denylist = AstroidManager.brain["module_denylist"]
         self._transform = AstroidManager.brain["_transform"]
 
     @property
@@ -200,6 +203,8 @@ class AstroidManager:
         # importing a module with the same name as the file that is importing
         # we want to fallback on the import system to make sure we get the correct
         # module.
+        if modname in self.module_denylist:
+            raise AstroidImportError("Skipping")
         if modname in self.astroid_cache and use_cache:
             return self.astroid_cache[modname]
         if modname == "__main__":

@jacobtylerwalls jacobtylerwalls added this to the 3.2.0 milestone Feb 24, 2024
@jacobtylerwalls jacobtylerwalls added the Enhancement ✨ Improvement to a component label Feb 24, 2024
@Jamie-
Copy link
Contributor Author

Jamie- commented Mar 7, 2024

Aha ok. I incorrectly assumed ignored-modules did that already.
I checked out the pylint and astroid repos, made those changes locally and ran some benchmarks on my codebase with (this time using all cores to get answers faster).

From what I'm seeing, those changes make no effect to the speed or memory usage when adding --ignored-modules=azure. I'm not quite sure why as adding some debug around the modified code, we are hitting the raise AstroidImportError("Skipping") line with the azure module.

Any pointers?

@jacobtylerwalls
Copy link
Member

Make sure you're using the full module name, e.g. azure.mgmt.network.v2022_07_01.models

@Jamie-
Copy link
Contributor Author

Jamie- commented Mar 7, 2024

Yep, that helps! Seeing a considerable improvement now. Thanks for that!
Will look to get the changes PRed up.

@jacobtylerwalls
Copy link
Member

Wonderful. Feel free to make improvements (e.g. for consistency, that collection should probably be a set...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement ✨ Improvement to a component Good first issue Friendly and approachable by new contributors Needs astroid update Needs an astroid update (probably a release too) before being mergable Needs PR This issue is accepted, sufficiently specified and now needs an implementation performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants