Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ExportImportResolutionBenchmark #10043

Merged
merged 29 commits into from
Jul 15, 2024

Conversation

Akirathan
Copy link
Member

@Akirathan Akirathan commented May 22, 2024

Pull Request Description

Vast majority of CPU time in ExportsResolution is spent in BindingsMap.SymbolRestriction.optimize. #10369 dealt with this. This PR only adds the ExportImportResolutionBenchmark.

Important Notes

  • Introduce new ExportImportResolutionBenchmark that measures the performance of import and export resolution only.
    • Note that the already existing ImportStandardLibrariesBenchmark is probably fine for the purpose, but I just wanted to be sure that ONLY the import/export resolution is measured and nothing else, so I have isolated that into a new benchmark.

@Akirathan Akirathan self-assigned this May 22, 2024
@Akirathan Akirathan added CI: No changelog needed Do not require a changelog entry for this PR. CI: Clean build required CI runners will be cleaned before and after this PR is built. labels May 22, 2024
@Akirathan
Copy link
Member Author

@Akirathan
Copy link
Member Author

My previous finding about hot methods can be proofed by applying

diff --git a/engine/runtime-compiler/src/main/scala/org/enso/compiler/phase/ExportsResolution.scala b/engine/runtime-compiler/src/main/scala/org/enso/compiler/phase/ExportsResolution.scala
index 78344651a..dfe1a6fd9 100644
--- a/engine/runtime-compiler/src/main/scala/org/enso/compiler/phase/ExportsResolution.scala
+++ b/engine/runtime-compiler/src/main/scala/org/enso/compiler/phase/ExportsResolution.scala
@@ -170,7 +170,7 @@ class ExportsResolution(private val context: CompilerContext) {
         case _: BindingsMap.ResolvedType =>
         case ResolvedModule(module) =>
           getBindings(module.unsafeAsModule()).resolvedExports =
-            exports.map(ex => ex.copy(symbols = ex.symbols.optimize))
+            exports.map(ex => ex.copy(symbols = ex.symbols))
       }
     }
   }

After that patch, the benchmark score is 4, before that, it was 188

Copy link
Member

@JaroslavTulach JaroslavTulach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider integrating the refactoring before asking for review of the import/export improvements. Important changes would be hidden among the refactorings....

@JaroslavTulach
Copy link
Member

  •        exports.map(ex => ex.copy(symbols = ex.symbols))
    

After that patch, the benchmark score is 4, before that, it was 188

What exactly gets broken when we don't optimize?


import java.nio.file.{Files, Path}

class ExportResolutionTest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Scala. Eh!

@JaroslavTulach
Copy link
Member

When your are at refactorings, I suggest to move SymbolRestriction and subclasses out of BindingsMap and make it as private as possible. Then, as far as I can see, the only usage is in ExportedModule. When I hide it:

diff --git engine/runtime-compiler/src/main/scala/org/enso/compiler/data/BindingsMap.scala engine/runtime-compiler/src/main/scala/org/enso/compiler/data/BindingsMap.scala
index d9c8d8d654..ab0326c872 100644
--- engine/runtime-compiler/src/main/scala/org/enso/compiler/data/BindingsMap.scala
+++ engine/runtime-compiler/src/main/scala/org/enso/compiler/data/BindingsMap.scala
@@ -624,7 +624,7 @@ object BindingsMap {
   case class ExportedModule(
     target: ImportTarget,
     exportedAs: Option[String],
-    symbols: SymbolRestriction
+    private val symbols: SymbolRestriction
   ) {
 
     /** Convert the internal [[ModuleReference]] to an abstract reference.

three usages remain in ExportsResolution - encapsulating them somehow (to isolate the impact of SymbolRestriction and simplify refactoring) might be good as well.

@JaroslavTulach
Copy link
Member

private val symbols: SymbolRestriction

three usages remain in ExportsResolution

However, from what I see, I don't understand why an ExportedModule couldn't have a java.util.Map<String, String> mapping nameToFqn. Maybe there is some trickery with re-export of all[1], but libraries aren't using that - they re-export name by name. That must mean it is clear what a module is exporting without checking what it is importing. In general the whole resolution shall be linear to number of export/import statements in the system.

[1] Only if we allow a wildcard re-export and cycles among the modules we may get into some NP-complete problems when we would need to analyze the modules non-linear number of times. I believe. If that would be slowing us down, we just ban export all.

@Akirathan
Copy link
Member Author

Akirathan commented May 28, 2024

Consider integrating the refactoring before asking for review of the import/export improvements. Important changes would be hidden among the refactorings....

@JaroslavTulach Good point. Moved into #10112. I will revert the relevant changes in this PR once that PR is merged.

@Akirathan
Copy link
Member Author

  •        exports.map(ex => ex.copy(symbols = ex.symbols))
    

After that patch, the benchmark score is 4, before that, it was 188

What exactly gets broken when we don't optimize?

Let's see in 4512650

@JaroslavTulach
Copy link
Member

JaroslavTulach commented Jun 5, 2024

I see only one failure in ExportCycleDetectionTest.detectSimpleCycle. It is a test added as part of this PR.

Is the test even valid?

Why it should fail? I don't see cycle. A_Type is defined in A_Module and exported from both modules. B_Type is defined in Main and exported from both modules. Where is the cycle?

Why import and export?

The test is using:

from project.Main import B_Type
from project.Main export B_Type

e.g. verbose, repeat yourself syntax that was made obsolete by

We shall be using only from project.Main export B_Type. Or support

from project.Main import B_Type
export B_Type

as suggested and refused. However certainly don't promote the repeat yourself form in new tests.

Local Only Check

In any case, local only test for duplicity shall be enough to report errors (none in this case of detectSimpleCycle test in my opinion), as A_Module exports A_Type and B_Type and Main exports A_Type and B_Type.

@Akirathan
Copy link
Member Author

Is the test even valid?

@JaroslavTulach I understand the controversy of ExportCycleDetectionTest. The initial reason why I added that test was that I wanted to rewrite export cycle detection from scratch and I wanted to have a set of simple unit tests to help me with that. However, now, I want to decrease the scope of this PR as much as possible and I want just to experiment with removing the problematic optimize method call. So I have removed that test in b7fbb85

@JaroslavTulach
Copy link
Member

I've got an answer from @kustosz.

My comment was:

However, from what I see, I don't understand why an ExportedModule couldn't have a java.util.Map<String, String> mapping nameToFqn.

Marcin wrote:

What the code does is it flattens the exported names into something that is a like the map you suggest. The whole thing is a solver for the combined constraints of "re-export all", "re-export selected" and "re-export all except (hiding)". It looks at all imports and within them all re-export chains and decides what is visible based on the combination of all limitations along the way (i.e. "am I importing a name, if the module exports it through a "hiding" statement, by importing it from a module that exports it through a whitelist, blah blah blah. From what you're describing, it sounds like there's some cache missing along the way or something like that.

I suggest to ban "re-export all" and "re-export all except (hiding)" for now. Keep only "re-export explicitly named":

  • change IR to yield an error for re-export all
  • completely rewrite the algorithm (in Java) to just compute list of exports for a module by processing IR of that module

Marcin claims...

I don't think it's NP-complete locally (but it may inside a package manager)
i.e. the compiler-side problem is rather easy to solve and if it's currently slow it's because I didn't care enough to make it fast 😛
(because it didn't seem important back then)

...it should be easy to fix the algorithm, but I feel we don't want a complex algorithm for a feature that we don't use in the standard libraries.

@JaroslavTulach
Copy link
Member

JaroslavTulach commented Jun 6, 2024

I want to decrease the scope of this PR as much as possible and I want just to experiment with removing the problematic optimize method call.

ModuleManagementTest is failing on mac: https:/enso-org/enso/actions/runs/9399918677/job/25888370018#step:7:2738

should allow for module deletion on Linux: https:/enso-org/enso/actions/runs/9399918677/job/25888369800#step:7:2352 - does somebody remove the mmapped file?

ExportedSymbolsTest on Windows: https:/enso-org/enso/actions/runs/9399918677/job/25888370524#step:7:5229 - cannot delete a directory because there is an mmapped file(?).

Would clean build help?

@Akirathan
Copy link
Member Author

Akirathan commented Jun 6, 2024

ModuleManagementTest is failing on mac: https:/enso-org/enso/actions/runs/9399918677/job/25888370018#step:7:2738

Looking at the stack trace:

  INFO ide_ci::program::command: sbtℹ️ [info] - should allow importing literal-source modules *** FAILED *** (1 second, 801 milliseconds)
 INFO ide_ci::program::command: sbtℹ️ Warning: Unable to read from client, please check on client for futher details of the problem.
 INFO ide_ci::program::command: sbtℹ️ [info]   org.graalvm.polyglot.PolyglotException: java.lang.InternalError: a fault occurred in an unsafe memory access operation
 INFO ide_ci::program::command: sbtℹ️ [info]   at org.enso.persist.PerInputImpl.readObject(PerInputImpl.java:70)
 INFO ide_ci::program::command: sbtℹ️ [info]   at org.enso.compiler.core.ir.IrPersistance$PersistScalaList.readObject(IrPersistance.java:218)
 INFO ide_ci::program::command: sbtℹ️ [info]   at org.enso.compiler.core.ir.IrPersistance$PersistScalaList.readObject(IrPersistance.java:195)
 INFO ide_ci::program::command: sbtℹ️ [info]   at org.enso.persist.Persistance.readWith(Persistance.java:154)
 INFO ide_ci::program::command: sbtℹ️ [info]   at org.enso.persist.PerInputImpl.readInline(PerInputImpl.java:63)
 INFO ide_ci::program::command: sbtℹ️ [info]   at org.enso.interpreter.caches.PersistBindingsMap$SymbolRestriction$Intersect.readObject(PersistBindingsMap$SymbolRestriction$Intersect.java:12)
 INFO ide_ci::program::command: sbtℹ️ [info]   at org.enso.interpreter.caches.PersistBindingsMap$SymbolRestriction$Intersect.readObject(PersistBindingsMap$SymbolRestriction$Intersect.java:4)
 INFO ide_ci::program::command: sbtℹ️ [info]   at org.enso.persist.Persistance.readWith(Persistance.java:154)
 INFO ide_ci::program::command: sbtℹ️ [info]   at org.enso.persist.PerInputImpl.readIndirect(PerInputImpl.java:214)
 INFO ide_ci::program::command: sbtℹ️ [info]   at org.enso.persist.PerInputImpl.readObject(PerInputImpl.java:70)
 INFO ide_ci::program::command: sbtℹ️ [info]   ...

it seems to me that this is related to the removal of the optimize method call - I can see that a BindingsMap.SymbolRestriction.Intersect is being serialized. Removing optimize method call means that there will be many of SymbolRestriction objects that will be serialized.

Would clean build help?

There is "CI: Clean build required" label on my PR since its creation.

GitHub
Hybrid visual and textual functional programming. Contribute to enso-org/enso development by creating an account on GitHub.

@Akirathan
Copy link
Member Author

Now we know that simply removing the SymbolRestriction.optimize method call done in 4512650 is not enough. Let's continue with the reimplementation.

@Akirathan
Copy link
Member Author

Blocked by #10274

…exp-perf

# Conflicts:
#	engine/runtime-integration-tests/src/test/java/org/enso/compiler/ExportCycleDetectionTest.java
#	engine/runtime-integration-tests/src/test/java/org/enso/compiler/ExportedSymbolsTest.java
#	lib/java/test-utils/src/main/java/org/enso/test/utils/ProjectUtils.java
@Akirathan Akirathan changed the title Improve performance of import export resolution Add ExportImportResolutionBenchmark Jul 12, 2024
@Akirathan Akirathan marked this pull request as ready for review July 12, 2024 11:29
@Akirathan Akirathan added the CI: Ready to merge This PR is eligible for automatic merge label Jul 15, 2024
Copy link
Contributor

@hubertp hubertp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benchmark will give inconsistent results because stdlib team continues to add new modules. The list that we import should be fixed.

But I guess this is better than nothing.

@mergify mergify bot merged commit 79a1fdb into develop Jul 15, 2024
42 checks passed
@mergify mergify bot deleted the wip/akirathan/9236-improve-impexp-perf branch July 15, 2024 10:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI: Clean build required CI runners will be cleaned before and after this PR is built. CI: No changelog needed Do not require a changelog entry for this PR. CI: Ready to merge This PR is eligible for automatic merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants