Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: optimize LINQ code #7580

Closed
jods4 opened this issue Dec 17, 2015 · 5 comments
Closed

Proposal: optimize LINQ code #7580

jods4 opened this issue Dec 17, 2015 · 5 comments
Labels
Area-Compilers Resolution-Duplicate The described behavior is tracked in another issue

Comments

@jods4
Copy link

jods4 commented Dec 17, 2015

LINQ (the Enumerable flavor) is a huge productivity boost.
Unfortunately it comes with an additional runtime cost: more delegates calls, often lambda captures, more allocations...
This means that on hot paths, LINQ may not be an appropriate choice. Actually, I've read that Roslyn avoids LINQ on hot paths for this very reason.

In many (most?) cases, the compiler could turn the LINQ code into optimized for (foreach) loops. The proof that this is possible is that LinqOptimizer from Nessos does just that, at runtime.

I suggest that Roslyn performs the transformation done by LinqOptimizer at compiler-time when it can (i.e. no call to unknown methods, no unsupported construct). If it can't, it bails out to the library-based approach used today.

Benefits:

  • everyone gets a speed boost (15x on some queries) and reduced memory usage for free.
  • even people who don't know about LinqOptimizer (i.e. almost everyone).
  • this makes LINQ usable again in more situations.
  • transformation happens at compile-time. Today this can be done with LinqOptimizer, but at the cost of creating Expression trees and compiling them at runtime. :(

Main drawback is certainly that this is a large and complex code transformation to include and support in the compiler. I think that it fits well with the current theme of creating a leaner, more performant language with reduced memory allocations; and I hope you'd think it's worthy of including in the compiler proper.

In the case you don't want to include this in the compiler itself, would there be any way to make the compiler extensible so that such transformation passes could easily be included by projects that want them?
Just as we can use custom Roslyn diagnostics by including Nuget packages, it would be nice to be able to use an AST transform in the same way. Download a nuget package and have all your LINQ code optimized at compilation time. This approach would bring almost all benefits: the LINQ code is optimized but you don't have to support it in the compiler itself. Devs would need to discover it, though.

@svick
Copy link
Contributor

svick commented Dec 18, 2015

There is a relevant discussion in #275. The short version is: the compiler shouldn't depend on the exact implementation. But this optimization can be done by the CLR, which does have access to the version of the library you're using.

So I suggest you propose this over there.

@jods4
Copy link
Author

jods4 commented Dec 18, 2015

Yes I did a quick search and I saw this ticket. I felt it was more focused on lambdas specifically and kind of more broad in scope, although the discussion did drift toward toward LINQ and mentionned LinqOptimizer as well.

It ends with the question: why not do this by JIT at runtime?
It's probably possible and I would certainly be very happy with that as well (existing programs gets a speed boost!).

The JIT tries to be fast though, sometimes at the detriment of code quality. Given the complexity of the transformation, I am unsure if the JIT will be very happy to do that. The new AOT .net comilers might, though.

I am no expert but I think the JIT will have a much harder time reinterpreting the IL code than the compiler can reinterpret the syntax tree. It's a lot lower level an many things have to be figured out (what a display class captures, whether it's used in several places or just one, etc.)

I'm going to open a ticket in the CoreCLR repo and we'll see what they think.

@gafter gafter closed this as completed Dec 20, 2015
@gafter gafter added Resolution-Duplicate The described behavior is tracked in another issue Area-Compilers labels Dec 20, 2015
@jods4
Copy link
Author

jods4 commented Dec 20, 2015

I hope you might still give some consideration to an extensibility point in the compiler that may allow that (i.e. transforming the Roslyn AST before emit). It would allow this (optimized LINQ) but many other interesting scenarios: AOP, providing default implementations (e.g. auto-props with INotifyPropertyChanges), inject additional code (e.g. a repository of all implementations of an interface in the assembly, but cached at compile-time), etc.

I know there are several proposals for various ways to add AOP to C#, please keep this kind of scenario in mind!

@gafter
Copy link
Member

gafter commented Dec 22, 2015

We are unlikely to add compiler features whose purpose would be to make the compiler violate the language spec by changing the semantics of code. However we are working on code generation compiler plug-in facilities that will address the scenarios.

@jods4
Copy link
Author

jods4 commented Dec 22, 2015

whose purpose ... "which could also be used to..."
But fair point, something too extensible will certainly be misused by some.

I like your idea of manipulating the syntax tree (C#). It prevents abusing the language itself, while still allowing that kind of optimizations.

Note that most proposals I've seen around AOP work at the method level. A trick like LINQ optimization would require working at the expression level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Compilers Resolution-Duplicate The described behavior is tracked in another issue
Projects
None yet
Development

No branches or pull requests

3 participants