Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand motivation section of overview #19

Open
eqrion opened this issue Dec 12, 2022 · 7 comments
Open

Expand motivation section of overview #19

eqrion opened this issue Dec 12, 2022 · 7 comments

Comments

@eqrion
Copy link
Contributor

eqrion commented Dec 12, 2022

Some feedback I've heard about this proposal is that the motivation is unclear for why this has to be an operation in core wasm.

Specifically I've had it asked, "why can't this be done by compilers targeting wasm"? I don't think the motivation section goes into this at all, and I think that would be useful.

The best answer I've been able to give is that this is not possible for return_call_indirect, which is basically an indirect jump and has no close analogue in core wasm. I'm not as knowledgeable though about the general case of a functional language wanting to compile to wasm and the feasibility of performing tail call elimination in a compiler. I think for posterity it would be nice to have something to point to on this.

@rossberg
Copy link
Member

It's also not possible to emulate return_call in core Wasm. Perhaps the person asking was missing that tail calls are more general than just tail recursion?

@eqrion
Copy link
Contributor Author

eqrion commented Dec 12, 2022

It's also not possible to emulate return_call in core Wasm. Perhaps the person asking was missing that tail calls are more general than just tail recursion?

I'll just talk for myself because I'm not super knowledgeable here.

Understood that tail recursion is different than general tail calls, and that tail recursion is more amenable to compiler transformations. Are there no feasible compiler transformations for general direct tail calls? That's probably the detail I am missing.

@fgmccabe
Copy link

Implementing tail calls was issue #1 for the JVM, back before there was issue trackers. Never got fixed though.
As it happens, if you have reliable exception handling, you can fake tail calls: you have to be able to intercept stack overflow as an exception. But much better to have it built in.

@rossberg
Copy link
Member

There are only non-local transformations, and they are either very expensive (trampolining through exception handlers, like @fgmccabe is alluding to, provided you even have them) or only practical in simple cases (inlining the callee) or require whole program rewriting (defunctionalisation) to eliminate the calls, merging everything into very large functions (which may hit implementation limits and is incompatible with separate compilation).

@eqrion
Copy link
Contributor Author

eqrion commented Dec 12, 2022

There are only non-local transformations, and they are either very expensive (trampolining through exception handlers, like @fgmccabe is alluding to, provided you even have them) or only practical in simple cases (inlining the callee) or require whole program rewriting (defunctionalisation) to eliminate the calls, merging everything into very large functions (which may hit implementation limits and is incompatible with separate compilation).

Okay, that is the detail I was missing. I think a paragraph about this in the overview would be helpful, it's definitely one of the FAQs I get about wasm.

@lorenzleutgeb
Copy link

lorenzleutgeb commented Dec 13, 2022

More background for @eqrion (adds nothing new to this issue, but provides a little more context):

In 2018, when I was working on compiling a functional language to wasm, I asked @TerrorJack for advice on tail calls and about what they did in https:/tweag/asterius (Haskell to wasm compiler). His response was:

About tail call optimization:

  • GHC does not use the C stack to pass parameters. It has its own
    notion of "stack". Likewise, asterius uses a region on the linear
    memory as its own "stack".

  • When "calling" a function, we don't simply use the "call"
    instruction and pass parameters as wasm function parameters. Instead,
    we pass parameters either on the stack or by some virtual registers
    (which can simply be implemented as wasm globals), then we find a way
    to jump to the new function without growing wasm call stack.

  • Since we don't use wasm function parameters at all, we can assume
    all wasm type signatures of our functions are i32(). The returned i32
    indicates the call target (all calls become tail calls). Then we can
    use a trampolined mini-interpreter which given the first function to
    enter, repeatedly invoke the function, perform some side-effects,
    retrieve the next call target, then use the call_indirect instruction
    to call into the next target. This is the current approach of
    asterius.

  • You can consider doing a CPS-styled compilation to make everything
    tail-called in order to use the trampolined tail call method.

  • You may go even further by marshalling each function to a wasm
    block, collecting all "functions" into one wasm function, and instead
    of using call_indirect, use the br_table instruction to jump to a
    block. When a block finishes its job, it sets an i32 wasm local which
    is the "call target". It may have higher performance than the approach
    above, but when the resulting wasm function becomes too large you may
    find it slow to validate (be it binaryen validation or V8's own
    validation).

I also contacted @msprotz who responded:

Tail-calls: yes, if you care about mutually recursive functions, then definitely, a relooper-like algorithm will be needed, or perhaps a trampoline? At least that's what js_of_ocaml used, last I checked.

Indeed, I would love for the tail-calls proposal in Wasm to get some traction.

@TyOverby
Copy link

I also contacted @msprotz who responded:

Tail-calls: yes, if you care about mutually recursive functions, then definitely, a relooper-like algorithm will be needed, or perhaps a trampoline? At least that's what js_of_ocaml used, last I checked.

As it stands, Js_of_ocaml is able to translate mutually-recursive functions into loops, but it can not do the same for recursive calls to dynamically-determined functions, like continuation parameters. This means that libraries like angstrom (which use continuations for running parser-combinators) to have separate code-paths for js_of_ocaml that implement a trampoline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants