Skip to content
This repository has been archived by the owner on Apr 13, 2023. It is now read-only.

Invocation syntax for functions accepting single argument lambdas #7190

Open
lucono opened this issue Aug 13, 2017 · 68 comments
Open

Invocation syntax for functions accepting single argument lambdas #7190

lucono opened this issue Aug 13, 2017 · 68 comments

Comments

@lucono
Copy link

lucono commented Aug 13, 2017

A very common scenario in many projects and libraries is one involving functions which accept a single argument, where the single argument is a zero- or single-argument function or lambda. In these cases, it would be useful to have a syntax which allows the body of the lambda directly in the function call's braces without needing to wrap the lambda with additional curly braces, improving readability quite a bit.

The syntax would need to be able to disambiguate this usage from other possible usages of the named argument function invocation syntax.

To invent a fictional implementation as an example, one might be similar to the invoke operator but where the opening curly brace of the named argument invocation syntax is immediately followed by a colon, resulting in a new function invocation operator which opens with {: as an indicator that the invocation braces directly contain the body (i.e. not wrapped in additional curly braces) of the single lambda argument being supplied.

{Integer*} ints = { 1, 2, 3 };

// 1.

ints.each((it) => print(it));      // today
ints.each {: print(it) };          // after - lambda statements directly in {: } invoke operator

// 2.

ints.map((it) {                    // today
    print(it);
    return it * 2;
});

ints.map {:                        // after - single param list ommitted
    print(it);
    return it * 2;
};

// 3.

ints.map((it) => it * 2);          // today
ints.map {: => it * 2 };           // after - with fat arrow for explicit return

Basically:

  • For single-argument lambdas, a special keyword (e.g. it) only available inside the lambda could be supported for accessing its single argument, and the need for an explicit parameter list relaxed. (Nested lambdas that would create nested its or shadowed its could be disallowed).
  • When using the new syntax, relax the requirement for a terminating ; at the end of the last statement or expression within the lambda block, as the ambiguity with other possible usages of the name argument syntax would not exist when using the new invoke operator. This will particularly eliminate the noise of the terminating semi-colon ; for lambdas with a single statement or expression.
  • For single expression lambdas that return a value, the fat arrow is used to satisfy explicit return.
  • Finally, for multi-argument lambdas, which is not a primary use-case of this feature, the argument list could be between the opening curly { and the colon :, surrounded by parentheses.
ints.sort((a, b) {          // today
    print("``a`` +");
    print(b);
    return a <=> b;
});

ints.sort {(a, b):          // after
    print(a);
    print(b);
    return a <=> b;
};
@ghost
Copy link

ghost commented Aug 20, 2017

I really don't like this, it looks nice, but isn't at all regular. I especially dislike the "magical identifier" "it".

Additionally, what would the following print?

void foo(Boolean(Boolean)|Boolean() predicate)
{
	if(is Boolean(Boolean) predicate)
	{
		if(predicate(true))
		{
			print("1");
		}
		else
		{
			print("2");
		}
	}
	else
	{
		if(predicate())
		{
			print("3");
		}
		else
		{
			print("4");
		}
	}
}

shared void run()
{
	Boolean it = false;
	
	foo{: => it};
}

@ghost ghost mentioned this issue Sep 2, 2017
@gavinking gavinking added this to the 1.4.0 beta milestone Sep 2, 2017
@gavinking
Copy link
Contributor

I'm now in favor of adding a very limited form of this feature to 1.4, but by making it a keyword. Thus, the syntax would be:

ints.each(print(it));

The difference between my proposal, and what is proposed above, is this would be only for single-expression anonymous functions, and not for blocks.

@gavinking
Copy link
Contributor

Oh, I see, there's an ambiguity here with respect to expression like foo(bar(baz(it)). Which of foo(), bar(), and baz() does it belong to? Still it should be possible to resolve that ambiguity during typechecking.

And I think requiring:

ints.each(=>print(it));

Would undermine most of what's nice about the feature.

@ghost
Copy link

ghost commented Sep 2, 2017

I dislike this feature as you proposed, @gavinking, but there is a more disciplined version that I feel fits in more in the language.

Considering #5989, I feel like it could be the keyword for inferring the type of static references. This way, one would write what you wrote in #7215 as the following:

persons
	.filter((person) => 18 <= person.age)
	.sortedBy(it.name)
	.map(it.email)
	.forEach(print);

In addition, if there were an operator for function composition (I'll use "/compose/", in the lack of a good symbol), one could write the .filter call as .filter(18.notLongerThan /compose/ it.age). It's interesting to note that such operator would be useful in its own right, not only when used together with this feature.

The problem would be when calling functions, as in .forEach(it.setAge(20)), which would need to be written as .forEach(shuffle(it.setAge)(20)).

@gavinking
Copy link
Contributor

@Zambonifofex I think what you're proposing is almost completely useless, since it doesn't reduce the token count from what I can already write today in Ceylon. And I don't see how, if it is so objectionable as a way to infer a parameter, it could possibly be unobjectionable as a way to refer to the type of that parameter.

Nor do I see how it's more "disciplined". It's merely less useful.

@gavinking
Copy link
Contributor

I have pushed a prototype implementation to the 7190 branch, so folks who are interested can play with it. (There are surely bugs; I've spent about one hour working on this.)

@ghost
Copy link

ghost commented Sep 2, 2017

@gavinking

it doesn't reduce the token count from what I can already write today in Ceylon

Types can be longer than a single token.

And if you don't like "it" as the name, you could use "type", for example. Either way, I felt the need for a feature like this several times in the past; I just feel like it would be more generally useful than what you're proposing.

But if you really don't like my suggestion, I could get used to your idea if you introduced some indication of which function the "it" actually belongs to. Since you don't like "=>", what about ":"?

persons
    .filter(:it.age >= 18)
    .sortedBy(:it.name)
    .map(:it.email)
    .forEach(:print(it));

And I also don't like the fact "it" isn't a keyword in your implementation. As I've said, I dislike "magical identifiers" like these; there is no indication that they are declared, which might hurt readability.

gavinking added a commit that referenced this issue Sep 3, 2017
@lucaswerkmeister
Copy link
Contributor

Replying to @gavinking’s comment on another issue:

persons
    .filter(it.age >= 18)
    .sortedBy(it.name)
    .map(it.email)
    .forEach(print(it))

Is far more readable than:

persons
    .filter((p) => p.age >= 18)
    .sortedBy((p) => p.name)
    .map((p) => p.email)
    .forEach(print)

Perhaps, but I think this is the most readable incarnation of the same code:

persons
    .filter((person) => person.age >= 18)
    .sortedBy(Person.name)
    .map(Person.email)
    .forEach(print)

In each line that does something person-related, you’re reminded that you’re operating on persons, not ps or its. This may be slightly more work to type, but code is more often read than written and so on.

Which of foo(), bar(), and baz() does it belong to? Still it should be possible to resolve that ambiguity during typechecking.

Yes, but we all know the typechecker is smarter than the programmer. Don’t force me to repeat the typechecker’s reasoning in my head to make sense of this code.

In the end, I personally don’t believe a further abbreviation syntax for this is necessary. We already have static references for th simplest cases (map(Person.email)), and if your lambda is more complicated, I personally don’t think a short (it) => is too much to ask.

@gavinking
Copy link
Contributor

@lucaswerkmeister

Perhaps, but I think this is the most readable incarnation of the same code:
persons
    .filter((person) => person.age >= 18)
    .sortedBy(Person.name)
    .map(Person.email)
    .forEach(print)

I agree that Person.name, Person.email is quite good. And it works as long as all I'm doing is using attributes. But the approach quickly breaks down as soon as I need to call a method of Person, and so in practice I've found it sometimes useful, but more often than not I do have to use the longer form with the (p) =>

This may be slightly more work to type, but code is more often read than written and so on.

I'm not motivated by a desire to avoid typing. I'm motivated by how nice I think the code for stream processing looks in some other languages that have this feature.

if your lambda is more complicated, I personally don’t think a short (it) => is too much to ask.

It doesn't have to be very complicated at all: it.longerThan(10), it.sequence(), it==' '.

@gavinking
Copy link
Contributor

I dislike "magical identifiers" like these; there is no indication that they are declared, which might hurt readability.

Well the IDE would of course give a visual indication that it is special. But sure, on reflection, I agree. For consistency with this, super, outer, which also don't really need to be reserved words, it should probably be a keyword too. (Unless we made none of them reserved words.)

@gavinking
Copy link
Contributor

gavinking commented Sep 3, 2017

I've pretty much finished the implementation of this. And after trying it out in my example apps, I'm sorry to conclude that I pretty much love it:

    value evenCubes =
            LongStream
                .iterate(1, it+1)
                .parallel()
                .map(it^3)
                .filter(2.divides)
                .mapToObj(it)
                .limit(20)
                .collect(toList<Integer>());
    value map =
            Stream
                .with(*text.split(it==','))
                .parallel()
                .map(it.trimmed)
                .filter(it.shorterThan(10))
                .collect(toMap(String.string, String.size));
    Stream
        .concat(Stream.with(*text.split()),
                Stream.iterate(1, 2.times))
        .parallel()
        .filter(switch (it)
            case (is String) it.longerThan(4)
            case (is Integer) it>4)
        .limit(10)
        .forEachOrdered(print);

Yes, I agree that it sits right on the border the class of "magical implicit stuff" that makes me rather suspicious, and that I've worked very, very hard to keep out of the language.

However:

  1. It's not implicit. The it keyword is absolutely explicit in its intent.
  2. Nor is it ever really ambiguous as far as I can tell. Sure, it borders on ambiguity with stuff like foo(bar(baz(it))), where the "scope" of it can only be resolved after first assigning a type to foo, bar, and baz. But that's also the case with regular identifier resolution.
  3. Finally, by so severely limiting this feature to (a) single-expression anonymous functions with (b) just one parameter, I've made it pretty abuse-proof.

So, we need to decide whether we should go for it and merge this work. Can we live without it? Definitely. We've been living without it for years now. Does it make the language nicer? It seems to me that it does. I like the look of the code with it. In particular, it means that Ceylon doesn't lack something that people like in other competing languages.

@gavinking gavinking self-assigned this Sep 3, 2017
@ghost
Copy link

ghost commented Sep 3, 2017

@gavinking

You've never said what you think about using ":" for indicating which function call it belongs to.

@gavinking
Copy link
Contributor

You've never said what you think about using ":" for indicating which function call it belongs to.

Well, I think it's ugly. The only motivation for adding this feature is that it makes the code look cleaner and more visually pleasing. Plus, the choice of : would be totally adhoc here. There is no other context in the language in which : means "function".

@gavinking
Copy link
Contributor

@Zambonifofex To be clear, we could use => instead, which would make much more sense, given that => already means "function" in the language:

    value evenCubes =
            LongStream
                .iterate(1, => it+1)
                .parallel()
                .map(=> it^3)
                .filter(2.divides)
                .mapToObj(=> it)
                .limit(20)
                .collect(toList<Integer>());

But, again, all of the value of this feature is in terms of reducing visual clutter. And => is still clutter.

@jvasileff
Copy link
Contributor

I was going to agree, until I saw this abomination:

It seems to me that it does.

:)

More seriously, I think this is overall a good feature. There are downsides:

  • I agree with @lucaswerkmeister that the original example isn't convincing
  • I really dislike having a third way to provide Callable arguments, even without considering named arguments (Person.name, (person)=>person.name, and it.name). (The second is ugly, but the first and third may both be preferable depending on context, even in the same file.)

However, it's so concise and convenient, and as often as I have fun((val) => expressionInvolvingVal) in my code, I think it would be worth it. I'd probably say the same even if the only benefit were to save me from having to type (val) => which requires way too much use of the shift key.

@gavinking
Copy link
Contributor

@jvasileff I think it's actually pretty nice that it.name boils down to the same thing as Person.name in the case where both are allowed. By which I mean, that this proposal is actually a strict superset of what @Zambonifofex proposed above.

@jvasileff
Copy link
Contributor

My point is, I'd normally prefer:

persons
    .sortedBy(Person.name)
    .map(Person.email)
    .forEach(print)

but then, if we add the filter, for consistency I might prefer:

persons
    .filter(it.age >= 18)
    .sortedBy(it.name)
    .map(it.email)
    .forEach(print)

It's another decision (another stylistic choice). But overall, less important to me than the added convenience and reduced clutter.

gavinking added a commit that referenced this issue Sep 3, 2017
@MikhailMalyutin
Copy link
Contributor

Hm. So, what is new syntax?

I like syntax like:

persons
    .filter(=> element.age >= 18)
    .sort(=> x.lastName <=> y.lastName)
    .map(=> element.name -> element.email)
    .each(print)

And yes, I like syntax:

get("/", => renderTodos(request));

But I don't understand how it will work. I don't like reserved "it" keywords or something else. But I don't understand how new magic syntax works?
How compiler understand whar request or element or x is mapped to ( (request) => renderTodos(request) ). In current syntax I can write get("/", (req) => renderTodos(req));
But I don't understand compiler understand what request is lambda parameter? By finding any undefined token, and if there is only one token - there is lambda parameter? Looks like very unreliable, I think in this case it is very easy to produce some bugs. Or lambda parameter name obtained from named arguments of function?

@lucaswerkmeister
Copy link
Contributor

Or lambda parameter name obtained from named arguments of function?

Yes. For example, Iterable.sort is declared with the parameter Comparison comparing(Element x, Element y), so x and y are the names of the two parameters in the lambda x.lastName <=> y.lastName.

@gavinking
Copy link
Contributor

gavinking commented Sep 11, 2017

I don't like reserved "it" keywords or something else.

@MikhailMalyutin This proposed feature doesn't rely on the it keyword.

Instead it uses the parameter names explicitly declared in the function you're calling.

@MikhailMalyutin
Copy link
Contributor

MikhailMalyutin commented Sep 11, 2017

Hm. Very interesting idea. I think new syntax is good. And I see only one problem if function have bad name, I have to use this bad name in my code. But old syntax will works - right?

But I think it is necessary to implement good support of new feature in IDE. Refactoring and etc. If IDE will work good - this will be perfect feature.

@MikhailMalyutin
Copy link
Contributor

MikhailMalyutin commented Sep 11, 2017

But I think next syntax will be better:

persons
    .filter(element.age >= 18)
    .sort(x.lastName <=> y.lastName)
    .map(element.name -> element.email)
    .each(print)

And yes, I like syntax:

get("/", renderTodos(request));

Or with blocks and return:

persons
    .filter{ return element.age >= 18;}
    .sort{ return x.lastName <=> y.lastName;}
    .map{ return element.name -> element.email;}
    .each(print)

get("/", { return renderTodos(request);});

@gavinking
Copy link
Contributor

gavinking commented Sep 11, 2017

@MikhailMalyutin

if function have bad name, I have to use this bad name in my code. But old syntax will works - right?

Of course. If you don't like the name of the parameter, give it an explicit name, just like before.

But I think it is necessary to implement good support of new feature in IDE.

I've already added some support, but yes, I also need to add it to the Rename refactoring, that's true.

But I think next syntax will be better:

Well in the discussion above, we decided that implicitly-scoped anonymous functions were problematic because of nesting. They're even more problematic when you don't have a keyword like it to active the implicit anonymous function.

Or with blocks and return:

Well that syntax won't work because the parser can't disambiguate it from a named argument list.

@gavinking
Copy link
Contributor

To be clear, you are allowed to write:

get("/", function { return renderTodos(request); });

Where the function keyword acts to disambiguate this from stream construction.

gavinking added a commit to eclipse-archived/ceylon.formatter that referenced this issue Sep 11, 2017
gavinking added a commit to eclipse-archived/ceylon-ide-intellij that referenced this issue Sep 11, 2017
@luolong
Copy link

luolong commented Sep 12, 2017

The thing that I find disturbing with this proposal is that it will make argument names an API — i.e. a change in nothing else but argument name will become a breaking API change.

We already have this to some extent because of the named arguments syntax, so wnat's one more feature on top of that. Just wanted to point this out, that's all.

@gavinking
Copy link
Contributor

The thing that I find disturbing with this proposal is that it will make argument names an API

Parameter names are already part of the API in Ceylon. Think of named argument lists.

@gavinking
Copy link
Contributor

Parameter names are already part of the API in Ceylon.

(Oops, sorry, you already said that.)

@lucaswerkmeister
Copy link
Contributor

Parameter names are already part of the API in Ceylon.

But not parameter names of function parameters, right? This compiles without error:

void f(void g(Integer i)) {
    g { i = 1; };
}

shared void run() {
    f {
        void g(Integer j) {
            print(j);
        }
    };
}

@gavinking
Copy link
Contributor

Correct.

@jvasileff
Copy link
Contributor

jvasileff commented Sep 24, 2017

Another option for it is to fully support nesting of anonymous functions, but disallow use of it except immediately within its associated anonymous function. IOW, it would no longer be in scope within nested declarations.

This breaks with Ceylon's strict lexical scoping rules, but seems quite reasonable since it is implicit.

There are a few significant advantages:

  1. Implicit variables (its) would only be used close to where they are "declared", so there would be no confusion with complex expressions
  2. Refactoring (cut, copy & paste of whole functions) would always work. An anonymous function's containing scope would no longer be relevant WRT to this feature
  3. The rule and the rule's application is very simple

@guai
Copy link

guai commented Jun 6, 2018

I like 'it' over declared arg names because of:

  1. groovy and kotlin have them. this concept is familiar to programmers
  2. no need to guess where that symbols came from. when reading someone else's code there can be a lot of variables in a scope. with 'it' its just the same symbol everywhere, with declared names it can take some time to guess where they came from
  3. what about name shadowing? I just want to copy-paste some snippet but with named args instead of 'it' there is a chance that I'll have to rename existing variables first. If a language have 'it' keyword you will not name regular var 'it', but with implicit arg names came from the declaration you don't know what names to avoid
  4. you should not disallow nested usages of 'it' for the same reason. copy-paste some code with 'it' to the place with 'it' already in scope should not require refactoring. it may and should be a warning from IDE, but let it just work. let 'it' be bound to the closest scope. I think Gavin did that already in his first prototype because it is also an easyest to implement solution. No need to be overpatronizing here. groovy and kotlin allow multiple 'it's in the scope. IDE warns and thats it, no problem

@guai
Copy link

guai commented Jun 7, 2018

another reason not to restrict nested 'it': what if lambda is a single-arg lambda, but I do not use its argument? will it count as 'it' taken and you cannot use it nowhere down the hierarchy?
what if I have 10 levels of nested lambdas one of which is a single-arg lambda, and others are no-arg lambdas, and I need to paste some snippet in there? what clues do I have on which level should I introduce named argument to avoid 'it' collision?

@xkr47
Copy link
Contributor

xkr47 commented Sep 10, 2018

I feel I have to speak for => here.

  • familiar ceylon style
  • people not familiar with the construct will recognize it as such and look it up in the documentation if they can't guess how it works.

Quoting @guai:

with declared names it can take some time to guess where they came from

I think common sense can be used here. If the function is anything else than a simple construct where it is clear what is going on, explicitly declare the variable names.

there is a chance that I'll have to rename existing variables first

Or just explicitly declare variable names. It's not the end of the world imo.

you should not disallow nested usages of 'it'

Not disallowed with => syntax..

IDE warns and thats it, no problem

I strongly dislike having warnings everywhere in my code. I grow insensitive to them and miss the important ones.

@davidfestal
Copy link
Contributor

+1 for the => syntax as well.

lucaswerkmeister added a commit to ceylon/ceylon.ast that referenced this issue Sep 17, 2018
Since eclipse-archived/ceylon#7190, an expression like `=>a*b+c` is valid: it is
a function expression with zero parameter lists (see also #142). And
apparently the parser will parse it as such when asked to parse this
code as a (non-lazy) specifier (though I’m not sure why), whereas we
actually want this to be parsed as a lazy specifier in this context. We
can avoid this by attempting the lazy specifier parse first and falling
back to the non-lazy case second. (Even better would be to only call the
parser once, but there’s no direct rule for “any specifier”, and I’m not
convinced that the alternative – turning the code into a complete
example for some other rule, e. g. typedMethodOrAttributeDeclaration –
would be better.)
Voiteh pushed a commit to Voiteh/ceylon that referenced this issue Mar 23, 2020
includes an improved algorithm for assigning names to parameters of SAMs
Voiteh pushed a commit to Voiteh/ceylon that referenced this issue Mar 23, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

10 participants