Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Name Lookup for Discards (C# 7.0) #284

Closed
gafter opened this issue Mar 27, 2017 · 4 comments
Closed

Name Lookup for Discards (C# 7.0) #284

gafter opened this issue Mar 27, 2017 · 4 comments
Labels
resolved: by-design The issue was examined, and we decided it wasn't a problem after all
Milestone

Comments

@gafter
Copy link
Member

gafter commented Mar 27, 2017

@gafter commented on Tue Nov 01 2016

We are considering a number of small changes to deconstruction, out variables, and pattern-matching to allow the use of wildcards where one could declare a variable, but the user has no intention of using the value captured in that variable. Here we discuss an open issue around name lookup, and propose a resolution.

The summary of this proposal is that the use of a wildcard has no effect on name lookup elsewhere in the code.

Original (LDM) approach

The original approach described by LDM was that the lookup of the identifier _ would treat it as a wildcard if nothing were found:

class Sample
{
    void M()
    {
        M2(out _); // ok wildcard
    }
}

multiple declarations of the identifier _ in the same scope would cause _ to be considered a wildcard:

class Sample
{
    void M()
    {
        int _ = e1;
        int _ = e2;
        M(out _); // ok, wildcard
    }
}

Existing cases, where the identifier is declared once, would still introduce a variable named _:

class Sample
{
    void M(object o)
    {
        if (o is int _)
        {
            Console.WriteLine(_); //, uses declared pattern variable _
        }
    }
}

Criticism of the this approach

This approach treats an unbound identifier _ as a wildcard, a single definition of _ as causing _ to bind to that definition, and multiple definitions again treating the use of _ as a wildcard. It requires careful definition of what "multiple definitions" are. Would a declaration of _ in an outer scope, and redeclaration in an inner scope cause a use to be a wildcard?

class Sample
{
    void M(object o)
    {
        if (o is int _)
        {
            M(out int _); // allowed? wildcard?
        }
    }
}

There are corresponding implementation difficulties that may make this approach unattractive.

Proposed alternative

The proposed alternative is that

  • the declaration of an identifier _ in the following contexts would always be treated as a wildcard, and would have no effect on name lookup elsewhere:
    • an out argument,
    • the left-hand-side of a deconstruction, or
    • a pattern variable
  • the use of the simple identifier _ as an expression as an out argument, or as a variable in the left-hand-side of a deconstruction, would be bound to a variable of that name. If no variable of that name is found, it is treated as a wildcard. This is similar to the treatment of var as a type.
class Sample
{
    void M(object o)
    {
        if (o is int _) // wildcard by 1.
        {
            M(out int _); // wildcard by 1.
        }
        (int x, int _) = e; // wildcard by 1.

        M(out _); // wildcard by 2.
        (o, _) = e; // wildcard by 2.
        (int x, _) = e; // wildcard by 2.

        Console.WriteLine(_); // error: no _ in scope
    }
}

This is much simpler to implement and I believe much easier to explain.


@gafter commented on Tue Nov 01 2016

/cc @dotnet/ldm @jcouv @AlekseyTs @jaredpar @VSadov


@HaloFour commented on Tue Nov 01 2016

To clarify, if there are no ambiguous overloads of a method declaring out parameters you may use _ as a simple wildcard without the declaration syntax?

//given
void M(out int x) { x = 123; }

// all are equivalent statements?  (assuming no variable _ in scope)
M(out var _);
M(out int _);
M(out _);

If no variable of that name is found, it is treated as a wildcard.

While I do think that this is an improvement on the original proposal I still there that there is too much room for ambiguity and the potential for accidentally overwriting legally declared fields/variables:

void Foo() {
    M(out _); // intended to throw away result
}

private int _; // oops, just added this field and changed the meaning of my program!

I'd argue that for out parameters you could only use a wildcard with the declaration:

void Foo() {
    M(out var _); // no ambiguity, definitely a wildcard
}

As for deconstruction, I'd argue that the use of _ without a declaration should always be considered a compiler error if there is a field/local with the name _ in scope:

private int _;

void Foo() {
    {
        int o, _;
        (o, _) = e; // compiler error, cannot deconstruct into _
        (o, var _) = e; // no ambiguity, definitely a wildcard
    }
    {
        int o;
        (o, _) = e; // no _ in scope, definitely a wildcard
    }
}

I acknowledge that this is a hard-line approach, but in my opinion it's worth being a little more obnoxiously stricter in order to avoid the potential that any legit variable/field is overwritten. For people who don't use the _ identifier there would effectively be no difference.


@vbcodec commented on Wed Nov 02 2016

@HaloFour
Forcing to deconstructions like (x, var _, var _, var _) and inability to deconstruct into _ is antithesis of improvement.

For private / protected / external _ there may be warning (where using _) for such cases.

@gafter
Disabling leakage var _ from deconstruction into enclosing scope is further complication already complicated rules of scoping described #12939

Do we need wildcards in typeswitch at all ? imo, it is useless here

Can you explain why characters #, @, $, % can't be used for wildcards ?

While presented alternative approach is simpler, I would stick with previous proposal which is bit more complex, but more aligned with rules for other variables.


@gafter commented on Tue Nov 01 2016

Do we need wildcards in typeswitch at all ? imo, it is useless here

Providing the user a way to avoid declaring an identifier allows us to warn when the declared identifier is unused.

Can you explain why characters #, @, $, % can't be used for wildcards ?

We don't think any of those are an improvement over _, and most other languages with a similar concept appear to agree. A wildcard, representing as it does something that the user intends to ignore, should be as low profile as possible, and nothing is lower profile than the Unicode LOW LINE. Users already declare _ when they intend a variable to be ignored, and this choice recognizes and blesses user preference.


@dsaf commented on Tue Nov 01 2016

Shouldn't that 0.01% of developers who are currently actually using _'s value be somehow punished anyway? I am more worried about general leaking to outer scope #14697.

@vbcodec

Do we need wildcards in typeswitch at all ? imo, it is useless here

Yeah, not sure why we need two ways:

if (o is int _) {}
if (o is int) {}

@gafter commented on Tue Nov 01 2016

Do we need wildcards in typeswitch at all ? imo, it is useless here

Yeah, not sure why we need two ways:

if (o is int _) {}
if (o is int) {}

That isn't a typeswitch.

In the context of an is expression, however, this may disambiguate some situations, thereby enabling the use of tuple types:

if (o is (int x, int y) _) {} // test for the type ValueTuple<int, int>; nothing placed in scope
if (o is (int x, int y) {}  // test for any tuple whose value contains two integers; places x and y in scope

@dsaf commented on Tue Nov 01 2016

It makes sense in such situations I guess. Although the names are probably not relevant in first case:

if (o is (int, int) _) {} 

or

if (o is (int _, int _) _) {} 

@dsaf commented on Tue Nov 01 2016

Also why not this?

if (o is (int, int)) {} 

@HaloFour commented on Tue Nov 01 2016

@vbcodec

Forcing to deconstructions like (x, var _, var _, var _) and inability to deconstruct into _ is antithesis of improvement.

Hey, that's my word! 😁

The only thing worse than a slightly more verbose improvement is a syntax that can accidentally lead the developer into a subtle runtime bug. Allowing deconstruction into an existing _ is a big potential pit of failure.

Note that this would apply specifically to deconstructing into existing variables which I expect will be a minority case. For people using declaration deconstruction syntax you'd only need var (x, _, _, _). I doubt most people would even notice.

For private _ there may be warning for such cases.

It's my opinion that a warning wouldn't be sufficient. You could add a protected field named _ in some base class and have no idea what code you impact further down the pipeline, including in other projects.

@gafter

We don't think any of those are an improvement over _,

I tend to agree. The closest I might suggest is . but I imagine that comes with its own baggage of parsing issues. I still favor * over all, but I understand the type ambiguity that introduces (and I'm jealous of VB.NET for the first time in a very long time).

@dsaf

Shouldn't that 0.01% of developers who are currently actually using _'s value be somehow punished anyway?

You know that the most active "abusers" would be those Fortune 50s with massive code bases that have every other MS VP on speed dial and would call them up screaming if they even suspected that their builds might break. There are some threads on the coreclr repo sharing some interesting stories of just this sort of thing happening with updates to .NET impacting internals that nobody should have ever relied on anyway, updates that had to be rolled back.

Anyhow, MS made _ a legal identifier, why should people who took advantage of it be penalized?


@dsaf commented on Tue Nov 01 2016

@HaloFour

You know that the most active "abusers" would be those Fortune 50s with massive code bases that have every other MS VP on speed dial and would call them up screaming if they even suspected that their builds might break. There are some threads on the coreclr repo sharing some interesting stories of just this sort of thing happening with updates to .NET impacting internals that nobody should have ever relied on anyway, updates that had to be rolled back.

And they are rushing to use C# vNext? And they will scream you say? Did it help e.g. save Silverlight? Based on what Microsoft is generally doing, enterprise is irrelevant. Unless it uses Node.js with Git and hosts it on Azure.


@vbcodec commented on Tue Nov 01 2016

@gafter

We don't think any of those are an improvement over _

But these characters do not conflict with any name, and there won't be needed any tricks to avoid ambiguities. Doesn't this outweigh 'low profile' of _ ?


@HaloFour commented on Tue Nov 01 2016

@dsaf

And they are rushing to use C# vNext?

This is also why the team has refused to add any new warnings to the compiler for existing code. Can't risk breaking some build that happens to be riddled with awful code.

https:/dotnet/corefx/issues/1420#issuecomment-96430564

We have had to revert refactorings because customer applications were taking a dependency on the method names in various stacks. Ugh.

Who knows, maybe Microsoft could stand to be significantly more abusive towards their customer base. It seems to work for Apple.


@dsaf commented on Wed Nov 02 2016

@HaloFour more abusive than the whole Sinofski fiasco? https://blog.ailon.org/how-one-announcement-destroyed-the-net-ecosystem-on-windows-19fb2ad1aa39#.2b9c211pz Renaming a method is kind of easier than switching a GUI framework.


@CyrusNajmabadi commented on Wed Nov 02 2016

Let's keep the conversation on topic. Thanks :)

Doesn't this outweigh 'low profile' of _ ?

Currently we don't think so. "_" as a wildcard fits into how people are generally using that identifier already today. So we think it's nice to be able to just codify that concept, even if it means we do need to put in the due diligence to make sure we don't break any existing code.


@dsaf commented on Wed Nov 02 2016

@CyrusNajmabadi sorry, I'll stop now :).


@DavidArno commented on Wed Nov 02 2016

Regarding the new proposals, I think @HaloFour has nicely summarised what I also see as a better solution than @gafter is proposing. I too think we need very strict - _ variable in scope or _ is a wildcard - dichotomy solution here to minimise the chance of really hard to find bugs.

There is one use-case that's missing from both the OP and @HaloFour's post. I think is important to clarify that following extra wildcard case should also be supported:

_ = e; // wildcard, but not explicitly covered by @gafter's 2nd case

@bondsbw commented on Wed Nov 02 2016

How about using ~ (tilde)?

if (o is int ~)
{
    M(out int ~);
}
(int x, int ~) = e;

M(out ~);
(o, ~) = e;
(int x, ~) = e;

It is almost as low profile as _, and has an appearance that conveys the ephemeral nature of wildcards.


@DavidArno commented on Wed Nov 02 2016

@bondsbw,

~ is the bitwise complement operator. Just like -, #, * etc, it therefore already has a meaning. _ is already used by many of us as a pseudo wildcard, so already has that meaning in the minds of 99% (pure guess) of those that use it.


@bondsbw commented on Wed Nov 02 2016

What makes ~ better, in my opinion, is that it cannot be used in similar circumstances. Nobody is going to confuse bitwise complement with wildcard.

And to echo @HaloFour's point, it doesn't suddenly change meaning when you are working elsewhere.


@bbarry commented on Wed Nov 02 2016

If you were considering other symbols, I'd suggest @. It is already a valid character at the start of an identifier (though it doesn't affect the name of the identifier). Reading if (o is int @) as if it were valid c# without a wildcard appears to be a pattern into a variable with no name.

That said, I like _ as a wildcard even if it is ambiguous with existing code. I'd be happy with the rule:

  1. Any pattern or deconstruction syntax involving a wildcard where there is a variable by the name _ in scope is an error.

@CyrusNajmabadi commented on Wed Nov 02 2016

as mentioned already, we are trying to avoid using other symbols. There's no end to the number of syntacticly unambiguous options we have if we go with something else. But that's not the goal here. The goals are:

  1. to use the character today that people already routinely use to mean 'i don't care about this variable'.
  2. to use a variable today that has a familiar meaning for people using other languages that supports this concept.

It's similar to the issue we had when we introduced generics into the language. We could have gone with different syntax than <>. It would have helped us avoid ambiguities in the language. However, we felt that there was enough merit in those characters (esp. with how parametric polymorphism occurs in other languages) to use them, even though they were more difficult to fit into our language.

To that end specified above, '_' is an ideal wildcard character. It just needs to be added to the language in a careful manner as we view back compat as a very high bar.


@bondsbw commented on Wed Nov 02 2016

Then I would prefer to change rule 2 so that _ is always treated as a wildcard in those circumstances.

@_ would still be allowed if you want the identifier.


@HaloFour commented on Wed Nov 02 2016

@CyrusNajmabadi

I know that I can be a big pain in the butt but all I'm trying to do is help in finding that point where it's safe to determine that _ would be a wildcard while trying to mitigate scenarios where the compiler/developer might do the wrong thing. I know that my code won't be affected as I've never resorted to using _ as an identifier. But I know people who have, both in scenarios where they wished to ignore the result and in cases where they wanted a simple syntax for selecting a property. I don't think I've seen it used elsewhere, but it wouldn't surprise me if it has.

Which leads me to ask what the expected/wanted behavior would be in the following scenario:

var (x, _) = e;
var query = employees.OrderBy(_ => _.FirstName);

This change might also piss off the fledgeling efforts to port Underscore to .NET. 😁


@gafter commented on Wed Nov 02 2016

@HaloFour

var (x, _) = e;
var query = employees.OrderBy(_ => _.FirstName);

That code is perfectly fine. The first _ is a wildcard, and the second one means the same as it always did.


@DavidArno commented on Wed Nov 02 2016

@HaloFour,

var (x, _) = e;
var query = employees.OrderBy(_ => _.FirstName);

Even by the stricter "no wildcard use if an _ var is in scope" rules, that code would be fine as the lambda parameter isn't in scope when the wildcard is used in the deconstruction.

I know that my code won't be affected as I've never resorted to using _ as an identifier. But I know people who have, both in scenarios where they wished to ignore the result and in cases where they wanted a simple syntax for selecting a property

I have used it, quite often as a pseudo-wildcard in lambdas. I don't get using it for "simple syntax for selecting a property" though: x is a far more established convention there. I would far prefer the team just risked breaking my code and made _ a wildcard only (requiring @_ for existing identifiers). The - at most - few hours of rewriting would be well worth avoiding making the language more complex. But that's (one of the many reasons) why @MadsTorgersen is running the language team; not me 😄


@HaloFour commented on Wed Nov 02 2016

@DavidArno

x is a far more established convention there.

Or any number of single-letter identifiers. But it was never wrong to use an underscore.

I would far prefer the team just risked breaking my code and made _ a wildcard only (requiring @_ for existing identifiers). The - at most - few hours of rewriting would be well worth avoiding making the language more complex.

To be honest, so would I. Rip the Band-aid and eliminate all possibility of ambiguity. I'm not even sure that would be possible, though. Code like var _ = 123; and M(out _) would suddenly take on a completely new meaning. Of course if you then never used that variable the difference probably wouldn't really matter.

I'd be all game for changing the rules of legal identifiers to require an _ to be followed by an alphanumeric character.


@eyalsk commented on Thu Nov 03 2016

Maybe we need to add a strict option for C#. haha.. j/k

I like the underscore approach a lot aka time to disallow underscores completely, well at least single ones.

@gafter What happens when you use more than a single underscore character? is it still a wildcard? honest question. 😆

void M()
{
    M2(out __); // Allowed?
}

@alrz commented on Thu Nov 03 2016

If we ever want "sequence wildcards" in array/tuple patterns, I think that would be the syntax:

tuple is (1, __) // tuple is (1, _, _, _)
arr is { 1, 2, __ } // matches an array with length > 2 
arr is { 1, 2, ___ } // matches an array with length >= 2
arr is { 1, 2 } // matches array with length = 2

Rel. #5811, #10631


@DavidArno commented on Thu Nov 03 2016

@alrz,

Whilst *** might have been OK for sequence patterns, it really doesn't work with underscores as many fonts join them together, making __ and ___ far too hard to tell apart.


@HaloFour commented on Thu Nov 03 2016

@alrz

@DavidArno just beat me to it. The _ character really does not lend itself to being repeated as visually they would just blend together. Maybe ... could be a "sequence wildcard", although then we'd have two completely different wildcards.


@alrz commented on Thu Nov 03 2016

@HaloFour @DavidArno This is the exact syntax that used in Mathematica for the same purpose. However, I think a single "zero or more" wildcard is sufficient as you can simulate the other one with help of a regular wildcard: arr is { 1, 2, ___ } vs arr is { 1, 2, _, ___ }.


@lawrencejohnston commented on Thu Nov 03 2016

One thing that disappoints me about this approach (if I understand it correctly) is that it doesn't allow using wildcards for parameters to lambda expressions. Sometimes callbacks include parameters you don't have a use for and it would be nice to have a way to ignore them.

For example when you only have a self-signed certificate in your development environment and you want to disable checking the certificate in development:

ServicePointManager.ServerCertificateValidationCallback = (_, _, _, _) => true;

Or if you wanted to do partial validation you might only need the certificate and policy errors but not the sender or chain:

ServicePointManager.ServerCertificateValidationCallback = (_, certificate, _, sslPolicyErrors) => {...};

I understand the new approach is more straightforward (and it may be worth going with it for that reason) but this was the use case that immediately came to my mind when wildcards were proposed in the first place so it's too bad it won't be supported.


@DavidArno commented on Thu Nov 03 2016

@lawrencejohnston,

If you take a look at #14794, @grafter does give an example of using _ as a wildcard parameter in a lambda. Whether it will happen is, like everything else in the discussion, not yet decided and subject to change.


@lawrencejohnston commented on Thu Nov 03 2016

@DavidArno

Right, the original proposal included it but my understanding is that this alternate proposal would not support them. By "One thing that disappoints me about this approach" I was referring to the alternate approach proposed in this issue.


@eyalsk commented on Fri Nov 04 2016

@alrz, @HaloFour, @DavidArno Do you guys know whether repeated underscores are going to be considered a wildcard? I mean I'd assume it won't but dunno based on other features like digit separators it seems a reasonable question where it's perfectly valid there.


@DavidArno commented on Fri Nov 04 2016

@eyalsk,

Absolutely no idea and I'm beginning to think that the team don't know either yet. I think we are seeing the start of C# being genuinely "developed in the open" in that these are just ideas @gafter is throwing around in the open to gauge community reaction before taking it to a design meeting for discussion. So anything could happen, I guess (or I'm being naive 😁)


@JesperTreetop commented on Fri Nov 04 2016

For those of us who missed it, what's the reason * can't be used here? * is more universally understood as a catch-all and a wildcard, it won't change the meaning of previously legal programs, and you can't define a _ variable in scope and change the meaning at a distance of the wildcard. (Far more likely than, say, defining a var class...) There's precedence of otherwise valid identifiers being keywords in C#, but I can't think of one that isn't a word or at least alphanumeric.


@HaloFour commented on Fri Nov 04 2016

@JesperTreetop

That int * is a valid type in C#, although pointers can't be used in is expressions nor type-switch expressions, nor could it be used without an identifier in out declarations.


@CyrusNajmabadi commented on Fri Nov 04 2016

@JesperTreetop From above:

The goals are:

  1. to use the character today that people already routinely use to mean 'i don't care about this variable'.
  2. to use a variable today that has a familiar meaning for people using other languages that supports this concept.

@JesperTreetop commented on Fri Nov 04 2016

@HaloFour

Right, gotcha, missed this.

@CyrusNajmabadi

Still don't think, no matter what some people use _ for, that the C# team should codify an intent now that we're seven versions in with many millions of lines of source code. Overloading possibly valid identifiers seems dicey because it makes people possibly have to carry two meanings around, the exception being if it had been ignore or skip where you could (to my mind) make a much stronger argument that the intended use overlaps completely with any ad-hoc convention. It would still have the problem where if someone defined something in scope with that name, the behavior would "magically" change. (And as with anything introduced after C# 1.0, you couldn't make those words keywords, only contextual keywords, because you'd break existing programs.) So I'd still prefer something outside of the overlap with possibly valid identifiers.


@CyrusNajmabadi commented on Fri Nov 04 2016

I personally disagree. I very much dislike it when languages go out of their way to avoid these problems, only to end up with an end result that feels inconsistent and unpleasant to use. I'd rather codify patterns and practices and move forward with a healthier feeling set of code.

In practice these issues just have not caused problems for us in the past. I'd prefer for us to not bend over to obsessive paranoia, and just be cognizant of how code is actually written an what the end impact will be.

In this case, the impact seems extremely tiny. If you're using _ today as a wildcard, you'll continue being able to do so. if you're using _ as an actual variable, you'll still be able to do so. And, if you end up adding a wildcard that conflicts with that you'll know immediately because your code won't compile anymore.

I think that's fine. Back compat is ensured. Any mistakes are found immediately. And there is a simple way to express this concept in a manner that fits with our ecosystem and the greater programming community.


@HaloFour commented on Fri Nov 04 2016

@CyrusNajmabadi

Who's goal is that, exactly? Again, seems like a case of the destination being determined long before the journey has been plotted.

And, if you end up adding a wildcard that conflicts with that you'll know immediately because your code won't compile anymore.

This is patently not the case. I've provided numerous examples where even the stricter rules above produce terribly ambiguous situations that can lead to very subtle overwrite bugs. Simply put, said situations are impossible to avoid, not without further serious restrictions to the applicability of wildcards.


@CyrusNajmabadi commented on Fri Nov 04 2016

This is patently not the case. I've provided numerous examples where even the stricter rules above produce terribly ambiguous situations that can lead to very subtle overwrite bugs.

I've found your cases super-non compelling. For example, the case of someone adding a field named _ in their file.

Such a situation is similar to someone adding an alias to 'var' in their file. Sure, it can happen. Sure, it can change meaning. Do i think that's a relevant case to consider? No.

We have to practically look at how our language is actually used out there. And working to avoid pathological conditions is not something we do. it's simply over-constrains langauge development and hurts the end language for most users to avoid problems that do not actually arise in practice.

I personally do not think it's healthy to design hte language in such a fashion. And while there is a contingent of people giving feedback that they would prefer that, it's not something that we honestly feel is appropriate to do.

Our experience has borne this out through numerous releases. Many (all?) interesting features have had this sort of potential 'gotcha'. And in reality its not an issue, and the language has been healthier and more enjoyable to use precisely because we did not try to ensure that such cases could never exist.


@HaloFour commented on Fri Nov 04 2016

@CyrusNajmabadi

I've found your cases super-non compelling.

You could've just said, "_ you." I'll pretend that's a wildcard and not an identifier. Of course with your compiler nobody could ever really know for sure.


@CyrusNajmabadi commented on Fri Nov 04 2016

@HaloFour. Please keep the discussion respectful. I realize that we may make decisions you disagree with. It is not personal. It is because we simply weigh things differently. I've heard your cases that you are concerned with, and i'm simply stating that i do not find myself concerned with them.

This has come up on several threads, but i think it bears repeating: It is a non-goal when we make language changes to make everyone happy. We weigh out a lot of options every time, and it is nearly always the case that for every interesting language area there is no perfect answer that maximizes value for every consideration. We end up going with a result that we feel is the best balance of positives vs negatives. If that results in something you don't like, it's not personal. It's just how things work when you have to design for an extraordinarily large audience with many different desires.


@JesperTreetop commented on Fri Nov 04 2016

Such a situation is similar to someone adding an alias to 'var' in their file. Sure, it can happen. Sure, it can change meaning.

Then, as long as we're talking about a new feature, why not take it the way it is intended - as a knock against that option and a reason to keep exploring other options too? Subtly changing the meaning of people's code in some rare circumstances could of course be the best option, but it's hard to know without anything else to compare it to, and without having had that discussion first. Preempting that discussion makes it harder; doing it in the face of provided examples to the contrary seems downright hostile.

We're just trying to be in the room too and bring the healthy critique and alternative viewpoints that are unavoidable parts of testing a good design - please don't fault us for that. It may be the case that there has been a grand discussion about this, carefully weighing every possible alternative, but it's hard to tell when only the conclusions are shared.


@CyrusNajmabadi commented on Fri Nov 04 2016

No discussion has been preempted.

doing it in the face of provided examples to the contrary seems downright hostile.

Nothing was preempted there. I simply stated that i found the use case non-compelling. That's an honest and true assessment of the example that was provided. If i'm to care about a potential pitfalls i have to at least truly believe that they're really going to happen. :)

but it's hard to tell when only the conclusions are shared.

This is a proposal. :)


@CyrusNajmabadi commented on Fri Nov 04 2016

why not take it the way it is intended - as a knock against that option and a reason to keep exploring other options too?

Because, not all things are equal. Literally any language change we make can have some knock against it. At a certain point though, if the knocks are not important enough, then exploring other options is not appropriate.

Furthermore, as has been mentioned in the thread, the other options also have knocks against them. There are rarely "slam dunk" cases :). Almost all the time, there's just a series of tradeoffs. In this case i've mentioned why i believe those other options are inferior to the option specified. That's not to say that there are not pros/cons to each side. Simple that under my own weighting of things, why one is better than the other.


@vbcodec commented on Fri Nov 04 2016

I agree with @CyrusNajmabadi here. Irrationally shaping feature just to support careless programming do not make sense. With current (and previous) rules existing code compile flawlessly. There must be small warning anyway, if modified variable _ is defined outside function, just to make dev aware of potential problem. There are dozens such warnings already defined, and there is not any problems with them


@HaloFour commented on Fri Nov 04 2016

To me ambiguity is the epitome of bad language design. I dislike the new scope rules, but at least the result is unambiguous and cannot lead to subtle bugs.

The example using a field is just a simple one and to demonstrate that said identifier doesn't even need to be declared in the same method (or even the same class), leading to the developer not even knowing that they're accidentally overwriting anything. The same situation could arise within a single method with a legally-declared variable.

This proposal is obviously a step to avoid those circumstances. It's close. It can never be perfect, but it can make a few relatively minor changes which would help to avoid those potential edge case bugs.


@JesperTreetop commented on Fri Nov 04 2016

Nothing was preempted there. I simply stated that i found the use case non-compelling.

I think I see the problem. What I saw above was me posing a question, to which you replied "well the code won't compile anymore, so it's not an issue". @HaloFour then mentioned that yes, it could be an issue, I have provided details about such cases where it could be, and you then replied "but those use cases are not compelling". I think what both me and HaloFour were reacting against was the implication that just because those situations were seen as unlikely, that they were effectively ignored.

I certainly agree that tradeoffs have to be made, and that various solutions will have pros and cons associated with them. I also think that the tradeoffs are best made when all the evidence is collected and weighed together. I read the above as a statement of intent that because this evidence "smells funny", that it's not going to be considered - I now think that's probably not what was intended.

Because, not all things are equal. Literally any language change we make can have some knock against it. At a certain point though, if the knocks are not important enough, then exploring other options is not appropriate.

Absolutely. I just hadn't seen any option being explored at all, and that's likely attributable to personal error on my part. As mentioned above, I don't mean a knock to be a death knell, but something to be noted and tallied for every option so that the best option can be chosen with a good rationale.


@CyrusNajmabadi commented on Fri Nov 04 2016

To me ambiguity is the epitome of bad language design.

That's fair. And there's nothing wrong with thinking that. But it's not how we've generally operated. In practice ambiguity can arise, but we've err'ed toward making the language much more pleasant to use despite that. Yes, people can write pathological code that is affected by this (and we've had many people on the team over the years who are gleeful to find and discover these issues). But in practice, we've seen it borne out that it's NBD.


@CyrusNajmabadi commented on Fri Nov 04 2016

I think what both me and HaloFour were reacting against was the implication that just because those situations were seen as unlikely, that they were effectively ignored.

I see. My point was that in real code situations the the types of issues would be caught. Yes, there are pathological cases that won't be caught. I just don't think they're important.

So, to an extent, these issues are being ignored because they are unlikely. That's how we do language design. The less and less likely something becomes the less and less impact it should have on language design. The alternative is to spend an inordinate amount of effort painstakingly making sure there is 0 ambgiuity anywhere, and ending with a language that would be awful to use.

As an example, if we went this route, then we wouldn't have been able to put in generics without tons of extra cruft. We'd likely need a whole new syntax for generics just to avoid the ambiguity problems inherent in the grammar around it.

In practice though... it's totally irrelevant. I think we've heard of one person ever actually hitting the ambiguity in practice. And even then they were very happy we chose to do things the way we did with generics.

Again, language design is a process of measuring pros/cons. If the cons are extremely unlikely/hypothetical, then they're simply going to be outweighed by the pros. Their mere existence is not sufficient to move us away from a proposal. They need to actually be important enough to do so.

Does that help clarify things?


@CyrusNajmabadi commented on Fri Nov 04 2016

The example using a field is just a simple one and to demonstrate that said identifier doesn't even need to be declared in the same method (or even the same class), leading to the developer not even knowing that they're accidentally overwriting anything.

The same is true if someone introduces a class into a namespace that has the name of some type that you're already referencing. Adding a method can mean existing code that calls an extension method changes meaning. Clashing can happen. And it can happen in practice with the normal symbols that people write all the time. I'm extremely unconcerned about clashes with declaration level stymbols called "" because we do not see people using "" as a declaration level symbol name commonly enough for this to matter in practice.

Again, this is not saying that the situation you mention cannot arise. It is simply stating that it is so unlikely and uncommon that we are not try to design around that possibility occurring.

If you can present a realistic case where this would happen commonly, then that would warrant possible reweighting.


@JesperTreetop commented on Fri Nov 04 2016

As an example, if we went this route, then we wouldn't have been able to put in generics without tons of extra cruft. We'd likely need a whole new syntax for generics just to avoid the ambiguity problems inherent in the grammar around it.
In practice though... it's totally irrelevant. I think we've heard of one person ever actually hitting the ambiguity in practice. And even then they were very happy we chose to do things the way we did with generics.

Oh yes. I am sure this happened. I'm also sure that at some reason someone brought up these things and that it was given consideration. All I'm saying is that for a moment, it felt like we were fast-forwarding past that step, or whatever the measured equivalent of that would be for this issue. I'm personally reassured that that wasn't the intention.

Communicating over the internet; boy, it sure is hard sometimes. :P


@CyrusNajmabadi commented on Fri Nov 04 2016

Indeed! :)


@HaloFour commented on Fri Nov 04 2016

Indeed, tradeoffs do need to be made. I'm aware of the breaking changes required to make generics happen. But at least the ambiguous code there would result in compiler errors. That's not the case here, the ambiguous code will happily compile and may not behave as expected. That's why I believe significantly more caution is warranted.

What I'm recommending above is considering most of these ambiguous cases to be compiler errors. If, as you say, these cases are strictly pathological, then only pathological code would be impacted, and the barrier between ambiguous and intentional is strengthened.


@vbcodec commented on Fri Nov 04 2016

@HaloFour

tradeoffs do need to be made

It is already made

But at least the ambiguous

There is no ambiguity. Feature and behaviour is fully determined

there would result in compiler errors

Errors appear only for logically incorrect code, not for lack of awareness of programmer

may not behave as expected

Depends for whom

strictly pathological

Should compiler post error on this ?
if (false) ((int?)null).ToString();

Only one ambiguity that exist, is your misunderstanding between technical and human aspects.


@HaloFour commented on Fri Nov 04 2016

@vbcodec

I'll happily point you to @CyrusNajmabadi own comment that this is a proposal. If these decisions were set in stone then this proposal wouldn't even exist as it arose from the objections to the previous LDM notes.


@bondsbw commented on Fri Nov 04 2016

Great discussion, but it really feels like the direction is to implement the original proposal despite the issues raised regarding ambiguity and potential to use different characters for the sequence wildcard, and despite suggestions for other characters and for modifying rules.

Is this assessment correct?


@vbcodec commented on Fri Nov 04 2016

@HaloFour

Do not think that it was set in stone, but this is result of using rules of language design, explained by @CyrusNajmabadi. These rules are narrowing space of possible proposals, while you want to drag proposal outside that space, because of extremely rare pathological cases. I think that pathological cases should be wiped out, than supported by destroying new features, In this view this proposal not only provide wildcards, but also force devs to use correct patterns.


@bondsbw commented on Fri Nov 04 2016

If ambiguity problems are so rare as to consider them "pathological", then why worry so much about breaking BC? Just kill the cases that, as you say, people aren't using.


@CyrusNajmabadi commented on Fri Nov 04 2016

Because back-compat is broken only when there is massive good seen from it. We do not see that here. Breaking changes face an enormously high bar to go through.


@eyalsk commented on Fri Nov 04 2016

@CyrusNajmabadi The issue I see is that with C# 7+ we will need to educate people about the quirks of the language where before features were more definite and straightforward, first it was scoping, now it's underscores and who knows what the future holds, right? I mean these quirks can get piled and grow fairly easily as the language evolve and more of these mines might introduce with each version so inevitably people will have to use analyzers and conventions even more than before just to avoid these mines.

Personally, I love underscores but I hate the fact that I will need to wonder whether it's a wildcard or a variable so I really hope that the IDE would help us distinguish between these cases, I'll probably figure it out based on the context fairly easily but when you're fixing a bug or just reading the code coloring them differently can help.


@CyrusNajmabadi commented on Fri Nov 04 2016

we will need to educate people about the quirks of the language where before features were more definite and straightforward,

That's your perspective :) Personally, i found the language before less straightforward. Others also felt the same way :)


@CyrusNajmabadi commented on Fri Nov 04 2016

so I really hope that the IDE would help us distinguish between these cases.

Sure, we can consider some sort of IDE treatment for this.

Personally, I love underscores but I hate the fact that I will need to wonder whether it's a wildcard or a variable

Is this really a concern in practice? How would this actually affect you. Either you use _ for some actual purpose (like a lambda parameter). In this case, you'll know it's an actual parameter, because if it was a wildcard, referencing it would not work.

Otherwise, you're likely using _ to mean "i don't care" in which case... what does it matter what it is at the end of the day? :)


@CyrusNajmabadi commented on Fri Nov 04 2016

Like, if i see _ used, i'm just going to go: "oh, this is something i don't care about" and move on.


@eyalsk commented on Fri Nov 04 2016

@CyrusNajmabadi

That's your perspective :) Personally, i found the language before less straightforward. Others also felt the same way :)

I guess we have different definitions for straightforward then for me straightforward would be as simple as explaining multiplication and it seems like for some people straightforward means explaining divisions with the "but you can't divide by zero because..." :)

I guess what I'm trying to say is that if there's going to be many "buts" in the language then it wouldn't be as straightforward as the version before it.

I value what you do and I trust your decisions but I also understand what @HaloFour is saying and I think that acknowledging our concerns as a community opens the door for more quality discussions.

Is this really a concern in practice? How would this actually affect you. Either you use _ for some actual purpose (like a lambda parameter). In this case, you'll know it's an actual parameter, because if it was a wildcard, referencing it would not work.

It might when reading the code, it's like having a wall of text without line breaks, sure line breaks aren't that important but they are pretty helpful making the text more readable.

Personally, I don't use single underscores at all as variables even in lambdas but others do so I just don't want to be in a situation where I wonder what it is, especially for complex and large codebases so if the IDE can help improve these situations I see no reason why not to but that's just my opinion. :)


@bondsbw commented on Fri Nov 04 2016

About sequence wildcards, I think I like something similar to @AdamSpeight2008's syntax in #5811 that builds on common regex syntax:

tuple is (1, _*) // tuple is oneple (1), or (1, _), or (1, _, _), or (1, _, _, _), etc.
tuple is (1, _+) // tuple is (1, _, _*)
tuple is (1, _?) // tuple is oneple (1) or (1, _) only

@miloush commented on Fri Nov 04 2016

Not that I believed someone used it, but how is nameof handling this change?

I occasionally (well, once as far as I could find) used two underscores to ignore event handler arguments in a lambda, i.e.

Loaded += (_, __) => { /* not using the arguments */ };

Not sure what I was thinking preferring it over delegate... anyway, if I understand the proposal, this will still be treated as two variables. (i.e. it is not deconstruction, out variable or pattern match).

And it follows that wildcards cannot be used to simplify this, correct?

Loaded += _ => { /* not using the arguments */ ];
Func<bool> = _ => true;
Func<int, bool> = _ => true; // the only one that compiles
Func<int, int, bool> = _ => true;

@bondsbw commented on Fri Nov 04 2016

@miloush I think your last example would be an appropriate place for sequence wildcards (this example using the syntax I mentioned):

Loaded += _* => { /* not using the arguments */ };
Func<bool> f1 = _* => true;
Func<int, bool> f2 = _* => true;
Func<int, int, bool> f3 = _* => true;

@miloush commented on Fri Nov 04 2016

@bondsbw You might want to file a separate proposal for such syntax as it doesn't really depend on how this thread is resolved (and it might get lost).


@eyalsk commented on Sat Nov 05 2016

@miloush Personally, when it comes to lambdas and only lambdas I'd love to ignore the parameters completely or partially, so something like this:

Loaded += () => { /* not using any parameters */ };

Loaded += (s) => { /* using a single parameter */ };

Loaded += (_, e) => { /* using a single parameter */ };

And the compiler will use the current feature of wildcards to emit the following code:

Loaded += (_, _) => { /* not using any parameters */ };

Loaded += (s, _) => { /* using a single parameter */ };

Loaded += (_, e) => { /* using a single parameter */ };

But this probably requires a new proposal. :)


@vbcodec commented on Sat Nov 05 2016

@bondsbw
Don't think that this sequence wildcard _* is good idea, as it may create ambiguity for overloaded methods.

void f1(Action<int, int> x) { }
void f1(Action<int, int, int> x) { }

f1((_*) => { ... ) ); // which f1 is picked ?

@HaloFour commented on Sat Nov 05 2016

@vbcodec

These rules are narrowing space of possible proposals, while you want to drag proposal outside that space, because of extremely rare pathological cases. I think that pathological cases should be wiped out, than supported by destroying new features, In this view this proposal not only provide wildcards, but also force devs to use correct patterns.

That's also my opinion. But the rules as they are proposed here don't "wipe out" the pathological case. They silently give the pathological cases priority over the "correct patterns".

My suggestions would "wipe out" the pathological cases by turning them into compiler errors when (intentionally or unintentionally) mixed with wildcards. They can be considered individually, but together they reduce the ambiguity between whether the compiler might choose wildcards vs. variables to zero.

  1. Make it illegal to deconstruct into an existing identifier named _.
  2. Eliminate the short form wildcard from out declarations.

If you're not dealing with the pathological case then this will have very little impact on your code. The second point is the only one that would make your code slightly longer when using wildcards.


@HaloFour commented on Sat Nov 05 2016

@vbcodec

Don't think that this sequence wildcard _* is good idea, as it may create ambiguity for overloaded methods.

That would be a compiler error, just as it is today:

void f1(Action<int, int> x) { }
void f1(Action<int, int, int> x) { }

f1(delegate { ... }); // compiler error

To disambiguate you'd be required to specify the same number of wildcards as there are parameters. If there are multiple overloads accepting delegates with the same number of parameters with different types then you disambiguate by specifying the type of the parameter prior to the wildcard.


@HaloFour commented on Sat Nov 05 2016

Opened a separate proposal to consider the use of _ in wildcards for lambdas. I haven't touched on sequence wildcards, only the simple case.

#15027

In short, if you declare a lambda with multiple parameters of the name _ then those parameters are all considered wildcards and the name is not available in the lambda body:

Func<int, int, bool> func1 = (_, _) => true; // legal
Func<int, int, bool> func2 = (_, _) => _ + 4; // CS0103: The name '_' does not exist in the current context

@vbcodec commented on Sat Nov 05 2016

@HaloFour

turning them into compiler errors when (intentionally or unintentionally)

Compiler post error only for technically incorrect statements, which is not case here.
How compiler should know that variable is unintentionally changed ? Compiler do not track how code is written, so it don't know if deconstructions or variable was created as first. That's why I propose only warning here (for deconstruction and calls with out).

Eliminate the short form wildcard from out declarations.

This is asking for breaking compatibility, because "Code like var _ = 123; and M(out _) would suddenly take on a completely new meaning." (citation of your post).

Every dev must learn every new feature, and be responsible while creating code. Team and compiler should not take responsibility for irresponsible devs. There are already features that can introduce 'subtle ambiguities' that are much more serious than changing value of some 'do not care' variable.


@HaloFour commented on Sat Nov 05 2016

@vbcodec

Compiler post error only for technically incorrect statements, which is not case here.

This would make that the case here. The compiler already does this in numerous examples today. For example, requiring a break in a switch to avoid unintentional fall-through.

This is asking for breaking compatibility, because "Code like var _ = 123; and M(out _) would suddenly take on a completely new meaning." (citation of your post).

No it wouldn't. That would mean exactly what it has meant in C# 1.0 through C# 6.0. In fact, it would also mean the same thing that it means based on the proposed changes in this proposal.

Every dev must learn every new feature, and be responsible while creating code. Team and compiler should not take responsibility for irresponsible devs.

It's called the "pit of success". The language does have the responsibility of providing an environment that leads developers towards success. The C# team has invoked it on numerous occasions already.

There are already features that can introduce 'subtle ambiguities' that are much more serious than changing value of some 'do not care' variable.

Name 'em. Specifically those that don't cause compiler errors and introduce subtle value changes.


@HaloFour commented on Sat Nov 05 2016

@vbcodec

Note that I believe that based on the reconsidered rules above that the following would already fail to compile:

void M(out string s) { s = "456"; }

var _ = 123;
M(out _); // CS1503: cannot convert from "out int" to "out string"

@vbcodec commented on Sat Nov 05 2016

@HaloFour

but now correctly code

void M(out string s) { s = "456"; }

var _ = '123';
M(out _); // fine

will fail to compile with your second rule.


@HaloFour commented on Sat Nov 05 2016

@vbcodec

You misunderstand. That code would compile just fine, as it does today. What wouldn't compile is M (out _) when there is no variable named _ in scope, which is also not a change from today. If you wanted to use a wildcard you'd use M (out var _) or M (out string _), which would always be a wildcard and never introduce a new variable into scope.


@vbcodec commented on Sat Nov 05 2016

@HaloFour

So, there is no need for error. Some devs may protect themselves by placing var before, _, while others may use just _. If someone never use _ for variables, why to force him to always write var ? It destroys most of this feature.


@alrz commented on Sat Nov 19 2016

What's the reason for var _? Is there any use case where _ is ambigious and var would help with that?


@eyalsk commented on Sat Nov 19 2016

@alrz Not sure but maybe it's slightly less work? I mean otherwise, this introduces a special case, just a hunch. :)

However, I agree, M(out _) is more attractive.


@jcouv commented on Sat Nov 19 2016

@alrz @eyalsk Yes, var _ is there for consistency with typed discards (int _). It may also be useful to clarify you want a discard when there is already a local named underscore (but do not want to reference that local).
Like you, I expect people will mostly use the short discard (just _ where allowed).


@MattGertz commented on Sat Jan 14 2017

@gafter @jaredpar I'm cleaning up the RTW list starting today -- is this one a done deal?


@gafter commented on Sun Jan 15 2017

@MattGertz The compiler work is all done. The language specification may require spec text to be written, which is why this LanguageDesign issue is still open.

@jcouv jcouv changed the title Name Lookup for Wildcards Name Lookup for Discards (C# 7.0) Feb 2, 2018
@BillWagner
Copy link
Member

moving to dotnet/csharpstandard

@BillWagner BillWagner transferred this issue from dotnet/csharplang Apr 30, 2021
@BillWagner BillWagner added this to the C# 7.x milestone Apr 30, 2021
@gafter
Copy link
Member Author

gafter commented May 11, 2022

This is a record of discussions leading to the final form of the C# 7 features. I don't think any independent action on this issue is needed.

@gafter gafter added resolved: by-design The issue was examined, and we decided it wasn't a problem after all meeting: discuss This issue should be discussed at the next TC49-TG2 meeting labels May 11, 2022
@jskeet
Copy link
Contributor

jskeet commented Mar 1, 2023

@MadsTorgersen: PR #664 introduces discards; do you believe it addresses this issue enough that it would be reasonable to close this now?

@jskeet jskeet removed the meeting: discuss This issue should be discussed at the next TC49-TG2 meeting label Mar 1, 2023
@jskeet
Copy link
Contributor

jskeet commented Mar 1, 2023

Everyone's happy for us to close this and leave it as resolved.

@jskeet jskeet closed this as completed Mar 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
resolved: by-design The issue was examined, and we decided it wasn't a problem after all
Projects
None yet
Development

No branches or pull requests

3 participants