-
My aim with this discussion is to explore how pattern matching could be supported over any Taking the second concern first, typical applications of pattern matching over a collection involve some sort of matching against the head and then recursively re-applying the patterns to the tail. Leaving this as a recusive implementation leads too easily to a stack overflow. So tail-call optimisation is used to turn it into an iterative implementation. Such an implementation naturally lends itself to using an enumerator. So my gut feeling here is that trying to avoid Regarding the problem of re-enumerating an The advantages I see to the linked list approach are:
However, the linked list approach has an obvious disadvantage over copying to an array, namely that elements cannot be accessed via an index and must be found be starting at the beginning of the list each time. This would likely add huge overheads to all but the most trivial of patterns. For the rest of this discussion, I'll ignore which method is used as the correct way to determine the best solution would be through performance testing, rather than me guessing. Basic pattern matchingTaking ideas from Roslyn #10631 (ie, @gafter's link above), the simplest form of pattern matching would match the whole collection. I've changed the syntax here as @alrz points out that the proposed property pattern syntax would likely clash with the ideas of using if (collection is [1, 2, 3])
// true if collection contains exactly the three elements, [1,2,3]
if (collection is [])
// empty collection matches
if (collection is [1, _, 3])
// matches if collection contains exactly the three elements and
// element 0 is 1 and element 2 is 3. Element 1 can be any value
if (collection is [1, var x, 3])
// matches as above, but element 2's value is assigned to x Element wildcard matchingHaving to match the whole collection is restrictive though. Again borrowing the syntax from the above linked issue, we could use if (collection is [.., 99, 100])
// matches on the last two elements only
if (collection is [1, 2, ..])
// matches on the first two elements
if (collection is [1, .., 100])
// matches the first and last elements Recursive matchingAs I said at the beginning, such matches are rather limited when dealing with collections. What's commonly required is a means of recursing over each element in turn. A possible example of this in action is shown below: int Sum(IEnumerable<int> collection) =>
collection switch (
case [] : 0, // empty collection handler
case [var last] : last,
case var head :: tail : Sum(tail) + head
); Also as previously mentioned, whilst it's useful to express the pattern recursively, tail call optimisation is normally used in such situations to avoid stack overflows. There is a problem with the above syntax: the use of To avoid this, I considered the idea of the framework providing a deconstruct for The solution that I think avoids both of these issues and avoids yet more new syntax is to use the previously discussed if (collection is [1, 2, .. var theRest])
// for array [1,2 3, 4], theRest will be [3, 4] here So to split the head and tail, we could just use int Sum(IEnumerable<int> collection) =>
collection switch (
case [] : 0, // empty collection handler
case [var last] : last,
case var [head, ..tail] : Sum(tail) + head
); Recursing just within the
|
Beta Was this translation helpful? Give feedback.
Replies: 16 comments 4 replies
-
I think it's OK to iterate over It maybe awful for performance but it's still a case for the multiple Developer should be aware of multiple iterations and could cache collection himself before pattern matching. And then there is a method to reset enumerator to its initial state. Developer could use multiple-iteration-friendly implementations of |
Beta Was this translation helpful? Give feedback.
-
The biggest question to me is: why would someone pattern match against a collection? The last example in the post would probably be clearer when written as a loop. |
Beta Was this translation helpful? Give feedback.
-
It looks great on paper but I really keep on asking myself what's the benefit of this over the traditional approach? and before this happens don't we need to standardize a syntax for collections? range? etc.. |
Beta Was this translation helpful? Give feedback.
-
My thoughts:
* Some examples, which I think were mostly successful:
|
Beta Was this translation helpful? Give feedback.
-
I don't think matching against I'm not really a fan of "cons patterns" in C# mostly because of (3) in @svick's comment. F# has it because list literals in F# create an immutable linked list, so a cons pattern is expected there. You can always write an extension pattern for any other data structures (like a link list, immutable list, etc). |
Beta Was this translation helpful? Give feedback.
-
@orthoxerox and @eyalsk, With regard to the question, "why would one pattern match a collection", I asked that same question a while back. Ironically @orthoxerox provided the answer, so I'm amused he's now asking why one would do it! A specific example can be found here:
tokens = tokens.Zip(tokens.Skip(1).Concat(new [] { Token.Dummy }), (t1, t2) => IsNewline(t1) && IsNewline(t2) ? new Token(Kind.Separator) : t1); Using pattern matching, this becomes: IEnumerable<Token> ConvertPairedNewlinesToTokens(IEnumerable<Token> tokenSet)
{
switch (tokenSet)
{
case var [t1, t2, .. tail] when IsNewline(t1) && IsNewline(t2) :
yield return Token(Kind.Separator);
switch(tail);
case var [t, .. tail] :
yield return t;
switch(tail);
case [] :
yield break;
}
} The syntax is longer*, but to my mind at least, it's far simpler to follow. *I think the syntax could be improved hugely, but for now I'm sticking with C#-style syntax for this discussion to keep it looking familiar. |
Beta Was this translation helpful? Give feedback.
-
I dismissed
(and well spotted with regard to my naive use of |
Beta Was this translation helpful? Give feedback.
-
@DavidArno I think that that token parsing problem is indeed a good example of something that current C# doesn't handle well, while functional-style pattern matching does. Trying to fit that into my "imperative pattern matching" idea, I think it could look something like: IEnumerable<Token> ConvertPairedNewlinesToTokens(IEnumerable<Token> tokenSet)
{
while (true)
{
switch (tokenSet)
{
case [var t1, var t2, .. tokenSet] when IsNewline(t1) && IsNewline(t2):
yield return Token(Kind.Separator);
break;
case [var t, .. tokenSet]:
yield return t;
break;
case []:
yield break;
}
}
} I also had a crazy idea: introduce a IEnumerable<Token> ConvertPairedNewlinesToTokens(IEnumerable<Token> tokenSet)
{
foreach switch (tokenSet)
{
case var [t1, t2] when IsNewline(t1) && IsNewline(t2):
yield return Token(Kind.Separator);
break;
case var [t]:
yield return t;
break;
}
} I'm not sure adding a |
Beta Was this translation helpful? Give feedback.
-
Could this syntax clash with attribute pattern matching in #807? private bool Serializable { get; set; }
...
if (T is [Serializable]) ... |
Beta Was this translation helpful? Give feedback.
-
Yes, but that copying happens infrequently; amortized time complexity of adding an item is still O(1). In practice, I believe the overheads of linked lists are much bigger, which is why real C# code uses
I don't think so. The count is initialized to 0, so if the first item matches |
Beta Was this translation helpful? Give feedback.
-
Hmm, you are right. It does work. So we can conclude that it's actually me that has difficulty understanding code that solves problems with loops. I love your idea of the @bondsbw, |
Beta Was this translation helpful? Give feedback.
-
Definitely not, I'm afraid. Since they're ultimately generated with a An public class EnigmaticObject : IEnumerable<int>
{
private static readonly Random Random = new Random();
public IEnumerator<int> GetEnumerator()
{
var count = 10;
while (count-- > 0)
{
yield return Random.Next();
}
}
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
} |
Beta Was this translation helpful? Give feedback.
-
I do, however, support the idea of pattern matching on collections. Maybe only on persistent or recursive data structures? I'm thinking that there are many types that can be destructured into a tuple, such as a cons-style list that could be split into its head and tail: var (hd, tl) = list; So maybe var [x, y, z] = a can be shorthand for the recursive pattern: (var x, (var y, (var z, _))) = a I'm just brainstorming here, but it seems possible, if a type can be split into two, to be able to recursively split it further. This could also work with Array Slices, when they arrive. |
Beta Was this translation helpful? Give feedback.
-
@Richiban Is list deconstruction into tuples part of an existing proposal? |
Beta Was this translation helpful? Give feedback.
-
Since all the required infra is being done as part of C# 9.0, I've started to work on a proposal for this feature, this is the latest revision: https://gist.github.com/alrz/84addd150849a0b8c014deb85b75211d/ I plan to implement a prototype for the first two sections with the hopes that it'll get championed. Feedback welcome. |
Beta Was this translation helpful? Give feedback.
-
@DavidArno I've just noticed this discussion. I think the latest iterations of list-patterns are getting pretty close to what you're proposing. I've written a proposal for how we might implement list-patterns on enumerable types (ie. that are not indexable) without enumerating multiple times: https:/dotnet/csharplang/blob/main/proposals/list-patterns-enumerables.md Regarding the head/tail pattern, I don't think we need to use |
Beta Was this translation helpful? Give feedback.
@DavidArno I've just noticed this discussion. I think the latest iterations of list-patterns are getting pretty close to what you're proposing.
I've written a proposal for how we might implement list-patterns on enumerable types (ie. that are not indexable) without enumerating multiple times: https:/dotnet/csharplang/blob/main/proposals/list-patterns-enumerables.md
In short, we'd generate a wrapper type around the enumerable and buffer a few elements. More elements need to be buffered if the list-patterns are longer and involve slices at the start. This type provides a count-in-progress and emulates indexers for start-indexes (such as
0
) and end-indexes (such as^1
).Regarding…