atom feed24 messages in org.perl.perl5-portersfixing ~~ harder. a lot harder
FromSent OnAttachments
Ricardo SignesJul 6, 2011 2:24 pm 
Leon TimmermansJul 6, 2011 3:19 pm 
ZeframJul 6, 2011 3:25 pm 
Reverend ChipJul 6, 2011 3:31 pm 
ZeframJul 6, 2011 3:49 pm 
Reverend ChipJul 6, 2011 3:53 pm 
Ricardo SignesJul 6, 2011 4:14 pm 
ZeframJul 6, 2011 4:30 pm 
Ricardo SignesJul 6, 2011 4:36 pm 
Reverend ChipJul 6, 2011 7:05 pm 
Ed AvisJul 7, 2011 9:15 am 
Jesse LuehrsJul 7, 2011 9:19 am 
Ed AvisJul 7, 2011 9:27 am 
Reverend ChipJul 7, 2011 9:33 am 
Ricardo SignesJul 7, 2011 9:43 am 
ZeframJul 7, 2011 9:43 am 
Reverend ChipJul 7, 2011 9:50 am 
Salvador FandinoJul 7, 2011 10:42 am 
Ricardo SignesJul 7, 2011 11:56 am 
Reverend ChipJul 7, 2011 12:29 pm 
Ricardo SignesJul 7, 2011 1:02 pm 
Reverend ChipJul 7, 2011 1:15 pm 
Randal L. SchwartzJul 10, 2011 3:42 pm 
Yuval KogmanJul 16, 2011 8:13 am 
Subject:fixing ~~ harder. a lot harder
From:Ricardo Signes (perl@rjbs.manxome.org)
Date:Jul 6, 2011 2:24:25 pm
List:org.perl.perl5-porters

Boy do I ever dislike ~~. Every time I see somebody suggest it, this is what I imagine:

<fellow> Maybe Perl isn't given to over-magic line-noisy crap. I hear there's even a new version. What'd it get us?

<japh> ~~, for smart matching, with 27-way recursive runtime dispatch by operand type!

<fellow> ...

The massive dispatch table is a red flag so large that it could be sewn into a red tent. The bizarre grammar exceptions in when() conditions are nuts.

I think there is real value in ~~, though. When I tried to justify it when presenting Perl 5.10 changes, I said that the value was that you could have different kinds of things that represented tests, or predicates. Then you could say "this method expects an arrayref and a predicate." What does that mean? It means that the second argument can be anything that goes on the right-hand side of ~~. I still think this is a very good idea. What I don't like is the massive number of things that go on the right-hand side of a ~~ oeprator, nor the way that the dispatch is based on both the left and right sides, giving us that massive table.

Here is the table I propose instead:

$a $b Meaning ======= ======= ====================== Any undef ! defined $a Any ~~-overloaded invokes the ~~ overload on the object, $a as arg Any Regexp, qr-OL $a =~ $b Any CodeRef $b->($a) Any Any fatal

~~ works the same way for undef (a trivial case), ~~-overloaded right-hands (ignoring the bug I pointed out earlier today), and many cases of CodeRef. We get rid of the way that containers on the lhs are currently groveled over. Regexp, too, just do what they were doing to non-containers.

This table is very, very easy for anyone to memorize, unlike the current table. It is also very easy to guess at, unlike the previous one. That is, you can see that someone has put a coderef on the rhs and guess what it will do.

I also propose that the other object overloading be allowed to work: if your object currently overloads 'qr' or '&{}', but not '~~', those overloads will be used for ~~. If none of those is overloaded, the smart match will be fatal.

I don't see any real need for Str or Num in the $b column. Something like Yuval or Leon T.'s modules can provide streq(4) or numeq(4) to generate coderefs.

Finally, I am not sure how I feel about non-scalars on the left-hand side. Assume that the code in question (for ~~ or code dispatch) gets a reference in the case of a non-scalar, so that:

@array ~~ sub { ... }

Ends up calling $sub->(\@array)

I don't like that it is indistinguishable from:

[ 1, 2, 3 ] ~~ sub { ... }

I can't yet produce a case where this is a serious problem, though. Even so, I think it may be sufficient to pass a second argument, which is whether the tested thing was en-referenced on demand.

@array ~~ sub { ... }; # @_ == [ \@array, 1 ] \@array ~~ sub { ... }; # @_ == [ \@array, undef ]

What does this mean for matches against Regexp? Well, it doesn't matter:

@array ~~ qr/.../; # ...is equivalent to... \@array ~~ qr/.../; # ...which is just dumb.

Since we'll all be using "no stringification", this should end up being fatal, anyway. No problem, right?

This should be available with "use feature 'dumb_match'," enabled also by "use 5.x.0" where x is whenever it's ready.