atom feed15 messages in org.perl.perl5-portersRe: Pattern matching in SNOBOL4 (long...
FromSent OnAttachments
Mark-Jason DominusApr 15, 1998 10:23 pm 
Ilya ZakharevichApr 15, 1998 11:34 pm 
Moore, PaulApr 16, 1998 2:16 am 
Moore, PaulApr 16, 1998 2:49 am 
Chaim FrenkelApr 16, 1998 6:50 am 
Mark-Jason DominusApr 16, 1998 7:20 am 
Ilya ZakharevichApr 16, 1998 9:53 am 
Ilya ZakharevichApr 16, 1998 10:08 am 
Larry WallApr 16, 1998 10:41 am 
Chaim FrenkelApr 16, 1998 11:03 am 
Ton HospelApr 16, 1998 3:18 pm 
kst...@chapin.eduApr 16, 1998 4:41 pm 
Peter PrymmerApr 16, 1998 4:55 pm 
Ton HospelApr 17, 1998 1:39 pm 
Ton HospelApr 17, 1998 2:20 pm 
Subject:Re: Pattern matching in SNOBOL4 (long, digression)
From:Ton Hospel (thos@mail.dma.be)
Date:Apr 17, 1998 1:39:02 pm
List:org.perl.perl5-porters

In article <1998@chapin.edu>, kst@chapin.edu writes:

Ton Hospel said:

In article
<c=UK%a=_%p=Origin-it%l=UKRU@ukrax001.ras.uk.origin-it.com>, "Moore, Paul" <Paul@uk.origin-it.com> writes:

From: Ilya Zakharevich[SMTP:il@math.ohio-state.edu]

When we have a patch-receptive pumpking (will we ever?), $& and friends will work in (?e ), so

'MISSISSIPPI' =~ /(is|si|ip|pi)(?e print $1 )(?!)/

will be the Perlian way.

I don't think I would like that too much. Now I quite often use the fact that you can do something like:

chomp($input=<STDIN>); $foo =~ /$input/;

where whatever the user types, he can't damage the integrity of your program. With that proposed extension he would be able to execute anything he wants. I would at least want a letter I could put after the // to forbid (?e ) execution.

Your concerns are well-founded; this is the class of problem which Taint mode is designed to diagnose and pre-empt. To execute a user-supplied regex safely, you really need to confirm that the pattern only contains the kinds of things you're willing to accept:

chomp ($input=<STDIN>); $input =~ s/(.*?)\(\?e.*/$1/; # Eliminate ``(?e ...'' expressions $foo =~ /$input/;

Keep in mind that user-supplied regular expressions can already have unwanted side effects -- e.g., ``((.*)*)'' will produce a fatal warning; ``(.)'' will set $1; Ilya can probably come up with an example which will gobble up all available memory . . . . :-)

Ok, you convinced me. I always thought the eval I put around these constructs took care of evil user input, but reading your post and checking teaches me that bad expressions are indeed fatal. So I'll just stop using that particular construct :-)

Which leads me to: why are they fatal ? (More out of curiosity, making them not fatal would indeed solve ?e, since Ilya wants to make them taint checked, but it wouldn't solve the easily written out of memory regex).