[RndTbl] Wrong time of night for doing regex?

Hartmut W Sager hwsager at marityme.net
Mon Jan 6 03:35:38 CST 2020


Thanks, Trevor, for your useful comments.  As a result, I've spent some
time in the PCRE regex documentation, and have discovered just how feeble
the regex implementation is in my Vedit (no, not vi!) text editor.  Even
tonight, I've run into more problems.

Other than the lousy regex implementation, though, Vedit has served me well
continuously since 1982 (with a large number of upgrades of course).

Hartmut W Sager - Tel +1-204-339-8331


On Sun, 5 Jan 2020 at 04:10, Trevor Cordes <trevor at tecnopolis.ca> wrote:

> On 2020-01-04 Hartmut W Sager wrote:
> >
> > It turns out, at least in this regex implementation, that a pair of
> > enclosing parentheses can only serve one of two purposes, not both,
> > at the same time.  Those two purposes are:
> >
> > 1.  Mark a group that can then be referred to by a variable like "\3"
> > in the replacement string.
> > 2.  Enclose a group with alternation (regex terminology) containing
> > several alternatives separated by the "or" operator "|".
>
> That's just plain evil.  Nasty!
>
> The de facto standard is (obviously) PCRE and your program (you said
> vi?) is obviously not PCRE.  I'd be shocked if vi doesn't offer you
> some way to replace the regex engine?  Or at least out-source the regex
> work to a filter?  Not sure, I don't use vi.
>
> In PCRE each () serves both purposes, unless you use (?:) in which case
> you only get purpose #2 (and save CPU cycles).
>
> The others are correct, using \s in the right hand side is not PCRE.
> In PCRE \s means "(most) any whitespace" in the regex, and will be just
> "s" in the substitution.
>
> PCRE = One Ring^H^H^H^HRegex to rule them all.  Most programs with
> regex use the PCRE library now, or give the option, and if you always
> use -P with grep you'll basically never have to touch another
> substandard regex engine again! :-)  All the perl-haters might find it
> amusing that they use "perl" on a daily basis because of PCRE :-)
> (Well, sort of.)
>
> > I am a bit suspicious of the ([0-9][0-9]|\s[0-9]) group re operator
> > precedence of the "or"
>
> In most (all?) regex engines (especially PCRE; but pretty sure all!)
> the rule is "first, most".  So the order you put your alternates may
> matter.  In the above case, order probably doesn't matter because
> things surrounding that bit must be space/comma.  Order matters in
> things where surrounding bits can match the same bits, and things like
> eating escaped chars, like escaped double-quotes in CSVs:
> /"(\\"|[^"])+"/ works, but
> /"([^"]|\\")+"/ doesn't.
>
> As always, the O'Reilly regex book is an amazing way to fully
> understand exactly what is going on and will really open a lot of eyes!!
> _______________________________________________
> Roundtable mailing list
> Roundtable at muug.ca
> https://muug.ca/mailman/listinfo/roundtable
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://muug.ca/pipermail/roundtable/attachments/20200106/eef25e01/attachment.html>


More information about the Roundtable mailing list