Fw: [RndTbl] Oh great RE master

Gilles Detillieux grdetil at scrc.umanitoba.ca
Wed May 9 16:45:00 CDT 2007


The problem with 's/.*\([[:digit:]]*\).*/\1/g' is the first .* will 
swallow up as many characters as it can while still having the rest of 
the expression match something.  Now, because the * means 0 or more of 
the previously matched character, the [[:digit:]]* and trailing .* will 
happily match nothing at all, so the initial .* still swallows 
everything.  The fix is to make the first part more restrictive than .*, 
.e.g. [^0-9]* or [^[:digit:]]*, so it won't chew up your digits, but 
then Sean's RE is even simpler -- so long as you want all the digits and 
it doesn't matter where they are.  If you needed to extract the first 
contiguous string of possibly several strings of digits, though, you'd 
need to get more elaborate.

An equivalent to Sean's command would be:

    echo BUILD-AM005-a | tr -dc '0-9'

This would chew up the newline character as well, but that doesn't 
matter if you're going to use the result in a variable using var=`...` 
or var=$(...) .

Gilles

On 05/09/2007 04:14 PM, Steve Moffat wrote:
> Well, ya... I guess I did the equivalent (though not so concise) method 
> after sending the first email to roundtable...
> 
> echo APP-AM005-a | sed 's/[[:alpha:]]//g;s/[[:punct:]]//g'
> 
> I like the search inversion though Sean. Much cleaner!
> 
> So the problem I have is solved, thanks Sean. But why won't my original 
> method work?
> The [[:digit:]]* should have matched all the consecutive digits 
> shouldn't it? And then the ( ) brackets should place the match into 
> buffer 1.
> 
> Steve
> 
> IBM Global Services
> sjm at ca.ibm.com
> (204)792-3245
> 
> ----- Forwarded by Steve Moffat/CanWest/IBM on 05/09/2007 04:08 PM -----
> 
>                         *"Sean Walberg" <sean at ertw.com>*
>                         Sent by: swalberg at gmail.com
> 
>                         05/09/2007 04:05 PM
> 
> 	
> 
> To
> 	
> Steve Moffat/CanWest/IBM at IBMCA
> 
> cc
> 	
> roundtable at muug.mb.ca
> 
> Subject
> 	
> Re: [RndTbl] Oh great RE master
> 
> 	
> 
> 
> # echo BUILD-AM005-a | sed 's/[^0-9]//g'
> 005
> 
> Sean
> 
> 
> 
> On 5/9/07, *Steve Moffat* <_Steve.Moffat at ca.ibm.com _ 
> <mailto:Steve.Moffat at ca.ibm.com>> wrote:
> 
>       Hi All;
>       I've been trying to write a sed function to return only a numeric
>       portion of a string, but can't seem to get it working.
>       The input is a single string of letters and numbers, with the
>       numbers always consecutive.
>       For example: BUILD-AM005-a
> 
>       I want to get the 005 out of this string.
> 
>       echo BUILD-AM005-a | sed 's/.*\([[:digit:]]\).*/\1/g'
> 
>       will return the digit 5. This is good!
> 
>       So I add an asterisk to try to match multiple digits like:
>       echo BUILD-AM005-a | sed 's/.*\([[:digit:]]*\).*/\1/g'
> 
>       and instead of returning 005, it doesn't match anything, so
>       returns nothing.
> 
>       Can any of you RE maters help me out?
> 
>       Steve Moffat
>       IBM Global Services_
>       __sjm at ca.ibm.com_ <mailto:sjm at ca.ibm.com>
>       (204)792-3245

-- 
Gilles R. Detillieux              E-mail: <grdetil at scrc.umanitoba.ca>
Spinal Cord Research Centre       WWW:    http://www.scrc.umanitoba.ca/
Dept. Physiology, U. of Manitoba  Winnipeg, MB  R3E 3J7  (Canada)


More information about the Roundtable mailing list