[RndTbl] html pretty code

Trevor Cordes trevor at tecnopolis.ca
Thu Nov 18 09:12:38 CST 2010

Adam thought someone might find this useful, so here it is.  It's a perl 
program I wrote to "pretty" html for easy readability/debugging by 
applying indenting.  The neat thing is, it's entirely contained in 1 
regular expression; no loops!  Well, except for the weird nl while loop.  
This ain't your father's regex!

Yes, there's a zillion html pretty programs out there but none did what I 
wanted in a few ways:

1. Just fire & forget, no 200 options to worry about.

2. Works on random html snippets not just whole pages, so you can view 
source from the web and just paste a few lines from the middle of any web 
page and it will pretty it up.

3. Challenge to do something like this only using regex!

There may be a few tags it doesn't catch yet (just add them to the main 
tag list), but that won't hurt the output very much.

Run like:
html-pretty < html-file | less

#!/usr/bin/perl -w
# Copyright 2010 Trevor	E Cordes, Tecnopolis Enterprises
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.

$l=-1; # indentation level


$s=~s#>\s+#>#g; # rm ws after tags
$s=~s#\s+<#<#g; # rm ws before tags
while ($s=~s#<([^>]*)\n([^>]*)>#<$1$2>#g) { 1; };	# rm nl in tags

$s=~s#>#>\n#g;  # put newlines after all tags
        ?	# non-tag text
          (' 'x($l+1)).$4."\n"
           ?    # a recognized triggers-indent tag
             ?  # tag starts with /, decrease indent
               $l--,($l<-1 and $l=-1) , ((' 'x($l+1)).'<'.(defined($1)?$1:'').$2.$3)
             :  # tag is opening tag, increase indent
               $l++ , ((' 'x$l).'<'.(defined($1)?$1:'').$2.$3)
           :    # a non-triggers-indent tag
            ( (' 'x($l+1)).'<'.(defined($1)?$1:'').$3 )
        !gemx; # indent

print $s;

More information about the Roundtable mailing list