On my blog, I use the excellent highlight.js library to apply syntax highlighting to source code in the browser. This has the benefit of being able to copy & paste source code directly in to the post (enclosed in a [% FILTER html %] block), instead of having to transform it somehow. There is also the added benefit of keeping the number of tag-enclosed pieces of text to a minimum, keeping the original DOM simple which, one hopes, means faster downloads and faster initial rendering.
A recent Stackoverflow question introduced me to the PPI::HTML module which uses the amazing PPI module to parse Perl source code, and associate CSS classes with the various elements.
If you ask the module to produce a complete HTML page, it will also embed the relevant CSS in the page, and will produce a pretty, colorful document. By default, the class names are rather verbose, and the module offers limited flexibility, but PPI::HTML::CodeFolder provides some enhancements that may be useful.
What if you wanted to produce a self-contained chunk of syntax-highlighted Perl without depending on external CSS or JavaScript? In that case, you can resort to a somewhat grungy technique I use when I am generating HTML email: Post process the HTML to replace classes on elements with style attributes.
Here is an example Perl script which generates a syntax highlighted version of its own source code:
#!/usr/bin/env perl
use strict;
use warnings;
use PPI;
use PPI::HTML;
use HTML::TokeParser::Simple;
my %colors = (
cast => '#339999',
comment => '#008080',
core => '#FF0000',
double => '#999999',
heredoc_content => '#FF0000',
interpolate => '#883333',
keyword => '#0000FF',
line_number => '#666666',
literal => '#999999',
magic => '#0099FF',
match => '#9900FF',
number => '#990000',
operator => '#DD7700',
pod => '#008080',
pragma => '#990000',
regex => '#9900FF',
single => '#664444',
substitute => '#9900FF',
transliterate => '#9900FF',
word => '#40c080',
);
my $highlighter = PPI::HTML->new(line_numbers => 0);
my $html = $highlighter->html(\ do { local $/; open 0; <0> });
print qq{<pre style="background-color:#fff;color:#000">},
map_class_to_style($html, \%colors),
qq{</pre>\n}
;
sub map_class_to_style {
my $html = shift;
my $colors = shift;
my $parser = HTML::TokeParser::Simple->new(string => $html);
my $out;
while (my $token = $parser->get_token) {
next if $token->is_tag('br');
my $class = $token->get_attr('class');
if ($class) {
$token->delete_attr('class');
if (defined(my $color = $colors->{$class})) {
# shave off some characters if possible
$color =~ s{
\A \#
([[:xdigit:]])\1
([[:xdigit:]])\2
([[:xdigit:]])\3
\z
}{#$1$2$3}x;
$token->set_attr(style => "color:$color");
}
}
$out .= $token->as_is;
}
$out;
}And the output, in a rather distasteful color scheme, I admit:
The original script is 1,690 bytes. On the other hand, the syntax highlighted chunk above is 8,599 which is about a 408% increase.
PS: You can discuss this post on /r/perl.