This is a review of Perl CPAN modules that can be used to convert markdown to HTML. If you're looking for a quick answer, you could skip to the Comparison section, but you can't go too far wrong with Text::Markdown.
The following is a list of the modules I'm aware of so far. Please let me know if I've missed any: neilb at cpan dot org.
| Module | Doc | Version | Author | # bugs | # users | Last update |
|---|---|---|---|---|---|---|
| DR::SunDown | pod | 0.02 | Dmitry E. Oboukhov | 1 | 0 | 2012-08-09 |
| Markdent | pod | 0.22 | Dave Rolsky | 0 | 2 | 2012-07-23 |
| Markup::Unified | pod | 0.0401 | עידו פרלמוטר (Ido Perlmuter) | 2 | 2 | 2012-11-22 |
| Text::Markdown | pod | 1.000031 | Tomas Doran | 8 | 38 | 2010-03-20 |
| Text::Markdown::Discount | pod | 0.10 | Masayoshi Sekimura | 1 | 1 | 2013-08-09 |
| Text::Markdown::Hoedown | pod | 0.07 | MATSUNO★ Tokuhiro | 0 | 1 | 2013-10-02 |
| Text::Markup | pod | 0.18 | David E. Wheeler | 0 | 1 | 2013-06-08 |
| Text::Markup::Any | pod | 0.03 | Masayuki Matsuki | 0 | 1 | 2013-10-08 |
| Text::MultiMarkdown | pod | 1.000034 | Tomas Doran | 0 | 18 | 2011-04-26 |
Each module is presented in turn, with a SYNOPSIS style code sample. Then all the converter modules are compared, and I end up with recommendations. The review ends with a See Also section that lists markdown-related modules that don't meet the criteria for this review.
With each module I convert the following small snippet of markdown:
# Sample markdown
This is a paragraph of text.
* This is a bullet
* Another bullet
And here's a code sample
print "Hello, World!\n";
And inline formatting:
*italic*, **bold**, and **_bold italic_**.
The first thing to consider is whether you want a full HTML document to be generated, or just a fragment that could be embedded. Most of the modules listed here generate fragments rather than complete HTML documents.
DR::SunDown is a wrapper around the sundown C library. Sundown is a fork of the soldout library.
The module provides a markdown2html function, which takes a markdown string and returns an HTML one:
use DR::SunDown; my $html = markdown2html($markdown); print $html, "\n";
Which produces the following output:
<h1>Sample markdown</h1> <p>This is a paragraph of text.</p> <ul> <li>This is a bullet</li> <li>Another bullet</li> </ul> <p>And here's a code sample</p> <pre><code>print "Hello, World!\n"; </code></pre> <p>And inline formatting: <em>italic</em>, <strong>bold</strong>, and <strong><em>bold italic</em></strong>.</p>
This module doesn't offer anything beyond basic conversion, but its main advantage is its speed.
The underlying sundown library has been frozen since November 2012, and has a lot of outstanding bugs, so I wouldn't recommend using it. The freeze notice says github and others are apparently working on a formal definition of markdown and a new parser.
Markdent is a toolkit for parsing markdown, which can also be used to convert a markdown document:
use Markdent::Simple::Document;
my $parser = Markdent::Simple::Document->new();
my $html = $parser->markdown_to_html(
title => 'sample document',
markdown => $markdown
);
print $html, "\n";
Which produces the following output:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html><head><title>sample document</title></head><body><h1>Sample markdown
</h1><p>This is a paragraph of text.
</p><ul><li>This is a bullet
</li><li>Another bullet
</li></ul><p>And here's a code sample
</p><pre><code>print "Hello, World!\n";
</code></pre><p>And inline formatting:
<em>italic</em>, <strong>bold</strong>, and <strong><em>bold italic</em></strong>.
</p></body></html>
Note that this is for generating documents, rather than processing snippets of markdown. The documentation says to look at Markdent::Handler::HTMLStream::Fragment if you don't want to produce a complete document. I wrote a simple class using that, and here's the output that resulted:
<h1>Sample markdown </h1><p>This is a paragraph of text. </p><ul><li>This is a bullet </li><li>Another bullet </li></ul><p>And here's a code sample </p><pre><code>print "Hello, World!\n"; </code></pre><p>And inline formatting: <em>italic</em>, <strong>bold</strong>, and <strong><em>bold italic</em></strong>. </p>
The documentation says that if you just want to convert markdown to HTML, then look at Text::Markdown.
Markup::Unified provides a common interface to conversion of three simple markup languages: markdown, BBCode, and Textile. The conversion of each format is handled by other modules; for Markdown it is Text::Markdown.
Here's how you convert markdown:
use Markup::Unified; my $u = Markup::Unified->new(); my $html = $u->format($markdown, 'markdown'); print $html, "\n";
Which produces the following output:
<h1>Sample markdown</h1> <p>This is a paragraph of text.</p> <ul> <li>This is a bullet</li> <li>Another bullet</li> </ul> <p>And here's a code sample</p> <pre><code>print "Hello, World!\n"; </code></pre> <p>And inline formatting: <em>italic</em>, <strong>bold</strong>, and <strong><em>bold italic</em></strong>.</p>
If you need to support multiple markup formats, for example in a blogging engine, then this kind of module might be useful.
Text::Markdown supports both a functional and OO interface. For simple conversion of markdown, you can import the markdown() function:
use Text::Markdown qw(markdown); my $html = markdown($markdown); print $html, "\n";
Which produces the following output:
<h1>Sample markdown</h1> <p>This is a paragraph of text.</p> <ul> <li>This is a bullet</li> <li>Another bullet</li> </ul> <p>And here's a code sample</p> <pre><code>print "Hello, World!\n"; </code></pre> <p>And inline formatting: <em>italic</em>, <strong>bold</strong>, and <strong><em>bold italic</em></strong>.</p>
So this can be used to process a snippet of markdown to embed into another document / page.
The OO interface gives you a little bit of control:
use Text::Markdown;
my $parser = Text::Markdown->new(
empty_element_suffix => '>',
tab_width => 4,
trust_list_start_value => 1,
);
my $html = $parser->markdown($markdown);
print $html, "\n";
This appears to be the most widely used module, and it's the one that I've been using to date. It does have some outstanding bugs on github, but for 'regular markdown usage', it's fine.
Text::Markdown::Discount is a perl interface to Discount, a markdown parser in C. It exports a single function markdown():
use Text::Markdown::Discount qw(markdown); my $html = markdown($markdown); print $html, "\n";
Which produces the following output:
<h1>Sample markdown</h1> <p>This is a paragraph of text.</p> <ul> <li>This is a bullet</li> <li>Another bullet</li> </ul> <p>And here's a code sample</p> <pre><code>print "Hello, World!\n"; </code></pre> <p>And inline formatting: <em>italic</em>, <strong>bold</strong>, and <strong><em>bold italic</em></strong>.</p>
The discount library also supports a number of extensions, which are described on the discount home page. Here are some of them:
Text::Markdown::Hoedown, from the prolific TOKUHIROM, is a wrapper around the hoedown C library. Hoedown is a fork of the sundown library (used in DR::Sundown, described above), which is itself a fork of the soldout library.
The simplest usage is similar to the other modules:
use Text::Markdown::Hoedown; my $html = markdown($markdown); print $html, "\n";
Which produces the following output:
<h1>Sample markdown</h1> <p>This is a paragraph of text.</p> <ul> <li>This is a bullet</li> <li>Another bullet</li> </ul> <p>And here's a code sample</p> <pre><code>print "Hello, World!\n"; </code></pre> <p>And inline formatting: <em>italic</em>, <strong>bold</strong>, and <strong><em>bold italic</em></strong>.</p>
The markdown function can also take options, which are used to enable certain markdown extensions, and control the HTML that is generated.
use Text::Markdown::Hoedown;
my $html = markdown($markdown,
extensions => HOEDOWN_EXT_SPACE_HEADERS
| HOEDOWN_EXT_DISABLE_INDENTED_CODE,
html_options => HOEDOWN_HTML_TOC
| HOEDOWN_HTML_USE_XHTML
);
print $html, "\n";
See the documentation for a list of all options. The doc's a bit thin though, so you'll have to guess / experiment to work out what all the options are.
The module also provides a markdown_toc() function, which will generate an HTML table of contents for a markdown string:
use Text::Markdown::Hoedown; my $html = markdown_toc($markdown); print $html, "\n";
Note that if you want to generate a TOC, you'll need to pass the HOEDOWN_HTML_TOC option to markdown(), shown above. This generates the id attributes on the header, linked to by the TOC.
This module is fast, and appears quite flexible. Once I've worked out what all the options are for, this might be a good option, if you don't mind a module that requires a C compiler.
Text::Markup is a parser that can handle a number of formats, converting them to HTML.
use Text::Markup; my $parser = Text::Markup->new(); my $html = $parser->parse(file => $path); print $html, "\n";
Which produces the following output:
<html> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> </head> <body> <h1>Sample markdown</h1> <p>This is a paragraph of text.</p> <ul> <li>This is a bullet </li> <li>Another bullet </li> </ul> <p>And here's a code sample</p> <pre><code>print "Hello, World!\n"; </code></pre> <p>And inline formatting: <em>italic</em>, <strong>bold</strong>, and <strong><em>bold italic</em></strong>.</p> </body> </html>
So this seems to be oriented towards generating documents rather than embeddable snippets.
Text::Markup::Any provides a single interface to a number of other markup conversion modules. The following shows conversion using Text::Markdown:
use Text::Markup::Any;
my $tma = Text::Markup::Any->new('Text::Markdown');
my $html = $tma->markup($markdown);
print $html, "\n";
Which produces the following output:
<h1>Sample markdown</h1> <p>This is a paragraph of text.</p> <ul> <li>This is a bullet</li> <li>Another bullet</li> </ul> <p>And here's a code sample</p> <pre><code>print "Hello, World!\n"; </code></pre> <p>And inline formatting: <em>italic</em>, <strong>bold</strong>, and <strong><em>bold italic</em></strong>.</p>
The markup modules supported are: Text::Markdown, Text::MultiMarkdown, Text::Markdown::Discount, Text::Xatena, and Text::Textile.
Compared with Markup::Unified this module is a little strange: you have to specify the name of the conversion module rather than the format name (ie 'Text::Markdown' instead of 'markdown'), plus it supports two different markdown modules, with no discussion of why you might choose one over the other.
Text::MultiMarkdown converts MultiMarkdown to HTML, and is written by Tomas Doran, who's the current maintainer of Text::Markdown as well. MultiMarkdown is superset of Markdown defined by Fletcher Penney. Since it's a superset, you can use Text::MultiMarkdown as a converter for regular Markdown:
use Text::MultiMarkdown qw(markdown); my $html = markdown($markdown); print $html, "\n";
Which produces the following output:
<h1 id="samplemarkdown">Sample markdown</h1> <p>This is a paragraph of text.</p> <ul> <li>This is a bullet</li> <li>Another bullet</li> </ul> <p>And here's a code sample</p> <pre><code>print "Hello, World!\n"; </code></pre> <p>And inline formatting: <em>italic</em>, <strong>bold</strong>, and <strong><em>bold italic</em></strong>.</p>
The output is nearly identical to that produced by Text::Markdown, unsurprisingly. Notice the id attribute on the h1 element, for example.
I benchmarked three of the modules using a slightly longer markdown sample (2K), which contains most of the different notations. I converted this 10,000 times. I didn't include Text::Markup, as it just uses Text::Markdown, under the hood.
| Module | Time (s) |
|---|---|
| DR::SunDown | 0.35 |
| Text::Markdown::Hoedown | 0.44 |
| Text::Markdown::Discount | 2.62 |
| Text::Markdown | 90.13 |
| Markup::Unified | 90.40 |
| Text::Markup::Any | 91.39 |
| Text::MultiMarkdown | 149.54 |
| Markdent | 348.31 |
At the moment I'm using Benchmark. I've been meaning to try Dumbbench, and noticed the dist includes Benchmark::Dumb, which is billed as a "Benchmark.pm compatibility layer". But it's not a complete drop-in, so I'll come back to that.
The following table shows the number of run-time dependencies for each module, when running the example code given for each module above.
| Module | # dependencies |
|---|---|
| Text::Markdown::Discount | 4 |
| Text::Markdown::Hoedown | 7 |
| DR::SunDown | 8 |
| Text::Markup | 24 |
| Text::Markup::Any | 26 |
| Text::Markdown | 26 |
| Text::MultiMarkdown | 27 |
| Markup::Unified | 43 |
| Markdent | 299 |
Note that the first three modules are all based on C libraries, so while they're reporting a lot fewer dependencies, they require the relevant C library and a C compiler.
I've been building a corpus of markdown samples, which I used to compare the output generated by the different modules. This is similar to Gruber's testsuite (zip file), but has smaller files, to make it easier to identify exactly what the differences are.
I've mainly focussed on Text::Markdown, DR::SunDown, Text::Markdown::Discount, and Text::Markdown::Hoedown. The differences I've found so far:
List items can wrap across multiple lines:
* This is the first bullet,
which is longer than one line.
* This is the second bullet,
which is also on multiple lines.
* You don't have to indent the bullet
But there must be a space following the bullet marker.
The third bullet is the critter. Discount is the only one that gets it right.
I've reported bugs on the others.
Text::Markdown is pretty battle-hardened, but Discount, SunDown and Hoedown are a lot faster. SunDown is no longer maintained, so don't use that.
For basic usage, and if you want a pure-Perl solution, Text::Markdown is the one to go with.
If you want better performance, then look at Discount or Hoedown.
If you want to write your own markdown processor, then Markdent looks like your best option.
This is a list of modules that don't meet the criteria for this review, but which might be of interest, as they're markdown-related.