Posted by: tonygurney | October 28, 2010

Pandoc – a universal document converter

If you need to convert files from one markup format into another, pandoc is your swiss-army knife. Need to generate a man page from a markdown file? No problem. LaTeX to Docbook? Sure. HTML to MediaWiki? Yes, that too. Pandoc can read markdown and (subsets of) reStructuredText, HTML, and LaTeX, and it can write plain text, markdown, reStructuredText, HTML, LaTeX, ConTeXt, PDF, RTF, DocBook XML, OpenDocument XML, ODT, GNU Texinfo, MediaWiki markup, groff man pages, EPUB ebooks, and S5 and Slidy HTML slide shows. PDF output (via LaTeX) is also supported with the included markdown2pdf wrapper script.

Pandoc understands a number of useful markdown syntax extensions, including document metadata (title, author, date); footnotes; tables; definition lists; superscript and subscript; strikeout; enhanced ordered lists (start number and numbering style are significant); delimited code blocks; markdown inside HTML blocks; and TeX math. Other options include ???smart??? quotes, dashes, and ellipses; syntax highlighting; and automatically generated tables of contents. If strict markdown compatibility is desired, all of these extensions can be turned off with a command-line flag.

Pandoc includes a Haskell library and a standalone executable. The library includes separate modules for each input and output format. So adding a new input or output format just requires adding a new module.

Pandoc is free software, released under the GPL. ?? 2006???2010 John MacFarlane.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s