Markdown/Textile conversion script

Have a problem? A question? This is the place for answers from other Express users.
Post Reply
WattsM
Posts: 40
Joined: 2003-11-13 13:50:17
Location: San Jose, CA
Contact:

Markdown/Textile conversion script

Post by WattsM »

On the exceedingly off-chance someone else might want this...

I'm starting a webzine using Textpattern as its content management system. The stories in it are going to be formatted with Markdown, a simple way of marking up text. Markdown and its second cousin Textile appear in a great deal of weblog and open source CMS software, either natively or as plugins.

The thing is, I'm getting submissions for the magazine in Word and RTF format, and I wanted to have some way of making the conversion a little easier. So, I came up with this script, which I've saved in the Macros folder as "Markdown.pl":

Code: Select all

#Nisus Macro Block
#source clipboard
#destination clipboard
#Send as RTF
#Before Execution
#Clipboard 1
#Copy
#After Execution
#Paste
#Clipboard 0
#End Nisus Macro Block
while (my $line = <>) {
    $line =~ s/{.*?\\ul.*\\lang\d+\s+?(.*?)}/_\1_/ig;   # underline
    $line =~ s/{.*?\\i.*\\lang\d+\s+?(.*?)}/_\1_/ig;    # italic
    $line =~ s/{.*?\\b.*\\lang\d+\s+?(.*?)}/**\1**/ig;  # bold
    $line =~ s/\\tab\s//ig;             # delete tabs
    $line =~ s/^\\par$//ig;             # delete blank lines
    $line =~ s/\\par/\\par\r\\par/ig;   # make double CRs
    $line =~ s/\\'c9/\.\.\./ig;         # ellipses
    $line =~ s/\\emdash/--/ig;          # em-dash
    $line =~ s/\\endash/---/ig;         # en-dash
    print $line;
}
This basically turns underline/italic text into _this_ and boldface text into **this**. The only change to this you'd need to make to use Textile is to change the bold line so the part that reads "**\1**" reads "*\1*" (so then boldface will be *this,* with one asterisk instead of two). It also tries to normalize the paragraphs to one blank line between each and to remove tabs. (It makes em-dashes into "--" and en-dashes into "---"; if you're not using SmartyPants--and you know if you are--you might want to remove the en-dash line.)

It's not perfect; if an entire paragraph is italicized, it misses it completely, and it doesn't know anything about paragraph-level formatting or styles. And, I can't guarantee that it'll work at all on non-English documents, or even on different NWX versions; it parses the RTF based on what NWX 2.0.1 seemed to be generating. And, it only works on selected text. (I'd like to have it so it works on the whole document if no text is selected, but it doesn't look like that's an option. And incidentally, Nisus guys, using "source front" and "destination front replace" makes NWX crash!)
rmark
Official Nisus Person
Posts: 430
Joined: 2003-02-11 10:49:05
Location: Solana Beach, CA
Contact:

Re: Markdown/Textile conversion script

Post by rmark »

Great work WattsM. I hope someone else finds this useful.

Regarding
WattsM wrote:Nisus guys, using "source front" and "destination front replace" makes NWX crash!)
Indeed. We know about this one and have it on the "To Fix" list already.
Write On!
Mark Hurvitz
VP for Communications *RETIRED*
Nisus Software Inc.
Post Reply