Okay, I need to go through ~2700 articles, some large, some very large, and programmatically decide which of them use an interesting amount of Textile formatting. Don’t ask why; it’s not polite to stare at the unfortunate. Just smile and nod and think about how you’d puzzle this one out.

So I’ve got a bunch of content, some of which may be in Textile format; how can I tell without looking at every piece and making a judgement call? First, I decided to try comparing the text before and after RedCloth processing, but that didn’t work since RedCloth does some whitespace cleanup (among other things) which mean that feeding it a string devoid of Textile markup doesn’t guarantee you’ll get the same string back. A small part of me is offended by this – it breaks the principle of least astonishment – but I’m enough of a situational ethicist to value the good in what it’s doing for me and move on.

I started poking around in RedCloth 3.0.4, found the bits which were perpetually going to alter some of my strings (for the better!) and discovered something wonderful – you can pass an ordered list of rules into RedCloth#to_html, allowing you to tweak it just the way you like it. Do you like using – gasp! – long dashes? The :glyphs_textile rule converts them to an HTML entity (— to be precise) along with a bunch of other conversions. And there are even handy shortcuts to specify all of the textile markup rules (:textile) and all of Markdown (:markdown). Yay! Flexibilty rewards to those who read the code closely!

But why, then, if I do something like:

RedCloth.new('foo -- bar -- baz').to_html(:glyphs_textile, :textile)

Do I get this:

foo - bar - baz

It turns out that the glyph and inline rules are run in the same method, inline, which looks like this:

    def inline( text ) 
        [/^inline_/, /^glyphs_/].each do |meth_re|
            @rules.each do |rule_name|
                method( rule_name ).call( text ) if rule_name.to_s.match( meth_re )
            end
        end
    end

See the problem? Because inline comes first in the ordered list of regular expressions, all inline rules come straight to the head of the line. It should look more like this:

    def inline( text ) 
        @rules.each do |rule_name|
            method( rule_name ).call( text ) if rule_name.to_s.match(/^(inline|glyphs)_/)
        end
    end

… and that’s exactly what I monkey-patched into my app.

Meanwhile, judging from the Textile demo, Textile should handle my double-dash example above; this is also a bug in the Ruby implementation – and all this time I blamed Textile … sorry! I think I’ll send in a patch as penance.

Things I’ve Learned

  • Making your code flexible is good for other developers.
  • Being good to other developers means they might send you patches when they find problems.
  • There’s a hidden feature in Textile which allows you to disable Textile for a chunk of text: <notextile>-not struck-</notextile> -or- ==-also not struck-==
  • Monkey-patching is a nice way to get something fixed in your app quickly.
  • It’s handy to be able to silence constant setting warnings when you have to monkey-patch a constant.
  • Even smart people make bugs (so what hope is there for me?)


blog comments powered by Disqus

Published

19 December 2007

Category

ruby/rails