Chicken Scratches

Developing ideas on developing.
Page style (CSS):

Archive for January, 2008

Programming for Not-so-dummies

January 30th, 2008 by Eddie Sullivan

Somebody posted a reply to my last post about Django's autoescaping mechanism. (They were too cowardly to post on my site, so they posted it at reddit.com.) The person said something like, "you shouldn't trust yourself to remember to escape your own variables." Oh, heaven forbid I trust myself to be a good programmer! That really got me thinking about the recent trends towards designing frameworks, APIs, even languages for mediocre programmers. We are sacrificing speed, simplicity and efficiency to make common bugs less common, trying to design away the mistakes inexpensive and poorly-trained computer scientists make.

Now, of course when I say "recent trends," I should acknowledge that this type of thinking has been around for decades. It was first truly popularized with the introduction of Java. Some people forget to free memory, so add garbage collection. Some people forget to bounds-check arrays, so make that automatic. Ooh, pointers are scary! Let's get rid of them. We can't allow our outsourced foreign coders direct access to memory!

Good training, along with working for nearly a decade as an embedded software engineer, has taught me good programming habits. I've learned to be conscious of memory leaks, to always check return values, to program defensively, to bounds-check. I've created software for shipping products in such low-level and "unprotected" languages as C++, C, and even Assembly. I've written production code within less than the memory space required for a Java byte-code interpreter. And of course, I'm not alone in this. There is a large subset of software developers who had to learn to program carefully, due to constraints out of their control. These types of good programming habits carry over into whatever platform or language is used.

I feel a lot of the new safety-net style approaches are simply enabling poor programmers to work on increasingly sophisticated projects. To get back to the example from my last post, Django is a wonderful tool. You can program a sophisticated database-centered multi-user web application without even knowing how to spell SQL. Django's recent addition of autoescaping, and more importantly, the enabling of autoescaping globally by default, is yet another example of API-design for the lowest common denominator. (I should note that I love Django. It saves me writing a lot of redundant code and provides a lot of things for free that I would otherwise need to write from scratch, so I don't mean to pick on Django here. It just happened to be the catalyst for this discussion.)

It's not all bad

I know I'm starting to sound like an old curmudgeon. "In my day, we didn't have variables, we just had to carry around rocks to count!" I'm not that old, really. And I'm certainly not advocating we go back to the days before garbage collection and bounds checking. Especially given the potential security ramifications of memory-management bugs, these things are especially important. I just want to urge caution before binding developers in a straitjacket. Rather than trying to design away all potential bugs at the level of the language or API, emphasize and facilitate good programming and testing practices. I've never once bought a For Dummies book, and I never will. Please don't force me to use a For Dummies application framework.

Escaping autoescape in Django

January 28th, 2008 by Eddie Sullivan
I've been pleased with the Django web-application platform. Programming in Python is fun and fast, and Django provides many things for free that would be a lot of work to program from scratch. I've also enjoyed developing with the "bleeding edge" development version of Django. I like being able to use the latest features before they make it into the official releases. Whenever I stumble across what I think is a bug in Django, the first thing I do is "svn update" in my Subversion checkout, and most of the time the bug has already been fixed in the trunk. Recently, however, a major change in the Django development version has caused all of my projects to stop working! Needless to say, this was a bit frustrating. The change was the addition of "autoescaping" in Django.

What it is

The autoescape setting, referred to in the Django documentation as Automatic HTML escaping, means that any variable inserted into a rendered template gets the function django.utils.html.escape called on it. You can see what this function does in the file trunk/django/utils/html.py, but essentially as of today's code base it applies the following set of substitutions:
your_string.replace('&', '&amp;').replace('<', '&lt;'). \
    replace('>', '&gt;').replace('"', '').replace("'", '''))
(Ironically, I had a lot of trouble getting that code fragment to look right, due to Wordpress's own autoescaping, which I ended up disabling altogether. Aaarrgh!) On the surface, this seems like a useful feature. It seems to have been done to "idiot-proof" the template language, and to prevent cross-site-scripting vulnerabilities in case there is user-generated text stored in variables and the programmer forgets to call the appropriate escape function.

The problems

In general, I hate this kind of stuff. I can't stand it when Microsoft Word capitalizes the first letter of my sentences. If I wanted a capital letter, I would have held down the shift key! I hate it when the rear defroster in my car shuts off automatically after 30 minutes. Hello! Just because 30 minutes have passed doesn't mean it's not still raining out; doesn't mean I don't still live in New England! Essentially, I don't like it when machines think they are smarter than I am, or when they try to do what I mean, rather than what I say. If I had wanted to escape my variables, I would have escaped them. That's the first problem. This would not be such a big issue if it weren't for the second problem: this new disruptive feature is turned on by default, with no easy way to disable it across the board. I have a lot of programatically generated HTML and Javascript code contained in template variables. As you can imagine: instant breakage!

How to turn it off

The Django documentation does not have much good information on how to disable this new feature. Supposedly you can add the text "|safe" to every variable reference. Obviously this is impractical on even the smallest sites. Supposedly also, you can surround every template with "{% autoescape off %}" and "{% autoescape end %}" . This could be a viable option for a small site, but for someone who has to manage several sites, each with a large number of templates, this quickly becomes cumbersome. The documentation claims that the "autoescape off" setting will cascade to subclassed templates, but as of the version I have, this doesn't work. After some grepping, I came to a temporary solution. It turns out the constructor to the Context class has an undocumented new boolean parameter called, appropriately enough, autoescape. Its default value is True. I briefly considered adding "autoescape=False" to every call to the Context constructor, but quickly abandoned that idea. The solution I came up with was to actually edit the Django source code, in trunk/django/template/context.py. I modified the constructor so that the default value of autoescape is now False. On line 12:
class Context(object):
    "A stack container for variable context"
    def __init__(self, dict_=None, autoescape=False):
Hopefully, Django will provide a more permanent solution in the future. Ideally, it would be a setting in the "settings.py" file. For now, this small change allows me to continue developing with this useful set of tools.

Important note (Added Feb 26, 2008)

This page is getting a lot of hits, so I want to make clear that I do not recommend making the above change to the Django source permanently. This should be viewed as a TEMPORARY fix only, until you have time to migrate all your templates and code to deal with autoescaping correctly, that is, to keep it on except where you really need it to be off.