Dec 12

I use base36 for all most object ids in tagz. I previously used to manually convert all base36 values into integers before doing the lookups. Yesterday it dawned to me that I could subclass QuerySets and remove a lot of boilerplate code. The idea here is that I can do something like

Model.objects.get(pk_base36='2kk')

instead of something like

Model.objects.get(pk=_decode('2kk'))

The Queryset wrapper would internally decode the base36 string into an integer and then lookup by that id. Code follows.

import string

from django.db import models

DIGITS = string.digits + string.ascii_lowercase
BASE = len(DIGITS)

def _encode(n) :
    l = []
    base = len(DIGITS)
    while True :
        n, rem = divmod(n, BASE)
        l.append(DIGITS[rem])
        if n < 1 :
            break
    l.reverse()
    return ''.join(l)

def _decode(s) :
    num = 0
    base = len(DIGITS)
    for c in s.lower() :
        pos = DIGITS.index(c)
        num = (num * base) + pos

class Base36KeyedQuerySet(QuerySet) :
    def _handle_pk_base36(self, kwargs) :
        if 'pk_base36' in kwargs :
            pk_base36 = kwargs['pk_base36']
            assert ('id' not in kwargs and 'pk' not in kwargs)
            kwargs['pk'] = _decode(pk_base36)
            del kwargs['pk_base36']
        return kwargs

    def get(self, *args, **kwargs) :
        kwargs = self._handle_pk_base36(kwargs)
        return super(self.__class__,self).get(*args, **kwargs)

    def filter(self, *args, **kwargs) :
        kwargs = self._handle_pk_base36(kwargs)
        return super(self.__class__,self).filter(*args, **kwargs)

    def exclude(self, *args, **kwargs) :
        kwargs = self._handle_pk_base36(kwargs)
        return super(self.__class__,self).exclude(*args, **kwargs)

    def get_or_create(self, *args, **kwargs) :
        kwargs = self._handle_pk_base36(kwargs)
        return super(self.__class__,self).get_or_create(*args, **kwargs)

class Base36KeyedManager(models.Manager) :
    def get_query_set(self) :
        return Base36KeyedQuerySet(self.model)

class Base36KeyedModel(models.Model) :
    def get_base36_id(self) :
        return _encode(self.id)

    class Meta :
        abstract = True

class ExampleModel(Base36KeyedModel) :
    field1 = models.CharField(max_length=255)
    field2 = models.CharField(max_length=255)

    objects = Base36KeyedManager()
Nov 04

Due to various reasons beyond the scope of this blog, I’ve had to quit my current job as a Programming Specialist at Position2. So, if you’re looking for a python / django hacker in Bangalore (although telecommuting certainly is an option I’d consider), drop me a line.

Oct 01

This one’s pretty basic (from the docs) , but I end up using it all the times. Being able to “AND” and “OR” django querysets can really simplify a lot of code. Here’s an instance (a simplification of the setup I have in tagz). Lets start by defining 2 models.

from django.db import models

class Tag( models.Model ) :
    text = models.CharField(max_length=255)

class Post( models.Model ) :
    link  = models.URLField(max_length=2048)
    title = models.CharField(max_length=255)
    tags  = models.ManyToManyField(Tag)

Now, lets say I want to get all Posts tagged as django or python.

qs = Post.objects.filter(tags__text='python') | Post.objects.filter(tags__text='django')

Now as intuitive as it might seem, using AND doesn’t seem to work.

qs = Post.objects.filter(tags__text='python') & Post.objects.filter(tags__text='django')
# qs.count() returns 0

So, we end up chaining filters like this:

qs = Post.objects.filter(tags__text='python').filter(tags__text='django')

Which boils down to simple for loop.

def filter_tags( tags ) :
    '''
    tags: a list of strings
    '''
    p = Post.objects.all()
    for t in tags :
        p = p.filter(tags__text=t)
    return p
Aug 29

In one of my django projects, I use a lot of recursive template tags, which seem to cause quite a bit of slowdown while rendering them. I looked at the code in django.template.loaders.filesystem

def load_template_source(template_name, template_dirs=None):
    tried = []
    for filepath in get_template_sources(template_name, template_dirs):
        try:
            return (open(filepath).read().decode(settings.FILE_CHARSET), filepath)
        except IOError:
            tried.append(filepath)
    if tried:
        error_msg = "Tried %s" % tried
    else:
        error_msg = "Your TEMPLATE_DIRS setting is empty. Change it to point to at least one template directory."
    raise TemplateDoesNotExist, error_msg
load_template_source.is_usable = True

Looks like the template is reloaded from the filesystem every time the a template is loaded. This gets really bad with custom templatetags and inclusion tags. Well, can’t we just load the file into memory, and the next time its needed, call os.stat() on the file and check if the file has been modified, if not don’t reload the file from disk. Finally, I settled on a compromise, don’t reload templates in production mode, and disable the template cache in debug mode.

Here’s template_cache.py

# -*- coding: utf-8 -*-

from django.template import loader, TemplateDoesNotExist
from django.conf import settings

template_cache = {}
def cached_loader( template_name, template_dirs=None ) :
    global template_cache
    t = template_cache.get(template_name)
    if not t :
        old_loaders = settings.TEMPLATE_LOADERS[:]
        settings.TEMPLATE_LOADERS = old_loaders[1:]
        loader.template_source_loaders = None
        try :
            template_cache[template_name] = t = loader.find_template_source( template_name, template_dirs )
        finally :
            settings.TEMPLATE_LOADERS = old_loaders # To avoid recursively calling cached_loader
        loader.template_source_loaders = None
    return t
cached_loader.is_usable = not settings.DEBUG    # Avoid caching in debug mode