Jeethu’s Blog

October 2, 2008

tagz.in is up again

Filed under: Linux, tagz — Tags: , , — admin @ 1:49 pm

Finally after 2 hours or hard work, I got tagz up and running. Its currently running on my vps (the same machine which was running the staging server).

Here’s the background on the issue.
The main server was running Ubuntu 8.04 (Hardy Heron)
Last night, I ran a standard apt-get update; apt-get upgrade.
It upgraded libc and libc-dev, and that was it. After that, all perl processes would just hang spinning busy on the cpu. I couldn’t do a thing. Tried running a couple of the offending scripts in strace, and they all hang on a clone() syscall. Tried restarting the AMI, but it still persisted. The worst problem was I couldn’t even get a db dump because pg_dump wouldn’t work. And the last snapshot I had was about 17 hours old.

So, here’s what I did. I terminated the postmaster instance, took a backup of the db directory, scp’d it to my vps and tried using it there. Then I found that I had to recompile postgresql with –enable-integer-datetimes for it to accept the database. Did that and few other tweaks (I’d switched DNS to point to the VPS early on) and here we have it, up and running.

I’ve got to move back to EC2 soon (The VPS wouldn’t be able to handle the loads for long). But this time, I’m going back to Debian Stable, I’ve had enough of Ubuntu, I have no idea how something as innocuous as a libc upgrade can barf things up so badly.

tagz.in is down :(

Filed under: tagz — Tags: — admin @ 11:40 am

Due to a libc upgrade gone awry, tagz.in has been down for the past 30 minutes. I’m working on bringing it up asap.

UPDATE: Its up again, on a different machine.

October 1, 2008

Merging Django querysets

Filed under: Django, Python — Tags: — admin @ 8:20 pm

This one’s pretty basic (from the docs) , but I end up using it all the times. Being able to “AND” and “OR” django querysets can really simplify a lot of code. Here’s an instance (a simplification of the setup I have in tagz). Lets start by defining 2 models.


from django.db import models

class Tag( models.Model ) :
    text = models.CharField(max_length=255)

class Post( models.Model ) :
    link  = models.URLField(max_length=2048)
    title = models.CharField(max_length=255)
    tags  = models.ManyToManyField(Tag)

Now, lets say I want to get all Posts tagged as django or python.


qs = Post.objects.filter(tags__text='python') | Post.objects.filter(tags__text='django')

Now as intuitive as it might seem, using AND doesn’t seem to work.


qs = Post.objects.filter(tags__text='python') & Post.objects.filter(tags__text='django')
# qs.count() returns 0

So, we end up chaining filters like this:


qs = Post.objects.filter(tags__text='python').filter(tags__text='django')

Which boils down to simple for loop.


def filter_tags( tags ) :
    '''
    tags: a list of strings
    '''
    p = Post.objects.all()
    for t in tags :
        p = p.filter(tags__text=t)
    return p

September 25, 2008

Tagz update

Filed under: tagz — Tags: — admin @ 10:35 am

Just added a new feature to Tagz yesterday (Subscriptions). Subscriptions allows all registered users to subscribe to a set of tags. Internally we ‘AND’ all the tags for every subscription and ‘OR’ the results of every subscription. Additionally, we also support stemming (like we do on every other feed). I see that a couple of users have already started using it. I just wish more people start posting more links (people tend to keep posts private) and commenting more often.

September 2, 2008

Tagz is now live

Filed under: tagz — Tags: — admin @ 9:24 pm

We silently launched tagz on September 1st. I posted about it on proggit last afternoon. The response has been pretty positive. Had some trouble initially. I did consider launching a 2nd ec2 instance at one point of time, but then the load reduced. Its relatively fast to launch another instance, change some settings and setup round robin dns scheduling. I’ve also got a whole lot of feature requests to implement.

August 31, 2008

I’ve just bought a MacBook

Filed under: Life — Tags: , , — admin @ 10:37 am

I’ve just bought a new Macbook and am busy setting up my dev environment on it. Things aren’t really as hard as I though they’d be. With macports, OSX doesn’t really feel much different superficially from any linux distro.

August 29, 2008

It is true

Filed under: Uncategorized — Tags: , , — admin @ 5:59 pm

My dear brother Thilak met with a minor accident this afternoon, and in the confusion the ensued, he’s spilt the beans on Tagz. It must’ve been painful to singlehandedly type the 228 word post (He’s got a cast on his right hand, because of the accident). The UI is kinda crude, but functional. Actually a couple of friends are already using/testing it. Well, we plan to release it sometime soon, but I honestly wish he hadn’t made it public so soon.

We’d been discussing this “`better` delicious reddit chimera” idea for quite some time now. Due to difficult personal circumstances in the past couple of months, I’ve been suffering from a terrible bout of insomnia. When the usual remedies for this (reading Nietzsche, driving through the city all night long etc) didn’t work, I started working on it. Then, on one of my infrequent visits to Mangalore, I showed a very crude prototype to Thilak and he was pretty enthusiastic about it. We setup a redmine instance, moved the mercurial repository to my vps and we were up and running, with a couple of commits every night.

We’ve got a long way to go before I can call it release ready. Until then, all I can say is its written using django and python, with postgresql for the db. And the `undumb` or `not dumb` (or whatever) tags thing he’s hinting about isn’t really all that smart, its just plain old tagging with porter stemming to identify similar tags.

Low level template file caching in Django

Filed under: Django — Tags: — admin @ 4:44 pm

In one of my django projects, I use a lot of recursive template tags, which seem to cause quite a bit of slowdown while rendering them. I looked at the code in django.template.loaders.filesystem


def load_template_source(template_name, template_dirs=None):
    tried = []
    for filepath in get_template_sources(template_name, template_dirs):
        try:
            return (open(filepath).read().decode(settings.FILE_CHARSET), filepath)
        except IOError:
            tried.append(filepath)
    if tried:
        error_msg = "Tried %s" % tried
    else:
        error_msg = "Your TEMPLATE_DIRS setting is empty. Change it to point to at least one template directory."
    raise TemplateDoesNotExist, error_msg
load_template_source.is_usable = True

Looks like the template is reloaded from the filesystem every time the a template is loaded. This gets really bad with custom templatetags and inclusion tags. Well, can’t we just load the file into memory, and the next time its needed, call os.stat() on the file and check if the file has been modified, if not don’t reload the file from disk. Finally, I settled on a compromise, don’t reload templates in production mode, and disable the template cache in debug mode.

Here’s template_cache.py


# -*- coding: utf-8 -*-

from django.template import loader, TemplateDoesNotExist
from django.conf import settings

template_cache = {}
def cached_loader( template_name, template_dirs=None ) :
    global template_cache
    t = template_cache.get(template_name)
    if not t :
        old_loaders = settings.TEMPLATE_LOADERS[:]
        settings.TEMPLATE_LOADERS = old_loaders[1:]
        loader.template_source_loaders = None
        try :
            template_cache[template_name] = t = loader.find_template_source( template_name, template_dirs )
        finally :
            settings.TEMPLATE_LOADERS = old_loaders # To avoid recursively calling cached_loader
        loader.template_source_loaders = None
    return t
cached_loader.is_usable = not settings.DEBUG    # Avoid caching in debug mode

June 26, 2008

Fun with mercurial precommit hooks

Filed under: Python — Tags: , , — admin @ 1:47 pm

I use Mercurial for all my coding projects. Today I hit upon the idea of using mercurial precommit hooks to run django tests before committing. I didn’t really expect it to be so easy :)

I use postgresql on the server, but I found that on win32, running tests with postgresql is excruciatingly slow.

For comparison, 23 tests with postgresql on Vista laptop take about 193 seconds (The same tests take ~14 seconds on my linux vps). With an sqlite in memory database on the same machine they take about 13.5 seconds. Moral of the story: On Windows, use sqlite in memory databases to test django apps.

Ok, so I had to figure out a way to use sqlite only for testing while using postgresql otherwise. I decided to instrument settings.py


_TEST_DB = False
if 'DJANGO_TEST_DB' in os.environ :
    _TEST_DB = True

if _TEST_DB :
    DATABASE_ENGINE = 'sqlite3'
    DATABASE_NAME =  ':memory:'
    DATABASE_USER = ''
    DATABASE_PASSWORD = ''
    DATABASE_HOST = ''
    DATABASE_PORT = ''
else :
    DATABASE_ENGINE   = 'postgresql_psycopg2'
    DATABASE_NAME     = 'project'
    DATABASE_USER     = 'projectdb'
    DATABASE_PASSWORD = 'dbpassword'
    DATABASE_HOST     = ''
    DATABASE_PORT     = ''

Now I thought of using the inprocess Python hooks feature of Mercurial, but it seemed to be too much of a hassle (need to have the hook script in PYTHONPATH), so I decided to write a simple command line script


import os, sys, subprocess
PROJ_DIR = r'C:\Users\Jeethu\code\project'

def main() :
    proj_dir = os.path.normpath(PROJ_DIR)
    os.chdir(proj_dir)
    os.environ['DJANGO_TEST_DB'] = 'True'
    r = subprocess.call(['python','manage.py','test'])
    return r

if __name__ == '__main__' :
    sys.exit(main())

Then simply add 2 lines to the .hg\hgrc file in the repo.

[hooks]
precommit.tests = c:\Users\Jeethu\code\project\hooks\test_hook.py

Hopefully, this will set the incentives right for me to write more tests.

References:
The hgrc manpage
Chapter 10 of the Mercurial Book

May 31, 2008

Building Python Extensions on Win32 with MS VC++ 2008 Express

Filed under: Python, Windows — Tags: , , , — admin @ 2:16 pm

These days, I’m mostly working on linux and I haven’t had to use the Microsoft dev tools for a long time. I’m using Vista on my new laptop. Yesterday, I had to install pycrypto from source on Windows. The only compiler I had was MinGW, which doesn’t quite cut it with distutils. I figured that I’d have an easier time with the free Visual C++ 2008 Express Edition compiler. Apparently, distutils on Python 2.5.2 doesn’t like this compiler either.
Here’s what I had to do to get distutils to work with it.
First, patch distutils.
File: C:\Python25\Lib\site-packages\msvccompiler.py
One of the things I observed in this file was that if
the environment variables DISTUTILS_USE_SDK and MSSdk
are set and if distutils can find “cl.exe” in the system path, it assumes that everything is set up.
To quote from a comment in the source,

# Assume that the SDK set up everything alright; don’t try to be
# smarter

Find the method load_macros in the MacroExpander class.
Replace the following lines inside the try block


if version > 7.0:
     self.set_macro("FrameworkSDKDir", net, "sdkinstallrootv1.1")
else:
     self.set_macro("FrameworkSDKDir", net, "sdkinstallroot")</pre>

With:


if version > 7.09:
     pass
elif version > 7.0:
     self.set_macro("FrameworkSDKDir", net, "sdkinstallrootv1.1")
else:
     self.set_macro("FrameworkSDKDir", net, "sdkinstallroot")

Now launch a command prompt window and run the following commands.

C:\Program Files\Microsoft Visual Studio 9.0\VC\vcvarsall.bat
SET DISTUTILS_USE_SDK=1
SET MSSdk=1

Now you can run “setup.py build” and it works just fine.
The compiler spits out a couple of warnings about the option ‘GX’ being deprecated, but thats just an innocuous warning.

Older Posts »

Powered by WordPress