The Scratch Blog

Tech Blog of Subhranath Chunder - "A blog from scratch"

Enable APIs for your django apps in djangoish way

A small django helper to let you create and manage app-centric API url mappings.

django-api-enabler lets you:

  • Enable/disable APIs associated with any specific app.
  • Enable/disable entire project APIs.
  • Use prefix like '/api/' to identify and distinguish between API and non-API URLs.
  • The URL to view maps are maintained within app code.
  • Customizable app specific prefixes.
  • urlconf compliant.

Simple, yet powerful and useful!
For details: https://github.com/subhranath/django-api-enabler

Reviews42 platform Architecture

Few months back, was given the responsibility to architect and deliver a community based reviews platform. The first of it's kind in India.

After initial discussions, the priorities finally came to:

  • Scalability
  • Easy to integrate with external platforms
  • Ready to port on different application platforms

Being a Django fanatic, it was my obvious tool of choice to start with, and to built a platform around it. The platform itself is now based on open source software/tools, and it's technology stack comprises of:

All these things were put together to create a SOA based software platform which is not only easy to scale and integrate with other external platforms and services, but also easy to expand into other platforms/devices.

The current platform architecture can be logically represented as something like the following:

Eventually, the Reviews42 platform made it's debut on 30th March 2012, with the launch of it's Django powered Web App, and possibly the rest of the apps will follow in the future.

Share your views about the architectural design, the nice and also the nasty ones.

Google+ API and Django Authentication Backend

The Google+ developers API was released a couple of days back, I was really excited to try it out. I saw that the initial release comes with a starter project in Python for Google App Engine only. Now, when trying it out, I decided why not create a Django pluggable app for it. So, I decided to create a django app to provide a fully integrated Django style authentication backend for Google+. And thus, it comes as the newest addition to the django-custom-auths project of mine. Check the project in GitHub for details.

I guess this is the first application which aims at integrating Google+ API with Django. Plese feel free to try it out. Any comments, criticism and suggestions are most welcome.

Find it here: https://github.com/subhranath/django-custom-auths

Django timedelta custom model field (with full example)

This intends to show a full working example of a django custom model field.

TimeDeltaField uses python object serialization or namely pickle, to store a datetime.timedelta type native python object on a database in the form of character data. This example implements only the must-have elements for it's working.

import datetime
import pickle
from django.db import models

class TimeDeltaField(models.Field):
    """Custom model field to store python native datetime.timedelta
    object in database, in serialized form.
    """
    __metaclass__ = models.SubfieldBase

    def __init__(self, *args, **kwargs):
        # Set the max_length to something long enough to store the data
        # in string format.
        kwargs['max_length'] = 200

        # Make sure the default specified is also serialized, else the
        # objects own string representation would be used.
        if 'default' in kwargs:
            kwargs['default'] = pickle.dumps(kwargs['default'])

        super(TimeDeltaField, self).__init__(*args, **kwargs)

    def get_internal_type(self):
        # Store the serialized data as the default 'CharField' type in
        # the database.
        return 'CharField'

    def to_python(self, value):
        if isinstance(value, basestring):
            # De-Serialize into timedelta.
            return pickle.loads(str(value))
        return value

    def get_prep_value(self, value):
        # Serialize the object.
        return pickle.dumps(value)

class MyModel(models.Model):
    """Dummy implementation of a model.
    """
    timedelta = TimeDeltaField(default=datetime.timedelta(days=30))

    def __unicode__(self):
        return unicode(self.id)

This entitles us to use APIs such as:

# Creates a models instance with default value of timedelta.
MyModel.objects.create()

# Creates an instance with your own specified timedelta value.
MyModel.objects.create( \
    timedelta=datetime.timedelta(datetime.timedelta(days=10, seconds=100))
)

This code has been written and tested in Django 1.4 pre-alpha version.

Django full text search, purely Python

This is a Django full text search module, written purely in Python, using the power of regular expressions, and is fully compatible with Google App Engine server.

The search module is given below, along with a code snippet to show how it works with Django or non-rel based GAE projects. The example has been simplified to show how it works with text search on the different 'Post' entries in a 'Blog' application.

import re

def get_keywords(query_string):
    """Accepts a search query string and returns a list of search keywords.
    """
    # Regex to split on double-quotes, single-quotes, and continuous
    # non-whitespace characters.
    split_pattern = re.compile('("[^"]+"|\'[^\']+\'|\S+)')
    
    # Pattern to remove more than one inter white-spaces.
    remove_inter_spaces_pattern = re.compile('[\s]{2,}')
    
    # Return the list of keywords.
    return [remove_inter_spaces_pattern.sub(' ', t.strip(' "\'')) \
            for t in split_pattern.findall(query_string) \
            if len(t.strip(' "\'')) > 0]

def get_results(objects, query, field_list):
    """Returns a QuerySet of filtered objects based on the 'query',
    upon a QuerySet of 'objects', on the fields specified by the list
    'field_list'.
    """
    
    # Create a string representing the actual query condition for filtering.
    condition = ''
    for field in field_list:
        condition = condition + 're.compile(query_pattern).search(obj.%(field)s) or ' % {'field': field}
    condition = condition[:-4]
    
    # Apply the query condition for all the keywords.
    for keyword in get_keywords(query):
        # List where the partially filtered object ids are stored.
        filtered_ids = []
        
        # Check for the filter condition for the current keyword, on all the presently filtered objects. 
        for obj in objects:
            
            # Check for the filtering.
            query_pattern = re.compile(keyword, re.IGNORECASE)
            if eval(condition):
                filtered_ids.append(obj.id)
        
        # For the next iteration work with the currently filtered objects only.
        objects = objects.filter(id__in=filtered_ids)
        
    return objects

Now, to use this module with your existing apps and models, all you need to do is, define a new Custom Manager in your 'models.py' file to add a more Djanglo like API, and connect this to the default model manager.

You may refer to the Custom Manager section to have a better view of how it's done, or you can simple continue to work with the following example code.

from django.db import models

# Our own custom search module.
import search

class PostManager(models.Manager):
    """Custom Manager to implement the full text search. 
    """
    # Our new custom API to fetch 'Post's based on the 'query' provided.
    def search(self, query=None):
        # QuerySet containing all the 'Post's.
        posts = super(PostManager, self).get_query_set().order_by('-published_date')
        
        # When a query has been actually specified.
        if query is not None:
            # Fetch the filtered QuerySet of 'Post's, based on the query
            # over the specified fields.
            posts = search.get_results(posts, query, \
                ['title', 'content', 'author.user.get_full_name()'])
        
        # QuerySet representing the filtered 'Post's.
        return posts
    
class Post(models.Model):
    """Represents a single article on the 'Blog'.
    """
    title   = models.CharField(max_length=100)
    content = models.TextField()
    author  = models.ForeignKey('Author')
    # Other model fields.
    # ...
    
    # Directive to use our own overridden custom manager.
    objects = PostManager()

    def __unicode__(self):
        return self.title
    
    # Other model methods.
    # ...
    

This would extend the existing model manager API to provide simple method calls over the 'Post' model objects, to use our custom full text search module, as shown below.

from blog.models import Post

Post.objects.search(query='django based platform')
# This would search and return a QuerySet of 'Post' objects which matched
# all the keywords ['django', 'based', 'platform'].

Post.objects.search(query='tutorial "custom feed" django')
# Similarly ,return a QuerySet matching keywords on
# ['tutorial', 'custom feed', 'django'].

Simple full text search

The full text search has now been implemented.

ScratchBlog can now perform powerful full-text search on the current blog posts. This full-text search has been implemented using an entirely new search module written from scratch, which uses simple python, combined with the power of python regular expressions. This new module can be used with other Django projects, as well as with django-nonrel projects along with GAE, if required. It doesn't intend to replace any existing text-search engines which are intended for any sort of complex or scalable searches, where huge amount of data is directly involved. You may use this for simple full-text search on the fly, without using any sort of indexing. It is ideal for any small projects, or where the data volume for the text don't go blizzard.

This search module will be released here soon. Till then, keep searching.

Improving link discovery

ScratchBlog is now sitemap enabled, which has been done using the Django sitemap framework, and it's accessible here.

Additionally, the 'robots.txt' file was recently added to the root url for specifying crawler access permissions. And all the external links are now updated and fixed.

Tutorial: Custom Feed Generation using Django Syndication Feed Framework 1.2 and later

Sometime back, when I was writing an application for my company, I took the task to implement generation of feeds. My initial choice was obviously Django's own Syndication Feed Framework which probably came with some changes in 1.2 version.

Initially it looked like a piece of cake. But, for the Atom feeds the client wanted to have something which required a fair bit of customization. The current Django documentation turned out to be falling short of what I really needed, and a proper concrete example seemed to be missing.

After spending a day with the given things, snippet examples, django core for the feed framework, and some help from google, I was finally able to get what was desired.

So, for the rest of the post I would like to take you through the "missing tutorial" for Custom Feed Generation using Django Syndication Feed Framework 1.2 and later. This tutorial is basically for people who wants to use the framework with their own customizations coming in. For example, adding a new tag element to the item, or introducing some pre-processing checks, etc.

It is assumed that you have already gone through the official documentation. Rather than giving a step by step tutorial, I'm rather presenting a full example showing how to get started with the customization:

from django.contrib.syndication.views import Feed
from django.utils.feedgenerator import Atom1Feed
from django.shortcuts import get_object_or_404
from django.http import HttpResponseForbidden
from apps.news.models import Entry, Tag

# We basically will provide an url like:
# /feeds/rss/<tag-name>/  => /feeds/rss/(?P<tag_name>[\w]+)/
# /feeds/atom/<tag-name>/ => /feeds/atom/(?P<tag_name>[\w]+)/
# And the generated RSS feed will list all the sorted 'Entry' which
# has been 'Tag'ed with <tag-name>.

class RssEntryFeed(Feed):
    """This is the class generating the simple RSS feed.
    """
    title = "MySiteName"
    link = "/"
    description = "My root description."
           
    def get_object(self, request, tag_name):
        """The 'tag_name' is passed from the urls.py
        """
        return get_object_or_404(Tag, name=tag_name)

    
    def title(self, obj):
        """The root title tag content.
        """
        return "MySiteName: %s" % (obj.name)

    
    def items(self, obj):
        """The 'item' elements.
        """
        return Entry.objects.filter(tags__in=obj).order_by('-pub_date')[:10]

        
    def item_title(self, item):
        return item.headline

    
    def item_link(self, item):
        return item.get_absolute_url()

    
    def item_description(self, item):
        return item.summary


class AtomEntryCustomFeed(Atom1Feed):
    """Custom Atom feed generator.
    This class will form the structure of the custom RSS feed.
    It's used to add a new tag element to each of the 'item's.
    """
    def add_item_elements(self, handler, item):
	# Invoke this same method of the super-class to add the standard elements
	# to the 'item's.
        super(AtomEntryCustomFeed, self).add_item_elements(handler, item)

	# Add a new custom element named 'content' to each of the tag 'item'.
        handler.addQuickElement(u"content", item['content'])


class AtomEntryFeed(RssEntryFeed):
    """Class used to generate the final customized Atom feed.
    Since this is a subclass of the RSS feed class it'll inherit its methods.
    """
    feed_type = AtomEntryCustomFeed		# Custom Atom feed.
    subtitle = RssEntryFeed.description		# Use the description of the RSS feed for this root tag.

    
    def __call__(self, request, *args, **kwargs):
        """Place for intercepting the call to the custom Atom feed, and perform
        pre-processing checks.
        """
        # Only authenticated users can view the feed.
        if not request.user.is_authenticated():
            return HttpResponseForbidden("<h3>Access Forbidden. User must be authenicated to access page.</h3>")
        else:
            return super(AtomEntryFeed, self).__call__(request, *args, **kwargs)

       
    def item_extra_kwargs(self, item):
        """
        Returns an extra keyword arguments dictionary that is used with
        the `add_item` call of the feed generator.
        Add the 'content' field of the 'Entry' item, to be used by the custom feed generator.
        """
        return { 'content': item.content, }

Atom feed supported

Scratch Blog now also supports Atom feed as specified in http://www.atomenabled.org/developers/syndication/atom-format-spec.php.

This is also implemented using the Django Syndication Feeds Framework 1.2 and later.

Some customizations have been used to let the framework generate the feed as per our requirement.

RSS feeds with Django syndication feed framework

Implemented the working of feeds with this site.

Now, you can subscribe to live RSS feed from this site, which have been implemented using the Django Syndication Feed Framework 1.2 and later.

Currently it supports RSS feeds following the specifications stated at RSS v2. Atom feeds will be added later on in future.

To Subscribe to live feeds, Click Here.

Backend supported Backup/Restore

The current application data is now ready to be backed-up, and restored as and when required.

Present application data can now be exported as JSON fixtures, to get a snapshot of the current application datastore state. At present it'll export the following objects:

  • Blogs
  • Users
  • Authors
  • Posts
  • Comments

The backup scripts uses Django Serialization to store the objects into files, and are deserialized into native python objects when restoring.

Thus, now backup and restore of the online project contents is possible. Currently, this can only be done from the backend, but later a more restricted version of it will be available from the front-end too.

Free to comment

The commenting system is now available.

Although it is going to be still under development, and more features might be added to it. It's now totally usable by others, none the less.

The current development version of Django does provide a comments framework to work with comments on other existing data models. I still decided not to use it at this point of time. There are two main reasons for that:

  • The comments framework is used as just another Django application (which I don't want to happen in this case), and it's development is not yet stabilized, going through modular and implemention changes.
  • I'm not sure whether I would like to stick to the current system either, and may switch to something like DISQUS. In such a case the porting might be a little less complex.

Although I have certain advanced plans for the commenting system, I don't except them to get implemented until a more matured version of ScratchBlog comes through.

Index definition mystery

According to the App Engine documentation, the index definitions at index.yaml should be added automatically by the development web server. But, in my current project setup, it's not happening that way, and I have to manually make the entries for it, atleast for the admin application.

Now sure whether this is a bug at App Engine, or Django-nonrel, or at my end. But would surely like to get to the bottom of this. And when I do, will get back with more info about it here.

Ported to Django-nonrel

Working with the google-app-engine-django helper was proving to be of some pain. Not only did I have to write the code twice, it was also getting very hectic to test the application too! So, decided to look for alternatives. After searching through the list of alternative, which wasn't long anyway, I decided to try out Django-nonrel.

After going through it's documentations and also trying out it out, finally decided to switch the from the "django-app-engine-django" helper to the "Django-nonrel" port.

Although it doesn't yet support all of Django APIs out of the box, but still has support upto a certain acceptible extent. The best part seems to be the way in which the port is supposed to work. And being currently a very active project, it's future and prospect seems good. They're using the lastest version of Django and hacking it's core to get support for non-relational databases and the App Engine database backend.

The manage.py looks loaded with many useful common features, other than just the regular ones. Now remote login python shell to the GAE application server is just too easy. Infact the remote feature can be used with it's other subcommands too.

Overall, it seemed to be the most promising thing for my need, and thus now my application has already been migrated to use this new Django port.

I don't have to code twice now, to say the least.

As the official documentation of Django-nonrel says:

Django should mostly behave as described in the Django documentation.

Scratchy Start

Porting this Django application to Google App Engine is proving to be some pain!

Initiallialy I had thought that it won't be much of a thing. And since the google-app-engine-django helper was already out there, I thought things won't be that much difficult.

Started with including the Django development version, and I just realized that things might not be that easy. Had to switch back to Django 1.1.1 version, and after putting it in place of the SVN version, the helper can launch the django devlopment server, to say the least. The helper still fails in one of the tests it performs, but who cares, my app engine server is up and running and ready to serve Django.

The Django Models had to be replaced with the App Engine Models. I'm using the Google App Engine backed datastore, as I'm not willing to spend for hosting yet. The views also had to changed due to the change in APIs from Django to App Engine. The development server is working pretty much fine, and after a little time, the local datastore is created using my simple python scripts.

Now the deployment. It was too easy to get the files up here. But, the biggest problem. I can't get the data from the local datastore to the deployed datastore. Looked for many possible alternatives to get it up, but to no avail. Spend hours trying to find some remote shell to the deployed server. Finally had to give up, and was already cursing this so called "helper". It wasn't helping me in any way to get me a remote shell to the backend.

After spending countless hours trying to figure out a solution for my problem, I finally figured out a way to populate the datastore to get the site up! Created a new url point and got my initialization scripts to run from the view associated to that url. I admit, it was a pretty bad way of getting the datastore up, but somehow it is created now. Phew!!!

Now can work on the other features of the application. "Comments" to start with.

I miss the Django ORM!!! (now when I'm writing the views).

This is a sample title

This is a sample content used for testing this new blogging application.

The content is being used for testing my new Django application, codenamed as "ScratchBlog".

Let's see how it goes through.