Back when I was hacking on PHP and rolling my own webapp frameworks ("in the day" as it were) I'd often use the singleton pattern for creating a app "registry" where global data and variables could be stored and accessed by objects anywhere within the framework. Since moving most of my web development projects to Python a couple years ago I haven't really had any use for such global, shared state objects. That is, until I received this issue ticket for Photologue.
Some background...
Photologue provides a way to define "Photo Sizes" within your database that all your uploaded images can be resized to automatically. When a photo is requested in a certain size, the system loads the appropriate size description and resizes the image as specified (if the sized image has not already been cached). This gives the programmer/designer/site-admin a lot of freedom when developing apps and pages as any number of photo sizes can be defined at any time without changing any code or restarting the server. This was also causing the Django's ORM to make an unreasonably large amount of queries when a page full of images was loaded as each image was querying for the photo size object as it was loaded.
The fix was twofold. First, when a Photo object is initialized a function is called that adds a number of convenience functions for accessing the defined photo sizes. For instance if you define a "thumbnail" photo size a method named "get_thumbnail_url" would be added to the instance and would return the url of the photo sized to the "thumbnail" specification. These functions were originally added by "curry"ing existing functions on the model ("get_SIZE_url, get_SIZE_path, etc.) and the names of all found photo sizes. When called these function would use the name supplied to load the photo size from the database. The first fix was simply to pass the actual photo size object it self to these functions, eliminating the need to load them later:
for photosize in sizes:
setattr(self, 'get_%s_size' % photosize.name, curry(self._get_SIZE_size, photosize=photosize))
setattr(self, 'get_%s_url' % photosize.name, curry(self._get_SIZE_url, photosize=photosize))
setattr(self, 'get_%s_path' % photosize.name, curry(self._get_SIZE_path, photosize=photosize))
Which brings us to the the title of this post...
I needed a way to load the photo size models once and then let any and all photos access these sizes without having to load them again. A global variable would work (as suggested by the original issue submitter) but seems kludgey and one thing I love about Python is being able to solve problem in a way that is as elegant as it is functional. So I did a little research and came across a pattern for creating classes with shared state. That is, all separate instances created from this class will maintain the same state wherever they exist within your program. When one instance is modified, all instances reflect the change. For example here's the class used to cache photo sizes within Photologue:
class PhotoSizeCache(object):
__state = {"sizes": {}}
def __init__(self):
self.__dict__ = self.__state
Now, when a photo needs to access the list of photo sizes it simple instantiates an instance of PhotoSizeCache, which automatically is assigned the class's global state, and checks to see if the "sizes" dictionary has a length. If not it loads the full list of sizes and stores them within the cache for other any other objects to find. The final result was a drastic reduction in the number of hits made on the database (three down from over two thousand on a test page loading around seven hundred images). It's still global state (me bad?) but I think it has a certain beauty that's only made possible by Python's dynamic nature.
Would be nice to show code for how you use this.
ReplyDeleteOne example for set, one for read.
Also, how do you deal with the problem that globals are only valid within a module. If it's a big program surely not all the places that need the data are in the same module.
Anon,
ReplyDeleteYou can see it in use in the source code for django-photologue here: http://code.google.com/p/django-photologue/
As for scope, my use-case is primary restricted to a single module. It's more a singleton-like cache object than a ubiquitous global object.