<?xml version="1.0" encoding="utf-8" ?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns="http://purl.org/rss/1.0/">




    



<channel rdf:about="http://www.stereoplex.com/search_rss">
  <title>Stereoplex</title>
  <link>http://www.stereoplex.com</link>

  <description>
    
            These are the search results for the query, showing results 1 to 15.
        
  </description>

  

  

  <image rdf:resource="http://www.stereoplex.com/logo.png"/>

  <items>
    <rdf:Seq>
      
        <rdf:li rdf:resource="http://www.stereoplex.com/blog/deferred-foreign-keys-with-django-dfk"/>
      
      
        <rdf:li rdf:resource="http://www.stereoplex.com/blog/mobile-api-design-thinking-beyond-rest"/>
      
      
        <rdf:li rdf:resource="http://www.stereoplex.com/blog/introducing-django-lazysignup"/>
      
      
        <rdf:li rdf:resource="http://www.stereoplex.com/blog/speeding-up-django-unit-test-runs-with-mysql"/>
      
      
        <rdf:li rdf:resource="http://www.stereoplex.com/blog/filtering-dropdown-lists-in-the-django-admin"/>
      
      
        <rdf:li rdf:resource="http://www.stereoplex.com/blog/swoop-travel-live"/>
      
      
        <rdf:li rdf:resource="http://www.stereoplex.com/blog/buildout-vs-pip-virtualenv-and-requirements-files"/>
      
      
        <rdf:li rdf:resource="http://www.stereoplex.com/blog/oserror-while-installing-django-buildout-and-djang"/>
      
      
        <rdf:li rdf:resource="http://www.stereoplex.com/blog/migrating-django-mingus"/>
      
      
        <rdf:li rdf:resource="http://www.stereoplex.com/blog/python-unicode-and-unicodedecodeerror"/>
      
      
        <rdf:li rdf:resource="http://www.stereoplex.com/blog/installing-geodjango-with-postgresql-and-zc-buildo"/>
      
      
        <rdf:li rdf:resource="http://www.stereoplex.com/blog/fez-djangoskel-django-projects-and-apps-as-eggs"/>
      
      
        <rdf:li rdf:resource="http://www.stereoplex.com/blog/a-django-development-environment-with-zc-buildout"/>
      
      
        <rdf:li rdf:resource="http://www.stereoplex.com/blog/testing-app-views"/>
      
      
        <rdf:li rdf:resource="http://www.stereoplex.com/blog/django-unit-tests-and-transactions"/>
      
    </rdf:Seq>
  </items>

</channel>


  <item rdf:about="http://www.stereoplex.com/blog/deferred-foreign-keys-with-django-dfk">
    <title>Deferred foreign keys with django-dfk</title>
    <link>http://www.stereoplex.com/blog/deferred-foreign-keys-with-django-dfk</link>
    <description>django-dfk allows deferred foreign keys to be declared on models, so that a concrete target can be set later.</description>
    <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p class="p1"><a class="external-link" href="http://pypi.python.org/pypi/django-dfk/">django-dfk</a> is a project that I developed for a recent project to allow foreign keys to be declared on models without an explicit target ('dfk' stands for 'deferred foreign key'). It provides an API to 'point' these foreign keys to a concrete target at a later date, and also allows you to forcefully 'repoint' foreign keys that have already been set up. This last facility should be used with caution - it's essentially akin to monkey-patching.</p>
<p class="p2">You can use GenericForeignKeys for this, and these are slightly more flexible in that each model <strong>instance</strong> foreign key may point to a different model. However, there is a performance cost associated with them, and joining can be problematic.</p>
<p class="p2">(The project actually rarely uses django-dfk directly - instead, it uses it as a basis for abstract foreign keys, which have a greater awareness of the application environment - however, that's a topic for another day.)</p>
<p class="p1">Before we go much further though - rather than using this package, a better long-term investment would be to look at the <a class="external-link" href="https://github.com/jezdez/django/tree/app-loading">app-loading branch</a> that <a class="external-link" href="https://twitter.com/arthurk">Arthur Koziel</a> and <a class="external-link" href="https://twitter.com/jezdez">Jannis Leidel</a> have been working on - testing it, and helping them to get it into a state to be merged into trunk. I think that should provide a more holistic approach to solving this kind of problem. Until then...</p>
<p class="p1">Deferred foreign keys are useful in applications where you know that a model will require a foreign key, but don't know what the target will be at the time you're writing the application. Taking an example from the project that django-dfk was created for, let's say you have a Django app that is an online game. Users are entered into the game by way of an 'Entry' model, and the Entry contains a foreign key back to a model which contains user information.</p>
<p class="p1">However, you will have several instances of this game deployed, and each game may have its own source of player data - hence, its own Player model.</p>
<p class="p1">Let's say that there are two applications involved: one called 'core', which contains the core game logic, and one called 'mygame1', which contains models and logic specific to an individual game deployment.</p>
<pre class="mceContentBody documentContent">core/models.py:<br />from django.db import models<br /><br />class Entry(models.Model):<br />    created = models.DateTimeField(defaults=datetime.datetime.now)<br />    player = models.ForeignKey( …. uh, what do I type here?</pre>
<p class="p1">Remember - the core application is going to be deployed in several places, and there are several possible places that FK might need to point.</p>
<p class="p1">One approach solving this would be to introduce a model which relates an entry to the game-specific Player model, resulting in models that look something like this:</p>
<pre class="p1">core/models.py:<br />from django.db import models<br /><br />class Entry(models.Model):<br />    created = models.DateTimeField(defaults=datetime.datetime.now)<br /><br /><br />mygame1/models.py:<br /><br />from django.db import models<br /><br />class MyPlayer(models.Model):<br />    name = models.CharField(max_length=50)<br />    entry = models.OneToOneField(Entry)</pre>
<p class="p1">This works fine - however, there are some games which share player data with each other. This means that the key needs to live on the Entry model - but, we don't know which player model to point the FK at, as this model might be deployed in multiple places.</p>
<p class="p1">We can solve this using django-dfk.</p>
<pre class="mceContentBody documentContent">core/models.py:<br />from django.db import models<br />from dfk.models import DeferredForeignKey<br /><br />class Entry(models.Model):<br />    created = models.DateTimeField(defaults=datetime.datetime.now)<br />    player = DeferredForeignKey(unique=True)<br /><br /><br />mygame1/models.py:<br /><br />from django.db import models<br />from dfk import point<br />from core import Entry<br /><br />class MyPlayer(models.Model):<br />    name = models.CharField(max_length=50)<br /><br />point(Entry, 'player', MyPlayer)</pre>
<p class="p1">The first thing to notice is that our Entry model now sports a DeferredForeignKey instance. It's important to realise that this isn't a real field, it's just a placeholder. Any arguments (except for the special 'name' argument, more on this below) are simply stored.</p>
<p class="p1">The action happens during the call to 'point', in mygame1's models.py. As the name implies, this points the DFK called 'player' on Entry to the MyPlayer class. (Actually, under the hood, it simply replaces the DeferredForeignKey instance with a real ForeignKey instance complete with the arguments which were originally passed to the DFK). Note that we do this at the module level in models.py - all your pointing (and repointing) needs to be done before the application is ready to use to ensure that syncdb outputs the correct SQL.</p>
<p class="p1">After all this is done, the definition of Entry will effectively look like this (although of course your code won't have changed!):</p>
<pre class="p1">class Entry(models.Model):<br />   created = models.DateTimeField(defaults=datetime.datetime.now)<br />   player = models.ForeignKey('mygame1.MyPlayer', unique=True)</pre>
<p class="p1">Other game applications (say, mygame2 and mygame3) would point the foreign key to the Player model that is appropriate for their game.</p>
<p class="p1">It's quite common to need to point a number of these keys at once - there might be several models which refer to a player. Rather than writing lots of 'point' statements (and having to add to them if a new key is added), django-dfk allows deferred foreign keys to be named:</p>
<pre class="p1">core/models.py<br />class Entry(models.Model):<br />   created = models.DateTimeField(defaults=datetime.datetime.now)<br />   player = DeferredForeignKey(name='Player', unique=True)<br /><br />class StatusUpdate(models.Model):<br />   text = models.CharField(max_length=140)<br />   player = DeferredForeignKey(name='Player')<br /><br /><br />mygame1/models.py:<br /><br />from django.db import models<br />from dfk import point<br />from core import Entry<br /><br />class MyPlayer(models.Model):<br />   name = models.CharField(max_length=50)<br /><br />point_named('core', 'Player', MyPlayer)</pre>
<p class="p1">Roughly translated, this means 'point all the deferred foreign keys in all models in the core app which have the name 'Player' to the MyPlayer model'. This will affect both the 'Entry' and 'StatusUpdate' models above.</p>
<p class="p1">Finally, django-dfk also provides a 'repoint' function. This is a big hammer, and is not to be used lightly. 'point' only works on DeferredForeignKey instances, by design - it's meant to prevent you making mistakes. 'repoint' works on regular foreign keys too. It's useful if you've used 'point' to change the destination of a DFK, but later need to change it. (However, if you find yourself in this position, you should probably refactor your code to just use 'point'). Both 'point' and 'repoint' take care of cleaning up various internal Django caches to ensure things life filtering on related fields work properly after a point operation.</p>
<p class="p1"><a class="external-link" href="http://pypi.python.org/pypi/django-dfk/">django-dfk can be found on PyPI</a>, and the <a class="external-link" href="https://github.com/danfairs/django-dfk">source is on github</a> - forks, bug reports and patches with docs and tests are welcome. django-dfk is in production on several high-volume sites.</p>]]></content:encoded>
    <dc:publisher>No publisher</dc:publisher>
    <dc:creator>Dan Fairs</dc:creator>
    <dc:rights></dc:rights>
    
      <dc:subject>django</dc:subject>
    
    <dc:date>2011-12-30T17:25:00Z</dc:date>
    <dc:type>Page</dc:type>
  </item>


  <item rdf:about="http://www.stereoplex.com/blog/mobile-api-design-thinking-beyond-rest">
    <title>Mobile API Design - Thinking Beyond REST</title>
    <link>http://www.stereoplex.com/blog/mobile-api-design-thinking-beyond-rest</link>
    <description>This article explores the problems of optimising REST APIs for mobile device
performance, and suggests a way of allowing clients to request alternate
representations.</description>
    <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p><a class="external-link" href="http://twitter.com/natea">Nate Aune</a> and <a class="external-link" href="http://twitter.com/jazztpt">Anna Callahan</a> gave a <a class="external-link" href="http://reinout.vanrees.org/weblog/2011/06/08/iphone-python.html">great talk</a> at this year's EuroDjangoCon about a service that they'd built in 24 hours, <a class="external-link" href="http://valentun.es">valentun.es</a>. Along with a great story, the meat of the talk was about the concessions you have to make with a mobile API with respect to data transfer rates and connectivity. Some of the things they said struck a chord with my own experiences of designing mobile APIs, and inspired me to write this post about those experiences, principles, problems, solutions - and an idea for the future.</p>
<h3>What's a REST API?</h3>
<p>There are probably as many definitions of what a REST API is as there are implementations (so naturally I'll add my own), but they all share some common characteristics:</p>
<ul>
<li>They recognise that the web is composed of resources, and are structured around that</li>
<li>They use the underlying HTTP methods (GET, POST, PUT, DELETE) to interact with resources</li>
<li>Representations of those resources are what actually flow back and forth between REST API servers and clients</li>
<li>URIs (usually URLs, which I'll use for the remainder of the article) are used to identify application state - particularly information on what the client can do next - andis contained within the representations received from the server.</li>
</ul>
<p>Take a look at the <a class="external-link" href="http://en.wikipedia.org/wiki/Representational_State_Transfer">Wikipedia article on REST</a> for one description of what it means to be RESTful.</p>
<h3>Resources</h3>
<p>When you design a REST API, the first task is usually to consider what resources you want to expose to web clients. This is often pretty straightforward: a reasonable approach is often to use the same arrangement as your underlying data model (be that tables in a relational database, document types in a document management system, and so on). This doesn't have to be the case, of course - in particular, you'd be unlikely to expose a link table that facilitates a many-to-many relation in a relational database as an individual resource, for example.</p>
<p>Let's say we're building an API for a pizza shop. We're probably going to have a Pizza resource, a Topping resource, and an Order resource.</p>
<p>Resources tend to be arranged hierarchically, probably due to the nature of the URLs used to address them. It's therefore common to have resources in an API that work as collections of other resources. The URL for a topping resource might be `http://pizza.com/toppings/cheese`. The URL of the collection resource for all toppings would be `http://pizza.com/toppings/`. (The presence of absence of a trailing slash is meaningless, but for the purpose of this article I'll use the convention of collection resources having a trailing slash.)</p>
<h3>Representations</h3>
<p>Representations are how the client and server talk about a resource. Prior to the advent of machine-usable APIs, HTML was the most common form of representation. These days, XML and JSON representations are commonly used for machine-readable representations. PDF, JPEG, and MP3 are also all perfectly good representations of a resource as is, of course, good old HTML.</p>
<p>Keep in mind that a resource may have more than one representation. You might be able to fetch a resource as HTML (useful for humans) and JSON (useful for machines). It's also important to realise that representations are not simply content types: there's no reason a resource can't have multiple JSON representations, for example. This idea becomes important a little later on.</p>
<p>Once you've defined your resources, therefore, thinking of representations is usually the next step. The default choice of a simple JSON representation, with a key per resource attribute, is a common choice. That will do for now. A representation of our Topping resource might therefore look like this:</p>
<pre>{<br /> "calories": 100, <br /> "name": "Cheese"<br />}</pre>
<p>Resources may need to refer to other resources. This should be done in a self-describing way: the client should not need to have any knowledge of the server application to build its own URLs. Seeing keys like 'foo_id' in a JSON representation is usually a sign of this design error. For example, this is a reasonable representation:</p>
<pre>{<br /> "favourite_topping": "/toppings/cheese"<br />}</pre>
<p>This isn't so good:</p>
<pre>{<br /> "favourite_topping_id": "36"<br />}</pre>
<p>That 'favourite_topping_id' is meaningless to the client - it has to know how to construct topping resource URLs to be able to use that data.</p>
<p>Incidentally, note that the examples in this article only include URL paths, rather than full, absolute URLs. Either is fine; as long as the client can resolve them.</p>
<h3>Interactions</h3>
<p>The next piece of the usual REST API design process is to consider the operations available for each resource, and what they mean. There are four key operations provided by HTTP - GET, POST, PUT and DELETE. (Actually, HTTP does provide more, but this quartet is what is most normally used in RESTful API design.)</p>
<p>These are often compared to the core SQL-based relational database operations of SELECT, INSERT, UPDATE and DELETE. This is slightly misleading. SELECT and GET are fairly similar, as are the two DELETEs; POST and PUT are different beasts though. POST is used for a write operation on a resource that has side-effects. PUT writes to a resource, but has no side-effects.</p>
<p>Put another way, PUT is idempotent - if you do the same PUT twice (and there's no other state changes in between) then the system state will be the same. POST carries no such guarantee.</p>
<p>A good example might be to compare the creation of a new Topping resource with the creation of an Order for a pizza. Creating a Topping would probably consist of something like the following:</p>
<pre>PUT http://pizza.com/toppings/jalapenos<br />{<br />  "calories": 25,<br />  "name": "Jalapeños"<br />}</pre>
<p>This would create the Topping resource at http://pizza.com/toppings/jalapenos. Re-running the request would not make any difference. Changing the 'calories' field in the JSON to a new value would replace the existing resource with the new one. So - PUT has a 'create-or-update' semantic.</p>
<p>The response for this request would probably simply be an HTTP 200 response, with an empty body (or more strictly, a 204, which tells the client to maintain its view of the representation.)</p>
<p>Contrast this with what creating an order might look like:</p>
<pre>POST http://pizza.com/orders/<br />{<br /> "toppings": [<br />   "/toppings/cheese", <br />   "/toppings/jalapenos"<br /> ], <br /> "card": "1234567890"<br />}</pre>
<p>We've made this a POST request because it has side-effects: it bills your credit card, and sets off the process of making a delicious pizza. Making that same request twice would bill your card twice, and get you two pizzas. POST is not idempotent.</p>
<p>The HTTP response here would probably be 201 Created, with the representation of some Confirmation resource in the body, perhaps looking like this:</p>
<pre>201 Created<br />{<br /> "order": "/orders/432544"<br />}</pre>
<p>Note once more how the response contains a self-describing URI, rather than some opaque order ID, which is not meaningful out of context.</p>
<h3>Reality Bites</h3>
<p>So we've identified our resources, the representations of them, and what operations the HTTP verbs actually correspond to. We're good to go, right?</p>
<p>Well, yes, basically. It'll work. But you'll almost certainly run into some problems. Before we get to the meat of selecting resource representation though, let's take a minute to consider a couple of real-world implementation problems you're likely to encounter.</p>
<h3>Aside 1: Bad HTTP clients</h3>
<p>There are some broken HTTP clients out there. I've run into one: Flash (circa 2009). Flash gets upset if it doesn't receive a 200 response from the server. In particular, if your server returns a 4xx HTTP code in an API response, Flash will not even pass the response to the Flash application.</p>
<p>On the Django project on which we ran into this, we ended up writing a custom middleware that looked for the presence of a magic query string parameter on the URL and, if it was found, replaced the status code with a 200 and put the real status code on the first line of the body. The Flash app then parsed out the response code from the response body. Ugly, but workable.</p>
<p>Flash also (at the time, it may have changed) was unable to perform PUT or DELETE requests. Our solution was similar: the Flash application would always perform a POST when it actually wanted to do a PUT or DELETE, and the real intended method went into another magic query string parameter. The aforementioned middleware would then rewrite the HTTP method on incoming API requests that carried this flag.</p>
<h3>Aside 2: Rich Error Handling</h3>
<p>The standard approach to expressing errors in a REST API is to use HTTP status codes. As is often the case with REST, this works fine for simple systems, but is simply too limiting for more sophisticated systems, particularly those which might submit a JSON or XML document to describe a POST request. If a client does submit such a rich request, which perhaps does not validate on the server, it is useful to be able to provide more than just a 400 Bad Request error.</p>
<p>Since I primarily use Django these days, and Django uses forms and formsets for validation, I have found that providing a standardised JSON representation of form and formset errors works well. The format can be specified in advance, and allows the server to inform the client of rich validation errors (down to field-level validation, with decent error messages) even over an API. I hope to write more on this in a future post.</p>
<p>Normal service resumes...</p>
<h3>Normalised resource representations</h3>
<p>OK, let's say we've got an iPhone app which uses our REST API to place orders and display the calorific content of the toppings we added. Let's look at the response that it might receive from a GET request to our Order resource:</p>
<pre>GET /orders/432544<br /><br />200 OK<br />{<br /> "toppings": [<br />   "/toppings/cheese", <br />   "/toppings/jalapenos"<br /> ]<br />}</pre>
<p>That's cool - the client can see that there are a couple of toppings there, so it fetches each one to get the calorific content:</p>
<pre>GET /toppings/cheese<br />200 OK<br />
{<br /> "calories": 100, <br /> "name": "Cheese"<br />}</pre>
<p>And then:</p>
<pre>GET /toppings/jalapenos<br /><br />{<br /> "calories": 25, <br /> "name": "Jalapeños"<br />}</pre>
<p>That's great. Our iPhone app can tell our user that the pizza they ordered has 125 calories.</p>
<p>What's not so great is that our iPhone app has had to make three separate requests. This works fine in development, across the office wifi network to the dev server. It doesn't work so well when a user's ordering a pizza on the train home from work, and the train goes into a tunnel halfway through this multi-request conversation (and the user was outside a 3G signal anyway).</p>
<h3>Nesting resource representations</h3>
<p>The natural response (and probably the right response) is simply to extend the representation of an Order resource to include the required representations of our toppings. This means that our Order resource representation now looks like this:</p>
<pre>GET /orders/432544<br /><br />200 OK<br />{<br />    "toppings": [<br />    {<br />        "calories": 100, <br />        "name": "Cheese"<br />    },<br />    {<br />        "calories": 25, <br />        "name": "Jalapeños"<br />    }<br />    ]<br />}</pre>
<p>That's cool. Our iPhone app now only needs to make one request, and it gets all the information on the toppings as well. We've traded the brevity of the original representation of the Order resource for not having to make multiple requests. The individual Topping representations are still available, of course.</p>
<p>To (ab)use database parlance, we've denormalised our Order representation, trading size for performance (that is, a smaller representation will be quicker to download).</p>
<p>Now, wind the clock forward a few months. We've extended our iPhone app and the supporting REST API to cover table reservations, meal pre-ordering, and so on. We've run into the problem described above, so we've heavily optimised our API responses to minimise HTTP round trips. Life is good. Right?</p>
<p>We're approached by a company who have developed an Android app that can use our API, wondering if we'd be happy to make it the official Android app. It's an awesome piece of software. They've thought about the user experience in a totally different way, and it works well on Android (though the current iPhone app approach works best on iPhone). The only real problem is that some of the API requests they make download a ton of data that they simply don't need; and they have to make lots of other smaller requests to make other parts of their app work as they want.</p>
<p>In other words, they want a <b>different</b> set of optimisation choices for the API.</p>
<p>What do we do? Add a new API version with different optimisations (even though it's still dealing with the same set of resources) in the representations? That doesn't sound so great, as we'd be maintaining two APIs. It doesn't scale, either - what happens when someone writes an app for the TV, which needs another set of tradeoffs?</p>
<h3>A Possible Solution: Choosing Representations</h3>
<p>(Note that this section outlines a possible solution - I don't have an implementation for this yet.)</p>
<p>The key insight is that the applications do not require different resources. They merely need different representations of those resources. Some frameworks allow this to be done by specifying an extension on the end of the URL. If we did that, we'd end up with:</p>
<pre>http://pizza.com/toppings/cheese.xml<br />http://pizza.com/toppings/cheese.json</pre>
<p>While not strictly wrong, this isn't ideal when thinking in resources. Different URLs mean different resources. Just because both XML and JSON representations are available doesn't double the number of resources - it's all the same cheese topping. We need some way of expressing what representation we want for a given resource. Ideally, we shouldn't change the URL.</p>
<p>Fortunately, HTTP gives us the tools to do this: the HTTP Accept header. It's usually used for specifying content types: text/html, application/json, and so on. However, the specification allows for extra parameters. Let's take a look (from <a class="external-link" href="http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html">http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html</a>):</p>
<pre>Accept         = "Accept" ":"<br />#( media-range [ accept-params ] )<br /><br />media-range    = ( "*/*"<br /> | ( type "/" "*" )<br /> | ( type "/" subtype )<br /> ) *( ";" parameter )<br />accept-params  = ";" "q" "=" qvalue *( accept-extension )<br />accept-extension = ";" token [ "=" ( token | quoted-string ) ]</pre>
<p>That means that a client could send a request like this:</p>
<pre>GET /orders/432544<br />Accept: application/json;q=1;depth=1,application/json</pre>
<p>Here, we've added a depth parameter to the acceptable types. This depth parameter works a little like the numeric depth parameter to a Django queryset's `select_related` method call - it asks the server to traverse one level deep into related resources. The standard also specifies that more specific content types take precedence over less specific types - which means that we have a mechanism for the client to allow fallback to a normal JSON representation of the resource. If the server cannot provide a representation that fulfils any of the values in the Accept header, it would return a 406 Not Acceptable.</p>
<h3>Aside: q?</h3>
<p>What's that 'q' parameter? It's what appears to be a historical quirk of the spec, where q is used to specify a 'quality' - used to allow the choice of a variety of representations based on degrading quality, eg. sample rate for sound. It's baked into the accept-params definition, so we include it here.</p>
<h3>Example Requests</h3>
<p>Let's compare two requests, one before using the proposed Accept convention, and one after:</p>
<pre style="padding-left: 1em; ">GET /orders/432544<br />Accept: application/json<br style="padding-left: 0px; " /><br style="padding-left: 0px; " />200 OK<br style="padding-left: 0px; " />{<br style="padding-left: 0px; " /> "toppings": [<br style="padding-left: 0px; " />   "/toppings/cheese", <br style="padding-left: 0px; " />   "/toppings/jalapenos"<br style="padding-left: 0px; " /> ]<br style="padding-left: 0px; " />}</pre>
<p>This is exactly the same request as we saw before, but with an explicit Accept header, specifying that the client is prepared to accept the normal form of the response. Let's see what might happen if a depth is specified:</p>
<pre style="padding-left: 1em; ">GET /orders/432544<br />Accept: application/json,application/json;q=1;depth=1<br style="padding-left: 0px; " /><br style="padding-left: 0px; " />200 OK<br style="padding-left: 0px; " />{<br style="padding-left: 0px; " />    "toppings": [<br style="padding-left: 0px; " />    {<br style="padding-left: 0px; " />        "calories": 100, <br style="padding-left: 0px; " />        "name": "Cheese"<br style="padding-left: 0px; " />    },<br style="padding-left: 0px; " />    {<br style="padding-left: 0px; " />        "calories": 25, <br style="padding-left: 0px; " />        "name": "Jalapeños"<br style="padding-left: 0px; " />    }<br style="padding-left: 0px; " />    ]<br style="padding-left: 0px; " />}</pre>
<p>We can now see the client expressing that it can accept resource representations nested to a single level - and the server responds in kind. Note that the client also specified that it could accept the normal form - so it would be valid for the server to respond with the first, normal form even to this second request.</p>
<p>What if the server could not fulfil the request? Perhaps the backend is unable to perform the join required to provide the nested data, and the client specified that it could only accept that nested representation:</p>
<pre style="padding-left: 1em; ">GET /orders/432544<br style="padding-left: 0px; " />Accept: application/json;q=1;depth=1<br style="padding-left: 0px; " /><br style="padding-left: 0px; " />406 Not Acceptable</pre>
<p>It might be sensible for the server to provide the normal representation in the body anyway, in case the client was able to process it.</p>
<p>In particular, this response might be used to prevent (or mitigate) DoS attacks; depending on the application, the server might impose a depth limit of 2 or 3 levels.</p>
<h3>Thoughts: Advantages, Limitations and Questions</h3>
<p>The approach outlined above would afford clients some flexibility in 'denormalising' the data that they receive on request, avoiding both the need for developers to create custom code to nest related data and the tendency for APIs to become over-specialised to the needs of one particular API client over time.</p>
<p>It's also cacheable - the Accept header should form part of the cache key, and provides a graceful degradation for servers which cannot perform the data nesting request.</p>
<p>However,it's not a silver bullet: in particular, it's not clear how one might implement an analogue to Django's .select_related('foo', 'bar') form, where the API consumer could specify which resources it wished to be nested in the main resource requested. Instead, all the client can specify is a simple depth, and may therefore receive far more nested information than it might actually need.</p>
<p>As far as I can see, there are two problems to solve before being able to implement this more specific second form:</p>
<ul>
<li>How to provide a self-describing way for a server to indicate that this facility is available for linked resources</li>
<li>How a client can tell the server which individual resource links to expand</li>
<ul>
<li>Perhaps including an XPath-like parameter in the Accept header might work here (tweaked suitably to apply to JSON documents too); but how would the client know which paths were expandable without requesting the normal form to start with?</li>
</ul>
</ul>
<p> </p>
<p>Implementation details aside, I do think that setting a convention (even of a limited form, such as the depth approach described above) would increase the usefulness of APIs. Perhaps I'll even get around to a Tastypie/Piston extension to implement it!</p>
<p>Comments on the approach, improvements and especially criticisms, are welcome in the comments.</p>]]></content:encoded>
    <dc:publisher>No publisher</dc:publisher>
    <dc:creator>Dan Fairs</dc:creator>
    <dc:rights></dc:rights>
    
      <dc:subject>mobile</dc:subject>
    
    
      <dc:subject>web</dc:subject>
    
    
      <dc:subject>api</dc:subject>
    
    
      <dc:subject>django</dc:subject>
    
    <dc:date>2011-06-16T07:25:00Z</dc:date>
    <dc:type>Page</dc:type>
  </item>


  <item rdf:about="http://www.stereoplex.com/blog/introducing-django-lazysignup">
    <title>Introducing django-lazysignup</title>
    <link>http://www.stereoplex.com/blog/introducing-django-lazysignup</link>
    <description>django-lazysignup is a package designed to allow users to interact with a site
as if they were authenticated users, but without signing up. At any time, they
can convert their temporary user account to a real user account. Read more
about it below.</description>
    <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p><a class="external-link" href="http://pypi.python.org/pypi/django-lazysignup/">django-lazysignup</a> is a Django application that was partly inspired by a talk that <a class="external-link" href="http://twitter.com/simonw">Simon Willison</a> gave at EuroPython a few years back (perhaps 2008, or 2009?) and partly to scratch an itch I had with an application I was building at the time. The problem it tries to solve is that making users sign up with a web site just to try out your app is quite a high barrier - potential users just bounce right off that registration form.</p>
<p>I'd seen some efforts to solve this problem before. Most seemed to involve stashing the data for some predetermined part of the website somewhere (often in the session) and then reconstituting it into real application data when the user eventually bites the bullet and signed up. This worked OK, but you had to write it anew for every web site, as clearly the data you'd want to store would change from site to site. You also ended up effectively developing a miniature version of your site that would work with some limited data set.</p>
<p>This didn't really seem good enough.</p>
<p>So I started wondering - what if we just created a real user for every person who visited the site? Django already has support for creating users with unusable passwords - so if we just create a user with an unusable password every time a new person comes along, log them in, and then at some future point (presumably once they've fallen in love with your site) they can set themselves up with a real username and password. And as a bonus, all that data that they created while messing about with the site sticks around, and carries over into their 'real' user.</p>
<p>This is, in essence, how django-lazysignup works.</p>
<p>Let's take a look in a bit more detail. You can <a class="external-link" href="http://pypi.python.org/pypi/django-lazysignup/">grab an official release from PyPI</a>, or <a class="external-link" href="https://github.com/danfairs/django-lazysignup">clone it on GitHub</a>.</p>
<h3>What's in the package?</h3>
<p>Once you've installed django-lazysignup (I'll let you <a class="external-link" href="http://pypi.python.org/pypi/django-lazysignup/">read the docs</a> to see how to do that), you've got a few tools to play with:<br /><br /></p>
<ul>
<li>An authentication backend</li>
<li>A user conversion view</li>
<li>The allow_lazy_user view decorator</li>
<li>The is_lazy_user template filter</li>
<li>The user agent blacklist</li>
<li>Custom user models</li>
</ul>
<p> </p>
<h3>Authentication backend</h3>
<p>The authentication backend needs to be installed for django-lazysignup to work. This backend is required so that we can authenticate the temporary user accounts without a password. We refer to these temporary users as 'lazy' users - they haven't bothered to sign up yet.</p>
<h3>The user conversion view</h3>
<p>In many cases, you're going to want your users to sign up eventually. The package includes a view to allow you to do this, converting a lazy user to a real user. In practice, this simply involves the user setting a username and password for their temporary user account. This approach means they get to keep all the data that they've already created in your application.</p>
<h3>The allow_lazy_user decorator</h3>
<p>The temporary user creation process is potentially an expensive one (and on a high-traffic site, may cause contention on the user table). django-lazysignup therefore provides a decorator that allows the developer to specify which views can trigger this process. Sites I've developed that use this package tend to apply the decorator to the views that do interesting things, but exclude the homepage and any static pages on the site.</p>
<h3>The is_lazy_user template filter</h3>
<p>It's often useful - particularly in templates - to find out whether the current user is a lazy (ie. temporary) user or not. In particular, you may want to show a link to the convert view if they're a lazy user. This template filter is provided for this purpose. Note that lazy users will appear as authenticated (ie. is_authenticated() returns True). For now, has_usable_password also returns False for lazy users, though this should not be relied on. The canonical way of detecting a lazy user is through the is_lazy_user filter (and its associated function in the utils module)</p>
<h3>The user agent blacklist</h3>
<p>You probably don't want every request to a view that's opted in to lazy user creation to actually create a user. Principally, you probably don't want search engines to do this. The user agent blacklist is a crude way of filtering out such robots.</p>
<p>Note that this means that views that have the allow_lazy_user decorator won't be guaranteed to always have an authenticated user. You still need to make sure that your views work with unauthenticated user (or mark them with the login_required decorator or similar).</p>
<h3>Custom user models</h3>
<p>As of version 0.7, django-lazysignup also has limited support for custom user models. Just set the LAZYSIGNUP_USER_MODEL setting appropriately (by default, it's auth.User to support contrib.auth out of the box). As alluded to before, support for custom users is basically predicated on the user model looking pretty much like a normal Django user - especially in that it's stored in the database.</p>
<p>I anticipate that this mechanism will change in the distant future, if Django were to adopt some form of pluggable model (or at least, a pluggable user model).</p>
<h3>Restrictions</h3>
<p>django-lazysignup was built with Django's own contrib.auth application in mind. If you're not using this for authentication, then integrating django-lazysignup will be more of a challenge. (If you do, and there are some changes to the package that would help you, please let me know!). In particular, it expects that you will have a user model stored somewhere in the database.</p>
<h3>The Future</h3>
<p>Honestly, I'd expected to have a 1.0 release out the door by now, but people keep suggesting great features that I'd like to get in before committing to API stability. Currently on the list are:</p>
<ul>
<li>Global view opt in, with opt-out decorator (suggested by Rob Hudson)</li>
<li>Support for deferred user validation - for example, to support email address validation (suggested by Alex Ehlke)</li>
</ul>
<p><br />In addition to Rob and Alex, thanks also to Luke Zapart for suggesting and providing an initial implementation and tests for the custom user model feature.<br /><br />If you have a play with the app, and there's something you'd like to see, then do get in touch - or just fork it on GitHub, add the feature (with docs and tests, preferably!) and send me a pull request.</p>
<p>And that application that django-lazysignup was originally built for? Well, I still want to do it. Someday.</p>]]></content:encoded>
    <dc:publisher>No publisher</dc:publisher>
    <dc:creator>Dan Fairs</dc:creator>
    <dc:rights></dc:rights>
    
      <dc:subject>django</dc:subject>
    
    <dc:date>2011-04-24T16:00:22Z</dc:date>
    <dc:type>Page</dc:type>
  </item>


  <item rdf:about="http://www.stereoplex.com/blog/speeding-up-django-unit-test-runs-with-mysql">
    <title>Speeding up Django unit test runs with MySQL</title>
    <link>http://www.stereoplex.com/blog/speeding-up-django-unit-test-runs-with-mysql</link>
    <description>Here are a couple of tips to speed up unit test runs on Mac OS X and Linux when running MySQL.</description>
    <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>When I'm developing Django sites, my database of choice is usually PostgreSQL. However, lots of clients use MySQL. And there lies a problem: table creation on MySQL seems to be an order of magnitude slower on Mac OS X than on Linux. This makes repeated unit test runs extremely painful.</p>
<p>I researched this a little bit a while ago, and noticed that it had been <a class="external-link" href="http://bugs.mysql.com/bug.php?id=56550">reported as a bug</a> in the MySQL tracker. At the time, there were no fixes or workarounds.</p>
<p>A recent update, however, has revealed the use of the skip-sync-frm option. Put it in your MySQL config file in the [mysqld] section for a quick speedup:</p>
<pre id="content">[mysqld]<br />default-table-type=innodb<br />transaction-isolation=READ-COMMITTED<br />default-character-set=utf8<br />skip-sync-frm=OFF</pre>
<div></div>
<p>Of course, nothing in this life is free, as Daniel Fischer explains in a comment:</p>
<p class="callout">The reason why it's slower on Mac OS X than on Linux is that on Mac OS X, fcntl(F_FULLFSYNC) is available, and mysqld prefers this call to fsync(). The difference is that fsync() only flushes data to the disk - both on Linux and Mac OS X -, while fcntl(F_FULLFSYNC) also asks the disk to flush its own buffers and blocks until the data is physically written to the disk.<br /><br />In a nutshell, it's slower because it's safer.</p>
<p>So, we're trading data integrity for performance - but this is a development machine, so trashing and recreating databases (or the MySQL installation for that matter) is fine, if necessary.</p>
<h3>Et tu, Linux?</h3>
<p>My colleague was having similar problems on the latest Ubuntu, 10.10. The tweak above helped him too, but his test runs were also painfully slow. He'd already added the 'noatime' option to fstab.</p>
<p>It turns out that the newest Ubuntu ships with ext4 as the default file system. By default, ext4 makes absolutely sure that all data has been written out to the filesystem journal before writing the journal commit record. This is done through the use of filesystem barriers. Again - this is done to prefer data integrity over performance. Since this is a dev machine, it's disposable, and performance is more important. So, we can turn this off in /etc/fstab:</p>
<pre>/dev/sda3 on / type ext4 (noatime,rw,errors=remount-ro,barrier=0)</pre>
<p>Read more about the <a class="external-link" href="http://kernelnewbies.org/Ext4#head-25c0a1275a571f7332fa196d4437c38e79f39f63">barrier setting on Kernelnewbies.org</a>.</p>
<p>Just to reiterate - it's probably best not to do this on a machine that's important without thinking about it carefully. Those settings have conservative defaults for a reason!</p>]]></content:encoded>
    <dc:publisher>No publisher</dc:publisher>
    <dc:creator>Dan Fairs</dc:creator>
    <dc:rights></dc:rights>
    
      <dc:subject>python</dc:subject>
    
    
      <dc:subject>linux</dc:subject>
    
    
      <dc:subject>django</dc:subject>
    
    
      <dc:subject>mysql</dc:subject>
    
    
      <dc:subject>tips</dc:subject>
    
    
      <dc:subject>macintosh</dc:subject>
    
    <dc:date>2010-12-07T14:22:21Z</dc:date>
    <dc:type>Page</dc:type>
  </item>


  <item rdf:about="http://www.stereoplex.com/blog/filtering-dropdown-lists-in-the-django-admin">
    <title>Filtering Dropdown Lists in the Django Admin</title>
    <link>http://www.stereoplex.com/blog/filtering-dropdown-lists-in-the-django-admin</link>
    <description>It's not immediately obvious how to filter dropdown lists in the Django admin interface. This article will talk about how ForeignKeys can be filtered in Django ModelForms and then the Django admin. </description>
    <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>Automatically-generated dropdown lists can seem a little mysterious at first - particularly when you first want to customise what they contain in the Django admin. I'm going to go through a number of examples of increasing complexity of customising the content of dropdowns in various contexts: ModelForms, and then into the Django admin.</p>
<p>Here are the models that the examples will work with. They're abbreviated and slightly modified versions of some models from the <a class="external-link" href="http://www.swooptravel.co.uk/">Swoop</a> project I'm currently working on:</p>
<pre>class Area(models.Model):<br />    title = models.CharField(max_length=100)<br />    area = models.MultiPolygonField(blank=True, null=True)<br /> <br />class Trip(models.Model):<br />    title = models.CharField(max_length=100)<br />    area = models.ForeignKey(Area)<br /> <br />class Landmark(models.Model):<br />    title = models.CharField(max_length=100)<br />    point = models.PointField()<br /> <br />class MountaineeringInfo(models.Model):<br />    trip = models.ForeignKey(Trip)<br />    area = models.ForeignKey(Area, blank=True, null=True)<br />    base_camp = models.ForeignKey(Landmark, blank=True, null=True)</pre>
<p>As you can see, we're using GeoDjango here - I'm not going to talk much about that here, but it should be obvious what's going on when we get it it. Note that these examples assume Django 1.2.</p>
<div>Here are the cases that this article will cover:</div>
<div>
<ul>
<li>Filtering a forms.ModelForm's ModelChoiceField</li>
<li>Filtering a Django admin dropdown</li>
<li>Filtering a Django admin dropdown in an inline, based on a value on the main instance (phew!)</li>
</ul>
</div>
<h2>Filtering a form's ModelChoiceField</h2>
<p>Consider this form:</p>
<pre>class MountaineeringForm(forms.ModelForm):<br />    class Meta:<br />        model = MountaineeringInfo</pre>
<p>This'll generate a simple form for us, including dropdowns with options for every Landmark, Trip and Area we have defined. Let's look at the area foreign key first. Note how the area attribute of the Area model is nullable. Let's say we only wanted to be able to select Areas from our MountaineeringForm which had a valid area attribute set - put another way, we want to filter out those records which are null.</p>
<p>This is pretty straightforward, and is in fact covered in the docs. And there are, in fact, two ways to do it:</p>
<pre>class MountaineeringForm(forms.ModelForm):<br />    area = forms.ModelChoiceField(queryset=Area.objects.exclude(area=None))<br />    class Meta:<br />        model = MountaineeringInfo</pre>
<p>This is probably the simplest way, and works well whenever the filtering you need to do does not depend on any request-specific or context-specific information.</p>
<p>The other option we have is to allow the ModelForm base class to do its usual thing, and then modify the fields that were generated directly.</p>
<pre>class MountaineeringForm(forms.ModelForm):<br />    class Meta:<br />        model = MountaineeringInfo<br /><br />    def __init__(self, *args, **kwargs):<br />        super(MountaineeringForm, self).__init__(self, *args, **kwargs)<br />        self.fields['area'].queryset = Area.objects.exclude(area=None)</pre>
<p>Now, this is slightly more verbose, and to my eye, not so clear as the first version. However, this approach of modifying the form after the fields have been constructed is a pattern we'll see in the coming examples.</p>
<h2>Filtering a Django Admin Dropdown</h2>
<p>Now, let's say that we want to edit MountaineeringInfo instances in the Django admin. At the moment, we just have this in our admin.py:</p>
<pre>admin.site.register(MountaineeringInfo)</pre>
<p>This generates a form much as we had previously with our simple ModelForm definition. However, we still want to filter out those Area instances which don't have an area set. We do this by providing a custom ModelAdmin sublcass, and overriding the formfield_for_foreignkey method:</p>
<pre>class MountaineeringInfoAdmin(admin.ModelAdmin):<br />    def formfield_for_foreignkey(self, db_field, request, **kwargs):<br />        if db_field.name == 'area':<br />            kwargs['queryset'] = Area.objects.exclude(area=None)<br />admin.site.register(MountaineeringInfo, MountaineeringInfoAdmin)<br /></pre>
<p>This is pretty straightforward, and is indeed documented in the Django docs. Note that the request is passed into this method, so it's easy to perform some filtering based on some aspect of the request - the currently logged-in user, for example.</p>
<h2>Filtering an inline's dropdown based on the inline instance</h2>
<p>This is slightly more complicated. Let's say we're editing a Trip, and we're editing MountaineeringInfo instances by way of an inline. In code terms, we've got something like this:</p>
<pre>class MountaineeringInfoInline(admin.TabularInline):<br />    model = MountaineeringInfo<br /><br />class TripAdmin(admin.ModelAdmin)<br />    inlines = [MountaineeringInfoInline]</pre>
<p>Now, referring back to our models, let's say we want to filter the available Landmarks for an inline depending on what Area is selected. There are two cases we have to consider: what happens when the inline is displayed but blank (ie. it's not bound to a MountaineeringInfo instance); and then, once we've got a MountaineeringInfo instance to bind to.</p>
<p>This can be solved using custom inline formsets. Let's extend the MountaineeringInfoInline class:</p>
<pre>class MountaineeringInfoInline(admin.TabularInline):
    model = MountaineeringInfo
    formset = MountaineeringInfoInlineFormset
</pre>
<p>Note we've defined an extra attribute, specifying a custom formset. Let's go ahead and define that formset:</p>
<pre>class MountaineeringInfoInlineFormset(BaseInlineFormSet):
    def add_fields(self, form, index):
        super(MountaineeringInfoInlineFormset, self).add_fields(form, index)
        landmarks = Landmark.objects.none()
        if form.instance:
            try:        <br />                area = form.instance.area    <br />            except Area.DoesNotExist:<br />                pass   <br />            else:  <br />                landmarks = Landmark.objects.filter(point__within=area.area)
        form.fields['base_camp'].queryset = landmarks
</pre>
<p>Here, we override the inline formset's add_fields() method which - unsurprisingly - is called to generate all the fields that will appear in the inline formset. Note that the form is passed in as an argument. Since this is a ModelForm, the underlying instance (which will be a MountaineeringInfo instance, remember) is available using the instance attribute on the form. Now, if Django is generating a new, blank inline formset, then form.instance will be None. In this case, we don't want any landmarks to display - we want the user to have chosen an area first. Hence, we assign an empty QuerySet to the base camp field on the form.</p>
<p>On the other hand, if instance is set, then we have an existing MountaineeringInfo instance to work with. In this case, we get the area associated with it (note that area is nullable, so we have to wrap the access in a try/except to guard against the possibility that no area has been set) and create a QuerySet of all landmarks whose point lies within the area. So, when a user selects an area from the dropdown and presses Save, the landmarks contained in the base camp dropdown filter themselves to only those within the specified area.</p>
<p>Depending on your app, you might want the default value for landmarks to be Landmark.objects.all() rather than none() as per the example above - if so, remember that all() is the default, so you could eliminate some of that code.</p>
<p>The only thing to be aware of with the above code is if a user selects an area and a landmark, then changes the area so that the selected landmark is no longer valid for the area, the old landmark will remain set in the database. Of course, the base_camp dropdown would be blank, and would be reset to None when Save was pressed. If this were a problem, it would be possible to set the queryset to be Landmark.objects.filter(pk=form.instance.base_camp.pk).</p>
<h2>Filtering an inline's dropdown based on the parent</h2>
<p>OK, confession first - this feels like a hack. But I haven't found a cleaner way to do it - let me know if you know how to!</p>
<p>Note the Trip model has an Area foreign key as well. How might we go about filtering the 'area' dropdown in new MountaineeringInfo instances to only contain areas that are within the parent Trip's area?</p>
<p>Well, there are two parts to this:</p>
<ol>
<li>Figuring out what the area of the parent Trip is</li>
<li>Filtering our own area dropdown depending on this values</li>
</ol>
<p>We've actually done most of the work necessary to understand how to do the second part already. So let's do that part first. The key thing to know is that BaseInlineFormset-derived instances, like ModelAdmin instances, have a formfield_for_dbfield method. We can therefore override this to restrict the queryset used in fields contained within the inline. Let's extend our existing definition:</p>
<pre>class MountaineeringInfoInline(admin.TabularInline):
    model = MountaineeringInfo
    formset = MountaineeringInfoInlineFormset
    
    def formfield_for_dbfield(self, field, **kwargs):
        if field.name == 'area':
            # Note - get_object hasn't been defined yet
            parent_trip = self.get_object(kwargs['request'], Trip)
            contained_areas = Area.objects.filter(area__contains=parent_trip.area.area)
            return forms.ModelChoiceField(queryset=contained_areas)
        return super(MountaineeringInfoInline, self).formfield_for_dbfield(field, **kwargs)
</pre>
<p>As you can see - very similar to what we've seen before. We use the get_object call to extract the Trip that this MountaineeringInfo instance is (or will be) related to, and find all Area instances which are contained by that parent Trip's area.</p>
<p>So, what does that get_object() method look like?</p>
<pre>    def get_object(self, request, model):
        object_id = request.META['PATH_INFO'].strip('/').split('/')[-1]
        try:
            object_id = int(object_id)
        except ValueError:
            return None
        return model.objects.get(pk=object_id)
</pre>
<p>This clearly isn't ideal, as it depends on the URL structure used by the Django admin: it extracts the object ID by stripping off slashes, splitting on slashes, and taking the last element. It then looks up the appropriate object using the model class passed on.</p>
<div>So that class in full:</div>
<pre>class MountaineeringInfoInline(admin.TabularInline):
    model = MountaineeringInfo
    formset = MountaineeringInfoInlineFormset

    def formfield_for_dbfield(self, field, **kwargs):
        if field.name == 'area':
            # Note - get_object hasn't been defined yet
            parent_trip = self.get_object(kwargs['request'], Trip)
            contained_areas = Area.objects.filter(area__contains=parent_trip.area.area)
            return forms.ModelChoiceField(queryset=contained_areas)
        return super(MountaineeringInfoInline, self).formfield_for_dbfield(field, **kwargs)

    def get_object(self, request, model):
        object_id = request.META['PATH_INFO'].strip('/').split('/')[-1]
        try:
            object_id = int(object_id)
        except ValueError:
            return None
        return model.objects.get(pk=object_id)
</pre>
<p>(In the real app code, that get_object() is in a base class, for easier reuse - hence the parameterisation of the model.)</p>
<h2>Can you do better?</h2>
<p>This kind of filtering is often required in non-trivial applications: be it filtering on security (which is relatively easy, as the request is usually present in most ModelAdmin APIs) or filtering on other data values - which seems a lot trickier than you might want. However, once you understand how the various ModelAdmin classes, inlines and formsets fit together, it's not too bad.</p>
<p>I'm keen to hear how you're tackling this in your Django apps - and whether this can be simplified!</p>]]></content:encoded>
    <dc:publisher>No publisher</dc:publisher>
    <dc:creator>Dan Fairs</dc:creator>
    <dc:rights></dc:rights>
    
      <dc:subject>tips</dc:subject>
    
    
      <dc:subject>django</dc:subject>
    
    <dc:date>2010-10-01T20:50:00Z</dc:date>
    <dc:type>Page</dc:type>
  </item>


  <item rdf:about="http://www.stereoplex.com/blog/swoop-travel-live">
    <title>Swoop Travel Live!</title>
    <link>http://www.stereoplex.com/blog/swoop-travel-live</link>
    <description>Foundry's first GeoDjango site, Swoop Travel, has gone live.</description>
    <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>I'm pleased to announce that <a href="http://wearefoundry.com">Foundry's</a> first web site has gone live: <a href="http://www.swooptravel.co.uk/">Swoop Travel</a>.</p>
<p>Swoop Travel is a GeoDjango site using PostgreSQL and PostGIS, although very little GIS functionality is in the public-facing site at the moment - it's mainly in the backend.</p>
<p>We built this from scratch in about a month (while juggling other projects too!) are are pretty pleased with how it's turned out. This is just the first iteration and we're looking forward to expanding the site.</p>
<p>GeoDjango is pretty awesome. As we work on the site and get more experience, I'll post some more about some of the innards: there's quite a lot of interesting stuff in the backend, particularly admin customisations like filtering dropdowns based on landmarks within geographic regions, and so on. I'll also try to talk about the production server configuration, as GeoDjango needs a slightly different WSGI configuration to the standard run-of-the-mill Django site.</p>]]></content:encoded>
    <dc:publisher>No publisher</dc:publisher>
    <dc:creator>Dan Fairs</dc:creator>
    <dc:rights></dc:rights>
    
      <dc:subject>django</dc:subject>
    
    <dc:date>2010-07-30T08:20:44Z</dc:date>
    <dc:type>Page</dc:type>
  </item>


  <item rdf:about="http://www.stereoplex.com/blog/buildout-vs-pip-virtualenv-and-requirements-files">
    <title>buildout vs pip, virtualenv and requirements files</title>
    <link>http://www.stereoplex.com/blog/buildout-vs-pip-virtualenv-and-requirements-files</link>
    <description>The Python world is blessed with two mainstream choices for integrating Python packages into an application: buildout and pip. This post looks at the pros and cons of each, to try to help you pick which one is best for you.</description>
    <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>Repeatable software configurations are crucial for reliable software deployments. Applications (particularly web applications based on frameworks) are often made up of dozens, if not hundreds of separate packages. When you deploy your application, you want to be sure that the packages and their versions are the same as those that were tested through development and staging.</p>
<p>Buildout</p>
<p>Buildout, developed originally by Jim Fulton at Zope Corporation, was an attempt to address this problem. zc.buildout 1.0.0, released back in January 2008 (after over a year of betas) has become the accepted way of managing the hundreds of packages that make up Zope, BlueBream and Plone. Buildout isn't restricted to Zope, of course - it's usable for any setuptools-aware Python software.</p>
<p>Buildout is based around configuration files, most commonly a single file called buildout.cfg. Buildout itself doesn't do very much; most functionality is provided by add-on modules. These modules are called 'recipes'. They're easy to write, and there are <a href="http://pypi.python.org/pypi?:action=browse&show=all&c=512">dozens of custom recipes</a> to perform specialised tasks. However, the key piece of functionality that buildout offers is to allow the developer to simply list the packages (eggs) required by their application, and optionally their versions. Minimum, maximum and ranges of acceptable versions can be specified, and buildout will do its best to satisfy version requirements.</p>
<p>Buildout is based on setuptools (which is in turn based upon distutils, part of the Python standard library). It uses setuptools to do the heavy lifting of package search and installation. Buildout environments are isolated from the system Python: packages installed via buildout don't end up anywhere near system Python's site-packages directory.</p>
<p>pip and virtualenv</p>
<p>Pip and virtualenv both come originally from the ever-productive Ian Bicking. pip is a replacement for easy_install, part of setuptools (as indeed pip is now): it offers a number of advantages over easy_install, **expand this**. Like buildout, pip also offers the ability to install a set of Python packages of a given version, using a requirements file. The requirements file lists which eggs to install, their versions, editable eggs checked out of source control, alternative mirrors for the location of packages, and so on.</p>
<p>pip installs into the current Python environment - which by default, will be the system Python. Pip is therefore commonly used with virtualenv, which isolates the Python environment.</p>
<p>buildout dependency resolution - build can fail halfway through due to version conflicts</p>
<p><span style="font-size: small;">A hybrid approach? gp.recipe.pip</span></p>
<p>Would be nice if it could provide data to feed mr.developer [sources], and [versions]</p>
<p>Conclusion</p>
<p>Choose pip - simplicity</p>
<p>Choose buildout - sophistication</p>]]></content:encoded>
    <dc:publisher>No publisher</dc:publisher>
    <dc:creator>Dan Fairs</dc:creator>
    <dc:rights></dc:rights>
    
      <dc:subject>buildout</dc:subject>
    
    
      <dc:subject>plone</dc:subject>
    
    
      <dc:subject>python</dc:subject>
    
    
      <dc:subject>zope</dc:subject>
    
    
      <dc:subject>django</dc:subject>
    
    <dc:date>2010-02-17T22:44:34Z</dc:date>
    <dc:type>Page</dc:type>
  </item>


  <item rdf:about="http://www.stereoplex.com/blog/oserror-while-installing-django-buildout-and-djang">
    <title>OSError while installing Django with buildout and djangorecipe</title>
    <link>http://www.stereoplex.com/blog/oserror-while-installing-django-buildout-and-djang</link>
    <description>A corrupted Django tarball can cause mysterious errors from djangorecipe.</description>
    <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>Sometimes, I see the following error when trying to run a Django buildout:</p>
<pre>File "/Users/dan/.eggs/djangorecipe-0.20-py2.6.egg/djangorecipe/recipe.py", line 271, in install_release
    os.listdir(extraction_dir)[0]
OSError: [Errno 2] No such file or directory: '/Users/dan/.downloads/django-archive'
</pre>
<p>After a bit of poking around, I found that this is to do with a corrupted Django tarball. In my case, this is usually because I've interrupted a download with Ctrl-C. Unfortunately it seems that the tarfile module in the Python standard library (at least as invoked by setuptools) treats broken a tar.gz files as an empty archive, without throwing an exception. Since there's no exception, djangorecipe assumes everything was uncompressed without problems, and is therefore rather surprised when the unpacked Django package isn't where it expected it to be.</p>
<p>The short term solution is to delete the bad Django archive from your download cache. This will likely be a 'downloads' directory in your buildout, or you may have a global one (as I do). When you next run buildout, the tarball will be freshly downloaded.</p>
<p>When I get a moment I'll see if I can modify djangorecipe to notice this condition and not proceed with the build.</p>]]></content:encoded>
    <dc:publisher>No publisher</dc:publisher>
    <dc:creator>Dan Fairs</dc:creator>
    <dc:rights></dc:rights>
    
      <dc:subject>django</dc:subject>
    
    
      <dc:subject>buildout</dc:subject>
    
    
      <dc:subject>tips</dc:subject>
    
    <dc:date>2010-01-18T17:28:53Z</dc:date>
    <dc:type>Page</dc:type>
  </item>


  <item rdf:about="http://www.stereoplex.com/blog/migrating-django-mingus">
    <title>Migrating to Django Mingus</title>
    <link>http://www.stereoplex.com/blog/migrating-django-mingus</link>
    <description>I've migrated my blog from a creaking Plone 2.5 to a fork of Django Mingus. This is the process I went through.</description>
    <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>I've been running a blog since 2007 (check the archives!). At the time, it made most sense for me to go for a Plone-based blog. Plone was what I was most familiar with, and there was a simple blog product out there that I could use called, sensibly enough, SimpleBlog. In fact, you can still grab it - it's where you'd expect on the <a href="http://plone.org/products/simpleblog">Plone Products section</a>. And as you can see, the release there was the one that I used at the time - SimpleBlog 2.0, for Plone 2.5</p>
<p>Fast-forward to the start of 2010, and things have moved on. Plone's moved on, for sure. Plone 4 is just around the corner, and there's some really, really cool stuff in there: Dexterity, a new content types framework, finally looks like it'll make content type creation as easy as it should be, and simple types and behaviours can be created through the web. Deco, slated (last time I heard) for Plone 5, is quite literally going to make publishers wet themselves. And Deliverance has got great potential in helping to unify the many disparate systems which makes the typical corporate user's daily life such a grind. Thing is, I'm not a big corporate user; a lot of what makes Plone great for that environment is overhead for me and my little blog.</p>
<p>The big change for me personally, though, is that more and more of my work is now Django. About two thirds of 2009 was Zope 2/Five, and a third Django; 2010 is looking like it'll be the other way around. So, having decided that my blog was looking a little tired, and seemed to be a bit of a spam magnet, I decided to bite the bullet and do a complete rebuild in Django.</p>
<p>I didn't want to write yet-another-blog from scratch, so I picked what seemed to have the most buzz around it at the time - <a href="http://github.com/montylounge/django-mingus">Django Mingus</a> - and started from there. And this is how I did it.</p>
<p>Oh, before I get stuck in, all the code for this site is open source. Fortunately, there's not much of it. There's no documentation as such (this is open source, after all) because the target audience is, well, me. I've stayed true to my Zope roots and used buildout; the buildout and Django project can be checked out from GitHub:</p>
<p><a href="http://github.com/danfairs/stereoplex-buildout">Stereoplex buildout and project</a></p>
<p>Onwards.</p>
<h2>Planning</h2>
<p>There's more to moving a blog than just installing your blog software and bashing out new articles. I had to consider a number of migration issues, specifically:</p>
<ul>
<li>Obviously, I need to move all the content over from my old blog. That's not just articles: that's images, comments, the whole shebang.</li>
<li>I needed to either keep all the old article URLs from my previous blog working, or have them redirect to the new URLs.</li>
<li>Similarly, there are lots of RSS URLs out in the wild; I've handed a couple of them out to the Django community aggregator and the Planet Plone aggregator. There's also the main site feed.</li>
<li>I decided not to move user accounts over, as I'd already decided that I wasn't going to require a login to comment. I was going to go with a ReCaptcha integration.</li>
</ul>
<p>So, the first step was obviously going to be to extract all the content from my old blog.</p>
<h2>Getting the content out of the ZODB</h2>
<p>Plone stores data in the ZODB. The ZODB is amazing. It was years ahead of its time, and provides a really natural way to store and interact with data in document-oriented systems. It's only really in the last year or so that we've seen the rise of broadly similar data stores, most of which talk HTTP and don't have the fine-grained transaction control that ZODB provides. The only thing to remember with the ZODB is that you can only access it directly with Python. That meant that I had to write a Python script to dump out all my content into a platform-neutral form. (I could have written a script which read the ZODB directly and created Django models, I guess. Using an XML intermediate format seemed easier though, and was probably quicker to work with - loading up the Plone 2.5 codebase is pretty slow.)</p>
<p>Most Plone applications, SimpleBlog included, stored its data using the Archetypes content type framework. Archetypes (often just AT) is a schema-driven approach to creating content. This is a pretty handy approach (even if AT's implementation was a bit awkward in places) as it's really simple to write code that can introspect that content. This was invaluable for writing an export script.</p>
<p>The result is on github: <a href="http://github.com/danfairs/Simple-AT-XML-Dump">Simple-AT-XML-Dump</a>. I should note that AT does have support for native XML dump and load. Thing is, I've got an ancient version of AT, and SimpleBlog used CMF (a layer underneath Plone) comments. I had no idea if XML marshalling would work properly, so I just did my own custom XML format. It's an instance script. It should work pretty well on any folderish Archetypes types. Invoke it like this:</p>
<pre>bin/instance run Simple-AT-XML-Dump/run.py -p /zodb/path/to/plone/content -o out.xml</pre>
<p>/zodb/path/to/plone/content is the physical path in the ZODB to dump. This needs to be the root folderish AT item. out.xml is the output file.</p>
<p>I won't go into the specifics of how the script works (if there's anything you're interested in especially, mail me or post a comment below). In a nutshell though, it'll dump out AT types by schema (including ImageFields with base64-encoded data) and any CMF discussions that are associated with them.</p>
<h2>Starting with Django Mingus</h2>
<p>Many years of pain have taught me to automate my build. The XML dumper instance script above is pretty much the smallest thing I'll do without a build (and even now, I feel a twinge of guilt at not having packaged it properly.) For me, therefore, the first thing to do was to get a build up and running.</p>
<p>The system du jour seems to be to use pip and virtualenv. Pip (a replacement for easy_install) lets you specify requirements files, which define which Python packages you application uses, what versions of them, and lets you install directly from the major source control systems. (This ability seems to have caused lots of requirements files with github links in them to spring up in 'released' packages, especially in the Django world. I regard this as a Bad Thing; but that's an opinion piece for another time). However, grizzled Zope veterans tend to reach for buildout in such a circumstance so, wanting to restrict New Fangled stuff to learning how Mingus was put together, I stuck with that. Buildout predates pip and virtualenv, and so does stuff that you'd normally just do with virtualenv, like creating an isolated environment. Buildout config files are also a touch more verbose than requirements files. That said, I'm still pretty sure that buildout is the more extensible system, with a vast array of custom recipes; and, in common with a lot of software from the Zope world, the narrative documentation is atrocious.</p>
<p>But I know buildout, and I'm not giving up these scars so easily, so that's what I went with, basing my buildout config on the requirements file that comes with Mingus. This process was actually pretty simple:</p>
<ol>
<li>Set up a standard Django buildout using djangorecipe</li>
<li>Pop mr.developer in as an extension, add [sources] for all of the editable (-e) eggs in the pip requirements file, and add them all as auto-checkout</li>
<li>Put all the non-editable eggs from the requirements file into the eggs section of the buildout config</li>
<li>Use the versions supplied in the requirements file to create a [versions] section in the buildout config</li>
</ol>
<p>This gives me a buildout.cfg file that mirrors the Mingus requirements file, but will also:</p>
<ul>
<li>Create the django management script and django WSGI file automatically</li>
<li>Create a project and select an appropriate settings file for me</li>
</ul>
<p>The end result of this is the GitHub project I linked to above, <a href="http://github.com/danfairs/stereoplex-buildout">stereoplex-buildout</a>.</p>
<h2>Stereoplex</h2>
<p>I created another egg, called <a href="http://github.com/danfairs/stereoplex">stereoplex</a>, to contain all my site-specific customisations and scripts. I used another package of mine, <a href="http://pypi.python.org/pypi/fez.djangoskel/">fez.djangoskel</a>, to create the basic layout. Specifically, it has:</p>
<ul>
<li>A Django management command to import the XML file created with Simple-AT-XML-Dump</li>
<li>A Django ModelAdmin subclass (actually a basic.blog.admin.PostAdmin subclass) to let me use TinyMCE as my editor</li>
<li>A ReCaptcha Django form field, widget, and custom comment form</li>
<li>An single extra view, which returns all items posted on the blog</li>
<li>A URLConf which brought together Mingus' URLs plus those for the extra view, and for TinyMCE</li>
<li>And of course, all the template overrides and CSS, JavaScript and images required for the new Stereoplex look and feel</li>
</ul>
<h2>Importing the Data</h2>
<p>The next step was to write a data importer. This had to do a number of things (data migrations are never simple!):</p>
<ul>
<li>Import all the images in the XML data file, creating basic.media.models.Photo instances for each of them</li>
<li>Rewrite all image links in the body text of posts to contain &lt;inline&gt; elements used by Mingus</li>
<li>Create basic.blog.models.Post instances for each blog post in the file</li>
<li>Create django.contrib.comments.models.Comment instances for every comment, and associate them with the appropriate post </li>
<li>Create django.contrib.redirect.models.Redirect objects for each imported post, to allow existing inbound links to be redirected to the new location.</li>
</ul>
<p>Automated content import is one of those things that you tell clients is usually impossible. And when they're migrating from a legacy CMS platform (or indeed, hand-maintained HTML), then that's usually right. The inevitable gigabytes of hand-rolled HTML are at best poorly formed, and at worse represent content which needs throwing away anyway.</p>
<p>I was more fortunate. I didn't have that much content to migrate - 60-odd posts, and a few images - and Plone 2.5's default editor Kupu is actually pretty good at producing good HTML. I was able to directly use the existing HTML, and only needed to replace the &lt;img&gt; tags with the appropriate &lt;inline&gt; expected by Mingus.</p>
<h2>Changes to Mingus packages</h2>
<p>I did make some modifications to Mingus' packages. These were as follows:</p>
<h3>django-mingus</h3>
<ul>
<li><a href="http://github.com/danfairs/django-mingus/commit/f94ad242fe9f216651333352b982738bcd1bdf2e">Improve the way subscription links are generated</a></li>
<li><a href="http://github.com/danfairs/django-mingus/commit/7c11807bd80c5d8f1b88185aa7f9bddbea0f99dc">On a category page, include the RSS link for that category in the page header</a></li>
<li><a href="http://github.com/danfairs/django-mingus/commit/898874e65a131db2dc2bc6cf8b695655d837290f">Hide teaser-related markup if there's no teaser</a></li>
</ul>
<h3>django-basic-apps</h3>
<ul>
<li><a href="http://github.com/danfairs/django-basic-apps/commit/482daf65b1f372b5cbe518eed5a34d1e1ca72404">Fix a bug where a template tag library was not loaded</a></li>
</ul>
<h3>django-sugar</h3>
<ul>
<li><a href="http://github.com/danfairs/django-sugar/commit/5a748578b0e19a6102e75fa68a2160da80572c2a">Enhance the pygmentize filter to not be restricted to &lt;code&gt;</a></li>
</ul>
<p>All really very minor.</p>
<h2>Server Configuration</h2>
<h3>Apache</h3>
<p>The Plone instance used a standard small setup: a single ZEO client talking to a ZEO server, fronted by Apache with the magic RewriteRule. The Zope configuration had been tweaked slightly to be usable on a small (by Plone's standard!) 512MB RAM host, but I hadn't had time to do any other optimisations (for example, serving static files from Apache) that I would normally do.</p>
<p>Django forces you to do at least some of these. Static files are always served by an external web server (unless you are really determined to sail through all the warnings in the documentation). I also finally got around to turning on gzip compression.</p>
<p>One remaining task that I planned to do in the Apache configuration was to provide redirects for my RSS feeds. This wasn't as easy as I might have liked, since the old feed URLs had query string elements; and indeed, it was some of those values that I needed to formulate a correct redirect. I ended up with the following:</p>
<pre class="code">RewriteEngine On<br />RewriteCond %{query_string} ^(.*)?EntryCategory=(.*)?&amp;(.*)<br />RewriteRule ^/search_rss /feeds/categories/%2/ [R=permanent,L,NC]<br /></pre>
<p>This simply declares three match groups in the RewriteCond regex, the second of which (hence %2) is the category slug. The RewriteRule then issues a permanent redirect to the new URL. We have to use RewriteCond, because RewriteRule regexes won't match a query string, only the URL path.</p>
<p>Next up is the mod_wsgi configuration. I use a fairly standard configuration, which is generally as follows:</p>
<pre class="code">WSGIScriptAlias / /var/websites/www.stereoplex.com/bin/django.wsgi<br />WSGIDaemonProcess stereoplex user=stereoplex group=stereoplex processes=3 threads=25 maximum-requests=1000 stack-size=524288<br />WSGIProcessGroup stereoplex<br /></pre>
<p>This runs the Stereoplex web application in its own process group, and with its own user and group membership. This is a safety net: if the site is compromised or (to be honest, more likely) I screw up and the app tries to write to the filesystem, its rights are limited by the host's file access control. The WSGI script file is generated by buildout. Probably the most interesting parameter here is the stack size. This is set lower than Linux's default value; for this sort of app, it doesn't need to be as large as the default. Setting this value to lower than the default led to a massive memory saving (remember, this is only a 512MB host; and this site is one of around half a dozen running on the same machine).</p>
<h3>Memcached</h3>
<p>Mingus makes extensive use of Django's caching support. Other Django sites on the same host use a Memcached instance, so I just pointed Stereoplex at that one. Memcached is essentially in a default configuration, except only bound to the localhost IP address, rather than the public IP address. It's configured to use a maximum of 64MB of RAM. This doesn't sound a great deal, but even with two or three websites using it, I haven't seen it go over 40MB.</p>
<h2>So... All Done?</h2>
<p>Nearly.</p>
<p>If I'm honest, the content editing experience isn't as nice as Django as it was in Plone. This is expected. Plone is a CMS, and has an administrative interface that's focussed on the business of managing content. Django isn't a CMS, it's a more general web framework. The experience is more like editing content in the ZMI.</p>
<p>That said, there are advantages. I've found third-party Django software much easier to integrate and customise than third-party Plone products ever were. I'm not fighting reams of configuration all the time. There's a strong mindset of developing apps to be reusable in the Django world; Mingus itself is little more than some UI glue (as, really, is Stereoplex). If I were tasked with delivering a large CMS with flexible authentication and authorisation, workflow, and so forth: I'd go for Plone in an instant. But for this job, many of Plone's strengths simply don't apply, and somthing more simple and lightweight was more appropriate.</p>
<p>Anyway - I'm quite pleased with the results. There are still a few kinks to be worked out (pygementize only seems to be being applied on the home page, not individual post pages, for example) but I'll get there over the next couple of weeks.</p>
<p>If there's anything you'd like to have more information on, then leave a comment (click on the article heading - yes, a proper link to comments is on the list!) or of course, just go and grab the code at <a href="http://github.com/danfairs/stereoplex">Github</a>.</p>
<p>Enjoy!</p>]]></content:encoded>
    <dc:publisher>No publisher</dc:publisher>
    <dc:creator>Dan Fairs</dc:creator>
    <dc:rights></dc:rights>
    
      <dc:subject>plone</dc:subject>
    
    
      <dc:subject>django</dc:subject>
    
    <dc:date>2010-01-14T23:38:34Z</dc:date>
    <dc:type>Page</dc:type>
  </item>


  <item rdf:about="http://www.stereoplex.com/blog/python-unicode-and-unicodedecodeerror">
    <title>Python, Unicode and UnicodeDecodeError</title>
    <link>http://www.stereoplex.com/blog/python-unicode-and-unicodedecodeerror</link>
    <description>In the years I've been developing in Python, Unicode seems to be the topic which causes the greatest amount of confusion amongst developers. Hopefully much of this confusion should go away in Python 3, for reasons I'll come to at the end; but until then, the UnicodeDecodeError is the bane of many developers' lives.
</description>
    <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<h3>Unicode and Encodings</h3>
<p>OK, let's take a step away from text for a moment. I want you to think of a number between one and ten. Got one? Great - now, grab a pen and paper, and write it down.<br /><br />What number did you think of? Well, I thought of the number six. And when I wrote it down, it looks like this:<br /><br /><img alt="6" class="image-inline" src="numeric___png_580x500_crop_q85.jpg" /></p>
<p>Of course, if I were an ancient Roman (or possibly a clockmaker), I could have written this:<br /> <br /></p>
<p><img alt="Six in Roman Numerals" class="image-inline" src="bars___png_580x500_crop_q85.jpg" /><br /><br /> <br />They all mean the same thing - the number six. But we've written them in different ways. In other words, we've 'encoded' our idea of the number six in our head in three different ways - three different encodings.<br /><br />The separation of the idea of 'the number six' from its actual representation is basically all Unicode is. The Unicode Character set (UCS) defines a set of things (loosely, a set of letters) that we can represent. How we represent each of those letters is called an encoding. There's only one Unicode, but there are many encodings. In Unicode parlance, each of those 'things' (letters) are known as 'code points'. Unicode separates the characters' meaning from their representation.<br /><br />For historical reasons, the most common encoding (in Western Europe and the US, anyway) is ASCII. This is also Python's default encoding. <br /><br />Let's think about ASCII for a moment. It's an encoding that uses 7 bits, which limits it to 128 possible values. That's enough to represent all the characters that Western Europe and the US use (letters in both cases, the numbers, punctuation, a few characters with diacritics). Therefore, Unicode strings that only include code points that are in these 128 ASCII characters can be encoded as ASCII. Conversely, any ASCII encoded string can be decoded to Unicode.<br /><br />It's worth reiterating that terminology, as you come across it a lot: the transformation from Unicode to an encoding like ASCII is called 'encoding'. The transformation from ASCII back to Unicode is called 'decoding'.<br /><br /></p>
<pre>    Unicode  ---- encode ----&gt; ASCII<br />    ASCII    ---- decode ----&gt; Unicode</pre>
<h3>Non-ASCII encodings</h3>
<p>Most people don't live in the US or Western Europe, and therefore have a requirement to store more characters than can be represented with ASCII. What those folk need to represent *is* part of the Unicode set (Unicode is massive!) - so a different encoding is required. Common encodings have familiar names: UTF-8 and UTF-16. UTF-8, for example, uses a single byte for encoding all the ASCII values, then variable numbers of bytes to encode further characters. (The ins and outs of these encodings are beyond the scope of this article - check out their respective Wikipedia entries for the gory details.)<br /><br />The fact that the first byte of UTF-8 isthe same as ASCII is important, since it means that the encoding is backwards-compatible with ASCII. However, it can mask problems in software. We'll come to this shortly.</p>
<h3>Some terminology</h3>
<p>Unicode-related terminology can get confusing. Here's a quick glossary:<br /><br /></p>
<ul>
<li>To encode</li>
<ul>
<li>Encoding (the verb) means to take a a Unicode string and produce a byte string</li>
</ul>
<li>To decode</li>
<ul>
<li>Decoding (the verb) means to take a byte string and produce a Unicode string</li>
</ul>
<li>An encoding</li>
<ul>
<li>An encoding (the noun) is a mapping that describes how to represent a Unicode character as a byte or series of bytes. Encodings are named (like 'ascii', or 'utf-8') and are used both when encoding (verb!) Unicode strings and decoding byte strings.</li>
</ul>
</ul>
<p><br />In other words, when you encode or decode, you need to specify the encoding that you're using. This will become clearer shortly.</p>
<h3>Python, bytes and strings</h3>
<p>You've probably noticed that there seems to be a couple of ways of writing down strings in Python. One looks like this:</p>
<pre>  'this is a string'<br /></pre>
<p>Another looks like this:</p>
<pre>  u'this is a string'<br /></pre>
<p>There's a good chance that you also know that the second one of those is a Unicode string. But what's the first one? And what does it actually mean to 'be a Unicode string'?<br /><br />The first one is simply a sequence of bytes. This byte sequence is, by convention, an ASCII representation (ie. encoding) of a string. The whole Python standard library, and most third-party modules, happily deal with strings natively in this encoding. As long as you live in US or Western Europe, then that's probably fine for you.<br /><br />The second one is a representation of a Unicode string. This can therefore contain any of the Unicode code points. It's possible that whatever you're using to edit the Python code (or just view it) might not be able to display the entire Unicode character set - for instance, a terminal usually has an encoding that it assumes data it's trying to display is in. There's a special notation, therefore, for representing arbitrary Unicode code points within a Python Unicode string: the \u and \U escapes. These will be followed by four or eight hex digits; there's some subtlety here (see the Python string reference for further information) but you can simply think of the number after the \u (or \U) representing the Unicode code point of the character. So, for example, the following Python string:</p>
<pre>  u'\u0062'<br /></pre>
<p>represents LATIN SMALL LETTER B, or more simply:</p>
<pre>  u'b'<br /></pre>
<p>To summarise then: the Unicode character set encompasses all characters that we may wish to represent. Individual encodings (ASCII, UTF-8, UTF-16, etc.) are representations of all or some of that full Unicode character set.</p>
<h3>Encoding and Decoding</h3>
<p>Byte strings and Unicode strings provide methods to perform the encoding and decoding for you. Remembering that you *encode* from Unicode to an encoding, you might try the following:<br /><br /></p>
<pre>&gt;&gt;&gt; u'\u0064'.encode('ascii')<br />'d'</pre>
<p>As you'd expect, the Unicode string has an 'encode' method. You tell Python which encoding you want ('ascii' in this case, there are lots more supported by Python - check the docs) using the first parameter to the encode() call.<br /><br />Conversely, byte strings have a decode() method:<br /><br /></p>
<pre>&gt;&gt;&gt; 'b'.decode('ascii')<br />u'b'</pre>
<p><br />Here, we're telling Python to take the byte string 'b', decode it based on the ASCII decoder and return a Unicode string. <br /><br />Note that in both these previous cases, we didn't really need to specify 'ascii' manually, since Python uses that as a default.</p>
<h3>UnicodeEncodeError</h3>
<p>So, we've established that there are encodings which can represent Unicode, or more usually, a certain subset of the Unicode character set. We've already talked about how ASCII can only represent 128 characters. So, what happens if you have a Unicode string that contains code points that are outside that 128 characters? Let's try something all too familiar to UK users: the £ sign. The Unicode code point for this character is 0x00A3:<br /><br /></p>
<pre>&gt;&gt;&gt; u'\u00A3'.encode('ascii')<br />Traceback (most recent call last):<br />  File "&lt;stdin&gt;", line 1, in &lt;module&gt;<br />UnicodeEncodeError: 'ascii' codec can't encode character u'\xa3' <br />in position 0: ordinal not in range(128)</pre>
<p>Boom. This is Python telling you that it encountered a character in the Unicode string which it can't represent in the requested encoding. There's a fair amount of information in the error: it's giving you the character that it's having problems with, what position it was at in the string, and (in the case of ASCII) it's telling you that the number it was expecting was in the range 0 - 127.<br /><br />How do you fix a UnicodeEncodeError? Well, you've got a couple of options:</p>
<ul>
<li>Pick an encoding that does have a representation for the problematic character</li>
<li>Use one of the error handling arguments to encode()</li>
</ul>
<p>The first option is obviously ideal, although its practicality depends on what you're doing with the encoded data. If you're passing it to another system that (for example) requires its text files in ASCII format, you're stuck. In that case, you're left with one of the other two options. You can pass 'ignore', 'replace', 'xmlcharrefreplace' or 'backslashreplace' to the encode call:<br /><br /></p>
<pre>&gt;&gt;&gt; u'\u0083'.encode('ascii', 'ignore')<br />''<br />&gt;&gt;&gt; u'\u0083'.encode('ascii', 'replace')<br />'?'<br />&gt;&gt;&gt; u'\u0083'.encode('ascii','xmlcharrefreplace')<br />'&amp;#131;'<br />&gt;&gt;&gt; u'\u0083'.encode('ascii','backslashreplace')<br />'\\x83'</pre>
<p><br />If you choose one of those options, you'll have to let the eventual consumer of your encoded text know how to handle these.</p>
<h3>UnicodeDecodeError</h3>
<p>This one is probably more familiar to most developers. A UnicodeDecodeError occurs when you ask Python to decode a byte string using a specified encoding, but Python encounters a byte sequence in that string that isn't in the encoding that you specified (phew!). This one probably benefits from an example.<br /><br />Consider once more the ASCII encoding. Being a 7-bit representation, ASCII only has 127 characters, represented by the numbers 0 - 127. So let's imagine the ASCII-encoded string below:</p>
<pre>'Hi!'</pre>
<p><br />In terms of ASCII numbers, that is:<br /><br /> 72 105 33<br /><br />Or in actual Python:<br /><br /></p>
<pre>&gt;&gt;&gt; s = chr(72) + chr(105) + chr(33)<br />&gt;&gt;&gt; s<br />'Hi!'<br />&gt;&gt;&gt; s.decode('ascii')<br />u'Hi!'</pre>
<p>That's all great. But what happens if we add a byte that's not in the ASCII range?<br /><br /></p>
<pre>&gt;&gt;&gt; s = s + chr(128)<br />&gt;&gt;&gt; s.decode('ascii') <br />Traceback (most recent call last):<br />  File "&lt;stdin&gt;", line 1, in &lt;module&gt;<br />UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 <br />in position 3: ordinal not in range(128)</pre>
<p>Boom. Python is saying that it encountered a character 0x80 (which is 128 in hex, the one we added) which was at position 3 (counting from zero) in the source byte string which was not in the range 0 - 127.<br /><br />This is normally caused by using the incorrect encoding to try to decode a byte string to Unicode. So, for example, if you were given a UTF-8 byte string, and tried to decode it as ASCII, then you might well see a UnicodeDecodeError.<br /><br />But why only might?<br /><br />Well, remember what I mentioned before - UTF-8 shares the first 127 characters with ASCII. That means that you can take a UTF-8 byte sequence, and decode it with the ASCII decoder, and *as long as there are no characters outside the ASCII range* it will work. *Only* when that byte string starts featuring characters which don't exist within the ASCII encoding do errors start being thrown.</p>
<h3>ASCII - the default codec</h3>
<p>Lots of Python programmers (well, US and Western European ones) can get quite a way into their Python careers converting byte strings to unicode like this:</p>
<pre>&gt;&gt;&gt; print unicode('hi!')<br />u'hi!'<br /></pre>
<p>What's going on here? Well, Python uses the ascii codec by default. So, the above is equivalent to:</p>
<pre>&gt;&gt;&gt; 'hi!'.decode('ascii')<br />u'hi!'<br /></pre>
<p>And, because most US/European test data is composed of this byte string:</p>
<pre>  'test'<br /></pre>
<p>... nobody notices the problem until the Japanese office complains the intranet is broken.</p>
<h3>Unicode Coercion</h3>
<p>If you try to interpolate a byte string with a Unicode string, or vice-versa, Python will try and convert the byte string to Unicode using the default (ie. ascii) codec. So:</p>
<pre>&gt;&gt;&gt; u'Hi' + ' there'<br />u'Hi there'<br />&gt;&gt;&gt; u'Hi %s' % 'there'<br />u'Hi there'<br />&gt;&gt;&gt; 'Hi %s' % u'there'<br />u'Hi there'</pre>
<p>These all work fine, because all the strings that we're working with can be represented with ASCII. Look what happens when we try a character which can't be represented with ASCII though:</p>
<pre>&gt;&gt;&gt; u'Hi ' + chr(128)</pre>
<pre>Traceback (most recent call last):<br />  File "&lt;stdin&gt;", line 1, in &lt;module&gt;<br />UnicodeDecodeError: 'ascii' codec can't decode byte 0x80 <br />in position 0: ordinal not in range(128)</pre>
<p><br />Python sees we're trying to combine a Unicode string with a byte string, so tries to decode the byte string to Unicode using the ASCII codec. Since character 128 (the Euro symbol, as it happens) can't be represented in ASCII, Python throws a UnicodeDecodeError.<br /><br />In my experience, Unicode coercion is often where UnicodeDecodeErrors manifest themselves. The programmer has a Unicode string (probably a template) into which they're trying to put some data from a database. Relational databases tend to supply byte strings. Usually the encoding is a property on the database connection. Often, however, developers simply assume it's ASCII (or don't do anything special at all, which in Python amounts to the same thing). They try to stick the data from the database (perhaps in UTF-8 or ISO-8859-1) into a Unicode string using the %s format specifier, Python tries to decode the byte string using the ascii codec, and the whole thing falls flat on its face.</p>
<h3>Why do Python byte strings have an encode() method?</h3>
<p>The sharp-eyed amongst you will have noticed that byte strings have an encode() method as well as a decode() method. What does this do? Quite simply, it does a decode-then-encode. The byte string is decoded to Unicode using the default (ascii) encoding, and is then encoded to the target encoding specified in the call to encode() using the appropriate encoding. As you'd expect, fun and games ensue if the original byte string isn't actually encoded in ASCII at all.</p>
<h3>Avoiding Unicode Errors</h3>
<p>So - this is really what you care about, right? How do you avoid these Unicode problems? Well, there are three simple rules:</p>
<ul>
<li>Within your application, always use Unicode</li>
<li>When you're reading text in to your application, decode it as soon as possible with the correct encoding</li>
<li>When you're outputting text from your application, encode at that point and do it explicitly</li>
</ul>
<p>What does this mean in practice? Well, it means:</p>
<ul>
<li>Whenever you're writing string literals in code, always use u''.</li>
<li>Whenever you read any text in, call .decode('encoding') on the byte string to obtain Unicode</li>
<li>Whenever you're writing text out, pick an appropriate encoding to handle whatever Unicode you're outputting - remember that ASCII can only represent a very limited subset</li>
</ul>
<p><br />There are more places than you probably realise that text can get into your application. Here's some:</p>
<ul>
<li>An incoming request from a web browser</li>
<li>Some text read in from a data file on disk</li>
<li>A template file read in from disk</li>
<li>Some user's input from a form</li>
<li>Some data from a database</li>
<li>Data returned from a web services call</li>
</ul>
<p><br />Frameworks help a lot here. Many frameworks handle the common encoding and decoding cases (usually the template encoding, and data encoding from a database) for you, and just pass you back Unicode strings. Watch out for web request variables - many of those may be plain byte strings. Also watch out for web service responses; you might need to inspect the response headers to find out the encoding. And even then be careful; I've come across situations with in-house apps where declared encoding were simply wrong, leading to unexpected UnicodeDecodeErrors.</p>
<h3>Figuring out which encoding to use</h3>
<p>When you're faced with a byte string, how do you know which decoding to use? The answer is, unfortunately, simple: you don't. Some environments (such as the Web) may help you - HTTP requests and responses contain headers which specify the encoding used within them. You can inspect those, and if they're wrong - well, at least you've got someone else to blame.<br /><br />If you're lucky, you know the byte string is encoding some XML. XML is gets a lot of flack, but one of the things it does right is to specify explicitly a default encoding that's actually useful (UTF-8) and provide a mechanism to declare a different encoding. So with XML, you can scan the first few bytes of the file, decode using UTF-8, and look for the magic encoding declaration. If there isn't one, then you can safely decode the rest of the file using UTF-8. If there is one, then switch encoding. Of course, your XML library of choice will do all this for you, and should give you Unicode text back once you've read your XML in.<br /><br />If you're unlucky, then you've got two more options. First off, you can talk to the people who run your source (or destination) system - find out what encodings they're using, or accept, and use those.</p>
<p>The final, last resort option is to simply have a range of common encodings to try. A list I often use is ASCII, ISO-8859-1, UTF-8, UTF-16. Keep trying to decode with each of those in turn until one works. Which encodings you pick of course depends on what kind of files you're expecting to see. You may also run into problems of course if you have a byte string in encoding X which also happens to be valid when decoded using encoding Y - in this case, you'll just get garbage data. This is the cause of many of the 'funny character' bugs you see in web applications: byte strings being decoded using an encoding which happened to work, but was in fact not the original encoding used to create the byte string.</p>
<h3>Python 3</h3>
<p>I'm not going to talk too much about Python 3, since I haven't actually used it yet. <br /><br />But - you rarely hear .NET or Java programmers complaining about Unicode errors. This is simply because both .NET and Java define a string to *be* Unicode in the first place. Anything involving the String class (in either runtime) is Unicode anyway; the developer sees encoding problems much less frequently as it's much less common for unexpected byte data to creep into applications. This doesn't mean the problems don't exist, of course: at the end of the day, text is still being encoded to and from byte strings; it's just done explicitly. (The fact that the default encoding on MS Windows, the OS on which many of these systems run, is UTF-16 helps here too - many more characters can be encoded in UTF-16 than ASCII).<br /><br />My understanding is that Python 3 takes this general approach. Python 2's 'str' type is gone. In its place is the 'unicode' type (equivalent to Java and .NET's String class), and the 'bytes' type. String operations are done on 'unicode' instances.</p>
<h3>Coding in a Unicode world</h3>
<p>Unicode is here to stay. The days of writing software that would only need to work in American universities, where the only language and script used was US English in Latin text are long gone. There's no magic to Unicode and the various encodings, and once you understand what's going on, there's no reason to have that sick feeling in the pit of your stomach the next time you see a UnicodeDecodeErrror. Just remember these rules:</p>
<ul>
<li>Decode on the way in</li>
<li>Unicode everywhere in your application</li>
<li>Encode on the way out</li>
</ul>
<p> </p>]]></content:encoded>
    <dc:publisher>No publisher</dc:publisher>
    <dc:creator>Dan Fairs</dc:creator>
    <dc:rights></dc:rights>
    
      <dc:subject>python</dc:subject>
    
    
      <dc:subject>zope</dc:subject>
    
    
      <dc:subject>plone</dc:subject>
    
    
      <dc:subject>django</dc:subject>
    
    <dc:date>2010-09-30T15:21:39Z</dc:date>
    <dc:type>Page</dc:type>
  </item>


  <item rdf:about="http://www.stereoplex.com/blog/installing-geodjango-with-postgresql-and-zc-buildo">
    <title>Installing GeoDjango with PostgreSQL and zc.buildout</title>
    <link>http://www.stereoplex.com/blog/installing-geodjango-with-postgresql-and-zc-buildo</link>
    <description>The installation of the PostgreSQL requirements is somewhat daunting. I've spent a bit of time putting together a buildout.cfg to try to make this easier.</description>
    <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>I've been wanting to play with GeoDjango for a while, since my database of choice (PostgreSQL) has excellent spatial support. However, getting all the dependencies up and running is pretty complicated.</p>
<p>I've been working on a buildout to get at least most of the steps done for you. There are a couple of manual steps at the end, which I hope to automate when I next have time to work on this.</p>
<p>The buildout installs the following items:</p>
<ul>
<li>PostgreSQL</li>
<li>PostGIS</li>
<li>GDAL</li>
<li>Proj</li>
<li>GEOS</li>
<li>psycopg2</li>
<li>Django</li>
</ul>
<p>It should also perform initial setup of the PostGIS database template, loading some sample SQL files, and sets up some convenience symlinks for the PostgreSQL command-line programs.</p>
<p>It's not finished - in particular, it just assumes that the user running the buildout is to be used as the database owner and such like. Anyway, here it is:</p>
<p>&nbsp;</p>
<pre>[buildout]<br />parts =<br />  postgresql<br />  postgis<br />  gdal<br />  init-pgsql<br />  pgsql-symlinks<br />  django<br />  <br />eggs =<br />    psycopg2<br /><br />[postgresql]<br />recipe = zc.recipe.cmmi<br />url = http://wwwmaster.postgresql.org/redir/198/h/source/v8.3.7/postgresql-8.3.7.tar.gz<br />extra_options =<br />  --with-readline<br />  --enable-thread-safety<br />  <br />[postgis]<br />recipe = hexagonit.recipe.cmmi<br />url = http://postgis.refractions.net/download/postgis-1.3.5.tar.gz<br />configure-options =<br />    --with-pgsql=${postgresql:location}/bin/pg_config<br />    --with-geos=${geos:location}/bin/geos-config<br />    --with-proj=${proj:location}<br /><br />[proj]<br />recipe = zc.recipe.cmmi<br />url = http://download.osgeo.org/proj/proj-4.6.1.tar.gz<br /><br />[geos]<br />recipe = zc.recipe.cmmi<br />url = http://download.osgeo.org/geos/geos-3.0.3.tar.bz2<br /><br />[gdal]<br />recipe = zc.recipe.cmmi<br />url = http://download.osgeo.org/gdal/gdal-1.6.0.tar.gz<br />extra_options = <br />    --with-python<br />    --with-geos=${geos:location}/bin/geos-config<br /><br />[init-pgsql]<br />recipe = iw.recipe.cmd<br />on_install = true<br />on_update = false<br />cmds = <br />    ${postgresql:location}/bin/initdb -D ${postgresql:location}/var/data -E UNICODE<br />    ${postgresql:location}/bin/pg_ctl -D ${postgresql:location}/var/data start<br />    sleep 30   <br />    ${postgresql:location}/bin/createdb -E UTF8 template_postgis<br />    ${postgresql:location}/bin/createlang -d template_postgis plpgsql<br />    ${postgresql:location}/bin/psql -d template_postgis -f ${postgis:location}/share/lwpostgis.sql<br />    ${postgresql:location}/bin/psql -d template_postgis -f ${postgis:location}/share/spatial_ref_sys.sql<br />    ${postgresql:location}/bin/psql -d template_postgis -c "GRANT ALL ON geometry_columns TO PUBLIC;"<br />    ${postgresql:location}/bin/psql -d template_postgis -c "GRANT ALL ON spatial_ref_sys TO PUBLIC;"<br />    ${postgresql:location}/bin/pg_ctl -D ${postgresql:location}/var/data stop<br /><br />[pgsql-symlinks]<br />recipe = cns.recipe.symlink<br />symlink_target = ${buildout:directory}/bin<br />symlink_base = ${postgresql:location}/bin<br />symlink =<br />    clusterdb<br />    createdb<br />    createlang<br />    createuser<br />    dropdb<br />    droplang<br />    dropuser<br />    ecpg<br />    initdb<br />    ipcclean<br />    pg_config<br />    pg_controldata<br />    pg_ctl<br />    pg_dump<br />    pg_dumpall<br />    pg_resetxlog<br />    pg_restore<br />    postgres<br />    postmaster<br />    psql<br />    reindexdb<br />    vacuumdb<br /><br />[django]<br />recipe = djangorecipe<br />version = 1.0.2<br />project = project<br />eggs =<br />    ${buildout:eggs}<br /><br /></pre>
<p>Note that running this will actually attempt to start up and shut down the database server, as it needs to be running in order for some of the initialisation scripts to run. That 'sleep 30' in the middle is to allow the database server to start, and (if you're on OS X and running it) to give you a change to enter your username and password for the firewall!</p>
<p>There are still some manual steps to be taken (which I'd like to automate in due course). These are the fairly standard things that you do when starting any Django project, plus an extra step for bootstrapping PostGIS.</p>
<h3>Create your database</h3>
<p>From the command line, you'll need to create the database for you application. You need to specify the PostGIS template, so use something like:</p>
<pre>$ bin/createdb -T template_postgis &lt;db name&gt;</pre>
<h3>Change the settings for your application</h3>
<p>Edit the settings.py for your application, and make sure that you're using 'postgresql_psycopg2' as the database engine. Set the database name as appropriate for your application. You should also add 'django.contrib.gis' to your INSTALLED_APPS setting, and you'll also need to add the following two lines to your settings.py:</p>
<pre>GDAL_LIBRARY_PATH = '/path/to/buildout/parts/gdal/lib/libgdal.dylib'<br />GEOS_LIBRARY_PATH = '/path/to/buildout/parts/geos/lib/libgeos_c.dylib'</pre>
<h3>Add Google projection</h3>
<p>I'll confess: I'm only doing this because the GeoDjango docs say you should! I don't know enough about GeoDjango yet to understand why. But you should do the following:</p>
<pre>$ bin/django shell<br />&gt;&gt;&gt; from django.contrib.gis.utils import add_postgis_srs<br />&gt;&gt;&gt; add_postgis_srs(900913)<br />&gt;&gt;&gt; ^D<br />$<br /></pre>
<p>If you get an error when importing add_postgis_srs, then double check you got the GDAL_LIBRARY_PATH and GEOS_LIBRARY_PATH correct, and that the files specified were built. (I'm on Mac OS X - I suspect the exact file name may change depending on platform.)</p>
<h3>Done!</h3>
<p>Once all that's done, you should hopefully be able to bin/django syncdb, start a new app (using <a href="http://pypi.python.org/pypi/fez.djangoskel">fez.djangoskel</a>, of course!) and start using GeoDjango.</p>
<p>I shall refine the above process over time (in particular, there are some modifications I'd like to make to djangorecipe to remove the manual steps at the end), and I'll post extra parts when I've done that.</p>
<p>&nbsp;</p>]]></content:encoded>
    <dc:publisher>No publisher</dc:publisher>
    <dc:creator>Dan Fairs</dc:creator>
    <dc:rights></dc:rights>
    
      <dc:subject>django</dc:subject>
    
    
      <dc:subject>software</dc:subject>
    
    
      <dc:subject>python</dc:subject>
    
    
      <dc:subject>buildout</dc:subject>
    
    <dc:date>2009-04-17T11:06:10Z</dc:date>
    <dc:type>Page</dc:type>
  </item>


  <item rdf:about="http://www.stereoplex.com/blog/fez-djangoskel-django-projects-and-apps-as-eggs">
    <title>fez.djangoskel: Django projects and apps as eggs</title>
    <link>http://www.stereoplex.com/blog/fez-djangoskel-django-projects-and-apps-as-eggs</link>
    <description>I've made an initial release of fez.djangoskel, which provides simple paster templates for egg-based Django projects and applications.</description>
    <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>This is just a brief note to say that I've release fez.djangoskel, a package which provides paster templates for creating egg-based Django projects and reusable applications. This is all part of my crusade to get the Django community to package software as eggs. As well as source and binary egg releases on PyPI, the code is <a href="http://github.com/danfairs/fez.djangoskel/tree/master">available on GitHub</a> (representing my first foray into git).<br /></p>
<p>The easiest way to get this is, as usual, with easy_install. (See my <a href="creating-a-python-2-4-plone-and-zope-development-environment-on-mac-os-x-leopard">previous</a> <a href="a-django-development-environment-with-zc-buildout">posts</a> on how to set this up):</p>
<pre>easy_install fez.djangoskel</pre>
<p>Once that and its dependencies have installed, you should fine that paster can now create Django projects and apps:</p>
<pre>$ paster create --list-templates<br />Available templates:<br />  basic_package:   A basic setuptools-enabled package<br />  django_app:      Template for a basic Django reusable application<br />  django_project:  Template for a Django project<br />  paste_deploy:    A web application deployed through paste.deploy<br /></pre>
<p>It's standard paste from here on in: to create a project, use:</p>
<pre>paster create -t django_project</pre>
<p>To create an app, use:</p>
<pre>paster create -t django_app</pre>
<p>Paster will then ask you a bunch of questions (the most crucial one being the name of your app/project!) and generate the file layout for you.<br /></p>
<p>I plan to do another release later this week which includes additional templates for creating Django eggs using namespace packages, as well as improved documentation.</p>]]></content:encoded>
    <dc:publisher>No publisher</dc:publisher>
    <dc:creator>Dan Fairs</dc:creator>
    <dc:rights></dc:rights>
    
      <dc:subject>django</dc:subject>
    
    <dc:date>2008-12-02T21:05:59Z</dc:date>
    <dc:type>Page</dc:type>
  </item>


  <item rdf:about="http://www.stereoplex.com/blog/a-django-development-environment-with-zc-buildout">
    <title>A Django Development Environment with zc.buildout</title>
    <link>http://www.stereoplex.com/blog/a-django-development-environment-with-zc-buildout</link>
    <description>This article will show you how to create a repeatable Django development environment from scratch using zc.buildout.</description>
    <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>Setting up environments is a pain. Whether it's Django, Zope, ASP.NET, whatever - a typical web stack has often dozens of components with dependencies on each other and underlying libraries. How do you manage this? How do you make sure that the software you're running on your development environment is configured the same way, and is the same version that gets into your production environment? How do you make sure that the third-party Python library you've just started using is correctly deployed?</p>
<p>One answer is zc.buildout. Buildout is a tool for reliably creating reproducible software builds. It was originally developed by Zope Corporation, and is often used in Zope builds; however, there's no dependency on Zope. You can use it to build pretty much anything. And I'm going to show you how to get a Django build up and running using it.</p>
<p>I shall use PostgreSQL as the database in my examples, but there's nothing stopping you using MySQL or any other Django-supported database, if you wish.</p>
<p>You'll also need the standard development tools (gcc, etc.) available since we're going to be getting buildout to compile some binary eggs for us.</p>
<p><br /></p>
<h3>The Basics: Python and PostgreSQL<br /></h3>
<p>The only thing you need to get going is some version of Python installed. The system Python is probably fine, as long as it's version 2.3 or later. Get a database installed too: I'm using PostgreSQL here. (You can use buildout to install a database too; I'm not going to cover that here, though, since most people have their database installed through system packages.)</p>
<p>We are going to install two packages in the system python: setuptools and virtualenv. <br /></p>
<p>(If you don't want to touch the system Python at all, that's fine; <a href="http://www.stereoplex.com/2008/may/7/creating-a-python-2-4-plone-and-zope-development-e/">check out my earlier article</a> on how to compile and install a local version of Python. You might want to do this if your system only offers an old version of Python. I do it as a matter of course, but then get shouted at by system admins who want to use a package manager to keep their Python up to date. Your mileage may vary.)</p>
<p><a href="http://peak.telecommunity.com/dist/ez_setup.py">Download ez_setup.py</a>, and run it as a user who can write to the Python's site-packages directory (root if you're using your system python):</p>
<pre>wget http://peak.telecommunity.com/dist/ez_setup.py<br />python ez_setup.py<br /></pre>
<p>Let's also take the opportunity to create a database for the project. This is for PostgreSQL; obviously, substitute whatever's appropriate for your platform:</p>
<pre>createdb djangodevdb <br /></pre>
<h3><br /></h3>
<h3>virtualenv</h3>
<p>Before we get stuck into buildout, let's talk about virtualenv.<br /></p>
<p>You're probably going to have more than one project on the go. You want to keep them separate from each other, and want to avoid polluting your system Python installation with application-specific third-party modules. This is what virtualenv does. It lets you create isolated sandboxes: modules installed in one virtualenv don't interfere with other virtualenv. Let's install virtualenv and create an environment for our Django environment:</p>
<pre>easy_install virtualenv</pre>
<p>That will download and install virtualenv. Next up, let's create ourselves a sandbox called 'djangodev' to work in:<br /></p>
<pre>hornet:2.4 dan$ virtualenv --no-site-packages djangodev<br />New python executable in djangodev/bin/python<br />Installing setuptools.............done.<br />hornet:2.4 dan$</pre>
<p>Finally, we need to 'activate' the sandbox. This isolates the environment from the system Python, and ensures that any modules installed are local to this environment.</p>
<pre>hornet:2.4 dan$ cd djangodev/<br />hornet:djangodev dan$ source bin/activate <br />(djangodev)hornet:djangodev dan$ <br /></pre>
<p>Note how the command prompt changes as a visual indicator that we're now working in a virtualenv. You can type 'deactivate' if you want to exit the virtualenv.<br /></p>
<h3><br /></h3>
<h3>Initial Configuration<br /></h3>
<p>Getting going with buildout is very straightforward. Create a top-level directory to hold your application and  download the bootstrap.py file into it:</p>
<pre>mkdir app<br />cd app<br />wget http://svn.zope.org/*checkout*/zc.buildout/trunk/bootstrap/bootstrap.py</pre>
<p>Don't run this yet.</p>
<p><br /></p>
<h3>A Basic Buildout Configuration<br /></h3>
<p>Next, create a file in that same directory called buildout.cfg, and put the following into it:</p>
<pre>[buildout]<br />parts =<br /><br /></pre>
<p>This is pretty much the simplest buildout configuration you can start with.</p>
<p><br /></p>
<h3>Bootstrapping</h3>
<p>The very first time you use buildout, you have to bootstrap it. This installs buildout itself, and generates scripts to run the buildout. Bootstrapping the buildout is simply a matter of running the bootstrap.py file you downloaded earlier. You should see output resembling this:<br /></p>
<pre>(djangodev)hornet:app dan$ python bootstrap.py <br />Creating directory '/Users/dan/opt/virtual/2.4/djangodev/app/bin'.<br />Creating directory '/Users/dan/opt/virtual/2.4/djangodev/app/parts'.<br />Creating directory '/Users/dan/opt/virtual/2.4/djangodev/app/develop-eggs'.<br />Generated script '/Users/dan/opt/virtual/2.4/djangodev/app/bin/buildout'.<br />(djangodev)hornet:app dan$ <br /></pre>
<p>That just creates buildout's initial scripts and directory layouts. You don't have to run it again for this environment.</p>
<h3><br /></h3>
<h3>Installing Django</h3>
<p>So, all that was boring. And it probably seemed fiddly: after all, couldn't we just have installed Django in the virtualenv's site-packages manually? Yes, we could; but then we'd have had to do that every time we deployed an environment. We can now start to use buildout to automate our builds. Let's install Django 1.0 with buildout.</p>
<p>Open up your buildout.cfg again, and change it so that it looks like this:</p>
<pre>[buildout]<br />parts = django<br /><br />[django]<br />recipe = djangorecipe<br />version = 1.0</pre>
<p></p>
<p>Now, go ahead and run buildout. This will download the Django 1.0 distribution, so may take a few minutes depending on your connection:<br /></p>
<pre>(djangodev)hornet:app dan$ bin/buildout <br />Unused options for buildout: 'download-directory'.<br />Installing django.<br />django: Downloading Django from: http://www.djangoproject.com/download/%s/tarball/<br />Generated script '/Users/dan/opt/virtual/2.4/djangodev/app/bin/django'.<br />(djangodev)hornet:app dan$</pre>
<p>Buildout (via the 'djangorecipe' extension) has done a couple of things for us:</p>
<ul><li>It has created a script called bin/django to run the django management commands</li><li>It has created an inital Django project (called, imaginitively, 'project') for us with some default settings</li><li>It has installed Django<br /></li></ul>
<p>Note that buildout created a script called 'django' in the bin directory. This script it the exact equivalent of django-admin.py, or running python manage.py when manually installing Django; that is, you can run bin/django syncdb, bin/django sqlall, eveything you would expect. So let's go ahead and try to run the Django development server:</p>
<pre>(djangodev)hornet:app dan$ bin/django runserver<br />Traceback (most recent call last):<br />  File "bin/django", line 20, in ?<br />    djangorecipe.manage.main('project.development')<br />  File "/Users/dan/.buildout/eggs/djangorecipe-0.13-py2.4.egg/djangorecipe/manage.py", <br />    line 15, in main<br />    management.execute_manager(mod)<br /> [ ... snip ... ]<br />  File "/Users/dan/opt/virtual/2.4/djangodev/app/parts/django/django/db/<br />    backends/sqlite3/base.py", line 26, in ?<br />    raise ImproperlyConfigured, "Error loading %s module: %s" % (module, e)<br />django.core.exceptions.ImproperlyConfigured: Error loading pysqlite2 module: <br />    No module named pysqlite2</pre>
<p>Well - that didn't go so well!</p>
<p>The problem here of course is that we've installed Django, but we haven't specified the database to connect to, or installed the python module required to connect to the database. Let's do both now.</p>
<h3><br /></h3>
<h3>Installing Dependencies<br /></h3>
<p>Configuring Django to connect to a database is <a href="http://docs.djangoproject.com/en/dev/intro/tutorial01/#database-setup">well covered in the Django documentation</a>, so for now I'll just tell you to edit the project/settings.py file and change DATABASE_ENGINE to 'postgresql_psycopg2', and to set your DATABASE_NAME, DATABASE_USER etc. appropriately for your installation.</p>
<p>Installing the connector is more interesting. The connector for PostgreSQL is called psycopg2. Let's tell buildout to install that. Open up your buildout.cfg again, and change it so it now looks like this:</p>
<pre>[buildout]<br />parts = django<br /><br />[django]<br />recipe = djangorecipe<br />version = 1.0<br />eggs = psycopg2<br /></pre>
<p>All we've done is add psycopg2 to the list of eggs to install with Django.<br /></p>
<p>Now just rerun buildout:</p>
<pre>(djangodev)hornet:app dan$ bin/buildout <br />Uninstalling django.<br />Unused options for buildout: 'download-directory'.<br />Installing django.<br />Getting distribution for 'psycopg2'.<br />warning: no files found matching '*.html' under directory 'doc'<br />/Users/dan/opt/python-2.4.5/include/python2.4/datetime.h:186: <br />  warning: 'PyDateTimeAPI' defined but not used<br />psycopg/typecast.c:37: warning: 'skip_until_space' defined but <br />  not used<br />/Users/dan/opt/python-2.4.5/include/python2.4/datetime.h:186: warning: <br />  'PyDateTimeAPI' defined but not used<br />./psycopg/config.h:63: warning: 'Dprintf' defined but not used<br />./psycopg/config.h:63: warning: 'Dprintf' defined but not used<br />/Users/dan/opt/python-2.4.5/include/python2.4/datetime.h:186: warning: <br />  'PyDateTimeAPI' defined but not used<br />zip_safe flag not set; analyzing archive contents...<br />Got psycopg2 2.0.8.<br />Generated script '/Users/dan/opt/virtual/2.4/djangodev/app/bin/django'.<br />django: Skipping creating of project: project since it exists<br /></pre>
<p>As you can see, buildout noticed that we'd specified an extra requirement of psycopg2 and so downloaded it from PyPI, compiled and installed it for us. What buildout did is essentially analagous to running 'easy_install psycopg2'. Now we should be able to run the Django development server:</p>
<pre>(djangodev)hornet:app dan$ bin/django runserver<br />Validating models...<br />0 errors found<br /><br />Django version 1.0-final-SVN-unknown, using settings 'project.development'<br />Development server is running at http://127.0.0.1:8000/<br />Quit the server with CONTROL-C.</pre>
<p>It worked! You can now go ahead and check your buildout.cfg into source control. Anyone checking that out will get the same Django build as you.</p>
<p><br /></p>
<h3>PIL</h3>
<p>PIL, the Python Imaging Library, is always a bit of a pain to install. Django requires it if you want to work with images. It's also not packaged with setuptools, let alone as an egg. How can we get this into our build?</p>
<p>Fortunately, <a href="http://article.gmane.org/gmane.comp.web.zope.devel/13999">Chris McDonough has repackaged PIL with setuptools</a>, making it relatively straightforward to add to our build. Open up buildout.cfg again, and edit it so that is looks like this:</p>
<p><br /></p>
<pre>[buildout]<br />parts =<br />  PIL<br />  django<br /><br />[django]<br />recipe = djangorecipe<br />version = 1.0<br />eggs =<br />  psycopg2<br />  markdown<br />  PIL<br /><br />[PIL]<br />recipe = zc.recipe.egg<br />egg = PIL==1.1.6<br />find-links = http://dist.repoze.org/</pre>
<p></p>
<p>Rerun buildout, as before, and PIL should be downloaded, compiled and installed.<br /></p>
<p>That should be enough to get you going with buildout. You can find lots more on buildout and the Django recipe in the links below:</p>
<ul><li><a href="http://buildout.zope.org/">http://buildout.zope.org/</a></li><li><a href="http://pypi.python.org/pypi/zc.buildout">http://pypi.python.org/pypi/zc.buildout</a></li><li><a href="http://pypi.python.org/pypi/djangorecipe/">http://pypi.python.org/pypi/djangorecipe/</a></li><li><a href="http://plone.org/documentation/tutorial/buildout">http://plone.org/documentation/tutorial/buildout</a> (Plone-oriented, but lots of useful background information there too)</li><li><a href="http://renesd.blogspot.com/2008/05/buildout-tutorial-buildout-howto.html">http://renesd.blogspot.com/2008/05/buildout-tutorial-buildout-howto.html</a></li><li><a href="http://wiki.python.org/moin/buildout/pycon2008_tutorial">http://wiki.python.org/moin/buildout/pycon2008_tutorial</a></li></ul>
<p>Google found me those links, so I'm sure it can find more for you too! I'd particularly encourage you to read the djangorecipe documentation for more detail on how it can configure Django for you.</p>
<p>In future articles, I intend to talk about</p>

<ul><li>Starting Django applications as eggs</li><li>Configuring applications as development eggs inside buildout<br /></li><li>Packaging applications and uploading them to PyPI, so that they're just an easy_install away</li><li>Dealing with third-party Django applications which have not been packaged as eggs</li><li>Using buildout to build non-python dependencies</li><li>How all this works with source control<br /></li></ul>

<p>Until next time.<br /></p>]]></content:encoded>
    <dc:publisher>No publisher</dc:publisher>
    <dc:creator>Dan Fairs</dc:creator>
    <dc:rights></dc:rights>
    
      <dc:subject>django</dc:subject>
    
    
      <dc:subject>python</dc:subject>
    
    <dc:date>2008-11-12T19:58:02Z</dc:date>
    <dc:type>Page</dc:type>
  </item>


  <item rdf:about="http://www.stereoplex.com/blog/testing-app-views">
    <title>Testing App Views</title>
    <link>http://www.stereoplex.com/blog/testing-app-views</link>
    <description>How to easily test views in applications which don't have a urls.py file.</description>
    <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p><i>First of all, thanks to those people who've been reading and pointing out improvements in the comments - it's much appreciated, and I've learned a lot from you!</i></p>
<p>Django applications tend to have views, in a views.py file. These are generally looked up by URL. URLConfs (various urls.py files) are used to map request URLs to views.</p>
<p>This can lead to a problem from a testing perspective. Not all applications have urls.py files; and for those that do, there's nothing stopping a project wiring up different URLs to those views through a custom urls.py. It therefore becomes difficult to use Django's test client to invoke a view using a GET or a POST because you don't know how that URL has been configured. After all, you run python manage.py test from your project, and the project's configuration is used.</p>
<p>Let's say I have a views.py that looks like this:</p>
<pre>from django.shortcuts import render_to_response<br />from django.template import RequestContext<br /><br />def say_hi(request):<br />    return render_to_response('app/hi.html', context=RequestContext(request))</pre>
<p>I want to use Django's test client to ensure that when this view is called, it returns an HTTP 200 OK response. However, my app doesn't have a urls.py - the person using it is meant to wire this view into whatever URL they want to. How do I test the view?  <br /></p>
<p>The solution is deceptively simple: inject a test URLs module.</p>
<p>Create a base test case that looks something like this:</p>
<pre>from django.conf import settings<br />from django.test import TestCase<br /><br />class TestUrlsTestCase(TestCase):<br /><br />    def setUp(self):<br />        self._old_root_urlconf = settings.ROOT_URLCONF<br />        settings.ROOT_URLCONF = 'app.testurls'<br /><br />    def tearDown(self):<br />        self.ROOT_URLCONF = self._old_root_urlconf<br /></pre>
<p>Remember that setUp() is run just before each test, and tearDown() is run just after. What we're doing is replacing the URLConf just for the duration of the test. We have to put it back in the tearDown(), else we'll end up with test state 'leakage' into other tests.<br /></p>
<p>You then just have to add a testurls.py file to the 'app' application (the one I'm testing) which contains known URLs  for the views I want to invoke. For example, my testurls.py might look like this:</p>
<pre>from django.conf.urls.defaults import *<br /><br />urlpatterns = patterns('app.views',<br />    (r'^foo/',  'say_hi'),<br />)</pre>
<p>I can then write standard test client code to check that my views are working as expected:</p>
<pre>class RealTest(TestUrlsTestCase):<br />                <br />    def testGet(self):<br />        response = self.client.get('/foo/')<br />        self.assertEqual(200, response.status_code)</pre>
<p>It now doesn't matter how the project integrator has configured their urls.py. When the tests are run, the URLConf that I have specified will always be used.<br /></p>
<p><br /></p>]]></content:encoded>
    <dc:publisher>No publisher</dc:publisher>
    <dc:creator>Dan Fairs</dc:creator>
    <dc:rights></dc:rights>
    
      <dc:subject>django</dc:subject>
    
    <dc:date>2008-07-29T21:40:03Z</dc:date>
    <dc:type>Page</dc:type>
  </item>


  <item rdf:about="http://www.stereoplex.com/blog/django-unit-tests-and-transactions">
    <title>Django Unit Tests and Transactions</title>
    <link>http://www.stereoplex.com/blog/django-unit-tests-and-transactions</link>
    <description>While these are more properly integration tests than unit tests, it can be handy to have Django roll back the database transaction after each test method runs.</description>
    <content:encoded xmlns:content="http://purl.org/rss/1.0/modules/content/"><![CDATA[<p>Coming to automated testing in Django from the Zope and Plone world, I was pleased to find full support for all the testing machinery that I've become used to: regular Python unit tests, and doctests. Of course, these being unit tests, they don't do any 'framework' management out of the box. <br /></p>
<p>Unit tests are supposed to test your code, and just your code. However, once you're in a framework environment (be that Zope and Plone, Django, or anything else) then testing how your code integrates with that framework is vital. Zope and Plone provide unittest.TestCase subclasses (ZopeTestCase and PloneTestCase respectively) which provide a lot of scaffolding for you to be able to run integration tests. Part of that scaffolding is automatic transaction management. This hooks into Zope's transaction API to roll back the transaction after each test runs. <br /></p>
<p>I wanted to do something similar for my Django test cases; I was finding 'state pollution' between my unit test runs, since data created by one test method isn't automatically cleaned out.<br /></p>
<p>Django's transaction handling is much simpler than Zope's: it cares only about the one database transaction that the current request has, and only if the transaction support middleware is installed. This means that we can pretty easily crib the code from that middleware and use it in a test case base class:</p>
<pre>from django.db import transaction<br /><br />class TransactionalTestCase(unittest.TestCase):<br />    <br />    def setUp(self):<br />        super(TransactionalTestCase, self).setUp()<br />        <br />        transaction.enter_transaction_management()<br />        transaction.managed(True)<br />        <br />    def tearDown(self):<br />        super(TransactionalTestCase, self).tearDown()<br /><br />        if transaction.is_dirty():<br />            transaction.rollback()<br />        transaction.leave_transaction_management()</pre>
<p><b>UPDATE:</b> Fixed an error in the call to the base class' tearDown() method, which caused open transactions to hang around and (among other things) prevented the test database being cleanly dropped at the end of the test run.<br /></p>
<p>After this, you can simply derive your test fixture classes from TransactionalTestCase, and make sure that you call the base setUp() and tearDown() methods if you do need to override them to perform your own setup and teardown.</p>
<p>My next spare time (hah!) project will be to integrate Django's transaction management into <a href="http://svn.repoze.org/repoze.tm/trunk/">repoze.tm</a> (which is Zope's transaction management suitably WSGI-fied). This would let a Django application participate in transactions with other transaction-aware components, making integration at the WSGI layer much more straightforward.<br /></p>]]></content:encoded>
    <dc:publisher>No publisher</dc:publisher>
    <dc:creator>Dan Fairs</dc:creator>
    <dc:rights></dc:rights>
    
      <dc:subject>django</dc:subject>
    
    <dc:date>2008-07-01T21:48:38Z</dc:date>
    <dc:type>Page</dc:type>
  </item>




</rdf:RDF>

