359 lines
14 KiB
Plaintext
359 lines
14 KiB
Plaintext
========================
|
|
Django's cache framework
|
|
========================
|
|
|
|
So, you got slashdotted_. Now what?
|
|
|
|
Django's cache framework gives you three methods of caching dynamic pages in
|
|
memory or in a database. You can cache the output of entire pages, you can
|
|
cache only the pieces that are difficult to produce, or you can cache your
|
|
entire site.
|
|
|
|
.. _slashdotted: http://en.wikipedia.org/wiki/Slashdot_effect
|
|
|
|
Setting up the cache
|
|
====================
|
|
|
|
The cache framework allows for different "backends" -- different methods of
|
|
caching data. There's a simple single-process memory cache (mostly useful as a
|
|
fallback) and a memcached_ backend (the fastest option, by far, if you've got
|
|
the RAM).
|
|
|
|
Before using the cache, you'll need to tell Django which cache backend you'd
|
|
like to use. Do this by setting the ``CACHE_BACKEND`` in your settings file.
|
|
|
|
The ``CACHE_BACKEND`` setting is a "fake" URI (really an unregistered scheme).
|
|
Examples:
|
|
|
|
============================== ===========================================
|
|
CACHE_BACKEND Explanation
|
|
============================== ===========================================
|
|
memcached://127.0.0.1:11211/ A memcached backend; the server is running
|
|
on localhost port 11211. You can use
|
|
multiple memcached servers by separating
|
|
them with semicolons.
|
|
|
|
db://tablename/ A database backend in a table named
|
|
"tablename". This table should be created
|
|
with "django-admin createcachetable".
|
|
|
|
file:///var/tmp/django_cache/ A file-based cache stored in the directory
|
|
/var/tmp/django_cache/.
|
|
|
|
simple:/// A simple single-process memory cache; you
|
|
probably don't want to use this except for
|
|
testing. Note that this cache backend is
|
|
NOT thread-safe!
|
|
|
|
locmem:/// A more sophisticated local memory cache;
|
|
this is multi-process- and thread-safe.
|
|
|
|
dummy:/// Doesn't actually cache; just implements the
|
|
cache backend interface and doesn't do
|
|
anything. This is an easy way to turn off
|
|
caching for a test environment.
|
|
============================== ===========================================
|
|
|
|
All caches may take arguments -- they're given in query-string style. Valid
|
|
arguments are:
|
|
|
|
timeout
|
|
Default timeout, in seconds, to use for the cache. Defaults to 5
|
|
minutes (300 seconds).
|
|
|
|
max_entries
|
|
For the simple and database backends, the maximum number of entries
|
|
allowed in the cache before it is cleaned. Defaults to 300.
|
|
|
|
cull_percentage
|
|
The percentage of entries that are culled when max_entries is reached.
|
|
The actual percentage is 1/cull_percentage, so set cull_percentage=3 to
|
|
cull 1/3 of the entries when max_entries is reached.
|
|
|
|
A value of 0 for cull_percentage means that the entire cache will be
|
|
dumped when max_entries is reached. This makes culling *much* faster
|
|
at the expense of more cache misses.
|
|
|
|
For example::
|
|
|
|
CACHE_BACKEND = "memcached://127.0.0.1:11211/?timeout=60"
|
|
|
|
Invalid arguments are silently ignored, as are invalid values of known
|
|
arguments.
|
|
|
|
.. _memcached: http://www.danga.com/memcached/
|
|
|
|
The per-site cache
|
|
==================
|
|
|
|
Once the cache is set up, the simplest way to use the cache is to cache your
|
|
entire site. Just add ``django.middleware.cache.CacheMiddleware`` to your
|
|
``MIDDLEWARE_CLASSES`` setting, as in this example::
|
|
|
|
MIDDLEWARE_CLASSES = (
|
|
"django.middleware.cache.CacheMiddleware",
|
|
"django.middleware.common.CommonMiddleware",
|
|
)
|
|
|
|
(The order of ``MIDDLEWARE_CLASSES`` matters. See "Order of MIDDLEWARE_CLASSES"
|
|
below.)
|
|
|
|
Then, add the following required settings to your Django settings file:
|
|
|
|
* ``CACHE_MIDDLEWARE_SECONDS`` -- The number of seconds each page should be
|
|
cached.
|
|
* ``CACHE_MIDDLEWARE_KEY_PREFIX`` -- If the cache is shared across multiple
|
|
sites using the same Django installation, set this to the name of the site,
|
|
or some other string that is unique to this Django instance, to prevent key
|
|
collisions. Use an empty string if you don't care.
|
|
|
|
The cache middleware caches every page that doesn't have GET or POST
|
|
parameters. Additionally, ``CacheMiddleware`` automatically sets a few headers
|
|
in each ``HttpResponse``:
|
|
|
|
* Sets the ``Last-Modified`` header to the current date/time when a fresh
|
|
(uncached) version of the page is requested.
|
|
* Sets the ``Expires`` header to the current date/time plus the defined
|
|
``CACHE_MIDDLEWARE_SECONDS``.
|
|
* Sets the ``Cache-Control`` header to give a max age for the page -- again,
|
|
from the ``CACHE_MIDDLEWARE_SECONDS`` setting.
|
|
|
|
See the `middleware documentation`_ for more on middleware.
|
|
|
|
.. _`middleware documentation`: http://www.djangoproject.com/documentation/middleware/
|
|
|
|
The per-page cache
|
|
==================
|
|
|
|
A more granular way to use the caching framework is by caching the output of
|
|
individual views. ``django.views.decorators.cache`` defines a ``cache_page``
|
|
decorator that will automatically cache the view's response for you. It's easy
|
|
to use::
|
|
|
|
from django.views.decorators.cache import cache_page
|
|
|
|
def slashdot_this(request):
|
|
...
|
|
|
|
slashdot_this = cache_page(slashdot_this, 60 * 15)
|
|
|
|
Or, using Python 2.4's decorator syntax::
|
|
|
|
@cache_page(60 * 15)
|
|
def slashdot_this(request):
|
|
...
|
|
|
|
``cache_page`` takes a single argument: the cache timeout, in seconds. In the
|
|
above example, the result of the ``slashdot_this()`` view will be cached for 15
|
|
minutes.
|
|
|
|
The low-level cache API
|
|
=======================
|
|
|
|
Sometimes, however, caching an entire rendered page doesn't gain you very much.
|
|
For example, you may find it's only necessary to cache the result of an
|
|
intensive database. In cases like this, you can use the low-level cache API to
|
|
store objects in the cache with any level of granularity you like.
|
|
|
|
The cache API is simple::
|
|
|
|
# The cache module exports a cache object that's automatically
|
|
# created from the CACHE_BACKEND setting.
|
|
>>> from django.core.cache import cache
|
|
|
|
# The basic interface is set(key, value, timeout_seconds) and get(key).
|
|
>>> cache.set('my_key', 'hello, world!', 30)
|
|
>>> cache.get('my_key')
|
|
'hello, world!'
|
|
|
|
# (Wait 30 seconds...)
|
|
>>> cache.get('my_key')
|
|
None
|
|
|
|
# get() can take a default argument.
|
|
>>> cache.get('my_key', 'has_expired')
|
|
'has_expired'
|
|
|
|
# There's also a get_many() interface that only hits the cache once.
|
|
# Also, note that the timeout argument is optional and defaults to what
|
|
# you've given in the settings file.
|
|
>>> cache.set('a', 1)
|
|
>>> cache.set('b', 2)
|
|
>>> cache.set('c', 3)
|
|
|
|
# get_many() returns a dictionary with all the keys you asked for that
|
|
# actually exist in the cache (and haven't expired).
|
|
>>> cache.get_many(['a', 'b', 'c'])
|
|
{'a': 1, 'b': 2, 'c': 3}
|
|
|
|
# There's also a way to delete keys explicitly.
|
|
>>> cache.delete('a')
|
|
|
|
That's it. The cache has very few restrictions: You can cache any object that
|
|
can be pickled safely, although keys must be strings.
|
|
|
|
Controlling cache: Using Vary headers
|
|
=====================================
|
|
|
|
The Django cache framework works with `HTTP Vary headers`_ to allow developers
|
|
to instruct caching mechanisms to differ their cache contents depending on
|
|
request HTTP headers.
|
|
|
|
Essentially, the ``Vary`` response HTTP header defines which request headers a
|
|
cache mechanism should take into account when building its cache key.
|
|
|
|
By default, Django's cache system creates its cache keys using the requested
|
|
path -- e.g., ``"/stories/2005/jun/23/bank_robbed/"``. This means every request
|
|
to that URL will use the same cached version, regardless of user-agent
|
|
differences such as cookies or language preferences.
|
|
|
|
That's where ``Vary`` comes in.
|
|
|
|
If your Django-powered page outputs different content based on some difference
|
|
in request headers -- such as a cookie, or language, or user-agent -- you'll
|
|
need to use the ``Vary`` header to tell caching mechanisms that the page output
|
|
depends on those things.
|
|
|
|
To do this in Django, use the convenient ``vary_on_headers`` view decorator,
|
|
like so::
|
|
|
|
from django.views.decorators.vary import vary_on_headers
|
|
|
|
# Python 2.3 syntax.
|
|
def my_view(request):
|
|
...
|
|
my_view = vary_on_headers(my_view, 'User-Agent')
|
|
|
|
# Python 2.4 decorator syntax.
|
|
@vary_on_headers('User-Agent')
|
|
def my_view(request):
|
|
...
|
|
|
|
In this case, a caching mechanism (such as Django's own cache middleware) will
|
|
cache a separate version of the page for each unique user-agent.
|
|
|
|
The advantage to using the ``vary_on_headers`` decorator rather than manually
|
|
setting the ``Vary`` header (using something like
|
|
``response['Vary'] = 'user-agent'``) is that the decorator adds to the ``Vary``
|
|
header (which may already exist) rather than setting it from scratch.
|
|
|
|
Note that you can pass multiple headers to ``vary_on_headers()``::
|
|
|
|
@vary_on_headers('User-Agent', 'Cookie')
|
|
def my_view(request):
|
|
...
|
|
|
|
Because varying on cookie is such a common case, there's a ``vary_on_cookie``
|
|
decorator. These two views are equivalent::
|
|
|
|
@vary_on_cookie
|
|
def my_view(request):
|
|
...
|
|
|
|
@vary_on_headers('Cookie')
|
|
def my_view(request):
|
|
...
|
|
|
|
Also note that the headers you pass to ``vary_on_headers`` are not case
|
|
sensitive. ``"User-Agent"`` is the same thing as ``"user-agent"``.
|
|
|
|
You can also use a helper function, ``patch_vary_headers()``, directly::
|
|
|
|
from django.utils.cache import patch_vary_headers
|
|
def my_view(request):
|
|
...
|
|
response = render_to_response('template_name', context)
|
|
patch_vary_headers(response, ['Cookie'])
|
|
return response
|
|
|
|
``patch_vary_headers`` takes an ``HttpResponse`` instance as its first argument
|
|
and a list/tuple of header names as its second argument.
|
|
|
|
.. _`HTTP Vary headers`: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.44
|
|
|
|
Controlling cache: Using other headers
|
|
======================================
|
|
|
|
Another problem with caching is the privacy of data and the question of where
|
|
data should be stored in a cascade of caches.
|
|
|
|
A user usually faces two kinds of caches: his own browser cache (a private
|
|
cache) and his provider's cache (a public cache). A public cache is used by
|
|
multiple users and controlled by someone else. This poses problems with
|
|
sensitive data: You don't want, say, your banking-account number stored in a
|
|
public cache. So Web applications need a way to tell caches which data is
|
|
private and which is public.
|
|
|
|
The solution is to indicate a page's cache should be "private." To do this in
|
|
Django, use the ``cache_control`` view decorator. Example::
|
|
|
|
from django.views.decorators.cache import cache_control
|
|
@cache_control(private=True)
|
|
def my_view(request):
|
|
...
|
|
|
|
This decorator takes care of sending out the appropriate HTTP header behind the
|
|
scenes.
|
|
|
|
There are a few other ways to control cache parameters. For example, HTTP
|
|
allows applications to do the following:
|
|
|
|
* Define the maximum time a page should be cached.
|
|
* Specify whether a cache should always check for newer versions, only
|
|
delivering the cached content when there are no changes. (Some caches
|
|
might deliver cached content even if the server page changed -- simply
|
|
because the cache copy isn't yet expired.)
|
|
|
|
In Django, use the ``cache_control`` view decorator to specify these cache
|
|
parameters. In this example, ``cache_control`` tells caches to revalidate the
|
|
cache on every access and to store cached versions for, at most, 3600 seconds::
|
|
|
|
from django.views.decorators.cache import cache_control
|
|
@cache_control(must_revalidate=True, max_age=3600)
|
|
def my_view(request):
|
|
...
|
|
|
|
Any valid ``Cache-Control`` directive is valid in ``cache_control()``. For a
|
|
full list, see the `Cache-Control spec`_. Just pass the directives as keyword
|
|
arguments to ``cache_control()``, substituting underscores for hyphens. For
|
|
directives that don't take an argument, set the argument to ``True``.
|
|
|
|
Examples:
|
|
|
|
* ``@cache_control(max_age=3600)`` turns into ``max-age=3600``.
|
|
* ``@cache_control(public=True)`` turns into ``public``.
|
|
|
|
(Note that the caching middleware already sets the cache header's max-age with
|
|
the value of the ``CACHE_MIDDLEWARE_SETTINGS`` setting. If you use a custom
|
|
``max_age`` in a ``cache_control`` decorator, the decorator will take
|
|
precedence, and the header values will be merged correctly.)
|
|
|
|
.. _`Cache-Control spec`: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9
|
|
|
|
Other optimizations
|
|
===================
|
|
|
|
Django comes with a few other pieces of middleware that can help optimize your
|
|
apps' performance:
|
|
|
|
* ``django.middleware.http.ConditionalGetMiddleware`` adds support for
|
|
conditional GET. This makes use of ``ETag`` and ``Last-Modified``
|
|
headers.
|
|
|
|
* ``django.middleware.gzip.GZipMiddleware`` compresses content for browsers
|
|
that understand gzip compression (all modern browsers).
|
|
|
|
Order of MIDDLEWARE_CLASSES
|
|
===========================
|
|
|
|
If you use ``CacheMiddleware``, it's important to put it in the right place
|
|
within the ``MIDDLEWARE_CLASSES`` setting, because the cache middleware needs
|
|
to know which headers by which to vary the cache storage. Middleware always
|
|
adds something the ``Vary`` response header when it can.
|
|
|
|
Put the ``CacheMiddleware`` after any middlewares that might add something to
|
|
the ``Vary`` header. The following middlewares do so:
|
|
|
|
* ``SessionMiddleware`` adds ``Cookie``
|
|
* ``GZipMiddleware`` adds ``Accept-Encoding``
|