A new try for docs

This commit is contained in:
Anssi Kääriäinen 2014-01-11 19:00:43 +02:00
parent 33aa18a6e3
commit f9cc039007
2 changed files with 257 additions and 241 deletions

View File

@ -0,0 +1,257 @@
==============
Custom lookups
==============
.. module:: django.db.models.lookups
:synopsis: Custom lookups
.. currentmodule:: django.db.models
By default Django offers a wide variety of different lookups for filtering
(for example, `exact` and `icontains`). This documentation explains how to
write custom lookups and how to alter the working of existing lookups. In
addition how to transform field values is explained. fFor example how to
extract the year from a DateField. By writing a custom `YearExtract`
transformer it is possible to filter on the transformed value, for example::
Author.objects.filter(birthdate__year__lte=1981)
Currently transformers are only available in filtering. So, it is not possible
to use it in other parts of the ORM, for example this will not work::
Author.objects.values_list('birthdate__year')
A simple Lookup example
~~~~~~~~~~~~~~~~~~~~~~~
Lets start with a simple custom lookup. We will write a custom lookup `ne`
which works opposite to `exact`. A `Author.objects.filter(name__ne='Jack')`
will translate to::
"author"."name" <> 'Jack'
A custom lookup will need an implementation and Django needs to be told
the existence of the lookup. The implementation for this lookup will be
simple to write::
from django.db.models import Lookup
class NotEqual(Lookup):
lookup_name = 'ne'
def as_sql(self, qn, connection):
lhs, lhs_params = self.process_lhs(qn, connection)
rhs, rhs_params = self.process_rhs(qn, connection)
params = lhs_params + rhs_params
return '%s <> %s' % (lhs, rhs), params
To register the `NotEqual` lookup we will just need to call register_lookup
on the field class we want the lookup to be available::
from django.db.models.fields import Field
Field.register_lookup(NotEqual)
Now Field and all its subclasses have a NotEqual lookup.
The first notable thing about `NotEqual` is the lookup_name. This name must
be supplied, and it is used by Django in the register_lookup() call so that
Django knows to associate `ne` to the NotEqual implementation.
`
An Lookup works against two values, lhs and rhs. The abbreviations stand for
left-hand side and right-hand side. The lhs is usually a field reference,
but it can be anything implementing the query expression API. The
rhs is the value given by the user. In the example `name__ne=Jack`, the
lhs is reference to Author's name field and Jack is the value.
The lhs and rhs are turned into values that are possible to use in SQL.
In the example above lhs is turned into "author"."name", [], and rhs is
turned into "%s", ['Jack']. The lhs is just raw string without parameters
but the rhs is turned into a query parameter 'Jack'.
Finally we combine the lhs and rhs by adding ` <> ` in between of them,
and supply all the parameters for the query.
A Lookup needs to implement a limited part of query expression API. See
the query expression API for details.
A simple transformer example
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We will next write a simple transformer. The transformer will be called
`YearExtract`. It can be used to extract the year part from `DateField`.
Lets start by writing the implementation::
from django.db.models import Extract
class YearExtract(Extract):
lookup_name = 'year'
output_type = IntegerField()
def as_sql(self, qn, connection):
lhs, params = qn.compile(self.lhs)
return "EXTRACT(YEAR FROM %s)" % lhs, params
Next, lets register it for `DateField`::
from django.db.models import DateField
DateField.register_lookup(YearExtract)
Now any DateField in your project will have `year` transformer. For example
the following query::
Author.objects.filter(birthdate__year__lte=1981)
would translate to the following query on PostgreSQL::
SELECT ...
FROM "author"
WHERE EXTRACT(YEAR FROM "author"."birthdate") <= 1981
An YearExtract class works only against self.lhs. Usually the lhs is
transformed in some way. Further lookups and extracts work against the
transformed value.
Note the definition of output_type in the `YearExtract`. The output_type is
a field instance. It informs Django that the Extract class transformed the
type of the value to an int. This is currently used only to check which
lookups the extract has.
The used SQL in this example works on most databases. Check you database
vendor's documentation to see if EXTRACT(year from date) is supported.
Writing an efficient year__exact lookup
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
When using the above written `year` lookup, the SQL produced will not use
indexes efficiently. We will fix that by writing a custom `exact` lookup
for YearExtract. For example if the user filters on
`birthdate__year__exact=1981`, then we want to produce the following SQL::
birthdate >= to_date('1981-01-01') AND birthdate <= to_date('1981-12-31')
The implementation is::
from django.db.models import Lookup
class YearExact(Lookup):
lookup_name = 'exact'
def as_sql(self, qn, connection):
lhs, lhs_params = qn.compile(self.lhs.lhs)
rhs, rhs_params = self.process_rhs(qn, connection)
params = lhs_params + rhs_params + lhs_params + rhs_params
return '%s >= to_date(%s || '-01-01') AND %s <= to_date(%s || '-12-31') % (lhs, rhs, lhs, rhs), params
YearExtract.register_lookup(YearExact)
There are a couple of notable things going on. First, `YearExact` isn't
calling process_lhs(). Instead it skips and compiles directly the lhs used by
self.lhs. The reason this is done is to skip `YearExtract` from adding the
EXTRACT clause to the query. Referring directly to self.lhs.lhs is safe as
`YearExact` can be accessed only from `year__exact` lookup, that is the lhs
is always `YearExtract`.
Next, as both the lhs and rhs are used multiple times in the query the params
need to contain lhs_params and rhs_params multiple times.
The final query does string manipulation directly in the database. The reason
for doing this is that if the self.rhs is something else than a plain integer
value (for exampel a `F()` reference) we can't do the transformations in
Python.
Writing alternative implemenatations for existing lookups
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sometimes different database vendors require different SQL for the same
operation. For this example we will rewrite a custom implementation for
MySQL for the NotEqual operator. Instead of `<>` we will be using `!=`
operator.
There are two ways to do this. The first is to write a subclass with a
as_mysql() method and registering the subclass over the original class::
class MySQLNotEqual(NotEqual):
def as_mysql(self, qn, connection):
lhs, lhs_params = self.process_lhs(qn, connection)
rhs, rhs_params = self.process_rhs(qn, connection)
params = lhs_params + rhs_params
return '%s != %s' % (lhs, rhs), params
Field.register_lookup(MySQLNotExact)
The alternate is to monkey-patch the existing class in place::
def as_mysql(self, qn, connection):
lhs, lhs_params = self.process_lhs(qn, connection)
rhs, rhs_params = self.process_rhs(qn, connection)
params = lhs_params + rhs_params
return '%s != %s' % (lhs, rhs), params
NotEqual.as_mysql = as_mysql
The subclass way allows one to override methods of the lookup if needed. The
monkey-patch way allows writing different implementations for the same class
in different locations of the project.
The way Django knows to call as_mysql() instead of as_sql() is as follows.
When qn.compile(notequal_instance) is called, Django first checks if there
is a method named 'as_%s' % connection.vendor. If that method doesn't exist,
the as_sql() will be called.
The vendor names for Django's in-built backends are 'sqlite', 'postgresql',
'oracle' and 'mysql'.
The Lookup API
~~~~~~~~~~~~~~
An lookup has attributes lhs and rhs. The lhs is something implementing the
query expression API and the rhs is either a plain value, or something that
needs to be compiled into SQL. Examples of SQL-compiled values include `F()`
references and usage of `QuerySets` as value.
A lookup needs to define lookup_name as a class level attribute. This is used
when registering lookups.
A lookup has three public methods. The as_sql(qn, connection) method needs
to produce a query string and parameters used by the query string. The qn has
a method compile() which can be used to compile self.lhs. However usually it
is better to call self.process_lhs(qn, connection) instead, which returns
query string and parameters for the lhs. Similary process_rhs(qn, connection)
returns query string and parameters for the rhs.
The Query Expression API
~~~~~~~~~~~~~~~~~~~~~~~~
A lookup can assume that the lhs responds to the query expression API.
Currently direct field references, aggregates and `Extract` instances respond
to this API.
.. method:: as_sql(qn, connection)
Responsible for producing the query string and parameters for the expression.
The qn has a compile() method that can be used to compile other expressions.
The connection is the connection used to execute the query. The
connection.vendor attribute can be used to return different query strings
for different backends.
Calling expression.as_sql() directly is usually an error - instead
qn.compile(expression) should be used. The qn.compile() method will take
care of calling vendor-specific methods of the expression.
.. method:: as_vendorname(qn, connection)
Works like as_sql() method. When an expression is compiled by qn.compile()
Django will first try to call as_vendorname(), where vendorname is the vendor
name of the backend used for executing the query. The vendorname is one of
'postgresql', 'oracle', 'sqlite' or 'mysql' for Django's inbuilt backends.
.. method:: get_lookup(lookup_name)::
The get_lookup() method is used to fetch lookups. By default the lookup
is fetched from the expression's output type, but it is possible to override
this method to alter that behaviour.
.. attribute:: output_type
The output_type attribute is used by the get_lookup() method to check for
lookups. The output_type should be a field instance.
Note that this documentation lists only the public methods of the API.

View File

@ -1,241 +0,0 @@
==============
Custom lookups
==============
.. module:: django.db.models.lookups
:synopsis: Custom lookups
.. currentmodule:: django.db.models
Django's ORM works using lookup paths when building query filters and other
query conditions. For example in the query Book.filter(author__age__lte=30)
the part "author__age__lte" is the lookup path.
The lookup path consist of three different parts. First is the related
lookups. In the author__age__lte example the part author refers to Book's
related model Author. Second part of the lookup path is the field. This is
Author's age field in the example. Finally the lte part is commonly called
just lookup. Both the related lookups part and the final lookup part can
contain multiple parts, for example "author__friends__birthdate__year__lte"
has author, friends as related lookups, birthdate as the field and year, lte
as final lookup part.
This documentation concentrates on writing custom lookups. By writing custom
lookups it is possible to control how Django interprets the final lookup part.
Django will fetch a ``Lookup`` class from the final field using the field's
method get_lookup(lookup_name). This method is allowed to do these things:
1. Return a Lookup class
2. Raise a FieldError
3. Return None
Returning None is only available during backwards compatibility period.
The interpretation is to use the old way of lookup hadling inside the ORM.
The Lookup class
~~~~~~~~~~~~~~~~
A Lookup operates on two values and produces boolean results. The values
are called lhs and rhs. The lhs is usually a field reference, but it can be
anything implementing the query expression API. The rhs is a value to compare
against.
The API is as follows:
.. attribute:: lookup_name
A string used by Django to distinguish different lookups. For example
'exact'.
.. method:: __init__(lhs, rhs)
The lhs is something implementing the query expression API. For example in
author__age__lte=30 the lhs is a Col instance referencing the age field of
author model. The rhs is the value to compare against. It can be Python value
(30 in the example) or SQL reference (produced by using F() or queryset for
example).
.. attribute:: Lookup.lhs
The left hand side part of this lookup. You can assume it implements the
query expression interface.
.. attribute:: Lookup.rhs
The value to compare against.
.. method:: Lookup.process_lhs(qn, connection)
Turns the lhs into query string + params.
.. method:: Lookup.process_rhs(qn, connection)
Turns the rhs into query string + params.
.. method:: Lookup.as_sql(qn, connection)
This method is used to produce the query string of the Lookup. A typical
implementation is usually something like::
def as_sql(self, qn, connection):
lhs, params = self.process_lhs(qn, connection)
rhs, rhs_params = self.process_rhs(qn, connection)
params = lhs_params.extend(rhs_params)
return '%s <OPERATOR> %s', (lhs, rhs), params
where the <OPERATOR> is some query operator. The qn is a callable that
can be used to convert strings to quoted variants (that is, colname to
"colname"). Note that the quotation is *not* safe against SQL injection.
In addition the qn implements method compile() which can be used to turn
anything with as_sql() method to query string. You should always call
qn.compile(part) instead of part.as_sql(qn, connection) so that 3rd party
backends have ability to customize the produced query string. More of this
later on.
The connection is the connection the SQL is compiled against.
In addition the Lookup class has some private methods - that is, implementing
just the above mentioned attributes and methods is not enough, instead you
must subclass Lookup.
The Extract class
~~~~~~~~~~~~~~~~~
An Extract is something that converts a value to another value in the query
string. For example you could have an Extract that procudes modulo 3 of the
given value. In SQL this is something like "author"."age" % 3.
Extracts are used in nested lookups. The Extract class must implement the
query part interface.
Extracts should be written by subclassing django.db.models.Extract.
A simple Lookup example
~~~~~~~~~~~~~~~~~~~~~~~
This is how to write a simple mod3 lookup for IntegerField::
from django.db.models import Lookup, IntegerField
class Mod3(Lookup):
lookup_name = 'mod3'
def as_sql(self, qn, connection):
lhs_sql, params = self.process_lhs(qn, connection)
rhs_sql, rhs_params = self.process_rhs(qn, connection)
params.extend(rhs_params)
# We need doulbe-escaping for the %%%% operator.
return '%s %%%% %s' % (lhs_sql, rhs_sql), params
IntegerField.register_lookup(Div3)
Now all IntegerFields or subclasses of IntegerField will have
a mod3 lookup. For example you could do Author.objects.filter(age__mod3=2).
This query would return every author whose age % 3 == 2.
A simple nested lookup example
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Here is how to write an Extract and a Lookup for IntegerField. The example
lookup can be used similarly as the above mod3 lookup, and in addition it
support nesting lookups::
class Mod3Extract(Extract):
lookup_name = 'mod3'
def as_sql(self, qn, connection):
lhs, lhs_params = qn.compile(self.lhs)
return '%s %%%% 3' % (lhs,), lhs_params
IntegerField.register_lookup(Mod3Extract)
Note that if you already added Mod3 for IntegerField in the above
example, now Mod3Extract will override that lookup.
This lookup can be used like Mod3 lookup, but in addition it supports
nesting, too. The default output type for Extracts is the same type as the
lhs' output_type. So, the Mod3Extract supports all the same lookups as
IntegerField. For example Author.objects.filter(age__mod3__in=[1, 2])
returns all authors for which age % 3 in (1, 2).
A more complex nested lookup
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
We will write a Year lookup that extracts year from date field. This
field will convert the output type of the field - the lhs (or "input")
field is DateField, but output is of type IntegerField.::
from django.db.models import IntegerField, DateField
from django.db.models.lookups import Extract
class YearExtract(Extract):
lookup_name = 'year'
def as_sql(self, qn, connection):
lhs_sql, params = qn.compile(self.lhs)
# hmmh - this is internal API...
return connection.ops.date_extract_sql('year', lhs_sql), params
@property
def output_type(self):
return IntegerField()
DateField.register_lookup(YearExtract)
Now you could write Author.objects.filter(birthdate__year=1981). This will
produce SQL like 'EXTRACT('year' from "author"."birthdate") = 1981'. The
produces SQL depends on used backend. In addtition you can use any lookup
defined for IntegerField, even div3 if you added that. So,
Authos.objects.filter(birthdate__year__div3=2) will return every author
with birthdate.year % 3 == 2.
We could go further and add an optimized implementation for exact lookups::
from django.db.models.lookups import Lookup
class YearExtractOptimized(YearExtract):
def get_lookup(self, lookup):
if lookup == 'exact':
return YearExact
return super(YearExtractOptimized, self).get_lookup()
class YearExact(Lookup):
def as_sql(self, qn, connection):
# We will need to skip the extract part, and instead go
# directly with the originating field, that is self.lhs.lhs
lhs_sql, lhs_params = self.process_lhs(qn, connection, self.lhs.lhs)
rhs_sql, rhs_params = self.process_rhs(qn, connection)
# Note that we must be careful so that we have params in the
# same order as we have the parts in the SQL.
params = []
params.extend(lhs_params)
params.extend(rhs_params)
params.extend(lhs_params)
params.extend(rhs_params)
# We use PostgreSQL specific SQL here. Note that we must do the
# conversions in SQL instead of in Python to support F() references.
return ("%(lhs)s >= (%(rhs)s || '-01-01')::date "
"AND %(lhs)s <= (%(rhs)s || '-12-31')::date" %
{'lhs': lhs_sql, 'rhs': rhs_sql}, params)
Note that we used PostgreSQL specific SQL above. What if we want to support
MySQL, too? This can be done by registering a different compiling implementation
for MySQL::
from django.db.backends.utils import add_implementation
@add_implementation(YearExact, 'mysql')
def mysql_year_exact(node, qn, connection):
lhs_sql, lhs_params = node.process_lhs(qn, connection, node.lhs.lhs)
rhs_sql, rhs_params = node.process_rhs(qn, connection)
params = []
params.extend(lhs_params)
params.extend(rhs_params)
params.extend(lhs_params)
params.extend(rhs_params)
return ("%(lhs)s >= str_to_date(concat(%(rhs)s, '-01-01'), '%%%%Y-%%%%m-%%%%d') "
"AND %(lhs)s <= str_to_date(concat(%(rhs)s, '-12-31'), '%%%%Y-%%%%m-%%%%d')" %
{'lhs': lhs_sql, 'rhs': rhs_sql}, params)
Now, on MySQL instead of calling as_sql() of the YearExact Django will use the
above compile implementation.