From f2dc4429a1da04c858364972eea57a35a868dab4 Mon Sep 17 00:00:00 2001 From: Marc Tamlyn Date: Sun, 12 Jan 2014 13:15:05 +0000 Subject: [PATCH] Reworked custom lookups docs. Mostly just formatting and rewording, but also replaced the example using ``YearExtract`` to use an example which is unlikely to ever be possible directly in the ORM. --- docs/ref/models/custom_lookups.txt | 334 ++++++++++++++++------------- 1 file changed, 188 insertions(+), 146 deletions(-) diff --git a/docs/ref/models/custom_lookups.txt b/docs/ref/models/custom_lookups.txt index aee70c5c77..ef8ec7f40e 100644 --- a/docs/ref/models/custom_lookups.txt +++ b/docs/ref/models/custom_lookups.txt @@ -2,37 +2,33 @@ Custom lookups ============== +.. versionadded:: 1.7 + .. module:: django.db.models.lookups :synopsis: Custom lookups .. currentmodule:: django.db.models -By default Django offers a wide variety of different lookups for filtering -(for example, `exact` and `icontains`). This documentation explains how to -write custom lookups and how to alter the working of existing lookups. In -addition how to transform field values is explained. fFor example how to -extract the year from a DateField. By writing a custom `YearExtract` -transformer it is possible to filter on the transformed value, for example:: - - Author.objects.filter(birthdate__year__lte=1981) - -Currently transformers are only available in filtering. So, it is not possible -to use it in other parts of the ORM, for example this will not work:: - - Author.objects.values_list('birthdate__year') +By default Django offers a wide variety of :ref:`built-in lookups +` for filtering (for example, ``exact`` and ``icontains``). This +documentation explains how to write custom lookups and how to alter the working +of existing lookups. A simple Lookup example ~~~~~~~~~~~~~~~~~~~~~~~ -Lets start with a simple custom lookup. We will write a custom lookup `ne` -which works opposite to `exact`. A `Author.objects.filter(name__ne='Jack')` -will translate to:: +Let's start with a simple custom lookup. We will write a custom lookup ``ne`` +which works opposite to ``exact``. ``Author.objects.filter(name__ne='Jack')`` +will translate to the SQL:: "author"."name" <> 'Jack' -A custom lookup will need an implementation and Django needs to be told -the existence of the lookup. The implementation for this lookup will be -simple to write:: +This SQL is backend independent, so we don't need to worry about different +databases. + +There are two steps to making this work. Firstly we need to implement the +lookup, then we need to tell Django about it. The implementation is quite +straightforwards:: from django.db.models import Lookup @@ -45,131 +41,165 @@ simple to write:: params = lhs_params + rhs_params return '%s <> %s' % (lhs, rhs), params -To register the `NotEqual` lookup we will just need to call register_lookup -on the field class we want the lookup to be available:: +To register the ``NotEqual`` lookup we will just need to call +``register_lookup`` on the field class we want the lookup to be available. In +this case, the lookup makes sense on all ``Field`` subclasses, so we register +it with ``Field`` directly:: from django.db.models.fields import Field Field.register_lookup(NotEqual) -Now Field and all its subclasses have a NotEqual lookup. +We can now use ``foo__ne`` for any field ``foo``. You will need to ensure that +this registration happens before you try to create any querysets using it. You +could place the implementation in a ``models.py`` file, or register the lookup +in the ``ready()`` method of an ``AppConfig``. -The first notable thing about `NotEqual` is the lookup_name. This name must -be supplied, and it is used by Django in the register_lookup() call so that -Django knows to associate `ne` to the NotEqual implementation. -` -An Lookup works against two values, lhs and rhs. The abbreviations stand for -left-hand side and right-hand side. The lhs is usually a field reference, -but it can be anything implementing the query expression API. The -rhs is the value given by the user. In the example `name__ne=Jack`, the -lhs is reference to Author's name field and Jack is the value. +Taking a closer look at the implementation, the first required attribute is +``lookup_name``. This allows the ORM to understand how to interpret ``name__ne`` +and use ``NotEqual`` to generate the SQL. By convention, these names are always +lowercase strings containing only letters, but the only hard requirement is +that it must not contain the string ``__``. -The lhs and rhs are turned into values that are possible to use in SQL. -In the example above lhs is turned into "author"."name", [], and rhs is -turned into "%s", ['Jack']. The lhs is just raw string without parameters -but the rhs is turned into a query parameter 'Jack'. +A ``Lookup`` works against two values, ``lhs`` and ``rhs``, standing for +left-hand side and right-hand side. The left-hand side is usually a field +reference, but it can be anything implementing the :ref:`query expression API +`. The right-hand is the value given by the user. In the +example ``Author.objects.filter(name__ne='Jack')``, the left-hand side is a +reference to the ``name`` field of the ``Author`` model, and ``'Jack'`` is the +right-hand side. -Finally we combine the lhs and rhs by adding ` <> ` in between of them, -and supply all the parameters for the query. +We call ``process_lhs`` and ``process_rhs`` to convert them into the values we +need for SQL. In the above example, ``process_lhs`` returns +``('"author"."name"', [])`` and ``process_rhs`` returns ``('"%s"', ['Jack'])``. +In this example there were no parameters for the left hand side, but this would +depend on the object we have, so we still need to include them in the +parameters we return. -A Lookup needs to implement a limited part of query expression API. See -the query expression API for details. +Finally we combine the parts into a SQL expression with ``<>``, and supply all +the parameters for the query. We then return a tuple containing the generated +SQL string and the parameters. A simple transformer example ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -We will next write a simple transformer. The transformer will be called -`YearExtract`. It can be used to extract the year part from `DateField`. +The custom lookup above is great, but in some cases you may want to be able to +chain lookups together. For example, let's suppose we are building an +application where we want to make use of the ``abs()`` operator. +We have an ``Experiment`` model which records a start value, end value and the +change (start - end). We would like to find all experiments where the change +was equal to a certain amount (``Experiment.objects.filter(change__abs=27)``), +or where it did not exceede a certain amount +(``Experiment.objects.filter(change__abs__lt=27)``). -Lets start by writing the implementation:: +.. note:: + This example is somewhat contrived, but it demonstrates nicely the range of + functionality which is possible in a database backend independent manner, + and without duplicating functionality already in Django. + +We will start by writing a ``AbsoluteValue`` transformer. This will use the SQL +function ``ABS()`` to transform the value before comparison:: from django.db.models import Extract - class YearExtract(Extract): - lookup_name = 'year' - output_type = IntegerField() + class AbsoluteValue(Extract): + lookup_name = 'abs' def as_sql(self, qn, connection): lhs, params = qn.compile(self.lhs) - return "EXTRACT(YEAR FROM %s)" % lhs, params + return "ABS(%s)" % lhs, params -Next, lets register it for `DateField`:: +Next, lets register it for ``IntegerField``:: - from django.db.models import DateField - DateField.register_lookup(YearExtract) + from django.db.models import IntegerField + IntegerField.register_lookup(AbsoluteValue) -Now any DateField in your project will have `year` transformer. For example -the following query:: +We can now run the queris we had before. +``Experiment.objects.filter(change__abs=27)`` will generate the following SQL:: - Author.objects.filter(birthdate__year__lte=1981) + SELECT ... WHERE ABS("experiments"."change") = 27 -would translate to the following query on PostgreSQL:: +By using ``Extract`` instead of ``Lookup`` it means we are able to chain +further lookups afterwards. So +``Experiment.objects.filter(change__abs__lt=27)`` will generate the following +SQL:: - SELECT ... - FROM "author" - WHERE EXTRACT(YEAR FROM "author"."birthdate") <= 1981 + SELECT ... WHERE ABS("experiments"."change") < 27 -An YearExtract class works only against self.lhs. Usually the lhs is -transformed in some way. Further lookups and extracts work against the -transformed value. +Subclasses of ``Extract`` usually only operate on the left-hand side of the +expression. Further lookups will work on the transformed value. Note that in +this case where there is no other lookup specified, Django interprets +``change__abs=27`` as ``change__abs__exact=27``. -Note the definition of output_type in the `YearExtract`. The output_type is -a field instance. It informs Django that the Extract class transformed the -type of the value to an int. This is currently used only to check which -lookups the extract has. +When looking for which lookups are allowable after the ``Extract`` has been +applied, Django uses the ``output_type`` attribute. We didn't need to specify +this here as it didn't change, but supposing we were applying ``AbsoluteValue`` +to some field which represents a more complex type (for example a point +relative to an origin, or a complex number) then we may have wanted to specify +``output_type = FloatField``, which will ensure that further lookups like +``abs__lte`` behave as they would for a ``FloatField``. -The used SQL in this example works on most databases. Check you database -vendor's documentation to see if EXTRACT(year from date) is supported. +Writing an efficient abs__lt lookup +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Writing an efficient year__exact lookup -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +When using the above written ``abs`` lookup, the SQL produced will not use +indexes efficiently in some cases. In particular, when we use +``change__abs__lt=27``, this is equivalent to ``change__gt=-27`` AND +``change__lt=27``. (For the ``lte`` case we could use the SQL ``BETWEEN``). -When using the above written `year` lookup, the SQL produced will not use -indexes efficiently. We will fix that by writing a custom `exact` lookup -for YearExtract. For example if the user filters on -`birthdate__year__exact=1981`, then we want to produce the following SQL:: +So we would like ``Experiment.objects.filter(change__abs__lt=27)`` to generate +the following SQL:: - birthdate >= to_date('1981-01-01') AND birthdate <= to_date('1981-12-31') + SELECT .. WHERE "experiments"."change" < 27 AND "experiments"."change" > -27 The implementation is:: from django.db.models import Lookup - class YearExact(Lookup): - lookup_name = 'exact' + class AbsoluteValueLessThan(Lookup): + lookup_name = 'lt' def as_sql(self, qn, connection): lhs, lhs_params = qn.compile(self.lhs.lhs) rhs, rhs_params = self.process_rhs(qn, connection) params = lhs_params + rhs_params + lhs_params + rhs_params - return '%s >= to_date(%s || '-01-01') AND %s <= to_date(%s || '-12-31') % (lhs, rhs, lhs, rhs), params + return '%s > %s AND %s < -%s % (lhs, rhs, lhs, rhs), params - YearExtract.register_lookup(YearExact) + AbsoluteValue.register_lookup(AbsoluteValueLessThan) -There are a couple of notable things going on. First, `YearExact` isn't -calling process_lhs(). Instead it skips and compiles directly the lhs used by -self.lhs. The reason this is done is to skip `YearExtract` from adding the -EXTRACT clause to the query. Referring directly to self.lhs.lhs is safe as -`YearExact` can be accessed only from `year__exact` lookup, that is the lhs -is always `YearExtract`. +There are a couple of notable things going on. First, ``AbsoluteValueLessThan`` +isn't calling ``process_lhs()``. Instead it skips the transformation of the +``lhs`` done by ``AbsoluteValue`` and uses the original ``lhs``. That is, we +want to get ``27`` not ``ABS(27)``. Referring directly to ``self.lhs.lhs`` is +safe as ``AbsoluteValueLessThan`` can be accessed only from the +``AbsoluteValue`` lookup, that is the ``lhs`` is always an instance of +``AbsoluteValue``. -Next, as both the lhs and rhs are used multiple times in the query the params -need to contain lhs_params and rhs_params multiple times. +Notice also that as both sides are used multiple times in the query the params +need to contain ``lhs_params`` and ``rhs_params`` multiple times. -The final query does string manipulation directly in the database. The reason -for doing this is that if the self.rhs is something else than a plain integer -value (for exampel a `F()` reference) we can't do the transformations in -Python. +The final query does the inversion (``27`` to ``-27``) directly in the +database. The reason for doing this is that if the self.rhs is something else +than a plain integer value (for example an ``F()`` reference) we can't do the +transformations in Python. + +.. note:: + In fact, most lookups with ``__abs`` could be implemented as range queries + like this, and on most database backend it is likely to be more sensible to + do so as you can make use of the indexes. However with PostgreSQL you may + want to add an index on ``abs(change)`` which would allow these queries to + be very efficient. Writing alternative implemenatations for existing lookups ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Sometimes different database vendors require different SQL for the same operation. For this example we will rewrite a custom implementation for -MySQL for the NotEqual operator. Instead of `<>` we will be using `!=` -operator. +MySQL for the NotEqual operator. Instead of ``<>`` we will be using ``!=`` +operator. (Note that in reality almost all databases support both, including +all the official databases supported by Django). -There are two ways to do this. The first is to write a subclass with a -as_mysql() method and registering the subclass over the original class:: +We can change the behaviour on a specific backend by creating a subclass of +``NotEqual`` with a ``as_mysql`` method:: class MySQLNotEqual(NotEqual): def as_mysql(self, qn, connection): @@ -179,80 +209,92 @@ as_mysql() method and registering the subclass over the original class:: return '%s != %s' % (lhs, rhs), params Field.register_lookup(MySQLNotExact) -The alternate is to monkey-patch the existing class in place:: +We can then register it with ``Field``. It takes the place of the original +``NotEqual`` class as it has - def as_mysql(self, qn, connection): - lhs, lhs_params = self.process_lhs(qn, connection) - rhs, rhs_params = self.process_rhs(qn, connection) - params = lhs_params + rhs_params - return '%s != %s' % (lhs, rhs), params - NotEqual.as_mysql = as_mysql +When compiling a query, Django first looks for ``as_%s % connection.vendor`` +methods, and then falls back to ``as_sql``. The vendor names for the in-built +backends are ``sqlite``, ``postgresql``, ``oracle`` and ``mysql``. -The subclass way allows one to override methods of the lookup if needed. The -monkey-patch way allows writing different implementations for the same class -in different locations of the project. +.. note:: + If for some reason you need to change the lookup just for a specific query, + you can do that and reregister the original lookup afterwards. However you + need to be careful to ensure that your patch is in place until the queryset + is evaluated, not just created. -The way Django knows to call as_mysql() instead of as_sql() is as follows. -When qn.compile(notequal_instance) is called, Django first checks if there -is a method named 'as_%s' % connection.vendor. If that method doesn't exist, -the as_sql() will be called. - -The vendor names for Django's in-built backends are 'sqlite', 'postgresql', -'oracle' and 'mysql'. - -The Lookup API -~~~~~~~~~~~~~~ - -An lookup has attributes lhs and rhs. The lhs is something implementing the -query expression API and the rhs is either a plain value, or something that -needs to be compiled into SQL. Examples of SQL-compiled values include `F()` -references and usage of `QuerySets` as value. - -A lookup needs to define lookup_name as a class level attribute. This is used -when registering lookups. - -A lookup has three public methods. The as_sql(qn, connection) method needs -to produce a query string and parameters used by the query string. The qn has -a method compile() which can be used to compile self.lhs. However usually it -is better to call self.process_lhs(qn, connection) instead, which returns -query string and parameters for the lhs. Similary process_rhs(qn, connection) -returns query string and parameters for the rhs. +.. _query-expression: The Query Expression API ~~~~~~~~~~~~~~~~~~~~~~~~ A lookup can assume that the lhs responds to the query expression API. -Currently direct field references, aggregates and `Extract` instances respond +Currently direct field references, aggregates and ``Extract`` instances respond to this API. .. method:: as_sql(qn, connection) -Responsible for producing the query string and parameters for the expression. -The qn has a compile() method that can be used to compile other expressions. -The connection is the connection used to execute the query. The -connection.vendor attribute can be used to return different query strings -for different backends. + Responsible for producing the query string and parameters for the + expression. The ``qn`` has a ``compile()`` method that can be used to + compile other expressions. The ``connection`` is the connection used to + execute the query. -Calling expression.as_sql() directly is usually an error - instead -qn.compile(expression) should be used. The qn.compile() method will take -care of calling vendor-specific methods of the expression. + Calling expression.as_sql() directly is usually incorrect - instead + qn.compile(expression) should be used. The qn.compile() method will take + care of calling vendor-specific methods of the expression. .. method:: as_vendorname(qn, connection) -Works like as_sql() method. When an expression is compiled by qn.compile() -Django will first try to call as_vendorname(), where vendorname is the vendor -name of the backend used for executing the query. The vendorname is one of -'postgresql', 'oracle', 'sqlite' or 'mysql' for Django's inbuilt backends. + Works like ``as_sql()`` method. When an expression is compiled by + ``qn.compile()``, Django will first try to call ``as_vendorname()``, where + vendorname is the vendor name of the backend used for executing the query. + The vendorname is one of ``postgresql``, ``oracle``, ``sqlite`` or + ``mysql`` for Django's built-in backends. -.. method:: get_lookup(lookup_name):: +.. method:: get_lookup(lookup_name) -The get_lookup() method is used to fetch lookups. By default the lookup -is fetched from the expression's output type, but it is possible to override -this method to alter that behaviour. + The ``get_lookup()`` method is used to fetch lookups. By default the lookup + is fetched from the expression's output type, but it is possible to + override this method to alter that behaviour. .. attribute:: output_type -The output_type attribute is used by the get_lookup() method to check for -lookups. The output_type should be a field instance. + The ``output_type`` attribute is used by the ``get_lookup()`` method to check for + lookups. The output_type should be a field. Note that this documentation lists only the public methods of the API. + +Lookup reference +~~~~~~~~~~~~~~~~ + +.. class:: Lookup + + In addition to the attributes and methods below, lookups also support + ``as_sql`` and ``as_vendorname`` from the query expression API. + +.. attribute:: lhs + + The ``lhs`` (left-hand side) of a lookup tells us what we are comparing the + rhs to. It is an object which implements the query expression API. This is + likely to be a field, an aggregate or a subclass of ``Extract``. + +.. attribute:: rhs + + The ``rhs`` (right-hand side) of a lookup is the value we are comparing the + left hand side to. It may be a plain value, or something which compiles + into SQL, for example an ``F()`` object or a ``Queryset``. + +.. attribute:: lookup_name + + This class level attribute is used when registering lookups. It determines + the name used in queries to triger this lookup. For example, ``contains`` + or ``exact``. This should not contain the string ``__``. + +.. method:: process_lhs(qn, connection) + + This returns a tuple of ``(lhs_string, lhs_params)``. In some cases you may + wish to compile ``lhs`` directly in your ``as_sql`` methods using + ``qn.compile(self.lhs)``. + +.. method:: process_rhs(qn, connection) + + Behaves the same as ``process_lhs`` but acts on the right-hand side.