diff --git a/docs/custom_model_fields.txt b/docs/custom_model_fields.txt new file mode 100644 index 0000000000..c12d1844cd --- /dev/null +++ b/docs/custom_model_fields.txt @@ -0,0 +1,567 @@ +=================== +Custom Model Fields +=================== + +**New in Django development version** + +Introduction +============ + +The `model reference`_ documentation explains how to use Django's standard +field classes. For many purposes, those classes are all you'll need. Sometimes, +though, the Django version won't meet your precise requirements, or you'll want +to use a field that is entirely different from those shipped with Django. + +Django's built-in field types don't cover every possible database column type -- +only the common types, such as ``VARCHAR`` and ``INTEGER``. For more obscure +column types, such as geographic polygons or even user-created types such as +`PostgreSQL custom types`_, you can define your own Django ``Field`` subclasses. + +Alternatively, you may have a complex Python object that can somehow be +serialized to fit into a standard database column type. This is another case +where a ``Field`` subclass will help you use your object with your models. + +Our example object +------------------ + +Creating custom fields requires a bit of attention to detail. To make things +easier to follow, we'll use a consistent example throughout this document. +Suppose you have a Python object representing the deal of cards in a hand of +Bridge_. It doesn't matter if you don't know how to play Bridge. You only need +to know that 52 cards are dealt out equally to four players, who are +traditionally called *north*, *east*, *south* and *west*. Our class looks +something like this:: + + class Hand(object): + def __init__(self, north, east, south, west): + # Input parameters are lists of cards ('Ah', '9s', etc) + self.north = north + self.east = east + self.south = south + self.west = west + + # ... (other possibly useful methods omitted) ... + +This is just an ordinary Python class, nothing Django-specific about it. We +would like to be able to things like this in our models (we assume the +``hand`` attribute on the model is an instance of ``Hand``):: + + + example = MyModel.objects.get(pk=1) + print example.hand.north + + new_hand = Hand(north, east, south, west) + example.hand = new_hand + example.save() + +We assign to and retrieve from the ``hand`` attribute in our model just like +any other Python class. The trick is to tell Django how to handle saving and +loading such an object + +In order to use the ``Hand`` class in our models, we **do not** have to change +this class at all. This is ideal, because it means you can easily write +model support for existing classes where you cannot change the source code. + +.. note:: + You might only be wanting to take advantage of custom database column + types and deal with the data as standard Python types in your models; + strings, or floats, for example. This case is similar to our ``Hand`` + example and we'll note any differences as we go along. + +.. _model reference: ../model_api/ +.. _PostgreSQL custom types: http://www.postgresql.org/docs/8.2/interactive/sql-createtype.html +.. _Bridge: http://en.wikipedia.org/wiki/Contract_bridge + +Background Theory +================= + +Database storage +---------------- + +The simplest way to think of a model field is that it provides a way to take a +normal Python object -- string, boolean, ``datetime``, or something more +complex like ``Hand`` -- and convert it to and from a format that is useful +when dealing with the database (and serialization, but, as we'll see later, +that falls out fairly naturally once you have the database side under control). + +Fields in a model must somehow be converted to fit into an existing database +column type. Different databases provide different sets of valid column types, +but the rule is still the same: those are the only types you have to work +with. Anything you want to store in the database must fit into one of +those types. + +Normally, you're either writing a Django field to match a particular database +column type, or there's a fairly straightforward way to convert your data to, +say, a string. + +For our ``Hand`` example, we could convert the card data to a string of 104 +characters by concatenating all the cards together in a pre-determined order. +Say, all the *north* cards first, then the *east*, *south* and *west* cards, in +that order. So ``Hand`` objects can be saved to text or character columns in +the database + +What does a field class do? +--------------------------- + +All of Django's fields (and when we say *fields* in this document, we always +mean model fields and not `form fields`_) are subclasses of +``django.db.models.Field``. Most of the information that Django records about a +field is common to all fields -- name, help text, validator lists, uniqueness +and so forth. Storing all that information is handled by ``Field``. We'll get +into the precise details of what ``Field`` can do later on; for now, suffice it +to say that everything descends from ``Field`` and then customises key pieces +of the class behaviour. + +.. _form fields: ../newforms/#fields + +It's important to realise that a Django field class is not what is stored in +your model attributes. The model attributes contain normal Python objects. The +field classes you define in a model are actually stored in the ``Meta`` class +when the model class is created (the precise details of how this is done are +unimportant here). This is because the field classes aren't necessary when +you're just creating and modifying attributes. Instead, they provide the +machinery for converting between the attribute value and what is stored in the +database or sent to the serializer. + +Keep this in mind when creating your own custom fields. The Django ``Field`` +subclass you write provides the machinery for converting between your Python +instances and the database/serializer values in various ways (there are +differences between storing a value and using a value for lookups, for +example). If this sounds a bit tricky, don't worry. It will hopefully become +clearer in the examples below. Just remember that you will often end up +creating two classes when you want a custom field. The first class is the +Python object that your users will manipulate. They will assign it to the model +attribute, they will read from it for displaying purposes, things like that. +This is the ``Hand`` class in our example. The second class is the ``Field`` +subclass. This is the class that knows how to convert your first class back and +forth between its permanent storage form and the Python form. + +Writing a ``Field`` subclass +============================= + +When you are planning your ``Field`` subclass, first give some thought to +which existing field your new field is most similar to. Can you subclass an +existing Django field and save yourself some work? If not, you should subclass the ``Field`` class, from which everything is descended. + +Initialising your new field is a matter of separating out any arguments that +are specific to your case from the common arguments and passing the latter to +the ``__init__()`` method of ``Field`` (or your parent class). + +In our example, the Django field we create is going to be called +``HandField``. It's not a bad idea to use a similar naming scheme to Django's +fields so that our new class is identifiable and yet clearly related to the +``Hand`` class it is wrapping. It doesn't behave like any existing field, so +we'll subclass directly from ``Field``:: + + from django.db import models + + class HandField(models.Field): + def __init__(self, *args, **kwargs): + kwargs['max_length'] = 104 + super(HandField, self).__init__(*args, **kwargs) + +Our ``HandField`` will accept most of the standard field options (see the list +below), but we ensure it has a fixed length, since it only needs to hold 52 +card values plus their suits; 104 characters in total. + +.. note:: + Many of Django's model fields accept options that they don't do anything + with. For example, you can pass both ``editable`` and ``auto_now`` to a + ``DateField`` and it will simply ignore the ``editable`` parameter + (``auto_now`` being set implies ``editable=False``). No error is raised in + this case. + + This behaviour simplifies the field classes, because they don't need to + check for options that aren't necessary. They just pass all the options to + the parent class and then don't use them later on. It is up to you whether + you want your fields to be more strict about the options they select, or + to use the simpler, more permissive behaviour of the current fields. + +The ``Field.__init__()`` method takes the following parameters, in this +order: + + - ``verbose_name`` + - ``name`` + - ``primary_key`` + - ``max_length`` + - ``unique`` + - ``blank`` + - ``null`` + - ``db_index`` + - ``core`` + - ``rel``: Used for related fields (like ``ForeignKey``). For advanced use + only. + - ``default`` + - ``editable`` + - ``serialize``: If ``False``, the field will not be serialized when the + model is passed to Django's serializers_. Defaults to ``True``. + - ``prepopulate_from`` + - ``unique_for_date`` + - ``unique_for_month`` + - ``unique_for_year`` + - ``validator_list`` + - ``choices`` + - ``radio_admin`` + - ``help_text`` + - ``db_column`` + - ``db_tablespace``: Currently only used with the Oracle backend and only + for index creation. You can usually ignore this option. + +All of the options without an explanation in the above list have the same +meaning they do for normal Django fields. See the `model documentation`_ for +examples and details. + +.. _serializers: ../serialization/ +.. _model documentation: ../model-api/ + +The ``SubfieldBase`` metaclass +------------------------------ + +As we indicated in the introduction_, field subclasses are often needed for +two reasons. Either to take advantage of a custom database column type, or to +handle complex Python types. A combination of the two is obviously also +possible. If you are only working with custom database column types and your +model fields appear in Python as standard Python types direct from the +database backend, you don't need to worry about this section. + +If you are handling custom Python types, such as our ``Hand`` class, we need +to make sure that when Django initialises an instance of our model and assigns +a database value to our custom field attribute we convert that value into the +appropriate Python object. The details of how this happens internally are a +little complex. For the field writer, though, things are fairly simple. Make +sure your field subclass uses ``django.db.models.SubfieldBase`` as its +metaclass. This ensures that the ``to_python()`` method, documented below_, +will always be called when the attribute is initialised. + +Our ``HandleField`` class now looks like this:: + + class HandleField(models.Field): + __metaclass__ = models.SubfieldBase + + def __init__(self, *args, **kwargs): + # ... + +.. _below: #to-python-self-value + +Useful methods +-------------- + +Once you've created your ``Field`` subclass and setup up the +``__metaclass__``, if necessary, there are a few standard methods you need to +consider overriding. Which of these you need to implement will depend on you +particular field behaviour. The list below is in approximately decreasing +order of importance, so start from the top. + +``db_type(self)`` +~~~~~~~~~~~~~~~~~ + +Returns the database column data type for the ``Field``, taking into account +the current ``DATABASE_ENGINE`` setting. + +Say you've created a PostgreSQL custom type called ``mytype``. You can use this +field with Django by subclassing ``Field`` and implementing the ``db_type()`` +method, like so:: + + from django.db import models + + class MytypeField(models.Field): + def db_type(self): + return 'mytype' + +Once you have ``MytypeField``, you can use it in any model, just like any other +``Field`` type:: + + class Person(models.Model): + name = models.CharField(max_length=80) + gender = models.CharField(max_length=1) + something_else = MytypeField() + +If you aim to build a database-agnostic application, you should account for +differences in database column types. For example, the date/time column type +in PostgreSQL is called ``timestamp``, while the same column in MySQL is called +``datetime``. The simplest way to handle this in a ``db_type()`` method is to +import the Django settings module and check the ``DATABASE_ENGINE`` setting. +For example:: + + class MyDateField(models.Field): + def db_type(self): + from django.conf import settings + if settings.DATABASE_ENGINE == 'mysql': + return 'datetime' + else: + return 'timestamp' + +The ``db_type()`` method is only called by Django when the framework constructs +the ``CREATE TABLE`` statements for your application -- that is, when you first +create your tables. It's not called at any other time, so it can afford to +execute slightly complex code, such as the ``DATABASE_ENGINE`` check in the +above example. + +Some database column types accept parameters, such as ``CHAR(25)``, where the +parameter ``25`` represents the maximum column length. In cases like these, +it's more flexible if the parameter is specified in the model rather than being +hard-coded in the ``db_type()`` method. For example, it wouldn't make much +sense to have a ``CharMaxlength25Field``, shown here:: + + # This is a silly example of hard-coded parameters. + class CharMaxlength25Field(models.Field): + def db_type(self): + return 'char(25)' + + # In the model: + class MyModel(models.Model): + # ... + my_field = CharMaxlength25Field() + +The better way of doing this would be to make the parameter specifiable at run +time -- i.e., when the class is instantiated. To do that, just implement +``__init__()``, like so:: + + # This is a much more flexible example. + class BetterCharField(models.Field): + def __init__(self, max_length, *args, **kwargs): + self.max_length = max_length + super(BetterCharField, self).__init__(*args, **kwargs) + + def db_type(self): + return 'char(%s)' % self.max_length + + # In the model: + class MyModel(models.Model): + # ... + my_field = BetterCharField(25) + +Finally, if your column requires truly complex SQL setup, return ``None`` from +``db_type()``. This will cause Django's SQL creation code to skip over this +field. You are then responsible for creating the column in the right table in +some other way, of course, but this gives you a way to tell Django to get out +of the way. + + +``to_python(self, value)`` +~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Converts between all the ways your field can receive its initial value and the +Python object you want to end up with. The default version just returns +``value``, so is useful is the database backend returns the data already in +the correct form (a Python string, for example). + +Normally, you will need to override this method. As a general rule, be +prepared to accept an instance of the right type (e.g. ``Hand`` in our ongoing +example), a string (from a deserializer, for example), and whatever the +database wrapper returns for the column type you are using. + +In our ``HandField`` class, we are storing the data in a character field in +the database, so we need to be able to process strings and ``Hand`` instances +in ``to_python()``:: + + class HandField(models.Field): + # ... + + def to_python(self, value): + if isinstance(value, Hand): + return value + + # The string case + p1 = re.compile('.{26}') + p2 = re.compile('..') + args = [p2.findall(x) for x in p1.findall(value)] + return Hand(*args) + +Notice that we always return a ``Hand`` instance from this method. That is the +Python object we want to store in the model's attribute. + +``get_db_prep_save(self, value)`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This is the reverse of ``to_python()`` when working with the database backends +(as opposed to serialization). The ``value`` parameter is the current value of +the model's attribute (a field has no reference to its containing model, so it +cannot retrieve the value itself) and the method should return data in a +format that can be used as a parameter in a query for the database backend. + +For example:: + + class HandField(models.Field): + # ... + + def get_db_prep_save(self, value): + return ''.join([''.join(l) for l in (self.north, + self.east, self.south, self.west)]) + + +``pre_save(self, model_instance, add)`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +This method is called just prior to ``get_db_prep_save()`` and should return +the value of the appropriate attribute from ``model_instance`` for this field. +The attribute name is in ``self.attname`` (this is set up by ``Field``). If +the model is being saved to the database for the first time, the ``add`` +parameter will be ``True``, otherwise it will be ``False``. + +Often you won't need to override this method. However, at times it can be very +useful. For example, the Django ``DateTimeField`` uses this method to set the +attribute to the correct value before returning it in the cases when +``auto_now`` or ``auto_now_add`` are set on the field. + +If you do override this method, you must return the value of the attribute at +the end. You should also update the model's attribute if you make any changes +to the value so that code holding references to the model will always see the +correct value. + +``get_db_prep_lookup(self, lookup_type, value)`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Prepares the ``value`` for passing to the database when used in a lookup (a +``WHERE`` constraint in SQL). The ``lookup_type`` will be one of the valid +Django filter lookups: ``exact``, ``iexact``, ``contains``, ``icontains``, +``gt``, ``gte``, ``lt``, ``lte``, ``in``, ``startswith``, ``istartswith``, +``endswith``, ``iendswith``, ``range``, ``year``, ``month``, ``day``, +``isnull``, ``search``, ``regex``, and ``iregex``. + +Your method must be prepared to handle all of these ``lookup_type`` values and +should raise either a ``ValueError`` if the ``value`` is of the wrong sort (a +list when you were expecting an object, for example) or a ``TypeError`` if +your field does not support that type of lookup. For many fields, you can get +by with handling the lookup types that need special handling for your field +and pass the rest of the ``get_db_prep_lookup()`` method of the parent class. + +If you needed to implement ``get_db_prep_save()``, you will usually need to +implement ``get_db_prep_lookup()``. The usual reason is because of the +``range`` and ``in`` lookups. In these case, you will passed a list of +objects (presumably of the right type) and will need to convert them to a list +of things of the right type for passing to the database. Sometimes you can +reuse ``get_db_prep_save()``, or at least factor out some common pieces from +both methods into a help function. + +For example:: + + class HandField(models.Field): + # ... + + def get_db_prep_lookup(self, lookup_type, value): + # We only handle 'exact' and 'in'. All others are errors. + if lookup_type == 'exact': + return self.get_db_prep_save(value) + elif lookup_type == 'in': + return [self.get_db_prep_save(v) for v in value] + else: + raise TypeError('Lookup type %r not supported.' % lookup_type) + + +``formfield(self, form_class=forms.CharField, **kwargs)`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Returns the default form field to use when this field is displayed +in a model. This method is called by the `helper functions`_ +``form_for_model()`` and ``form_for_instance()``. + +All of the ``kwargs`` dictionary is passed directly to the form field's +``__init__()`` method. Normally, all you need to do is set up a good default +for the ``form_class`` argument and then delegate further handling to the +parent class. This might require you to write a custom form field (and even a +form widget). See the `forms documentation`_ for information about this. Also +have a look at ``django.contrib.localflavor`` for some examples of custom +widgets. + +Continuing our ongoing example, we can write the ``formfield()`` method as:: + + class HandField(models.Field): + # ... + + def formfield(self, **kwargs): + # This is a fairly standard way to set up some defaults + # whilst letting the caller override them. + defaults = {'form_class': MyFormField} + defaults.update(kwargs) + return super(HandField, self).formfield(**defaults) + +This assumes we have some ``MyFormField`` field class (which has its own +default widget) imported. This document doesn't cover the details of writing +custom form fields. + +.. _helper functions: ../newforms/#generating-forms-for-models +.. _forms documentation: ../newforms/ + +``get_internal_type(self)`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Returns a string giving the name of the ``Field`` subclass we are emulating at +the database level. This is used to determine the type of database column for +simple cases. + +If you have created a ``db_type()`` method, you do not need to worry about +``get_internal_type()`` -- it won't be used much. Sometimes, though, your +database storage is similar in type to some other field, so you can use that +other field's logic to create the right column. + +For example:: + + class HandField(models.Field): + # ... + + def get_internal_type(self): + return 'CharField' + +No matter which database backend we are using, this will mean that ``syncdb`` +and other SQL commands create the right column type for storing a string. + +If ``get_internal_type()`` returns a string that is not known to Django for +the database backend you are using -- that is, it doesn't appear in +``django.db.backends..creation.DATA_TYPES`` -- the string will still +be used by the serializer, but the default ``db_type()`` method will return +``None``. See the documentation of ``db_type()`` above_ for reasons why this +might be useful. Putting a descriptive string in as the type of the field for +the serializer is a useful idea if you are ever going to be using the +serializer output in some other place, outside of Django. + +.. _above: #db-type-self + +``flatten_data(self, follow, obj=None)`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +.. admonition:: Subject to change + + Although implementing this method is necessary to allow field + serialization, the API might change in the future. + +Returns a dictionary, mapping the field's attribute name to a flattened string +version of the data. This method has some internal uses that aren't of +interest to use here (mostly having to do with manipulators). For our +purposes, it is sufficient to return a one item dictionary that maps the +attribute name to a string. + +This method is used by the serializers to convert the field into a string for +output. You can ignore the input parameters for serialization purposes, +although calling ``Field._get_val_from_obj(obj)`` is the best way to get the +value to serialize. + +For example, since our ``HandField`` uses strings for its data storage anyway, +we can reuse some existing conversion code:: + + class HandField(models.Field): + # ... + + def flatten_data(self, follow, obj=None): + value = self._get_val_from_obj(obj) + return {self.attname: self.get_db_prep_save(value)} + +Some general advice +-------------------- + +Writing a custom field can be a tricky process sometime, particularly if you +are doing complex conversions between your Python types and your database and +serialization formats. A couple of tips to make things go more smoothly: + + 1. Look at the existing Django fields (in + ``django/db/models/fields/__init__.py``) for inspiration. Try to find a field + that is already close to what you want and extend it a little bit, in + preference to creating an entirely new field from scratch. + + 2. Put a ``__str__()`` or ``__unicode__()`` method on the class you are + wrapping up as a field. There are a lot of places where the default behaviour + of the field code is to call ``force_unicode()`` on the value (in our + examples in this document, ``value`` would be a ``Hand`` instance, not a + ``HandField``). So if your ``__unicode__()`` method automatically converts to + the string form of your Python object, you can save yourself a lot of work. + diff --git a/docs/model-api.txt b/docs/model-api.txt index b49963d8f5..ca84c84d09 100644 --- a/docs/model-api.txt +++ b/docs/model-api.txt @@ -1013,111 +1013,12 @@ See the `One-to-one relationship model example`_ for a full example. Custom field types ------------------ -**New in Django development version** +If one of the existing model fields cannot be used to fit your purposes, or if +you wish to take advantage of some less common database column types, you can +create your own field class. Full coverage of creating your own fields is +provided in the `Custom Model Fields`_ documentation. -Django's built-in field types don't cover every possible database column type -- -only the common types, such as ``VARCHAR`` and ``INTEGER``. For more obscure -column types, such as geographic polygons or even user-created types such as -`PostgreSQL custom types`_, you can define your own Django ``Field`` subclasses. - -.. _PostgreSQL custom types: http://www.postgresql.org/docs/8.2/interactive/sql-createtype.html - -.. admonition:: Experimental territory - - This is an area of Django that traditionally has not been documented, but - we're starting to include bits of documentation, one feature at a time. - Please forgive the sparseness of this section. - - If you like living on the edge and are comfortable with the risk of - unstable, undocumented APIs, see the code for the core ``Field`` class - in ``django/db/models/fields/__init__.py`` -- but if/when the innards - change, don't say we didn't warn you. - -To create a custom field type, simply subclass ``django.db.models.Field``. -Here is an incomplete list of the methods you should implement: - -``db_type()`` -~~~~~~~~~~~~~ - -Returns the database column data type for the ``Field``, taking into account -the current ``DATABASE_ENGINE`` setting. - -Say you've created a PostgreSQL custom type called ``mytype``. You can use this -field with Django by subclassing ``Field`` and implementing the ``db_type()`` -method, like so:: - - from django.db import models - - class MytypeField(models.Field): - def db_type(self): - return 'mytype' - -Once you have ``MytypeField``, you can use it in any model, just like any other -``Field`` type:: - - class Person(models.Model): - name = models.CharField(max_length=80) - gender = models.CharField(max_length=1) - something_else = MytypeField() - -If you aim to build a database-agnostic application, you should account for -differences in database column types. For example, the date/time column type -in PostgreSQL is called ``timestamp``, while the same column in MySQL is called -``datetime``. The simplest way to handle this in a ``db_type()`` method is to -import the Django settings module and check the ``DATABASE_ENGINE`` setting. -For example:: - - class MyDateField(models.Field): - def db_type(self): - from django.conf import settings - if settings.DATABASE_ENGINE == 'mysql': - return 'datetime' - else: - return 'timestamp' - -The ``db_type()`` method is only called by Django when the framework constructs -the ``CREATE TABLE`` statements for your application -- that is, when you first -create your tables. It's not called at any other time, so it can afford to -execute slightly complex code, such as the ``DATABASE_ENGINE`` check in the -above example. - -Some database column types accept parameters, such as ``CHAR(25)``, where the -parameter ``25`` represents the maximum column length. In cases like these, -it's more flexible if the parameter is specified in the model rather than being -hard-coded in the ``db_type()`` method. For example, it wouldn't make much -sense to have a ``CharMaxlength25Field``, shown here:: - - # This is a silly example of hard-coded parameters. - class CharMaxlength25Field(models.Field): - def db_type(self): - return 'char(25)' - - # In the model: - class MyModel(models.Model): - # ... - my_field = CharMaxlength25Field() - -The better way of doing this would be to make the parameter specifiable at run -time -- i.e., when the class is instantiated. To do that, just implement -``__init__()``, like so:: - - # This is a much more flexible example. - class BetterCharField(models.Field): - def __init__(self, max_length, *args, **kwargs): - self.max_length = max_length - super(BetterCharField, self).__init__(*args, **kwargs) - - def db_type(self): - return 'char(%s)' % self.max_length - - # In the model: - class MyModel(models.Model): - # ... - my_field = BetterCharField(25) - -Note that if you implement ``__init__()`` on a ``Field`` subclass, it's -important to call ``Field.__init__()`` -- i.e., the parent class' -``__init__()`` method. +.. _Custom Model Fields: ../custom_model_fields/ Meta options ============