From 500fe9c639f37533fb8525d53dacbe986c0e3a9f Mon Sep 17 00:00:00 2001 From: Aymeric Augustin Date: Sun, 19 Aug 2012 16:30:07 +0200 Subject: [PATCH] [py3] Wrote Django-specific porting tips and extended the existing Python 3 documentation. --- docs/topics/python3.txt | 311 +++++++++++++++++++++++++++++++++++----- 1 file changed, 276 insertions(+), 35 deletions(-) diff --git a/docs/topics/python3.txt b/docs/topics/python3.txt index 32463f2e512..589856bd4ed 100644 --- a/docs/topics/python3.txt +++ b/docs/topics/python3.txt @@ -1,25 +1,212 @@ -====================== -Python 3 compatibility -====================== +=================== +Porting to Python 3 +=================== Django 1.5 is the first version of Django to support Python 3. The same code runs both on Python 2 (≥ 2.6.5) and Python 3 (≥ 3.2), thanks to the six_ -compatibility layer and ``unicode_literals``. +compatibility layer. .. _six: http://packages.python.org/six/ -This document is not meant as a Python 2 to Python 3 migration guide. There -are many existing resources, including `Python's official porting guide`_. -Rather, it describes guidelines that apply to Django's code and are -recommended for pluggable apps that run with both Python 2 and 3. +This document is primarily targeted at authors of pluggable application +who want to support both Python 2 and 3. It also describes guidelines that +apply to Django's code. + +Philosophy +========== + +This document assumes that you are familiar with the changes between Python 2 +and Python 3. If you aren't, read `Python's official porting guide`_ first. +Refreshing your knowledge of unicode handling on Python 2 and 3 will help; the +`Pragmatic Unicode`_ presentation is a good resource. + +Django uses the *Python 2/3 Compatible Source* strategy. Of course, you're +free to chose another strategy for your own code, especially if you don't need +to stay compatible with Python 2. But authors of pluggable applications are +encouraged to use the same porting strategy as Django itself. + +Writing compatible code is much easier if you target Python ≥ 2.6. You will +most likely take advantage of the compatibility functions introduced in Django +1.5, like :mod:`django.utils.six`, so your application will also require +Django ≥ 1.5. + +Obviously, writing compatible source code adds some overhead, and that can +cause frustration. Django's developers have found that attempting to write +Python 3 code that's compatible with Python 2 is much more rewarding than the +opposite. Not only does that make your code more future-proof, but Python 3's +advantages (like the saner string handling) start shining quickly. Dealing +with Python 2 becomes a backwards compatibility requirement, and we as +developers are used to dealing with such constraints. + +Porting tools provided by Django are inspired by this philosophy, and it's +reflected throughout this guide. .. _Python's official porting guide: http://docs.python.org/py3k/howto/pyporting.html +.. _Pragmatic Unicode: http://nedbatchelder.com/text/unipain.html + +Porting tips +============ + +Unicode literals +---------------- + +This step consists in: + +- Adding ``from __future__ import unicode_literals`` at the top of your Python + modules -- it's best to put it in each and every module, otherwise you'll + keep checking the top of your files to see which mode is in effect; +- Removing the ``u`` prefix before unicode strings; +- Adding a ``b`` prefix before bytestrings. + +Performing these changes systematically guarantees backwards compatibility. + +However, Django applications generally don't need bytestrings, since Django +only exposes unicode interfaces to the programmer. Python 3 discourages using +bytestrings, except for binary data or byte-oriented interfaces. Python 2 +makes bytestrings and unicode strings effectively interchangeable, as long as +they only contain ASCII data. Take advantage of this to use unicode strings +wherever possible and avoid the ``b`` prefixes. + +.. note:: + + Python 2's ``u`` prefix is a syntax error in Python 3.2 but it will be + allowed again in Python 3.3 thanks to :pep:`414`. Thus, this + transformation is optional if you target Python ≥ 3.3. It's still + recommended, per the "write Python 3 code" philosophy. + +String handling +--------------- + +Python 2's :class:`unicode` type was renamed :class:`str` in Python 3, +:class:`str` was renamed :class:`bytes`, and :class:`basestring` disappeared. +six_ provides :ref:`tools ` to deal with these +changes. + +Django also contains several string related classes and functions in the +:mod:`django.utils.encoding` and :mod:`django.utils.safestring` modules. Their +names used the words ``str``, which doesn't mean the same thing in Python 2 +and Python 3, and ``unicode``, which doesn't exist in Python 3. In order to +avoid ambiguity and confusion these concepts were renamed ``bytes`` and +``text``. + +Here are the name changes in :mod:`django.utils.encoding`: + +================== ================== +Old name New name +================== ================== +``smart_str`` ``smart_bytes`` +``smart_unicode`` ``smart_text`` +``force_unicode`` ``force_text`` +================== ================== + +For backwards compatibility, the old names still work on Python 2. Under +Python 3, ``smart_str`` is an alias for ``smart_text``. + +.. note:: + + :mod:`django.utils.encoding` was deeply refactored in Django 1.5 to + provide a more consistent API. Check its documentation for more + information. + +:mod:`django.utils.safestring` is mostly used via the +:func:`~django.utils.safestring.mark_safe` and +:func:`~django.utils.safestring.mark_for_escaping` functions, which didn't +change. In case you're using the internals, here are the name changes: + +================== ================== +Old name New name +================== ================== +``EscapeString`` ``EscapeBytes`` +``EscapeUnicode`` ``EscapeText`` +``SafeString`` ``SafeBytes`` +``SafeUnicode`` ``SafeText`` +================== ================== + +For backwards compatibility, the old names still work on Python 2. Under +Python 3, ``EscapeString`` and ``SafeString`` are aliases for ``EscapeText`` +and ``SafeText`` respectively. + +:meth:`__str__` and :meth:`__unicode__` methods +----------------------------------------------- + +In Python 2, the object model specifies :meth:`__str__` and +:meth:`__unicode__` methods. If these methods exist, they must return +:class:`str` (bytes) and :class:`unicode` (text) respectively. + +The ``print`` statement and the :func:`str` built-in call :meth:`__str__` to +determine the human-readable representation of an object. The :func:`unicode` +built-in calls :meth:`__unicode__` if it exists, and otherwise falls back to +:meth:`__str__` and decodes the result with the system encoding. Conversely, +the :class:`~django.db.models.Model` base class automatically derives +:meth:`__str__` from :meth:`__unicode__` by encoding to UTF-8. + +In Python 3, there's simply :meth:`__str__`, which must return :class:`str` +(text). + +(It is also possible to define :meth:`__bytes__`, but Django application have +little use for that method, because they hardly ever deal with +:class:`bytes`.) + +Django provides a simple way to define :meth:`__str__` and :meth:`__unicode__` +methods that work on Python 2 and 3: you must define a :meth:`__str__` method +returning text and to apply the +:func:`~django.utils.encoding.python_2_unicode_compatible` decorator. + +On Python 3, the decorator is a no-op. On Python 2, it defines appropriate +:meth:`__unicode__` and :meth:`__str__` methods (replacing the original +:meth:`__str__` method in the process). Here's an example:: + + from __future__ import unicode_literals + from django.utils.encoding import python_2_unicode_compatible + + @python_2_unicode_compatible + class MyClass(object): + def __str__(self): + return "Instance of my class" + +This technique is the best match for Django's porting philosophy. + +Finally, note that :meth:`__repr__` must return a :class:`str` on all versions +of Python. + +:class:`dict` and :class:`dict`-like classes +-------------------------------------------- + +:meth:`dict.keys`, :meth:`dict.items` and :meth:`dict.values` return lists in +Python 2 and iterators in Python 3. :class:`~django.http.QueryDict` and the +:class:`dict`-like classes defined in :mod:`django.utils.datastructures` +behave likewise in Python 3. + +six_ provides compatibility functions to work around this change: +:func:`~six.iterkeys`, :func:`~six.iteritems`, and :func:`~six.itervalues`. +Django's bundled version adds :func:`~django.utils.six.iterlists` for +:class:`~django.utils.datastructures.MultiValueDict` and its subclasses. + +:class:`~django.http.HttpRequest` and :class:`~django.http.HttpResponse` objects +-------------------------------------------------------------------------------- + +According to :pep:`3333`: + +- headers are always :class:`str` objects, +- input and output streams are always :class:`bytes` objects. + +Specifically, :attr:`HttpResponse.content ` +contains :class:`bytes`, which may require refactoring your tests.This won't +be an issue if you use :meth:`~django.test.TestCase.assertContains` and +:meth:`~django.test.TestCase.assertNotContains`: these methods expect a +unicode string. + +Coding guidelines +================= + +The following guidelines are enforced in Django's source code. They're also +recommended for third-party application who follow the same porting strategy. Syntax requirements -=================== +------------------- Unicode -------- +~~~~~~~ In Python 3, all strings are considered Unicode by default. The ``unicode`` type from Python 2 is called ``str`` in Python 3, and ``str`` becomes @@ -36,29 +223,25 @@ In order to enable the same behavior in Python 2, every module must import my_string = "This is an unicode literal" my_bytestring = b"This is a bytestring" -In classes, define ``__str__`` methods returning unicode strings and apply the -:func:`~django.utils.encoding.python_2_unicode_compatible` decorator. It will -define appropriate ``__unicode__`` and ``__str__`` in Python 2:: - - from __future__ import unicode_literals - from django.utils.encoding import python_2_unicode_compatible - - @python_2_unicode_compatible - class MyClass(object): - def __str__(self): - return "Instance of my class" - If you need a byte string literal under Python 2 and a unicode string literal under Python 3, use the :func:`str` builtin:: str('my string') +In Python 3, there aren't any automatic conversions between :class:`str` and +:class:`bytes`, and the :mod:`codecs` module became more strict. +:meth:`str.decode` always returns :class:`bytes`, and :meth:`bytes.decode` +always returns :class:`str`. As a consequence, the following pattern is +sometimes necessary:: + + value = value.encode('ascii', 'ignore').decode('ascii') + Be cautious if you have to `slice bytestrings`_. .. _slice bytestrings: http://docs.python.org/py3k/howto/pyporting.html#bytes-literals Exceptions ----------- +~~~~~~~~~~ When you capture exceptions, use the ``as`` keyword:: @@ -71,17 +254,64 @@ This older syntax was removed in Python 3:: try: ... - except MyException, exc: + except MyException, exc: # Don't do that! ... The syntax to reraise an exception with a different traceback also changed. Use :func:`six.reraise`. +Magic methods +------------- + +Use the patterns below to handle magic methods renamed in Python 3. + +Iterators +~~~~~~~~~ + +:: + + class MyIterator(object): + def __iter__(self): + return self # implement some logic here + + def __next__(self): + raise StopIteration # implement some logic here + + next = __next__ # Python 2 compatibility + +Boolean evaluation +~~~~~~~~~~~~~~~~~~ + +:: + + class MyBoolean(object): + + def __bool__(self): + return True # implement some logic here + + __nonzero__ = __bool__ # Python 2 compatibility + +Division +~~~~~~~~ + +:: + + class MyDivisible(object): + + def __truediv__(self, other): + return self / other # implement some logic here + + __div__ = __truediv__ # Python 2 compatibility + + def __itruediv__(self, other): + return self // other # implement some logic here + + __idiv__ = __itruediv__ # Python 2 compatibility .. module: django.utils.six Writing compatible code with six -================================ +-------------------------------- six_ is the canonical compatibility library for supporting Python 2 and 3 in a single codebase. Read its documentation! @@ -90,8 +320,10 @@ a single codebase. Read its documentation! Here are the most common changes required to write compatible code. -String types ------------- +.. _string-handling-with-six: + +String handling +~~~~~~~~~~~~~~~ The ``basestring`` and ``unicode`` types were removed in Python 3, and the meaning of ``str`` changed. To test these types, use the following idioms:: @@ -104,7 +336,7 @@ Python ≥ 2.6 provides ``bytes`` as an alias for ``str``, so you don't need :attr:`six.binary_type`. ``long`` --------- +~~~~~~~~ The ``long`` type no longer exists in Python 3. ``1L`` is a syntax error. Use :data:`six.integer_types` check if a value is an integer or a long:: @@ -112,21 +344,27 @@ The ``long`` type no longer exists in Python 3. ``1L`` is a syntax error. Use isinstance(myvalue, six.integer_types) # replacement for (int, long) ``xrange`` ----------- +~~~~~~~~~~ Import :func:`six.moves.xrange` wherever you use ``xrange``. Moved modules -------------- +~~~~~~~~~~~~~ Some modules were renamed in Python 3. The :mod:`django.utils.six.moves ` module provides a compatible location to import them. -In addition to six' defaults, Django's version provides ``thread`` as -``_thread`` and ``dummy_thread`` as ``_dummy_thread``. +The ``urllib``, ``urllib2`` and ``urlparse`` modules were reworked in depth +and :mod:`django.utils.six.moves ` doesn't handle them. Django +explicitly tries both locations, as follows:: + + try: + from urllib.parse import urlparse, urlunparse + except ImportError: # Python 2 + from urlparse import urlparse, urlunparse PY3 ---- +~~~ If you need different code in Python 2 and Python 3, check :data:`six.PY3`:: @@ -141,9 +379,9 @@ function. .. module:: django.utils.six Customizations of six -===================== +--------------------- -The version of six bundled with Django includes a few additional tools: +The version of six bundled with Django includes one extra function: .. function:: iterlists(MultiValueDict) @@ -152,3 +390,6 @@ The version of six bundled with Django includes a few additional tools: :meth:`~django.utils.datastructures.MultiValueDict.iterlists()` on Python 2 and :meth:`~django.utils.datastructures.MultiValueDict.lists()` on Python 3. + +In addition to six' defaults moves, Django's version provides ``thread`` as +``_thread`` and ``dummy_thread`` as ``_dummy_thread``.