diff --git a/docs/ref/unicode.txt b/docs/ref/unicode.txt index d03f388111..17c5edfad6 100644 --- a/docs/ref/unicode.txt +++ b/docs/ref/unicode.txt @@ -150,19 +150,16 @@ Web frameworks have to deal with URLs (which are a type of IRI_). One requirement of URLs is that they are encoded using only ASCII characters. However, in an international environment, you might need to construct a URL from an IRI_ -- very loosely speaking, a URI_ that can contain Unicode -characters. Quoting and converting an IRI to URI can be a little tricky, so -Django provides some assistance. +characters. Use these functions for quoting and converting an IRI to a URI: -* The function :func:`django.utils.encoding.iri_to_uri()` implements the - conversion from IRI to URI as required by the specification (:rfc:`3987#section-3.1`). +* The :func:`django.utils.encoding.iri_to_uri()` function, which implements the + conversion from IRI to URI as required by :rfc:`3987#section-3.1`. -* The functions ``django.utils.http.urlquote()`` and - ``django.utils.http.urlquote_plus()`` are versions of Python's standard - ``urllib.quote()`` and ``urllib.quote_plus()`` that work with non-ASCII - characters. (The data is converted to UTF-8 prior to encoding.) +* The :func:`urllib.parse.quote` and :func:`urllib.parse.quote_plus` + functions from Python's standard library. These two groups of functions have slightly different purposes, and it's -important to keep them straight. Normally, you would use ``urlquote()`` on the +important to keep them straight. Normally, you would use ``quote()`` on the individual portions of the IRI or URI path so that any reserved characters such as '&' or '%' are correctly encoded. Then, you apply ``iri_to_uri()`` to the full IRI and it converts any non-ASCII characters to the correct encoded @@ -181,13 +178,15 @@ like that. An example might clarify things here:: - >>> urlquote('Paris & Orléans') + >>> from urllib.parse import quote + >>> from django.utils.encoding import iri_to_uri + >>> quote('Paris & Orléans') 'Paris%20%26%20Orl%C3%A9ans' - >>> iri_to_uri('/favorites/François/%s' % urlquote('Paris & Orléans')) + >>> iri_to_uri('/favorites/François/%s' % quote('Paris & Orléans')) '/favorites/Fran%C3%A7ois/Paris%20%26%20Orl%C3%A9ans' If you look carefully, you can see that the portion that was generated by -``urlquote()`` in the second example was not double-quoted when passed to +``quote()`` in the second example was not double-quoted when passed to ``iri_to_uri()``. This is a very important and useful feature. It means that you can construct your IRI without worrying about whether it contains non-ASCII characters and then, right at the end, call ``iri_to_uri()`` on the @@ -198,6 +197,7 @@ implements the conversion from URI to IRI as per :rfc:`3987#section-3.2`. An example to demonstrate:: + >>> from django.utils.encoding import uri_to_iri >>> uri_to_iri('/%E2%99%A5%E2%99%A5/?utf8=%E2%9C%93') '/♥♥/?utf8=✓' >>> uri_to_iri('%A9hello%3Fworld') @@ -240,14 +240,14 @@ handles this for you automatically. If you're constructing a URL manually (i.e., *not* using the ``reverse()`` function), you'll need to take care of the encoding yourself. In this case, -use the ``iri_to_uri()`` and ``urlquote()`` functions that were documented +use the ``iri_to_uri()`` and ``quote()`` functions that were documented above_. For example:: + from urllib.parse import quote from django.utils.encoding import iri_to_uri - from django.utils.http import urlquote def get_absolute_url(self): - url = '/person/%s/?x=0&y=0' % urlquote(self.location) + url = '/person/%s/?x=0&y=0' % quote(self.location) return iri_to_uri(url) This function returns a correctly encoded URL even if ``self.location`` is