Edited templates.txt and templates_python.txt auto-escaping changes from [6671]

git-svn-id: http://code.djangoproject.com/svn/django/trunk@6798 bcc190cf-cafb-0310-a4f2-bffc1f526a37
This commit is contained in:
Adrian Holovaty 2007-12-01 18:38:12 +00:00
parent cf21274b1a
commit 4a3084f3fc
2 changed files with 246 additions and 183 deletions

View File

@ -310,58 +310,104 @@ Automatic HTML escaping
**New in Django development version**
A very real problem when creating HTML (and other) output using templates and
variable substitution is the possibility of accidently inserting some variable
value that affects the resulting HTML. For example, a template fragment such as
::
When generating HTML from templates, there's always a risk that a variable will
include characters that affect the resulting HTML. For example, consider this
template fragment::
Hello, {{ name }}.
seems like a harmless way to display the user's name. However, if you are
displaying data that the user entered directly and they had entered their name as ::
At first, this seems like a harmless way to display a user's name, but consider
what would happen if the user entered his name as this::
<script>alert('hello')</script>
this would always display a Javascript alert box when the page was loaded.
Similarly, if you were displaying some data generated by another process and it
contained a '<' symbol, you couldn't just dump this straight into your HTML,
because it would be treated as the start of an element. The effects of these
sorts of problems can vary from merely annoying to allowing exploits via `Cross
Site Scripting`_ (XSS) attacks.
With this name value, the template would be rendered as::
.. _Cross Site Scripting: http://en.wikipedia.org/wiki/Cross-site_scripting
Hello, <script>alert('hello')</script>
In order to provide some protection against these problems, Django
provides automatic (but controllable) HTML escaping for data coming from
tempate variables. Inside this tag, any data that comes from template
variables is examined to see if it contains one of the five HTML characters
(<, >, ', " and &) that often need escaping and those characters are converted
to their respective HTML entities. It causes no harm if a character is
converted to an entity when it doesn't need to be, so all five characters are
always converted.
...which means the browser would pop-up a JavaScript alert box!
Since some variables will contain data that is *intended* to be rendered
as HTML, template tag and filter writers can mark their output strings as
requiring no further escaping. For example, the ``unordered_list`` filter is
designed to return raw HTML and we want the template processor to simply
display the results as returned, without applying any escaping. That is taken
care of by the filter. The template author need do nothing special in that
case.
Similarly, what if the name contained a ``'<'`` symbol, like this?
By default, automatic HTML escaping is always applied. However, sometimes you
will not want this to occur (for example, if you're using the templating
system to create an email). To control automatic escaping inside your template,
wrap the affected content in the ``autoescape`` tag, like so::
<b>username
That would result in a rendered template like this::
Hello, <b>username
...which, in turn, would result in the remainder of the Web page being bolded!
Clearly, user-submitted data shouldn't be trusted blindly and inserted directly
into your Web pages, because a malicious user could use this kind of hole to
do potentially bad things. This type of security exploit is called a
Cross Site Scripting`_ (XSS) attack.
To avoid this problem, you have two options:
* One, you can make sure to run each untrusted variable through the
``escape`` filter (documented below), which converts potentially harmful
HTML characters to unharmful ones. This was default the default solution
in Django for its first few years, but the problem is that it puts the
onus on *you*, the developer / template author, to ensure you're escaping
everything. It's easy to forget to escape data.
* Two, you can take advantage of Django's automatic HTML escaping. The
remainder of this section describes how auto-escaping works.
By default in the Django development version, every template automatically
escapes the output of every variable tag. Specifically, these five characters
are escaped:
* ``<`` is converted to ``&lt;``
* ``>`` is converted to ``&gt;``
* ``'`` (single quote) is converted to ``&#39;``
* ``"`` (double quote) is converted to ``&quot;``
* ``&`` is converted to ``&amp;``
Again, we stress that this behavior is on by default. If you're using Django's
template system, you're protected.
How to turn it off
------------------
If you don't want data to be auto-escaped, on a per-site, per-template level or
per-variable level, you can turn it off in several ways.
Why would you want to turn it off? Because sometimes, template variables
contain data that you *intend* to be rendered as raw HTML, in which case you
don't want their contents to be escaped. For example, you might store a blob of
HTML in your database and want to embed that directly into your template. Or,
you might be using Django's template system to produce text that is *not* HTML
-- like an e-mail message, for instance.
For individual variables
~~~~~~~~~~~~~~~~~~~~~~~~
To disable auto-escaping for an individual variable, use the ``safe`` filter::
This will be escaped: {{ data }}
This will not be escaped: {{ data|safe }}
Think of *safe* as shorthand for *safe from further escaping* or *can be
safely interpreted as HTML*. In this example, if ``data`` contains ``'<b>'``,
the output will be::
This will be escaped: &lt;b&gt;
This will not be escaped: <b>
For template blocks
~~~~~~~~~~~~~~~~~~~
To control auto-escaping for a template, wrap the template (or just a
particular section of the template) in the ``autoescape`` tag, like so::
{% autoescape off %}
Hello {{ name }}
{% endautoescape %}
The auto-escaping tag passes its effect onto templates that extend the
current one as well as templates included via the ``include`` tag, just like
all block tags.
The ``autoescape`` tag takes either ``on`` or ``off`` as its argument. At times, you might want to force auto-escaping when it would otherwise be disabled. For example::
The ``autoescape`` tag takes either ``on`` or ``off`` as its argument. At
times, you might want to force auto-escaping when it would otherwise be
disabled. Here is an example template::
Auto-escaping is on by default. Hello {{ name }}
@ -370,52 +416,60 @@ The ``autoescape`` tag takes either ``on`` or ``off`` as its argument. At times,
Nor this: {{ other_data }}
{% autoescape on %}
Auto-escaping applies again, {{ name }}
Auto-escaping applies again: {{ name }}
{% endautoescape %}
{% endautoescape %}
For individual variables, the ``safe`` filter can also be used to indicate
that the contents should not be automatically escaped::
The auto-escaping tag passes its effect onto templates that extend the
current one as well as templates included via the ``include`` tag, just like
all block tags. For example::
This will be escaped: {{ data }}
This will not be escaped: {{ data|safe }}
# base.html
Think of *safe* as shorthand for *safe from further escaping* or *can be
safely interpreted as HTML*. In this example, if ``data`` contains ``'<a>'``,
the output will be::
{% autoescape off %}
<h1>{% block title %}</h1>
{% block content %}
{% endautoescape %}
This will be escaped: &lt;a&gt;
This will not be escaped: <a>
Generally, you won't need to worry about auto-escaping very much. View
developers and custom filter authors need to think about when their data
shouldn't be escaped and mark it appropriately. They are in a better position
to know when that should happen than the template author, so it is their
responsibility. By default, all output is escaped unless the template
processor is explicitly told otherwise.
# child.html
You should also note that if you are trying to write a template that might be
used in situations where automatic escaping is enabled or disabled and you
don't know which (such as when your template is included in other templates),
you can safely write as if you were in an ``{% autoescape off %}`` situation.
Scatter ``escape`` filters around for any variables that need escaping. When
auto-escaping is on, these extra filters won't change the output -- any
variables that use the ``escape`` filter do not have further automatic
escaping applied to them.
{% extends "base.html" %}
{% block title %}This & that{% endblock %}
{% block content %}<b>Hello!</b>{% endblock %}
Because auto-escaping is turned off in the base template, it will also be
turned off in the child template, resulting in the following rendered HTML::
<h1>This & that</h1>
<b>Hello!</b>
Notes
-----
Generally, template authors don't need to worry about auto-escaping very much.
Developers on the Python side (people writing views and custom filters) need to
think about the cases in which data shouldn't be escaped, and mark data
appropriately, so things Just Work in the template.
If you're creating a template that might be used in situations where you're
not sure whether auto-escaping is enabled, then add an ``escape`` filter to any
variable that needs escaping. When auto-escaping is on, there's no danger of
the ``escape`` filter *double-escaping* data -- the ``escape`` filter does not
affect auto-escaped variables.
String literals and automatic escaping
--------------------------------------
Sometimes you will pass a string literal as an argument to a filter. For
example::
As we mentioned earlier, filter arguments can be strings::
{{ data|default:"This is a string literal." }}
All string literals are inserted **without** any automatic escaping into the
template, if they are used (it's as if they were all passed through the
``safe`` filter). The reasoning behind this is that the template author is in
control of what goes into the string literal, so they can make sure the text
is correctly escaped when the template is written.
template -- they act as if they were all passed through the ``safe`` filter.
The reasoning behind this is that the template author is in control of what
goes into the string literal, so they can make sure the text is correctly
escaped when the template is written.
This means you would write ::
@ -426,7 +480,7 @@ This means you would write ::
{{ data|default:"3 > 2" }} <-- Bad! Don't do this.
This doesn't affect what happens to data coming from the variable itself.
The variable's contents are still automatically escaped, if necessary, since
The variable's contents are still automatically escaped, if necessary, because
they're beyond the control of the template author.
Using the built-in reference
@ -1230,11 +1284,11 @@ once, after all other filters).
Escapes a string's HTML. Specifically, it makes these replacements:
* ``"&"`` to ``"&amp;"``
* ``<`` to ``"&lt;"``
* ``>`` to ``"&gt;"``
* ``'"'`` (double quote) to ``'&quot;'``
* ``"'"`` (single quote) to ``'&#39;'``
* ``<`` is converted to ``&lt;``
* ``>`` is converted to ``&gt;``
* ``'`` (single quote) is converted to ``&#39;``
* ``"`` (double quote) is converted to ``&quot;``
* ``&`` is converted to ``&amp;``
The escaping is only applied when the string is output, so it does not matter
where in a chained sequence of filters you put ``escape``: it will always be

View File

@ -727,134 +727,144 @@ Filters and auto-escaping
**New in Django development version**
When you are writing a custom filter, you need to give some thought to how
this filter will interact with Django's auto-escaping behaviour. Firstly, you
should realise that there are three types of strings that can be passed around
inside the template code:
When writing a custom filter, give some thought to how the filter will interact
with Django's auto-escaping behavior. Note that three types of strings can be
passed around inside the template code:
* raw strings are the native Python ``str`` or ``unicode`` types. On
output, they are escaped if auto-escaping is in effect and presented
unchanged, otherwise.
* **Raw strings** are the native Python ``str`` or ``unicode`` types. On
output, they're escaped if auto-escaping is in effect and presented
unchanged, otherwise.
* "safe" strings are strings that are safe from further escaping at output
time. Any necessary escaping has already been done. They are commonly used
for output that contains raw HTML that is intended to be intrepreted on the
client side.
* **Safe strings** are strings that have been marked safe from further
escaping at output time. Any necessary escaping has already been done.
They're commonly used for output that contains raw HTML that is intended
to be interpreted as-is on the client side.
Internally, these strings are of type ``SafeString`` or ``SafeUnicode``,
although they share a common base class in ``SafeData``, so you can test
for them using code like::
Internally, these strings are of type ``SafeString`` or ``SafeUnicode``.
They share a common base class of ``SafeData``, so you can test
for them using code like::
if isinstance(value, SafeData):
# Do something with the "safe" string.
if isinstance(value, SafeData):
# Do something with the "safe" string.
* strings which are marked as "needing escaping" are *always* escaped on
output, regardless of whether they are in an ``autoescape`` block or not.
These strings are only escaped once, however, even if auto-escaping
applies. This type of string is internally represented by the types
``EscapeString`` and ``EscapeUnicode``. You will not normally need to worry
about these; they exist for the implementation of the ``escape`` filter.
* **Strings marked as "needing escaping"** are *always* escaped on
output, regardless of whether they are in an ``autoescape`` block or not.
These strings are only escaped once, however, even if auto-escaping
applies.
When you are writing a filter, your code will typically fall into one of two
situations:
Internally, these strings are of type ``EscapeString`` or
``EscapeUnicode``. Generally you don't have to worry about these; they
exist for the implementation of the ``escape`` filter.
1. Your filter does not introduce any HTML-unsafe characters (``<``, ``>``,
``'``, ``"`` or ``&``) into the result that were not already present. In
this case, you can let Django take care of all the auto-escaping handling
for you. All you need to do is put the ``is_safe`` attribute on your
filter function and set it to ``True``. This attribute tells Django that
is a "safe" string is passed into your filter, the result will still be
"safe" and if a non-safe string is passed in, Django will automatically
escape it, if necessary. The reason ``is_safe`` is necessary is because
there are plenty of normal string operations that will turn a ``SafeData``
object back into a normal ``str`` or ``unicode`` object and, rather than
try to catch them all, which would be very difficult, Django repairs the
damage after the filter has completed.
Template filter code falls into one of two situations:
For example, suppose you have a filter that adds the string ``xx`` to the
end of any input. Since this introduces no dangerous HTML characters into
the result (aside from any that were already present), you should mark
your filter with ``is_safe``::
1. Your filter does not introduce any HTML-unsafe characters (``<``, ``>``,
``'``, ``"`` or ``&``) into the result that were not already present. In
this case, you can let Django take care of all the auto-escaping
handling for you. All you need to do is put the ``is_safe`` attribute on
your filter function and set it to ``True``, like so::
@register.filter
def myfilter(value):
return value
myfilter.is_safe = True
This attribute tells Django that if a "safe" string is passed into your
filter, the result will still be "safe" and if a non-safe string is
passed in, Django will automatically escape it, if necessary.
You can think of this as meaning "this filter is safe -- it doesn't
introduce any possibility of unsafe HTML."
The reason ``is_safe`` is necessary is because there are plenty of
normal string operations that will turn a ``SafeData`` object back into
a normal ``str`` or ``unicode`` object and, rather than try to catch
them all, which would be very difficult, Django repairs the damage after
the filter has completed.
For example, suppose you have a filter that adds the string ``xx`` to the
end of any input. Since this introduces no dangerous HTML characters to
the result (aside from any that were already present), you should mark
your filter with ``is_safe``::
@register.filter
def add_xx(value):
return '%sxx' % value
add_xx.is_safe = True
When this filter is used in a template where auto-escaping is enabled,
Django will escape the output whenever the input is not already marked as
"safe".
When this filter is used in a template where auto-escaping is enabled,
Django will escape the output whenever the input is not already marked as
"safe".
By default, ``is_safe`` defaults to ``False`` and you can omit it from
any filters where it isn't required.
By default, ``is_safe`` defaults to ``False``, and you can omit it from
any filters where it isn't required.
Be careful when deciding if your filter really does leave safe strings
as safe. Sometimes if you are *removing* characters, you can
inadvertently leave unbalanced HTML tags or entities in the result.
For example, removing a ``>`` from the input might turn ``<a>`` into
``<a``, which would need to be escaped on output to avoid causing
problems. Similarly, removing a semicolon (``;``) can turn ``&amp;``
into ``&amp``, which is no longer a valid entity and thus needs
further escaping. Most cases won't be nearly this tricky, but keep an
eye out for any problems like that when reviewing your code.
Be careful when deciding if your filter really does leave safe strings
as safe. If you're *removing* characters, you might inadvertently leave
unbalanced HTML tags or entities in the result. For example, removing a
``>`` from the input might turn ``<a>`` into ``<a``, which would need to
be escaped on output to avoid causing problems. Similarly, removing a
semicolon (``;``) can turn ``&amp;`` into ``&amp``, which is no longer a
valid entity and thus needs further escaping. Most cases won't be nearly
this tricky, but keep an eye out for any problems like that when
reviewing your code.
2. Alternatively, your filter code can manually take care of any necessary
escaping. This is usually necessary when you are introducing new HTML
markup into the result. You want to mark the output as safe from further
escaping so that your HTML markup isn't escaped further, so you'll need to
handle the input yourself.
2. Alternatively, your filter code can manually take care of any necessary
escaping. This is necessary when you're introducing new HTML markup into
the result. You want to mark the output as safe from further
escaping so that your HTML markup isn't escaped further, so you'll need
to handle the input yourself.
To mark the output as a safe string, use
``django.utils.safestring.mark_safe()``.
To mark the output as a safe string, use ``django.utils.safestring.mark_safe()``.
Be careful, though. You need to do more than just mark the output as
safe. You need to ensure it really *is* safe and what you do will often
depend upon whether or not auto-escaping is in effect. The idea is to
write filters than can operate in templates where auto-escaping is either
on or off in order to make things easier for your template authors.
Be careful, though. You need to do more than just mark the output as
safe. You need to ensure it really *is* safe, and what you do depends on
whether auto-escaping is in effect. The idea is to write filters than
can operate in templates where auto-escaping is either on or off in
order to make things easier for your template authors.
In order for you filter to know the current auto-escaping state, set the
``needs_autoescape`` attribute to ``True`` on your function (if you don't
specify this attribute, it defaults to ``False``). This attribute tells
Django that your filter function wants to be passed an extra keyword
argument, called ``autoescape`` that is ``True`` is auto-escaping is in
effect and ``False`` otherwise.
In order for you filter to know the current auto-escaping state, set the
``needs_autoescape`` attribute to ``True`` on your function. (If you
don't specify this attribute, it defaults to ``False``). This attribute
tells Django that your filter function wants to be passed an extra
keyword argument, called ``autoescape``, that is ``True`` is
auto-escaping is in effect and ``False`` otherwise.
An example might make this clearer. Let's write a filter that emphasizes
the first character of a string::
For example, let's write a filter that emphasizes the first character of
a string::
from django.utils.html import conditional_escape
from django.utils.safestring import mark_safe
from django.utils.html import conditional_escape
from django.utils.safestring import mark_safe
def initial_letter_filter(text, autoescape=None):
first, other = text[0] ,text[1:]
if autoescape:
esc = conditional_escape
else:
esc = lambda x: x
result = '<strong>%s</strong>%s' % (esc(first), esc(other))
return mark_safe(result)
initial_letter_filter.needs_autoescape = True
def initial_letter_filter(text, autoescape=None):
first, other = text[0], text[1:]
if autoescape:
esc = conditional_escape
else:
esc = lambda x: x
result = '<strong>%s</strong>%s' % (esc(first), esc(other))
return mark_safe(result)
initial_letter_filter.needs_autoescape = True
The ``needs_autoescape`` attribute on the filter function and the
``autoescape`` keyword argument mean that our function will know whether
or not automatic escaping is in effect when the filter is called. We use
``autoescape`` to decide whether the input data needs to be passed through
``django.utils.html.conditional_escape`` or not (in the latter case, we
just use the identity function as the "escape" function). The
``conditional_escape()`` function is like ``escape()`` except it only
escapes input that is **not** a ``SafeData`` instance. If a ``SafeData``
instance is passed to ``conditional_escape()``, the data is returned
unchanged.
The ``needs_autoescape`` attribute on the filter function and the
``autoescape`` keyword argument mean that our function will know whether
automatic escaping is in effect when the filter is called. We use
``autoescape`` to decide whether the input data needs to be passed through
``django.utils.html.conditional_escape`` or not. (In the latter case, we
just use the identity function as the "escape" function.) The
``conditional_escape()`` function is like ``escape()`` except it only
escapes input that is **not** a ``SafeData`` instance. If a ``SafeData``
instance is passed to ``conditional_escape()``, the data is returned
unchanged.
Finally, in the above example, we remember to mark the result as safe
so that our HTML is inserted directly into the template without further
escaping.
Finally, in the above example, we remember to mark the result as safe
so that our HTML is inserted directly into the template without further
escaping.
There is no need to worry about the ``is_safe`` attribute in this case
(although including it wouldn't hurt anything). Whenever you are manually
handling the auto-escaping issues and returning a safe string, the
``is_safe`` attribute won't change anything either way.
There's no need to worry about the ``is_safe`` attribute in this case
(although including it wouldn't hurt anything). Whenever you manually
handle the auto-escaping issues and return a safe string, the
``is_safe`` attribute won't change anything either way.
Writing custom template tags
----------------------------
@ -981,7 +991,7 @@ Auto-escaping considerations
The output from template tags is **not** automatically run through the
auto-escaping filters. However, there are still a couple of things you should
keep in mind when writing a template tag:
keep in mind when writing a template tag.
If the ``render()`` function of your template stores the result in a context
variable (rather than returning the result in a string), it should take care
@ -991,18 +1001,17 @@ time, so content that should be safe from further escaping needs to be marked
as such.
Also, if your template tag creates a new context for performing some
sub-rendering, you should be careful to set the auto-escape attribute to the
current context's value. The ``__init__`` method for the ``Context`` class
takes a parameter called ``autoescape`` that you can use for this purpose. For
example::
sub-rendering, set the auto-escape attribute to the current context's value.
The ``__init__`` method for the ``Context`` class takes a parameter called
``autoescape`` that you can use for this purpose. For example::
def render(self, context):
# ...
new_context = Context({'var': obj}, autoescape=context.autoescape)
# ... Do something with new_context ...
This is not a very common situation, but it is sometimes useful, particularly
if you are rendering a template yourself. For example::
This is not a very common situation, but it's useful if you're rendering a
template yourself. For example::
def render(self, context):
t = template.load_template('small_fragment.html')
@ -1010,7 +1019,7 @@ if you are rendering a template yourself. For example::
If we had neglected to pass in the current ``context.autoescape`` value to our
new ``Context`` in this example, the results would have *always* been
automatically escaped, which may not be the desired behaviour if the template
automatically escaped, which may not be the desired behavior if the template
tag is used inside a ``{% autoescape off %}`` block.
Registering the tag