Edited templates.txt and templates_python.txt auto-escaping changes from [6671]

git-svn-id: http://code.djangoproject.com/svn/django/trunk@6798 bcc190cf-cafb-0310-a4f2-bffc1f526a37
This commit is contained in:
Adrian Holovaty 2007-12-01 18:38:12 +00:00
parent cf21274b1a
commit 4a3084f3fc
2 changed files with 246 additions and 183 deletions

View File

@ -310,58 +310,104 @@ Automatic HTML escaping
**New in Django development version** **New in Django development version**
A very real problem when creating HTML (and other) output using templates and When generating HTML from templates, there's always a risk that a variable will
variable substitution is the possibility of accidently inserting some variable include characters that affect the resulting HTML. For example, consider this
value that affects the resulting HTML. For example, a template fragment such as template fragment::
::
Hello, {{ name }}. Hello, {{ name }}.
seems like a harmless way to display the user's name. However, if you are At first, this seems like a harmless way to display a user's name, but consider
displaying data that the user entered directly and they had entered their name as :: what would happen if the user entered his name as this::
<script>alert('hello')</script> <script>alert('hello')</script>
this would always display a Javascript alert box when the page was loaded. With this name value, the template would be rendered as::
Similarly, if you were displaying some data generated by another process and it
contained a '<' symbol, you couldn't just dump this straight into your HTML,
because it would be treated as the start of an element. The effects of these
sorts of problems can vary from merely annoying to allowing exploits via `Cross
Site Scripting`_ (XSS) attacks.
.. _Cross Site Scripting: http://en.wikipedia.org/wiki/Cross-site_scripting Hello, <script>alert('hello')</script>
In order to provide some protection against these problems, Django ...which means the browser would pop-up a JavaScript alert box!
provides automatic (but controllable) HTML escaping for data coming from
tempate variables. Inside this tag, any data that comes from template
variables is examined to see if it contains one of the five HTML characters
(<, >, ', " and &) that often need escaping and those characters are converted
to their respective HTML entities. It causes no harm if a character is
converted to an entity when it doesn't need to be, so all five characters are
always converted.
Since some variables will contain data that is *intended* to be rendered Similarly, what if the name contained a ``'<'`` symbol, like this?
as HTML, template tag and filter writers can mark their output strings as
requiring no further escaping. For example, the ``unordered_list`` filter is
designed to return raw HTML and we want the template processor to simply
display the results as returned, without applying any escaping. That is taken
care of by the filter. The template author need do nothing special in that
case.
By default, automatic HTML escaping is always applied. However, sometimes you <b>username
will not want this to occur (for example, if you're using the templating
system to create an email). To control automatic escaping inside your template, That would result in a rendered template like this::
wrap the affected content in the ``autoescape`` tag, like so::
Hello, <b>username
...which, in turn, would result in the remainder of the Web page being bolded!
Clearly, user-submitted data shouldn't be trusted blindly and inserted directly
into your Web pages, because a malicious user could use this kind of hole to
do potentially bad things. This type of security exploit is called a
Cross Site Scripting`_ (XSS) attack.
To avoid this problem, you have two options:
* One, you can make sure to run each untrusted variable through the
``escape`` filter (documented below), which converts potentially harmful
HTML characters to unharmful ones. This was default the default solution
in Django for its first few years, but the problem is that it puts the
onus on *you*, the developer / template author, to ensure you're escaping
everything. It's easy to forget to escape data.
* Two, you can take advantage of Django's automatic HTML escaping. The
remainder of this section describes how auto-escaping works.
By default in the Django development version, every template automatically
escapes the output of every variable tag. Specifically, these five characters
are escaped:
* ``<`` is converted to ``&lt;``
* ``>`` is converted to ``&gt;``
* ``'`` (single quote) is converted to ``&#39;``
* ``"`` (double quote) is converted to ``&quot;``
* ``&`` is converted to ``&amp;``
Again, we stress that this behavior is on by default. If you're using Django's
template system, you're protected.
How to turn it off
------------------
If you don't want data to be auto-escaped, on a per-site, per-template level or
per-variable level, you can turn it off in several ways.
Why would you want to turn it off? Because sometimes, template variables
contain data that you *intend* to be rendered as raw HTML, in which case you
don't want their contents to be escaped. For example, you might store a blob of
HTML in your database and want to embed that directly into your template. Or,
you might be using Django's template system to produce text that is *not* HTML
-- like an e-mail message, for instance.
For individual variables
~~~~~~~~~~~~~~~~~~~~~~~~
To disable auto-escaping for an individual variable, use the ``safe`` filter::
This will be escaped: {{ data }}
This will not be escaped: {{ data|safe }}
Think of *safe* as shorthand for *safe from further escaping* or *can be
safely interpreted as HTML*. In this example, if ``data`` contains ``'<b>'``,
the output will be::
This will be escaped: &lt;b&gt;
This will not be escaped: <b>
For template blocks
~~~~~~~~~~~~~~~~~~~
To control auto-escaping for a template, wrap the template (or just a
particular section of the template) in the ``autoescape`` tag, like so::
{% autoescape off %} {% autoescape off %}
Hello {{ name }} Hello {{ name }}
{% endautoescape %} {% endautoescape %}
The auto-escaping tag passes its effect onto templates that extend the The ``autoescape`` tag takes either ``on`` or ``off`` as its argument. At
current one as well as templates included via the ``include`` tag, just like times, you might want to force auto-escaping when it would otherwise be
all block tags. disabled. Here is an example template::
The ``autoescape`` tag takes either ``on`` or ``off`` as its argument. At times, you might want to force auto-escaping when it would otherwise be disabled. For example::
Auto-escaping is on by default. Hello {{ name }} Auto-escaping is on by default. Hello {{ name }}
@ -370,52 +416,60 @@ The ``autoescape`` tag takes either ``on`` or ``off`` as its argument. At times,
Nor this: {{ other_data }} Nor this: {{ other_data }}
{% autoescape on %} {% autoescape on %}
Auto-escaping applies again, {{ name }} Auto-escaping applies again: {{ name }}
{% endautoescape %} {% endautoescape %}
{% endautoescape %} {% endautoescape %}
For individual variables, the ``safe`` filter can also be used to indicate The auto-escaping tag passes its effect onto templates that extend the
that the contents should not be automatically escaped:: current one as well as templates included via the ``include`` tag, just like
all block tags. For example::
This will be escaped: {{ data }} # base.html
This will not be escaped: {{ data|safe }}
Think of *safe* as shorthand for *safe from further escaping* or *can be {% autoescape off %}
safely interpreted as HTML*. In this example, if ``data`` contains ``'<a>'``, <h1>{% block title %}</h1>
the output will be:: {% block content %}
{% endautoescape %}
This will be escaped: &lt;a&gt;
This will not be escaped: <a>
Generally, you won't need to worry about auto-escaping very much. View # child.html
developers and custom filter authors need to think about when their data
shouldn't be escaped and mark it appropriately. They are in a better position
to know when that should happen than the template author, so it is their
responsibility. By default, all output is escaped unless the template
processor is explicitly told otherwise.
You should also note that if you are trying to write a template that might be {% extends "base.html" %}
used in situations where automatic escaping is enabled or disabled and you {% block title %}This & that{% endblock %}
don't know which (such as when your template is included in other templates), {% block content %}<b>Hello!</b>{% endblock %}
you can safely write as if you were in an ``{% autoescape off %}`` situation.
Scatter ``escape`` filters around for any variables that need escaping. When Because auto-escaping is turned off in the base template, it will also be
auto-escaping is on, these extra filters won't change the output -- any turned off in the child template, resulting in the following rendered HTML::
variables that use the ``escape`` filter do not have further automatic
escaping applied to them. <h1>This & that</h1>
<b>Hello!</b>
Notes
-----
Generally, template authors don't need to worry about auto-escaping very much.
Developers on the Python side (people writing views and custom filters) need to
think about the cases in which data shouldn't be escaped, and mark data
appropriately, so things Just Work in the template.
If you're creating a template that might be used in situations where you're
not sure whether auto-escaping is enabled, then add an ``escape`` filter to any
variable that needs escaping. When auto-escaping is on, there's no danger of
the ``escape`` filter *double-escaping* data -- the ``escape`` filter does not
affect auto-escaped variables.
String literals and automatic escaping String literals and automatic escaping
-------------------------------------- --------------------------------------
Sometimes you will pass a string literal as an argument to a filter. For As we mentioned earlier, filter arguments can be strings::
example::
{{ data|default:"This is a string literal." }} {{ data|default:"This is a string literal." }}
All string literals are inserted **without** any automatic escaping into the All string literals are inserted **without** any automatic escaping into the
template, if they are used (it's as if they were all passed through the template -- they act as if they were all passed through the ``safe`` filter.
``safe`` filter). The reasoning behind this is that the template author is in The reasoning behind this is that the template author is in control of what
control of what goes into the string literal, so they can make sure the text goes into the string literal, so they can make sure the text is correctly
is correctly escaped when the template is written. escaped when the template is written.
This means you would write :: This means you would write ::
@ -426,7 +480,7 @@ This means you would write ::
{{ data|default:"3 > 2" }} <-- Bad! Don't do this. {{ data|default:"3 > 2" }} <-- Bad! Don't do this.
This doesn't affect what happens to data coming from the variable itself. This doesn't affect what happens to data coming from the variable itself.
The variable's contents are still automatically escaped, if necessary, since The variable's contents are still automatically escaped, if necessary, because
they're beyond the control of the template author. they're beyond the control of the template author.
Using the built-in reference Using the built-in reference
@ -1230,11 +1284,11 @@ once, after all other filters).
Escapes a string's HTML. Specifically, it makes these replacements: Escapes a string's HTML. Specifically, it makes these replacements:
* ``"&"`` to ``"&amp;"`` * ``<`` is converted to ``&lt;``
* ``<`` to ``"&lt;"`` * ``>`` is converted to ``&gt;``
* ``>`` to ``"&gt;"`` * ``'`` (single quote) is converted to ``&#39;``
* ``'"'`` (double quote) to ``'&quot;'`` * ``"`` (double quote) is converted to ``&quot;``
* ``"'"`` (single quote) to ``'&#39;'`` * ``&`` is converted to ``&amp;``
The escaping is only applied when the string is output, so it does not matter The escaping is only applied when the string is output, so it does not matter
where in a chained sequence of filters you put ``escape``: it will always be where in a chained sequence of filters you put ``escape``: it will always be

View File

@ -727,134 +727,144 @@ Filters and auto-escaping
**New in Django development version** **New in Django development version**
When you are writing a custom filter, you need to give some thought to how When writing a custom filter, give some thought to how the filter will interact
this filter will interact with Django's auto-escaping behaviour. Firstly, you with Django's auto-escaping behavior. Note that three types of strings can be
should realise that there are three types of strings that can be passed around passed around inside the template code:
inside the template code:
* raw strings are the native Python ``str`` or ``unicode`` types. On * **Raw strings** are the native Python ``str`` or ``unicode`` types. On
output, they are escaped if auto-escaping is in effect and presented output, they're escaped if auto-escaping is in effect and presented
unchanged, otherwise. unchanged, otherwise.
* "safe" strings are strings that are safe from further escaping at output * **Safe strings** are strings that have been marked safe from further
time. Any necessary escaping has already been done. They are commonly used escaping at output time. Any necessary escaping has already been done.
for output that contains raw HTML that is intended to be intrepreted on the They're commonly used for output that contains raw HTML that is intended
client side. to be interpreted as-is on the client side.
Internally, these strings are of type ``SafeString`` or ``SafeUnicode``, Internally, these strings are of type ``SafeString`` or ``SafeUnicode``.
although they share a common base class in ``SafeData``, so you can test They share a common base class of ``SafeData``, so you can test
for them using code like:: for them using code like::
if isinstance(value, SafeData): if isinstance(value, SafeData):
# Do something with the "safe" string. # Do something with the "safe" string.
* strings which are marked as "needing escaping" are *always* escaped on * **Strings marked as "needing escaping"** are *always* escaped on
output, regardless of whether they are in an ``autoescape`` block or not. output, regardless of whether they are in an ``autoescape`` block or not.
These strings are only escaped once, however, even if auto-escaping These strings are only escaped once, however, even if auto-escaping
applies. This type of string is internally represented by the types applies.
``EscapeString`` and ``EscapeUnicode``. You will not normally need to worry
about these; they exist for the implementation of the ``escape`` filter.
When you are writing a filter, your code will typically fall into one of two Internally, these strings are of type ``EscapeString`` or
situations: ``EscapeUnicode``. Generally you don't have to worry about these; they
exist for the implementation of the ``escape`` filter.
1. Your filter does not introduce any HTML-unsafe characters (``<``, ``>``, Template filter code falls into one of two situations:
``'``, ``"`` or ``&``) into the result that were not already present. In
this case, you can let Django take care of all the auto-escaping handling
for you. All you need to do is put the ``is_safe`` attribute on your
filter function and set it to ``True``. This attribute tells Django that
is a "safe" string is passed into your filter, the result will still be
"safe" and if a non-safe string is passed in, Django will automatically
escape it, if necessary. The reason ``is_safe`` is necessary is because
there are plenty of normal string operations that will turn a ``SafeData``
object back into a normal ``str`` or ``unicode`` object and, rather than
try to catch them all, which would be very difficult, Django repairs the
damage after the filter has completed.
For example, suppose you have a filter that adds the string ``xx`` to the 1. Your filter does not introduce any HTML-unsafe characters (``<``, ``>``,
end of any input. Since this introduces no dangerous HTML characters into ``'``, ``"`` or ``&``) into the result that were not already present. In
the result (aside from any that were already present), you should mark this case, you can let Django take care of all the auto-escaping
your filter with ``is_safe``:: handling for you. All you need to do is put the ``is_safe`` attribute on
your filter function and set it to ``True``, like so::
@register.filter
def myfilter(value):
return value
myfilter.is_safe = True
This attribute tells Django that if a "safe" string is passed into your
filter, the result will still be "safe" and if a non-safe string is
passed in, Django will automatically escape it, if necessary.
You can think of this as meaning "this filter is safe -- it doesn't
introduce any possibility of unsafe HTML."
The reason ``is_safe`` is necessary is because there are plenty of
normal string operations that will turn a ``SafeData`` object back into
a normal ``str`` or ``unicode`` object and, rather than try to catch
them all, which would be very difficult, Django repairs the damage after
the filter has completed.
For example, suppose you have a filter that adds the string ``xx`` to the
end of any input. Since this introduces no dangerous HTML characters to
the result (aside from any that were already present), you should mark
your filter with ``is_safe``::
@register.filter @register.filter
def add_xx(value): def add_xx(value):
return '%sxx' % value return '%sxx' % value
add_xx.is_safe = True add_xx.is_safe = True
When this filter is used in a template where auto-escaping is enabled, When this filter is used in a template where auto-escaping is enabled,
Django will escape the output whenever the input is not already marked as Django will escape the output whenever the input is not already marked as
"safe". "safe".
By default, ``is_safe`` defaults to ``False`` and you can omit it from By default, ``is_safe`` defaults to ``False``, and you can omit it from
any filters where it isn't required. any filters where it isn't required.
Be careful when deciding if your filter really does leave safe strings Be careful when deciding if your filter really does leave safe strings
as safe. Sometimes if you are *removing* characters, you can as safe. If you're *removing* characters, you might inadvertently leave
inadvertently leave unbalanced HTML tags or entities in the result. unbalanced HTML tags or entities in the result. For example, removing a
For example, removing a ``>`` from the input might turn ``<a>`` into ``>`` from the input might turn ``<a>`` into ``<a``, which would need to
``<a``, which would need to be escaped on output to avoid causing be escaped on output to avoid causing problems. Similarly, removing a
problems. Similarly, removing a semicolon (``;``) can turn ``&amp;`` semicolon (``;``) can turn ``&amp;`` into ``&amp``, which is no longer a
into ``&amp``, which is no longer a valid entity and thus needs valid entity and thus needs further escaping. Most cases won't be nearly
further escaping. Most cases won't be nearly this tricky, but keep an this tricky, but keep an eye out for any problems like that when
eye out for any problems like that when reviewing your code. reviewing your code.
2. Alternatively, your filter code can manually take care of any necessary 2. Alternatively, your filter code can manually take care of any necessary
escaping. This is usually necessary when you are introducing new HTML escaping. This is necessary when you're introducing new HTML markup into
markup into the result. You want to mark the output as safe from further the result. You want to mark the output as safe from further
escaping so that your HTML markup isn't escaped further, so you'll need to escaping so that your HTML markup isn't escaped further, so you'll need
handle the input yourself. to handle the input yourself.
To mark the output as a safe string, use To mark the output as a safe string, use ``django.utils.safestring.mark_safe()``.
``django.utils.safestring.mark_safe()``.
Be careful, though. You need to do more than just mark the output as Be careful, though. You need to do more than just mark the output as
safe. You need to ensure it really *is* safe and what you do will often safe. You need to ensure it really *is* safe, and what you do depends on
depend upon whether or not auto-escaping is in effect. The idea is to whether auto-escaping is in effect. The idea is to write filters than
write filters than can operate in templates where auto-escaping is either can operate in templates where auto-escaping is either on or off in
on or off in order to make things easier for your template authors. order to make things easier for your template authors.
In order for you filter to know the current auto-escaping state, set the In order for you filter to know the current auto-escaping state, set the
``needs_autoescape`` attribute to ``True`` on your function (if you don't ``needs_autoescape`` attribute to ``True`` on your function. (If you
specify this attribute, it defaults to ``False``). This attribute tells don't specify this attribute, it defaults to ``False``). This attribute
Django that your filter function wants to be passed an extra keyword tells Django that your filter function wants to be passed an extra
argument, called ``autoescape`` that is ``True`` is auto-escaping is in keyword argument, called ``autoescape``, that is ``True`` is
effect and ``False`` otherwise. auto-escaping is in effect and ``False`` otherwise.
An example might make this clearer. Let's write a filter that emphasizes For example, let's write a filter that emphasizes the first character of
the first character of a string:: a string::
from django.utils.html import conditional_escape from django.utils.html import conditional_escape
from django.utils.safestring import mark_safe from django.utils.safestring import mark_safe
def initial_letter_filter(text, autoescape=None): def initial_letter_filter(text, autoescape=None):
first, other = text[0] ,text[1:] first, other = text[0], text[1:]
if autoescape: if autoescape:
esc = conditional_escape esc = conditional_escape
else: else:
esc = lambda x: x esc = lambda x: x
result = '<strong>%s</strong>%s' % (esc(first), esc(other)) result = '<strong>%s</strong>%s' % (esc(first), esc(other))
return mark_safe(result) return mark_safe(result)
initial_letter_filter.needs_autoescape = True initial_letter_filter.needs_autoescape = True
The ``needs_autoescape`` attribute on the filter function and the The ``needs_autoescape`` attribute on the filter function and the
``autoescape`` keyword argument mean that our function will know whether ``autoescape`` keyword argument mean that our function will know whether
or not automatic escaping is in effect when the filter is called. We use automatic escaping is in effect when the filter is called. We use
``autoescape`` to decide whether the input data needs to be passed through ``autoescape`` to decide whether the input data needs to be passed through
``django.utils.html.conditional_escape`` or not (in the latter case, we ``django.utils.html.conditional_escape`` or not. (In the latter case, we
just use the identity function as the "escape" function). The just use the identity function as the "escape" function.) The
``conditional_escape()`` function is like ``escape()`` except it only ``conditional_escape()`` function is like ``escape()`` except it only
escapes input that is **not** a ``SafeData`` instance. If a ``SafeData`` escapes input that is **not** a ``SafeData`` instance. If a ``SafeData``
instance is passed to ``conditional_escape()``, the data is returned instance is passed to ``conditional_escape()``, the data is returned
unchanged. unchanged.
Finally, in the above example, we remember to mark the result as safe Finally, in the above example, we remember to mark the result as safe
so that our HTML is inserted directly into the template without further so that our HTML is inserted directly into the template without further
escaping. escaping.
There is no need to worry about the ``is_safe`` attribute in this case There's no need to worry about the ``is_safe`` attribute in this case
(although including it wouldn't hurt anything). Whenever you are manually (although including it wouldn't hurt anything). Whenever you manually
handling the auto-escaping issues and returning a safe string, the handle the auto-escaping issues and return a safe string, the
``is_safe`` attribute won't change anything either way. ``is_safe`` attribute won't change anything either way.
Writing custom template tags Writing custom template tags
---------------------------- ----------------------------
@ -981,7 +991,7 @@ Auto-escaping considerations
The output from template tags is **not** automatically run through the The output from template tags is **not** automatically run through the
auto-escaping filters. However, there are still a couple of things you should auto-escaping filters. However, there are still a couple of things you should
keep in mind when writing a template tag: keep in mind when writing a template tag.
If the ``render()`` function of your template stores the result in a context If the ``render()`` function of your template stores the result in a context
variable (rather than returning the result in a string), it should take care variable (rather than returning the result in a string), it should take care
@ -991,18 +1001,17 @@ time, so content that should be safe from further escaping needs to be marked
as such. as such.
Also, if your template tag creates a new context for performing some Also, if your template tag creates a new context for performing some
sub-rendering, you should be careful to set the auto-escape attribute to the sub-rendering, set the auto-escape attribute to the current context's value.
current context's value. The ``__init__`` method for the ``Context`` class The ``__init__`` method for the ``Context`` class takes a parameter called
takes a parameter called ``autoescape`` that you can use for this purpose. For ``autoescape`` that you can use for this purpose. For example::
example::
def render(self, context): def render(self, context):
# ... # ...
new_context = Context({'var': obj}, autoescape=context.autoescape) new_context = Context({'var': obj}, autoescape=context.autoescape)
# ... Do something with new_context ... # ... Do something with new_context ...
This is not a very common situation, but it is sometimes useful, particularly This is not a very common situation, but it's useful if you're rendering a
if you are rendering a template yourself. For example:: template yourself. For example::
def render(self, context): def render(self, context):
t = template.load_template('small_fragment.html') t = template.load_template('small_fragment.html')
@ -1010,7 +1019,7 @@ if you are rendering a template yourself. For example::
If we had neglected to pass in the current ``context.autoescape`` value to our If we had neglected to pass in the current ``context.autoescape`` value to our
new ``Context`` in this example, the results would have *always* been new ``Context`` in this example, the results would have *always* been
automatically escaped, which may not be the desired behaviour if the template automatically escaped, which may not be the desired behavior if the template
tag is used inside a ``{% autoescape off %}`` block. tag is used inside a ``{% autoescape off %}`` block.
Registering the tag Registering the tag