172 lines
5.8 KiB
Plaintext
172 lines
5.8 KiB
Plaintext
|
====================================================
|
||
|
py.xml: Lightweight and flexible xml/html generation
|
||
|
====================================================
|
||
|
|
||
|
.. contents::
|
||
|
.. sectnum::
|
||
|
|
||
|
Motivation
|
||
|
==========
|
||
|
|
||
|
There are a plethora of frameworks and libraries to generate
|
||
|
xml and html trees. However, many of them are large, have a
|
||
|
steep learning curve and are often hard to debug. Not to
|
||
|
speak of the fact that they are frameworks to begin with.
|
||
|
|
||
|
The py lib strives to offer enough functionality to represent
|
||
|
itself and especially its API in html or xml.
|
||
|
|
||
|
.. _xist: http://www.livinglogic.de/Python/xist/index.html
|
||
|
.. _`exchange data`: execnet.html#exchange-data
|
||
|
|
||
|
a pythonic object model , please
|
||
|
================================
|
||
|
|
||
|
The py lib offers a pythonic way to generate xml/html, based on
|
||
|
ideas from xist_ which `uses python class objects`_ to build
|
||
|
xml trees. However, xist_'s implementation is somewhat heavy
|
||
|
because it has additional goals like transformations and
|
||
|
supporting many namespaces. But its basic idea is very easy.
|
||
|
|
||
|
.. _`uses python class objects`: http://www.livinglogic.de/Python/xist/Howto.html
|
||
|
|
||
|
generating arbitrary xml structures
|
||
|
-----------------------------------
|
||
|
|
||
|
With ``py.xml.Namespace`` you have the basis
|
||
|
to generate custom xml-fragments on the fly::
|
||
|
|
||
|
class ns(py.xml.Namespace):
|
||
|
"my custom xml namespace"
|
||
|
doc = ns.books(
|
||
|
ns.book(
|
||
|
ns.author("May Day"),
|
||
|
ns.title("python for java programmers"),),
|
||
|
ns.book(
|
||
|
ns.author("why"),
|
||
|
ns.title("Java for Python programmers"),),
|
||
|
publisher="N.N",
|
||
|
)
|
||
|
print doc.unicode(indent=2).encode('utf8')
|
||
|
|
||
|
will give you this representation::
|
||
|
|
||
|
<books publisher="N.N">
|
||
|
<book>
|
||
|
<author>May Day</author>
|
||
|
<title>python for java programmers</title></book>
|
||
|
<book>
|
||
|
<author>why</author>
|
||
|
<title>Java for Python programmers</title></book></books>
|
||
|
|
||
|
In a sentence: positional arguments are child-tags and
|
||
|
keyword-arguments are attributes.
|
||
|
|
||
|
On a side note, you'll see that the unicode-serializer
|
||
|
supports a nice indentation style which keeps your generated
|
||
|
html readable, basically through emulating python's white
|
||
|
space significance by putting closing-tags rightmost and
|
||
|
almost invisible at first glance :-)
|
||
|
|
||
|
basic example for generating html
|
||
|
---------------------------------
|
||
|
|
||
|
Consider this example::
|
||
|
|
||
|
from py.xml import html # html namespace
|
||
|
|
||
|
paras = "First Para", "Second para"
|
||
|
|
||
|
doc = html.html(
|
||
|
html.head(
|
||
|
html.meta(name="Content-Type", value="text/html; charset=latin1")),
|
||
|
html.body(
|
||
|
[html.p(p) for p in paras]))
|
||
|
|
||
|
print unicode(doc).encode('latin1')
|
||
|
|
||
|
Again, tags are objects which contain tags and have attributes.
|
||
|
More exactly, Tags inherit from the list type and thus can be
|
||
|
manipulated as list objects. They additionally support a default
|
||
|
way to represent themselves as a serialized unicode object.
|
||
|
|
||
|
If you happen to look at the py.xml implementation you'll
|
||
|
note that the tag/namespace implementation consumes some 50 lines
|
||
|
with another 50 lines for the unicode serialization code.
|
||
|
|
||
|
CSS-styling your html Tags
|
||
|
--------------------------
|
||
|
|
||
|
One aspect where many of the huge python xml/html generation
|
||
|
frameworks utterly fail is a clean and convenient integration
|
||
|
of CSS styling. Often, developers are left alone with keeping
|
||
|
CSS style definitions in sync with some style files
|
||
|
represented as strings (often in a separate .css file). Not
|
||
|
only is this hard to debug but the missing abstractions make
|
||
|
it hard to modify the styling of your tags or to choose custom
|
||
|
style representations (inline, html.head or external). Add the
|
||
|
Browers usual tolerance of messyness and errors in Style
|
||
|
references and welcome to hell, known as the domain of
|
||
|
developing web applications :-)
|
||
|
|
||
|
By contrast, consider this CSS styling example::
|
||
|
|
||
|
class my(html):
|
||
|
"my initial custom style"
|
||
|
class body(html.body):
|
||
|
style = html.Style(font_size = "120%")
|
||
|
|
||
|
class h2(html.h2):
|
||
|
style = html.Style(background = "grey")
|
||
|
|
||
|
class p(html.p):
|
||
|
style = html.Style(font_weight="bold")
|
||
|
|
||
|
doc = my.html(
|
||
|
my.head(),
|
||
|
my.body(
|
||
|
my.h2("hello world"),
|
||
|
my.p("bold as bold can")
|
||
|
)
|
||
|
)
|
||
|
|
||
|
print doc.unicode(indent=2)
|
||
|
|
||
|
This will give you a small'n mean self contained
|
||
|
represenation by default::
|
||
|
|
||
|
<html>
|
||
|
<head/>
|
||
|
<body style="font-size: 120%">
|
||
|
<h2 style="background: grey">hello world</h2>
|
||
|
<p style="font-weight: bold">bold as bold can</p></body></html>
|
||
|
|
||
|
Most importantly, note that the inline-styling is just an
|
||
|
implementation detail of the unicode serialization code.
|
||
|
You can easily modify the serialization to put your styling into the
|
||
|
``html.head`` or in a separate file and autogenerate CSS-class
|
||
|
names or ids.
|
||
|
|
||
|
Hey, you could even write tests that you are using correct
|
||
|
styles suitable for specific browser requirements. Did i mention
|
||
|
that the ability to easily write tests for your generated
|
||
|
html and its serialization could help to develop _stable_ user
|
||
|
interfaces?
|
||
|
|
||
|
More to come ...
|
||
|
----------------
|
||
|
|
||
|
For now, i don't think we should strive to offer much more
|
||
|
than the above. However, it is probably not hard to offer
|
||
|
*partial serialization* to allow generating maybe hundreds of
|
||
|
complex html documents per second. Basically we would allow
|
||
|
putting callables both as Tag content and as values of
|
||
|
attributes. A slightly more advanced Serialization would then
|
||
|
produce a list of unicode objects intermingled with callables.
|
||
|
At HTTP-Request time the callables would get called to
|
||
|
complete the probably request-specific serialization of
|
||
|
your Tags. Hum, it's probably harder to explain this than to
|
||
|
actually code it :-)
|
||
|
|
||
|
.. _`py.test`: test.html
|