314 lines
12 KiB
Plaintext
314 lines
12 KiB
Plaintext
=====================================================
|
|
py.magic.greenlet: Lightweight concurrent programming
|
|
=====================================================
|
|
|
|
.. contents::
|
|
.. sectnum::
|
|
|
|
Motivation
|
|
==========
|
|
|
|
The "greenlet" package is a spin-off of `Stackless`_, a version of CPython
|
|
that supports micro-threads called "tasklets". Tasklets run
|
|
pseudo-concurrently (typically in a single or a few OS-level threads) and
|
|
are synchronized with data exchanges on "channels".
|
|
|
|
A "greenlet", on the other hand, is a still more primitive notion of
|
|
micro-thread with no implicit scheduling; coroutines, in other words.
|
|
This is useful when you want to
|
|
control exactly when your code runs. You can build custom scheduled
|
|
micro-threads on top of greenlet; however, it seems that greenlets are
|
|
useful on their own as a way to make advanced control flow structures.
|
|
For example, we can recreate generators; the difference with Python's own
|
|
generators is that our generators can call nested functions and the nested
|
|
functions can yield values too. (Additionally, you don't need a "yield"
|
|
keyword. See the example in `test_generator.py`_).
|
|
|
|
Greenlets are provided as a C extension module for the regular unmodified
|
|
interpreter.
|
|
|
|
.. _`Stackless`: http://www.stackless.com
|
|
.. _`test_generator.py`: http://codespeak.net/svn/user/arigo/greenlet/test_generator.py
|
|
|
|
Example
|
|
-------
|
|
|
|
Let's consider a system controlled by a terminal-like console, where the user
|
|
types commands. Assume that the input comes character by character. In such
|
|
a system, there will typically be a loop like the following one::
|
|
|
|
def process_commands(*args):
|
|
while True:
|
|
line = ''
|
|
while not line.endswith('\n'):
|
|
line += read_next_char()
|
|
if line == 'quit\n':
|
|
print "are you sure?"
|
|
if read_next_char() != 'y':
|
|
continue # ignore the command
|
|
process_command(line)
|
|
|
|
Now assume that you want to plug this program into a GUI. Most GUI toolkits
|
|
are event-based. They will invoke a call-back for each character the user
|
|
presses. [Replace "GUI" with "XML expat parser" if that rings more bells to
|
|
you ``:-)``] In this setting, it is difficult to implement the
|
|
read_next_char() function needed by the code above. We have two incompatible
|
|
functions::
|
|
|
|
def event_keydown(key):
|
|
??
|
|
|
|
def read_next_char():
|
|
?? should wait for the next event_keydown() call
|
|
|
|
You might consider doing that with threads. Greenlets are an alternate
|
|
solution that don't have the related locking and shutdown problems. You
|
|
start the process_commands() function in its own, separate greenlet, and
|
|
then you exchange the keypresses with it as follows::
|
|
|
|
def event_keydown(key):
|
|
# jump into g_processor, sending it the key
|
|
g_processor.switch(key)
|
|
|
|
def read_next_char():
|
|
# g_self is g_processor in this simple example
|
|
g_self = greenlet.getcurrent()
|
|
# jump to the parent (main) greenlet, waiting for the next key
|
|
next_char = g_self.parent.switch()
|
|
return next_char
|
|
|
|
g_processor = greenlet(process_commands)
|
|
g_processor.switch(*args) # input arguments to process_commands()
|
|
|
|
gui.mainloop()
|
|
|
|
In this example, the execution flow is: when read_next_char() is called, it
|
|
is part of the g_processor greenlet, so when it switches to its parent
|
|
greenlet, it resumes execution in the top-level main loop (the GUI). When
|
|
the GUI calls event_keydown(), it switches to g_processor, which means that
|
|
the execution jumps back wherever it was suspended in that greenlet -- in
|
|
this case, to the switch() instruction in read_next_char() -- and the ``key``
|
|
argument in event_keydown() is passed as the return value of the switch() in
|
|
read_next_char().
|
|
|
|
Note that read_next_char() will be suspended and resumed with its call stack
|
|
preserved, so that it will itself return to different positions in
|
|
process_commands() depending on where it was originally called from. This
|
|
allows the logic of the program to be kept in a nice control-flow way; we
|
|
don't have to completely rewrite process_commands() to turn it into a state
|
|
machine.
|
|
|
|
|
|
Usage
|
|
=====
|
|
|
|
Introduction
|
|
------------
|
|
|
|
A "greenlet" is a small independent pseudo-thread. Think about it as a
|
|
small stack of frames; the outermost (bottom) frame is the initial
|
|
function you called, and the innermost frame is the one in which the
|
|
greenlet is currently paused. You work with greenlets by creating a
|
|
number of such stacks and jumping execution between them. Jumps are never
|
|
implicit: a greenlet must choose to jump to another greenlet, which will
|
|
cause the former to suspend and the latter to resume where it was
|
|
suspended. Jumping between greenlets is called "switching".
|
|
|
|
When you create a greenlet, it gets an initially empty stack; when you
|
|
first switch to it, it starts the run a specified function, which may call
|
|
other functions, switch out of the greenlet, etc. When eventually the
|
|
outermost function finishes its execution, the greenlet's stack becomes
|
|
empty again and the greenlet is "dead". Greenlets can also die of an
|
|
uncaught exception.
|
|
|
|
For example::
|
|
|
|
from py.magic import greenlet
|
|
|
|
def test1():
|
|
print 12
|
|
gr2.switch()
|
|
print 34
|
|
|
|
def test2():
|
|
print 56
|
|
gr1.switch()
|
|
print 78
|
|
|
|
gr1 = greenlet(test1)
|
|
gr2 = greenlet(test2)
|
|
gr1.switch()
|
|
|
|
The last line jumps to test1, which prints 12, jumps to test2, prints 56,
|
|
jumps back into test1, prints 34; and then test1 finishes and gr1 dies.
|
|
At this point, the execution comes back to the original ``gr1.switch()``
|
|
call. Note that 78 is never printed.
|
|
|
|
Parents
|
|
-------
|
|
|
|
Let's see where execution goes when a greenlet dies. Every greenlet has a
|
|
"parent" greenlet. The parent greenlet is initially the one in which the
|
|
greenlet was created (this can be changed at any time). The parent is
|
|
where execution continues when a greenlet dies. This way, greenlets are
|
|
organized in a tree. Top-level code that doesn't run in a user-created
|
|
greenlet runs in the implicit "main" greenlet, which is the root of the
|
|
tree.
|
|
|
|
In the above example, both gr1 and gr2 have the main greenlet as a parent.
|
|
Whenever one of them dies, the execution comes back to "main".
|
|
|
|
Uncaught exceptions are propagated into the parent, too. For example, if
|
|
the above test2() contained a typo, it would generate a NameError that
|
|
would kill gr2, and the exception would go back directly into "main".
|
|
The traceback would show test2, but not test1. Remember, switches are not
|
|
calls, but transfer of execution between parallel "stack containers", and
|
|
the "parent" defines which stack logically comes "below" the current one.
|
|
|
|
Instantiation
|
|
-------------
|
|
|
|
``py.magic.greenlet`` is the greenlet type, which supports the following
|
|
operations:
|
|
|
|
``greenlet(run=None, parent=None)``
|
|
Create a new greenlet object (without running it). ``run`` is the
|
|
callable to invoke, and ``parent`` is the parent greenlet, which
|
|
defaults to the current greenlet.
|
|
|
|
``greenlet.getcurrent()``
|
|
Returns the current greenlet (i.e. the one which called this
|
|
function).
|
|
|
|
``greenlet.GreenletExit``
|
|
This special exception does not propagate to the parent greenlet; it
|
|
can be used to kill a single greenlet.
|
|
|
|
The ``greenlet`` type can be subclassed, too. A greenlet runs by calling
|
|
its ``run`` attribute, which is normally set when the greenlet is
|
|
created; but for subclasses it also makes sense to define a ``run`` method
|
|
instead of giving a ``run`` argument to the constructor.
|
|
|
|
Switching
|
|
---------
|
|
|
|
Switches between greenlets occur when the method switch() of a greenlet is
|
|
called, in which case execution jumps to the greenlet whose switch() is
|
|
called, or when a greenlet dies, in which case execution jumps to the
|
|
parent greenlet. During a switch, an object or an exception is "sent" to
|
|
the target greenlet; this can be used as a convenient way to pass
|
|
information between greenlets. For example::
|
|
|
|
def test1(x, y):
|
|
z = gr2.switch(x+y)
|
|
print z
|
|
|
|
def test2(u):
|
|
print u
|
|
gr1.switch(42)
|
|
|
|
gr1 = greenlet(test1)
|
|
gr2 = greenlet(test2)
|
|
gr1.switch("hello", " world")
|
|
|
|
This prints "hello world" and 42, with the same order of execution as the
|
|
previous example. Note that the arguments of test1() and test2() are not
|
|
provided when the greenlet is created, but only the first time someone
|
|
switches to it.
|
|
|
|
Here are the precise rules for sending objects around:
|
|
|
|
``g.switch(obj=None or *args)``
|
|
Switches execution to the greenlet ``g``, sending it the given
|
|
``obj``. As a special case, if ``g`` did not start yet, then it will
|
|
start to run now; in this case, any number of arguments can be
|
|
provided, and ``g.run(*args)`` is called.
|
|
|
|
Dying greenlet
|
|
If a greenlet's ``run()`` finishes, its return value is the object
|
|
sent to its parent. If ``run()`` terminates with an exception, the
|
|
exception is propagated to its parent (unless it is a
|
|
``greenlet.GreenletExit`` exception, in which case the exception
|
|
object itself is sent to the parent).
|
|
|
|
Apart from the cases described above, the target greenlet normally
|
|
receives the object as the return value of the call to ``switch()`` in
|
|
which it was previously suspended. Indeed, although a call to
|
|
``switch()`` does not return immediately, it will still return at some
|
|
point in the future, when some other greenlet switches back. When this
|
|
occurs, then execution resumes just after the ``switch()`` where it was
|
|
suspended, and the ``switch()`` itself appears to return the object that
|
|
was just sent. This means that ``x = g.switch(y)`` will send the object
|
|
``y`` to ``g``, and will later put the (unrelated) object that some
|
|
(unrelated) greenlet passes back to us into ``x``.
|
|
|
|
Note that any attempt to switch to a dead greenlet actually goes to the
|
|
dead greenlet's parent, or its parent's parent, and so on. (The final
|
|
parent is the "main" greenlet, which is never dead.)
|
|
|
|
Methods and attributes of greenlets
|
|
-----------------------------------
|
|
|
|
``g.switch(obj=None or *args)``
|
|
Switches execution to the greenlet ``g``. See above.
|
|
|
|
``g.run``
|
|
The callable that ``g`` will run when it starts. After ``g`` started,
|
|
this attribute no longer exists.
|
|
|
|
``g.parent``
|
|
The parent greenlet. This is writeable, but it is not allowed to
|
|
create cycles of parents.
|
|
|
|
``g.gr_frame``
|
|
The current top frame, or None.
|
|
|
|
``bool(g)``
|
|
True if ``g`` is active, False if it is dead or not yet started.
|
|
|
|
``g.throw([typ, [val, [tb]]])``
|
|
Switches execution to the greenlet ``g``, but immediately raises the
|
|
given exception in ``g``. If no argument is provided, the exception
|
|
defaults to ``greenlet.GreenletExit``. The normal exception
|
|
propagation rules apply, as described above. Note that calling this
|
|
method is almost equivalent to the following::
|
|
|
|
def raiser():
|
|
raise typ, val, tb
|
|
g_raiser = greenlet(raiser, parent=g)
|
|
g_raiser.switch()
|
|
|
|
except that this trick does not work for the
|
|
``greenlet.GreenletExit`` exception, which would not propagate
|
|
from ``g_raiser`` to ``g``.
|
|
|
|
Greenlets and Python threads
|
|
----------------------------
|
|
|
|
Greenlets can be combined with Python threads; in this case, each thread
|
|
contains an independent "main" greenlet with a tree of sub-greenlets. It
|
|
is not possible to mix or switch between greenlets belonging to different
|
|
threads.
|
|
|
|
Garbage-collecting live greenlets
|
|
---------------------------------
|
|
|
|
If all the references to a greenlet object go away (including the
|
|
references from the parent attribute of other greenlets), then there is no
|
|
way to ever switch back to this greenlet. In this case, a GreenletExit
|
|
exception is generated into the greenlet. This is the only case where a
|
|
greenlet receives the execution asynchronously. This gives
|
|
``try:finally:`` blocks a chance to clean up resources held by the
|
|
greenlet. This feature also enables a programming style in which
|
|
greenlets are infinite loops waiting for data and processing it. Such
|
|
loops are automatically interrupted when the last reference to the
|
|
greenlet goes away.
|
|
|
|
The greenlet is expected to either die or be resurrected by having a new
|
|
reference to it stored somewhere; just catching and ignoring the
|
|
GreenletExit is likely to lead to an infinite loop.
|
|
|
|
Greenlets do not participate in garbage collection; cycles involving data
|
|
that is present in a greenlet's frames will not be detected. Storing
|
|
references to other greenlets cyclically may lead to leaks.
|