============================================================================== py.execnet: *elastic* distributed programming ============================================================================== ``execnet`` helps you to: * ad-hoc instantiate local or remote Python Processes * send code for execution in one or many processes * send and receive data between processes through channels One of it's unique features is that it uses a **zero-install** technique: no manual installation steps are required on remote places, only a basic working Python interpreter and some input/output connection to it. There is a `EuroPython2009 talk`_ from July 2009 with examples and some pictures. .. contents:: :local: :depth: 2 .. _`EuroPython2009 talk`: http://codespeak.net/download/py/ep2009-execnet.pdf Gateways: immediately spawn local or remote process =================================================== In order to send code to a remote place or a subprocess you need to instantiate a so-called Gateway object. There are currently three Gateway classes: * :api:`py.execnet.PopenGateway` to open a subprocess on the local machine. Useful for making use of multiple processors to to contain code execution in a separated environment. * :api:`py.execnet.SshGateway` to connect to a remote ssh server and distribute execution to it. * :api:`py.execnet.SocketGateway` a way to connect to a remote Socket based server. *Note* that this method requires a manually started :source:py/execnet/script/socketserver.py script. You can run this "server script" without having the py lib installed on the remote system and you can setup it up as permanent service. remote_exec: execute source code remotely =================================================== All gateways offer remote code execution via this high level function:: def remote_exec(source): """return channel object for communicating with the asynchronously executing 'source' code which will have a corresponding 'channel' object in its executing namespace.""" With `remote_exec` you send source code to the other side and get both a local and a remote Channel_ object, which you can use to have the local and remote site communicate data in a structured way. Here is an example for reading the PID:: >>> import py >>> gw = py.execnet.PopenGateway() >>> channel = gw.remote_exec(""" ... import os ... channel.send(os.getpid()) ... """) >>> remote_pid = channel.receive() >>> remote_pid != py.std.os.getpid() True .. _`Channel`: .. _`channel-api`: .. _`exchange data`: Channels: bidirectionally exchange data between hosts ======================================================= A channel object allows to send and receive data between two asynchronously running programs. When calling `remote_exec` you will get a channel object back and the code fragment running on the other side will see a channel object in its global namespace. Here is the interface of channel objects:: # # API for sending and receiving anonymous values # channel.send(item): sends the given item to the other side of the channel, possibly blocking if the sender queue is full. Note that items need to be marshallable (all basic python types are). channel.receive(): receives an item that was sent from the other side, possibly blocking if there is none. Note that exceptions from the other side will be reraised as gateway.RemoteError exceptions containing a textual representation of the remote traceback. channel.waitclose(timeout=None): wait until this channel is closed. Note that a closed channel may still hold items that will be received or send. Note that exceptions from the other side will be reraised as gateway.RemoteError exceptions containing a textual representation of the remote traceback. channel.close(): close this channel on both the local and the remote side. A remote side blocking on receive() on this channel will get woken up and see an EOFError exception. .. _xspec: XSpec: string specification for gateway type and configuration =============================================================== ``py.execnet`` supports a simple extensible format for specifying and configuring Gateways for remote execution. You can use a string specification to instantiate a new gateway, for example a new SshGateway:: gateway = py.execnet.makegateway("ssh=myhost") Let's look at some examples for valid specifications. Specification for an ssh connection to `wyvern`, running on python2.4 in the (newly created) 'mycache' subdirectory:: ssh=wyvern//python=python2.4//chdir=mycache Specification of a python2.5 subprocess; with a low CPU priority ("nice" level). Current dir will be the current dir of the instantiator (that's true for all 'popen' specifications unless they specify 'chdir'):: popen//python=2.5//nice=20 Specification of a Python Socket server process that listens on 192.168.1.4:8888; current dir will be the 'pyexecnet-cache' sub directory which is used a default for all remote processes:: socket=192.168.1.4:8888 More generally, a specification string has this general format:: key1=value1//key2=value2//key3=value3 If you omit a value, a boolean true value is assumed. Currently the following key/values are supported: * ``popen`` for a PopenGateway * ``ssh=host`` for a SshGateway * ``socket=address:port`` for a SocketGateway * ``python=executable`` for specifying Python executables * ``chdir=path`` change remote working dir to given relative or absolute path * ``nice=value`` decrease remote nice level if platforms supports it Examples of py.execnet usage =============================================================== Compare cwd() of Popen Gateways ---------------------------------------- A PopenGateway has the same working directory as the instantiatior:: >>> import py, os >>> gw = py.execnet.PopenGateway() >>> ch = gw.remote_exec("import os; channel.send(os.getcwd())") >>> res = ch.receive() >>> assert res == os.getcwd() >>> gw.exit() Synchronously receive results from two sub processes ----------------------------------------------------- Use MultiChannels for receiving multiple results from remote code:: >>> import py >>> ch1 = py.execnet.PopenGateway().remote_exec("channel.send(1)") >>> ch2 = py.execnet.PopenGateway().remote_exec("channel.send(2)") >>> mch = py.execnet.MultiChannel([ch1, ch2]) >>> l = mch.receive_each() >>> assert len(l) == 2 >>> assert 1 in l >>> assert 2 in l Asynchronously receive results from two sub processes ----------------------------------------------------- Use ``MultiChannel.make_receive_queue()`` for asynchronously receiving multiple results from remote code. This standard Queue provides ``(channel, result)`` tuples which allows to determine where a result comes from:: >>> import py >>> ch1 = py.execnet.PopenGateway().remote_exec("channel.send(1)") >>> ch2 = py.execnet.PopenGateway().remote_exec("channel.send(2)") >>> mch = py.execnet.MultiChannel([ch1, ch2]) >>> queue = mch.make_receive_queue() >>> chan1, res1 = queue.get() # you may also specify a timeout >>> chan2, res2 = queue.get() >>> res1 + res2 3 >>> assert chan1 in (ch1, ch2) >>> assert chan2 in (ch1, ch2) >>> assert chan1 != chan2 Receive file contents from remote SSH account ----------------------------------------------------- Here is a small program that you can use to retrieve contents of remote files:: import py # open a gateway to a fresh child process gw = py.execnet.SshGateway('codespeak.net') channel = gw.remote_exec(""" for fn in channel: f = open(fn, 'rb') channel.send(f.read()) f.close() """) for fn in somefilelist: channel.send(fn) content = channel.receive() # process content # later you can exit / close down the gateway gw.exit() Instantiate a socket server in a new subprocess ----------------------------------------------------- The following example opens a PopenGateway, i.e. a python child process, and starts a socket server within that process and then opens a second gateway to the freshly started socketserver:: import py popengw = py.execnet.PopenGateway() socketgw = py.execnet.SocketGateway.new_remote(popengw, ("127.0.0.1", 0)) print socketgw._rinfo() # print some info about the remote environment Sending a module / checking if run through remote_exec -------------------------------------------------------------- You can pass a module object to ``remote_exec`` in which case its source code will be sent. No dependencies will be transferred so the module must be self-contained or only use modules that are installed on the "other" side. Module code can detect if it is running in a remote_exec situation by checking for the special ``__name__`` attribute like this:: if __name__ == '__channelexec__': # ... call module functions ...