test_ok1/py/doc/apigen/api-docs.txt


Automatic API documentation generation
======================================

Motivation
----------

* We want better automatic hyperlinking within the documentation.
* Test driven documentation generation. Get more semantic info from the tests (return values, exceptions raised, etc). Also, expose tests in the documentation itself as usage examples.
* Should work with generated code, even without source files.
* We want to expose some workflow/context information around the API. What are the inputs, and where do they come from.
* Ease of use, setup, perhaps an option to py.test to generate the docs.
* The generator itself should be nicely unit tested small modules.

Related projects
----------------

* pydoc
* epydoc
* eric3-doc/eric3-api
* doxygen
* pydoctor
* PyUMLGraph (source page down? http://www.python.org/pypi/PyUMLGraph/0.1.4)
* ...

Proposed features
-----------------

* Minimal dependencies
* Easy to setup, easy to use.
* Frontend independent, with multiple frontends. HTML, HTML + javascript, pygame.
* Whenever possible the documentation is hyperlinked to relevant info.
* The documents contain links to the source code.
* Can look at running code and generate docs from it.
* Can generate some sort of static document output which could be easily distributed in a zip/tarball.
* Works with generated code, even code without source files.
* Collected data should be usable for other purposes, for example websites, IDE code completion, etc.
* By default, only document the exposed API, don't dig too deep inside the objects.
* Try to learn additional semantic information from the tests. What types are input and returned from functions? What order are functions usually called in? Tests can become usage examples in the documentation. There is a nice social aspect to getting data from the tests, because it encourages the developer to create more complete tests.
* Report code coverage in the tests? (does this really belong in py.test?)
* Possibly create some kind of diagrams/graphs.
* Possibly use the same introspection techniques to produce very rich state information after a crash.


Which things to implement first
-------------------------------

* Create simple ReST output, just documenting the API.

Implementation ideas
--------------------

* Somehow stop a running application and introspect it.
* Use a tracehook while running tests to collect information.
* Use pypy to solve all problems. ;)


Automatic API documentation generation - architecture overview
==============================================================

Abstraction layers:
-------------------

* ``extractor`` - library API extraction
* ``presenter`` - presentation backend (pygame, html, http + ajax)
* ``test runner`` - just `py.test`_ (modified or not)
* ``tracer`` - middleware responsible for generating necessary
  annotations
* ``reporter`` - object API for presenter which gets info from tracer and runner


Extractor:
----------

Right now it's quite easy - we'll just take pylib ``__package__`` semantics,
which allows us not to try to guess what's API and what is internall. (I
believe that having explicit split is better than trying to guess it)

Presenter:
----------

XXX: Whatever fits here (use weird imagination)

Test runner:
------------

For test runner we'll use `py.test`_ with some hooks from ``tracer`` to
perform stuff we wan't him to do. Basically we'll run all of the tests
which covers some amount of API (specified explicitly) and gather as much
information as possible. Probably this will need some support for special
calls like ``py.test.fail`` or ``py.test.raises`` to support specific
situations explicitly.

Tracer:
-------

This is the most involved (and probably requiring most work) part of
application it should support as-much-operations-as-possible to make
information meaningfull. Mostly:

* Tracing the information of argument and result types from things
  that are performed by API-defined functions (by debugger hook)
* Tracing variables that comes along to make crosslinking between API
  possible.
* Various other (possibly endless) stuff that might be performed by
  tracing flow graphs.

Reporter:
---------

Reporter is the crucial glue between components. It gathers all the
information actually coming from ``extractor`` and ``tracer`` about
gathered information and tries to present them in an object-oriented
way for any possible backend which might want to actually read it.

_`py.test` - http://codespeak.net/py/test/