Intelligent Documentation Generation

Student:Teodor Shaterov
Title:Intelligent Documentation Generation
Type:bachelor thesis
Advisors:Veldema, R.; Philippsen, M.
State:submitted on November 27, 2012

At informatik 2, we are in the process of creating a new language named Tapir. Tapir is intended to allow a programmer to write ultra-safe/bug-free programs. We ensure this by allowing extensive testing of the program using a simulation tool to test all possible program executions, testing support, etc, etc.

Another aspect of making a program 'safe', is to ensure that the program does what it is supposed to by generating 'safe' documentation. This means that the documentation should describe what the code does (and not what it once did or what it should do).

Other documentation systems are either very invasive or the documentation is completely seperate from the code.

  • Javadoc is seperate from the code, the programmer types his documentation in front of each class/method using a special commenting style and the javadoc preprocessor extracts each comment, binds them together somehow to build documentation. However, the comment is in no way guaranteed to describe the code (correctly). On the up-side, Javadoc is not very invasive.
  • literate programming is where the code and the documentation is really mixed. It's very hard in this programming style to differentiate between code and documentation making the code hard to read. On the up-side, the documentation is almost guaranteed to really describe the code. For more infos on literate programming see:

We propose that documentation should be verifyably matched to the code and that the code and documentation should be mixed in the same way as aspect-oriented programming works. In effect, documentation should be an 'aspect' of the code in this sense (only that is doesn't compute anything).

Finally, what is wrong with many current documentation systems for code, is that they aren't in any way 'smart'. Consider the following example:

/// computes the value of pi
double pi() { .... }
void foo(int N) {
for (int i=0; i<N; i++) {

Here, the function 'pi()' has some documentation but 'foo()' does not. We could, by simple compiler analysis generate documentation for 'foo()' that says that "foo computes the value of pi N times". Another example:

/// result value
double res;
void bar() {
for (int i=0; i<N; i++) {
res += pi();

Again, function 'bar()' has no documentation but we could generate documentation for 'bar' by compiler analysis, for example, we could generate the string "bar 'computes the value of pi' N times and adds it to 'result value'.

This sort of intelligence could then be extented towards calling recursive functions, pulling documentation out of loops (in the same way as moving code from loops works, etc., to allow documentation to be generated for those functions that are not documented at all by propagating documentation 'upward'.

Another piece of intelligent documentation that could be generated are estimates of function complexity. For the function 'bar()' above, a string O(N O(pi)) could easily be generated.


This SA/DA consists of three parts.

1) Invent a syntax and semantic that allows documentation to be written in an 'aspect oriented' style.

2) Patch our (Tapir) compiler so that documentation can be propagated. This should allow loops and functions that are undocumented to have their documentation generated

3) Adjust the syntax and semantic of your documentation text so that one can specify a test(s) to be automatically run to determine that code and documentation match. For example, for each documentation to a function, also specify a test of that function that must succeed to ensure that the documentation matches the code.

4) Think of as many ways as possible to generate documentation to relieve the programmer from writing it (which most programmers abhor)

More information about our new language can be found here:

Note: this SA/DA can be done in both German or Englisch. This work is turned from an DA into an SA by removing the above point '3' from the requirements.

watermark seal