The first stage on the way from text to graphics is to parse the Python code and to collect everything what is there in a form of data structures which could be used for generating graphics. I definitely wanted to use something which has already been developed. It turned out that there is a function in the Python interpreter dynamic library which can help. It's a C function which provides a syntax tree.
While working on this part I wrote a simple utility which prints the data structure produced by the Python interpreter function. Here is an example of a very simple Python file and the produced data structure. Generally it looks nice: there are line numbers and column numbers, the node types correspond to the formal Python grammar specification. However there are some problems too.
The sample code has a few comments - the syntax tree lost them. The encoding
line number and column number are wrong. Even the encoding name is wrong: the file says it is
latin-1 but the syntax tree reports
iso-8859-1. It turned out that
the Python interpreter code has a normalization procedure for the encoding spec. There are
some problems with multiline string literals as well - the line numbers are not supplied.
All these surprises had to be considered in the parser module. On the other hand all the
text parsing complexity is gone, all I had to do is to walk the syntax tree and build data
structures convenient for generating graphics.