View Single Post
Old Dec 21st, 2005, 3:38 PM   #1
Arevos
Programming Guru
 
Arevos's Avatar
 
Join Date: Aug 2005
Location: England
Posts: 1,499
Rep Power: 5 Arevos is on a distinguished road
Hacking the Python compiler

I've been looking into generating Python bytecode from a Python program and this is what I've come up with:

def write_pyc(code, filename):
   "Write code to a PYC file"
   file = open(filename, 'wb')
   
   # PYC header specifying version and timestamp
   file.write(imp.get_magic())
   file.write(struct.pack('<l', time.time()))
   
   marshal.dump(code, file)

   file.close()


def compile_module(parse_tree, filename):
   "Compile a raw AST module"
   parse_tree.filename = filename
   return compiler.pycodegen.ModuleCodeGenerator(parse_tree).getCode()
The compile_module function is used to compile an AST (Abstract Syntax Tree) into a code object. The write_pyc function can write a compiled code object to a standard pyc file.

How do you create an AST? A good way to get a feel of Python ASTs is to get Python to generate them for you from existing files:
print compiler.parseFile(py_filename)
You can also check out the table of AST node types that are listed here

Here's an AST I constructed myself:
module = ast.Module(
   None,
   ast.Stmt(
      [ast.Printnl(
         [ast.CallFunc(
            ast.Name('pow'), [ast.Const(2), ast.Const(2)], None, None
         )],
         None
      )]
   )
)
This AST corresponds to the Python code: "print pow(2, 2)".

To compile this AST and write it to a file:
filename = "four.pyc"
write_pyc(compile_module(module, filename), filename)
The bytecode can then be executed using:
python four.pyc
If all goes well, the number 4 should be printed to the screen.
Arevos is offline   Reply With Quote