Tuesday, September 2, 2014

Use Python PDBparse to parse PDB files.

PDBparse is a Python package for parsing Microsoft PDB files. I like Python solution for parsing PDB files because it is open source. By running its examples, I easily understand PDB format without reading PDB spec initially if there is really a PDB spec.

This page just describes how to install the package and how to run its examples.

Below is the website of PDBparse.
https://code.google.com/p/pdbparse/

We can download the source code here.
http://pdbparse.googlecode.com/svn/trunk

So please check out it and put it in a directory. For example, D:\pdbparse. Let's build it and install it.
D:\pdbparse>python setup.py build
running build
running build_py
running build_ext
building 'pdbparse._undname' extension
error: Unable to find vcvarsall.bat

If the error occurs, please don't give up. When running setup.py, Python 2.7 searches for Visual Studio 2008. If we don't have the version or another version of Visual Studio, the error occurs. We can trick  Python by setting an environment variable. For example, environment has Visual Studio 2005.
>set VS90COMNTOOLS=%VS80COMNTOOLS%

Let's build the package again.
D:\pdbparse>python setup.py build
LINK : error LNK2001: unresolved external symbol init_undname
build\temp.win32-2.7\Release\src\_undname.lib : fatal error LNK1120: 1 unresolved externals
error: command '"C:\Program Files\Microsoft Visual Studio 8\VC\BIN\link.exe"' failed with exit status 1120

If the error occurs, we should check if the init_undname is defined in C files. There is only one C file, undname.c, in the source files of package. The file doesn't define init_undname but has undname() routine.
char *undname(char *buffer, char *mangled, int buflen, unsigned short int flags)
{
    return __unDName(buffer, mangled, buflen, malloc, free, flags);
}

Therefore my solution is to create init_undname() to wrap undname(). Please add the following init_undname() in undname.c.
char *init_undname(char *buffer, char *mangled, int buflen, unsigned short int flags)
{
  return undname (buffer, mangled, buflen, flags);
}

Let's try it again. It should be successful.
D:\pdbparse>python setup.py build

Let's install the package in Python environment.
D:\pdbparse>python setup.py install

Now we can run the examples of the package. It should be run successfully
D:\pdbparse\examples\python pdb_dump.py Test.pdb

Please run another example.
python pdb_get_syscall_table.py Test.pdb
  File "C:\Python27\lib\site-packages\pdbparse\info.py", line 1, in <module>
    from construct import *
ImportError: No module named construct

If the ImportError occurs, please download the contruct package here.
http://construct.readthedocs.org/en/latest/

Download the source code of construct package and put it in a directory. For example, D:\construct-2.5.2. Please build it and install it.
D:\construct>python setup.py build
D:\construct>python setup.py install

Please run the example again
python pdb_get_syscall_table.py Test.pdb
  File "C:\Python27\lib\site-packages\construct\lib\binary.py", line 1, in <module>
    import six
ImportError: No module named six

The error occurs because construct package depends on six package and we miss it. Please download it, build it, and install it.
https://pypi.python.org/pypi/six/1.7.3
D:\six-1.7.3>python setup.py build
D:\six-1.7.3>python setup.py install

Please run the example again
python pdb_get_syscall_table.py Test.pdb
Traceback (most recent call last):
  File "pdb_get_syscall_table.py", line 9, in <module>
    from pefile import PE
ImportError: No module named pefile

So the example need pefile package to parse EXE file and we miss the package. Please download it, build it and install it.
D:\pefile-1.2.10-139>python setup.py build
D:\pefile-1.2.10-139>python setup.py install

Now the example can be run successfully. There are many others examples help us to understand the PDB file format. W can try them.

pdb_dump.py
pdb_get_syscall_table.py
pdb_lookup.py
pdb_print_ctypes.py
pdb_print_gvars.py
pdb_print_tpi.py
pdb_tpi_vtypes.py
symchk.py
tpi_closure.py
tpi_print_construct.py
tpi_size.py

5 comments: