5.5 KiB
This project has history of over 17 years spanning back to Python 1.5
There have been a number of people who have worked on this. I am awed by the amount of work, number of people who have contributed to this, and the cleverness in the code.
The below is an annotated history from my reading of the sources cited.
In 1998, John Aycock first wrote a grammar parser in Python, eventually called SPARK, that was usable inside a Python program. This code was described in the 7th International Python Conference. That paper doesn't talk about decompilation, nor did John have that in mind at that time. It does mention that a full parser for Python (rather than the simple languages in the paper) was being considered.
This contains a of people acknowledged in developing SPARK. What's amazing about this code is that it is reasonably fast and has survived up to Python 3 with relatively little change. This work was done in conjunction with his Ph.D Thesis. This was finished around 2001. In working on his thesis, John realized SPARK could be used to deparse Python bytecode. In the fall of 1999, he started writing the Python program, "decompyle", to do this.
This code introduced another clever idea: using table-driven semantics routines, using format specifiers.
The last mention of a release of SPARK from John is around 2002.
In the fall of 2000, Hartmut Goebel took over maintaining the code. The first subsequent public release announcement that I can find is "decompyle - A byte-code-decompiler version 2.2 beta 1".
From the CHANGES file found in the tarball for that release, it appears that Hartmut did most of the work to get this code to accept the full Python language. He added precedence to the table specifiers, support for multiple versions of Python, the pretty-printing of docstrings, lists, and hashes. He also wrote test and verification routines of deparsed bytecode, and used this in an extensive set of tests that he also wrote. He could verify against the entire Python library.
decompyle2.2 was packaged for Debian (sarge) by Ben Burton around 2002. As it worked on Python 2.2 only long after Python 2.3 and 2.4 were in widespread use, it was removed.
Crazy Compilers offers a byte-code decompiler service for versions of Python up to 2.6. As someone who worked in compilers, it is tough to make a living by working on compilers. (For example, based on John Aycock's recent papers it doesn't look like he's done anything compiler-wise since SPARK). So I hope people will use the crazy-compilers service. I wish them the success that his good work deserves.
Also looking at code I see Dan Pascu did a bit of work around 2005 on the Python scanner, parser, and marshaling routines. For example I see a bit code to massage disassembly output to make it more amenable for deparsing. 2005 would put his work around the Python 2.4 releases.
Next we get to "uncompyle" and PyPI and the era of git repositories. In contrast to decompyle, this now runs only on Python 2.7 although it accepts bytecode back to Python 2.5. Thomas Grainger is the package owner of this, although Hartmut is listed as the author.
The project exists not only on github but also on bitbucket where the git history goes back to 2009. Somewhere in there the name was changed from "decompyle" to "uncompyle".
The name Thomas Grainger isn't found in (m)any of the commits in the several years of active development. Guenther Starnberger, Keknehv, hamled, and Eike Siewertsen are principle committers here.
This project, uncompyle6, however owes its existence to uncompyle2 by Myst herie (Mysterie) whose first commit seems to goes back to 2012; it is also based on Hartmut's code. I chose this as it seems had been the most actively worked on most recently.
Over the many years, code styles and Python features have changed. However brilliant the code was and still is, it hasn't really had a single public active maintainer. And there have been many forks of the code.
That it has been in need of an overhaul has been recognized by the Hartmut a decade an a half ago:
decompyle/uncompile__init__.py
NB. This is not a masterpiece of software, but became more like a hack.
Probably a complete rewrite would be sensefull. hG/2000-12-27
Lastly, I should mention unpyc and most especially pycdc, largely by Michael Hansen and Darryl Pogue. If they supported getting source-code fragments and I could call it from Python, I'd probably ditch this and use that. From what I've seen, the code runs blindingly fast and spans all versions of Python.
Tests for the project have been, or are being, culled from all of the projects mentioned.
NB. If you find mistakes, want corrections, or want your name added (or removed), please contact me.