Update documentation

This commit is contained in:
rocky
2018-06-12 13:13:48 -04:00
parent c01eb554ed
commit 7e172b63d1

View File

@@ -29,11 +29,11 @@ CPython bytecode decompilers is the ability to deparse just
*fragments* of source code and give source-code information around a
given bytecode offset.
I use the tree fragments to deparse fragments of code inside my
trepan_ debuggers_. For that, bytecode offsets are recorded and
associated with fragments of the source code. This purpose, although
compatible with the original intention, is yet a little bit different.
See this_ for more information.
I use the tree fragments to deparse fragments of code *at run time*
inside my trepan_ debuggers_. For that, bytecode offsets are recorded
and associated with fragments of the source code. This purpose,
although compatible with the original intention, is yet a little bit
different. See this_ for more information.
Python fragment deparsing given an instruction offset is useful in
showing stack traces and can be encorporated into any program that
@@ -58,7 +58,7 @@ provides decompilation for subset of Python versions, we generally do
demonstrably better for those as well.
How can we tell? By taking Python bytecode that comes distributed with
that version of Python and decompiling these. Among htose that
that version of Python and decompiling these. Among those that
successfully decompile, we can then make sure the resulting programs
are syntactically correct by running the Python interpreter for that
bytecode version. Finally, in cases where the program has a test for
@@ -151,12 +151,12 @@ for that bytecode version. Having done this the bytecode produced
could be compared with the original bytecode. However as Python's code
generation got better, this is no longer feasible.
There is a kind of *weak verification* that we use that doesn't check
bytecode for equivalence but does check to see if the resulting
decompiled source is a valid Python program by running the Python
interpreter. Because the Python language has changed so much, for best
results you should use the same Python version in checking as was used
in creating the bytecode.
There verification that we use that doesn't check bytecode for
equivalence but does check to see if the resulting decompiled source
is a valid Python program by running the Python interpreter. Because
the Python language has changed so much, for best results you should
use the same Python version in checking as was used in creating the
bytecode.
There are however an interesting class of these programs that is
readily available give stronger verification: those programs that
@@ -174,19 +174,21 @@ that era was minimal)
There is some work to do on the lower end Python versions which is
more difficult for us to handle since we don't have a Python
interpreter for versions 1.5, 1.6, and 2.0.
interpreter for versions 1.6, and 2.0.
In the Python 3 series, Python support is is strongest around 3.4 or
3.3 and drops off as you move further away from those versions. Python
3.6 changes things drastically by using word codes rather than byte
codes. As a result, the jump offset field in a jump instruction
argument has been reduced. This makes the `EXTENDED_ARG` instructions
are now more prevalent in jump instruction; previously they had been
rare. Perhaps to compensate for the additional `EXTENDED_ARG`
instructions, additional jump optimization has been added. So in sum
handling control flow by ad hoc means as is currently done is worse.
3.0 is weird in that it in some ways resembles 2.6 more than it does
3.1 or 2.7. Python 3.6 changes things drastically by using word codes
rather than byte codes. As a result, the jump offset field in a jump
instruction argument has been reduced. This makes the `EXTENDED_ARG`
instructions are now more prevalent in jump instruction; previously
they had been rare. Perhaps to compensate for the additional
`EXTENDED_ARG` instructions, additional jump optimization has been
added. So in sum handling control flow by ad hoc means as is currently
done is worse.
Also, between Python 3.5, 3.6 and 3.7 there have been major changes to the
Between Python 3.5, 3.6 and 3.7 there have been major changes to the
`MAKE_FUNCTION` and `CALL_FUNCTION` instructions.
Currently not all Python magic numbers are supported. Specifically in
@@ -212,11 +214,12 @@ There is lots to do, so please dig in and help.
See Also
--------
* https://github.com/zrax/pycdc : supports all versions of Python and is written in C++. Support for Python 3 is a bit lacking though.
* https://code.google.com/archive/p/unpyc3/ : supports Python 3.2 only. The above projects use a different decompiling technique than what is used here.
* https://github.com/figment/unpyc3/ : fork of above, but supports Python 3.3 only. Includes some fixes like supporting function annotations
* The HISTORY_ file.
* https://github.com/zrax/pycdc : purports to support all versions of Python. It is written in C++ and is most accurate for Python versions around 2.7 and 3.3 when the code was more actively developed. Accuracy for more recent versions of Python 3 and early versions of Python are especially lacking. See its `issue tracker <https://github.com/zrax/pycdc/issues>`_ for details. Currently lightly maintained.
* https://code.google.com/archive/p/unpyc3/ : supports Python 3.2 only. The above projects use a different decompiling technique than what is used here. Currently unmaintained.
* https://github.com/figment/unpyc3/ : fork of above, but supports Python 3.3 only. Includes some fixes like supporting function annotations. Currently unmaintained.
* https://github.com/wibiti/uncompyle2 : supports Python 2.7 only, but does that fairly well. Because of it specificity it can sometimes to better than uncompyle6 which we can't do withouth breaking other 2.7 cases. Currently lightly maintained. See its issue `tracker <https://github.com/wibiti/uncompyle2/issues>`_ for more details
* `How to report a bug <https://github.com/rocky/python-uncompyle6/blob/master/HOW-TO-REPORT-A-BUG.md>`_
* The HISTORY_ file.
* https://github.com/rocky/python-xdis : Cross Python version disassembler
* https://github.com/rocky/python-xasm : Cross Python version assembler
* https://github.com/rocky/python-uncompyle6/wiki : Wiki Documents which describe the code and aspects of it in more detail