You've already forked python-uncompyle6
mirror of
https://github.com/rocky/python-uncompyle6.git
synced 2025-08-03 00:45:53 +08:00
update history
This commit is contained in:
75
HISTORY.md
75
HISTORY.md
@@ -64,14 +64,17 @@ success that his good work deserves.
|
|||||||
Dan Pascu did a bit of work from late 2004 to early 2006 to get this
|
Dan Pascu did a bit of work from late 2004 to early 2006 to get this
|
||||||
code to handle first Python 2.3 and then 2.4 bytecodes. Because of
|
code to handle first Python 2.3 and then 2.4 bytecodes. Because of
|
||||||
jump optimization introduced in the CPython bytecode compiler at that
|
jump optimization introduced in the CPython bytecode compiler at that
|
||||||
time, various JUMP instructions were classifed as going backwards, and
|
time, various JUMP instructions were classified to assist parsing For
|
||||||
COME FROM instructions were reintroduced. See
|
example, due to the way that code generation and line number table
|
||||||
|
work, jump instructions to an earlier offset must be looping jumps,
|
||||||
|
such as those found in a "continue" statement; "COME FROM"
|
||||||
|
instructions were reintroduced. See
|
||||||
[RELEASE-2.4-CHANGELOG.txt](https://github.com/rocky/python-uncompyle6/blob/master/DECOMPYLE-2.4-CHANGELOG.txt)
|
[RELEASE-2.4-CHANGELOG.txt](https://github.com/rocky/python-uncompyle6/blob/master/DECOMPYLE-2.4-CHANGELOG.txt)
|
||||||
for more details here. There wasn't a public
|
for more details here. There wasn't a public release of RELEASE-2.4
|
||||||
release of RELEASE-2.4 and bytecodes other than Python 2.4 weren't
|
and bytecodes other than Python 2.4 weren't supported. Dan says the
|
||||||
supported. Dan says the Python 2.3 version could verify the entire
|
Python 2.3 version could verify the entire Python library. But given
|
||||||
Python library. But given subsequent bugs found like simply
|
subsequent bugs found like simply recognizing complex-number constants
|
||||||
recognizing complex-number constants in bytecode, decompilation wasn't perfect.
|
in bytecode, decompilation wasn't perfect.
|
||||||
|
|
||||||
Next we get to ["uncompyle" and
|
Next we get to ["uncompyle" and
|
||||||
PyPI](https://pypi.python.org/pypi/uncompyle/1.1) and the era of
|
PyPI](https://pypi.python.org/pypi/uncompyle/1.1) and the era of
|
||||||
@@ -109,18 +112,21 @@ Given this, perhaps it is not surprising that subsequent changes
|
|||||||
tended to shy away from using the built-in compiler technology
|
tended to shy away from using the built-in compiler technology
|
||||||
mechanisms and addressed problems and extensions by some other means.
|
mechanisms and addressed problems and extensions by some other means.
|
||||||
|
|
||||||
Specifically, in `uncompyle`, decompilation of python bytecode 2.5 & 2.6
|
Specifically, in `uncompyle`, decompilation of python bytecode 2.5 &
|
||||||
is done by transforming the byte code into a pseudo-2.7 Python
|
2.6 is done by transforming the byte code into a pseudo-2.7 Python
|
||||||
bytecode and is based on code from Eloi Vanderbeken.
|
bytecode and is based on code from Eloi Vanderbeken. A bit of this
|
||||||
|
could have bene easily added by modifying grammar rules.
|
||||||
|
|
||||||
This project, `uncompyle6`, abandons that approach for various
|
This project, `uncompyle6`, abandons that approach for various
|
||||||
reasons. Having a grammar per Python version is much cleaner and it
|
reasons. Having a grammar per Python version is much cleaner and it
|
||||||
scales indefinitely. That said, we don't have entire copies of the
|
scales indefinitely. That said, we don't have entire copies of the
|
||||||
grammar, but work off of differences from some neighboring version.
|
grammar, but work off of differences from some neighboring version.
|
||||||
And this too I find helpful. Should there be a desire to rebase or
|
|
||||||
start a new base version to work off of, say for some future Python
|
Should there be a desire to rebase or start a new base version to work
|
||||||
version, that can be done by dumping a grammar for a specific version
|
off of, say for some future Python version, that can be done by
|
||||||
after it has been loaded incrementally.
|
dumping a grammar for a specific version after it has been loaded
|
||||||
|
incrementally. You can get a full dump of the grammar by profiling the
|
||||||
|
grammar on a large body of Python source code.
|
||||||
|
|
||||||
Another problem with pseudo-2.7 bytecode is that that we need offsets
|
Another problem with pseudo-2.7 bytecode is that that we need offsets
|
||||||
in fragment deparsing to be exactly the same as the bytecode; the
|
in fragment deparsing to be exactly the same as the bytecode; the
|
||||||
@@ -163,24 +169,26 @@ Hartmut a decade an a half ago:
|
|||||||
This project deparses using an Earley-algorithm parse with lots of
|
This project deparses using an Earley-algorithm parse with lots of
|
||||||
massaging of tokens and the grammar in the scanner
|
massaging of tokens and the grammar in the scanner
|
||||||
phase. Earley-algorithm parsers are context free and tend to be linear
|
phase. Earley-algorithm parsers are context free and tend to be linear
|
||||||
if the grammar is LR or left recursive.
|
if the grammar is LR or left recursive. There is a technique for
|
||||||
|
improving LL right recursion, but our parser doesn't have that yet.
|
||||||
|
|
||||||
Another approach that doesn't use grammars is to do something like
|
Another approach to decompiling, and one that doesn't use grammars is
|
||||||
simulate execution symbolically and build expression trees off of
|
to do something like simulate execution symbolically and build
|
||||||
stack results. Control flow in that approach still needs to be handled
|
expression trees off of stack results. Control flow in that approach
|
||||||
somewhat ad hoc. The two important projects that work this way are
|
still needs to be handled somewhat ad hoc. The two important projects
|
||||||
[unpyc3](https://code.google.com/p/unpyc3/) and most especially
|
that work this way are [unpyc3](https://code.google.com/p/unpyc3/) and
|
||||||
[pycdc](https://github.com/zrax/pycdc) The latter project is largely
|
most especially [pycdc](https://github.com/zrax/pycdc) The latter
|
||||||
by Michael Hansen and Darryl Pogue. If they supported getting
|
project is largely by Michael Hansen and Darryl Pogue. If they
|
||||||
source-code fragments, did a better job in supporting Python more
|
supported getting source-code fragments, did a better job in
|
||||||
fully, and had a way I could call it from Python, I'd probably would
|
supporting Python more fully, and had a way I could call it from
|
||||||
have ditched this and used that. The code runs blindingly fast and
|
Python, I'd probably would have ditched this and used that. The code
|
||||||
spans all versions of Python, although more recently Python 3 support
|
runs blindingly fast and spans all versions of Python, although more
|
||||||
has been lagging. The code is impressive for its smallness given that
|
recently Python 3 support has been lagging. The code is impressive for
|
||||||
it covers many versions of Python. However, I think it has reached a
|
its smallness given that it covers many versions of Python. However, I
|
||||||
scalability issue, same as all the other efforts. For it to handle
|
think it has reached a scalability issue, same as all the other
|
||||||
Python versions more accurately, I think it will need to have a lot
|
efforts. To handle Python versions more accurately, I think that code
|
||||||
more code specially which specialize for Python versions.
|
base will need to have a lot more code specially which specializes for
|
||||||
|
Python versions. And then it will run into a modularity problem.
|
||||||
|
|
||||||
Tests for the project have been, or are being, culled from all of the
|
Tests for the project have been, or are being, culled from all of the
|
||||||
projects mentioned. Quite a few have been added to improve grammar
|
projects mentioned. Quite a few have been added to improve grammar
|
||||||
@@ -190,11 +198,12 @@ If you think, as I am sure will happen in the future, "hey, I can just
|
|||||||
write a decompiler from scratch and not have to deal with all all of
|
write a decompiler from scratch and not have to deal with all all of
|
||||||
the complexity here", think again. What is likely to happen is that
|
the complexity here", think again. What is likely to happen is that
|
||||||
you'll get at best a 90% solution working for a single Python release
|
you'll get at best a 90% solution working for a single Python release
|
||||||
that will be obsolete in about a year, and more obsolute each
|
that will be obsolete in about a year, and more obsolete each
|
||||||
subsequent year. Writing a decompiler for Python gets harder as it
|
subsequent year. Writing a decompiler for Python gets harder as it
|
||||||
Python progresses, so writing one for Python 3.7 isn't as easy as it
|
Python progresses, so writing one for Python 3.7 isn't as easy as it
|
||||||
was for Python 2.2. That said, if you still feel you want to write a
|
was for Python 2.2. That said, if you still feel you want to write a
|
||||||
single version decompiler, talk to me. I may have some ideas.
|
single version decompiler, look at the test cases in this project and
|
||||||
|
talk to me. I may have some ideas.
|
||||||
|
|
||||||
|
|
||||||
For a little bit of the history of changes to the Earley-algorithm parser,
|
For a little bit of the history of changes to the Earley-algorithm parser,
|
||||||
|
Reference in New Issue
Block a user