You've already forked python-uncompyle6
mirror of
https://github.com/rocky/python-uncompyle6.git
synced 2025-08-03 00:45:53 +08:00
Bug in for loop with try. Add more of 2.7's COME_FROM statements.
spark.py: add tracing reduce rules. main: reduce cutsines. Start history
This commit is contained in:
109
HISTORY.md
Normal file
109
HISTORY.md
Normal file
@@ -0,0 +1,109 @@
|
|||||||
|
This project has history of over 17 years spanning back to Python 1.5
|
||||||
|
|
||||||
|
There have been a number of people who have worked on this. I am awed
|
||||||
|
by the amount of work, number of people who have contributed to this,
|
||||||
|
and the cleverness in the code.
|
||||||
|
|
||||||
|
The below is an annotated history from my reading of the sources cited.
|
||||||
|
|
||||||
|
In 1998, John Aycock first wrote a grammar parser in Python,
|
||||||
|
eventually called SPARK, that was usable inside a Python program. This
|
||||||
|
code was described in the
|
||||||
|
[7th International Python Conference](http://legacy.python.org/workshops/1998-11/proceedings/papers/aycock-little/aycock-little.html). That
|
||||||
|
paper doesn't talk about decompilation, nor did John have that in mind
|
||||||
|
at that time. It does mention that a full parser for Python (rather
|
||||||
|
than the simple languages in the paper) was being considered.
|
||||||
|
|
||||||
|
[This](http://pages.cpsc.ucalgary.ca/~aycock/spark/content.html#contributors)
|
||||||
|
contains a of people acknowledged in developing SPARK. What's amazing
|
||||||
|
about this code is that it is reasonably fast and has survived up to
|
||||||
|
Python 3 with relatively little change. This work was done in
|
||||||
|
conjunction with his Ph.D Thesis. This was finished around 2001. In
|
||||||
|
working on his thesis, John realized SPARK could be used to deparse
|
||||||
|
Python bytecode. In the fall of 1999, he started writing the Python
|
||||||
|
program, "decompyle", to do this.
|
||||||
|
|
||||||
|
This code introduced another clever idea: using table-driven
|
||||||
|
semantics routines, using format specifiers.
|
||||||
|
|
||||||
|
The last mention of a release of SPARK from John is around 2002.
|
||||||
|
|
||||||
|
In the fall of 2000, Hartmut Goebel
|
||||||
|
[took over maintaining the code](https://groups.google.com/forum/#!searchin/comp.lang.python/hartmut$20goebel/comp.lang.python/35s3mp4-nuY/UZALti6ujnQJ). The
|
||||||
|
first subsequennt public release announcement that I can find is
|
||||||
|
["decompyle - A byte-code-decompiler version 2.2 beta 1"](https://mail.python.org/pipermail/python-announce-list/2002-February/001272.html).
|
||||||
|
|
||||||
|
From the CHANGES file found in
|
||||||
|
[the tarball for that release](http://old-releases.ubuntu.com/ubuntu/pool/universe/d/decompyle2.2/decompyle2.2_2.2beta1.orig.tar.gz),
|
||||||
|
it appears that Hartmut did most of the work to get this code to
|
||||||
|
accept the full Python language. He added precidence to the table
|
||||||
|
specifiers, support for multiple versions of Python, the
|
||||||
|
pretty-printing of docstrings, lists and hashes. He also wrote
|
||||||
|
extensive tests and routines to the testing and verification of
|
||||||
|
decompiled bytecode.
|
||||||
|
|
||||||
|
decompyle2.2 was packaged for Debian (sarge) by
|
||||||
|
[Ben Burton around 2002](https://packages.qa.debian.org/d/decompyle.html). As
|
||||||
|
it worked on Python 2.2 only long after Python 2.3 and 2.4 were in
|
||||||
|
widespread use, it was removed.
|
||||||
|
|
||||||
|
[Crazy Compilers](http://www.crazy-compilers.com/decompyle/) offers a
|
||||||
|
byte-code decompiler service for versions of Python up to 2.6. As
|
||||||
|
someone who worked in compilers, it is tough to make a living by
|
||||||
|
working on compilers. (For example, based on
|
||||||
|
[John Aycock's recent papers](http://pages.cpsc.ucalgary.ca/~aycock/)
|
||||||
|
it doesn't look like he's done anything compiler-wise since SPARK). So
|
||||||
|
I hope people will use the crazy-compilers service. I wish them the
|
||||||
|
success that his good work deserves.
|
||||||
|
|
||||||
|
Next we get to
|
||||||
|
["uncompyle" and PyPI](https://pypi.python.org/pypi/uncompyle/1.1) and
|
||||||
|
the era of git repositories. In contrast to decompyle, this now runs
|
||||||
|
only on Python 2.7 although it accepts bytecode back to Python
|
||||||
|
2.5. Thomas Grainger is the package owner of this, although Hartmut is
|
||||||
|
listed as the author.
|
||||||
|
|
||||||
|
The project exists not only on
|
||||||
|
[github](https://github.com/gstarnberger/uncompyle) but also on
|
||||||
|
[bitbucket](https://bitbucket.org/gstarnberger/uncompyle) where the
|
||||||
|
git history goes back to 2009. Somewhere in there the name was changed
|
||||||
|
from "decompyle" to "uncompyle".
|
||||||
|
|
||||||
|
The name Thomas Grainger isn't found in (m)any of the commits in the
|
||||||
|
several years of active development. Guenther Starnberger, Keknehv,
|
||||||
|
hamled, and Eike Siewertsen are principle committers here.
|
||||||
|
|
||||||
|
This project, uncompyle6, however owes its existence to uncompyle2 by
|
||||||
|
Myst herie (Mysterie) whose first commit seems to goes back to 2012;
|
||||||
|
it is also based on Hartmut's code. I chose this as it seems had been
|
||||||
|
the most actively worked on most recently.
|
||||||
|
|
||||||
|
Over the many years, code styles and Python features have
|
||||||
|
changed. However brilliant the code was and still is, it hasn't really
|
||||||
|
had a single public active maintainer. And there have been many forks
|
||||||
|
of the code.
|
||||||
|
|
||||||
|
That it has been in need of an overhaul has been recognized by the
|
||||||
|
Hartmut a decade an a half ago:
|
||||||
|
|
||||||
|
[decompyle/uncompile__init__.py](https://github.com/gstarnberger/uncompyle/blob/master/uncompyle/__init__.py#L25-L26)
|
||||||
|
|
||||||
|
NB. This is not a masterpiece of software, but became more like a hack.
|
||||||
|
Probably a complete rewrite would be sensefull. hG/2000-12-27
|
||||||
|
|
||||||
|
One of the attempts to modernize it and make it available for Python3
|
||||||
|
is [the one by Anton Vorobyov (DarkFenX)](https://github.com/DarkFenX/uncompyle3). I've
|
||||||
|
followed some of the ideas there in this project.
|
||||||
|
|
||||||
|
Lastly, I should mention [unpyc](https://code.google.com/p/unpyc3/)
|
||||||
|
and most especially [pycdc](https://github.com/zrax/pycdc), largely by
|
||||||
|
Michael Hansen and Darryl Pogue. If they supported getting source-code
|
||||||
|
fragments and I could call it from Python, I'd probably ditch this and
|
||||||
|
use that. From what I've seen, the code runs blindingly fast and spans
|
||||||
|
all versions of Python.
|
||||||
|
|
||||||
|
Tests for the project have been, or are being, culled from all of the
|
||||||
|
projects mentioned.
|
||||||
|
|
||||||
|
NB. If you find mistakes, want corrections, or want your name added (or removed),
|
||||||
|
please contact me.
|
Binary file not shown.
BIN
test/bytecode_3.4/10_for.pyc
Normal file
BIN
test/bytecode_3.4/10_for.pyc
Normal file
Binary file not shown.
BIN
test/bytecode_3.4/20_try_except.pyc
Normal file
BIN
test/bytecode_3.4/20_try_except.pyc
Normal file
Binary file not shown.
Binary file not shown.
@@ -1,8 +1,11 @@
|
|||||||
Files in this directory contain very simnple constructs that work
|
Files in this directory contain very simnple constructs that work
|
||||||
across all versions of Python.
|
across all versions of Python.
|
||||||
|
|
||||||
Their simnplicity is to try to make it easier to debug grammar
|
Their simplicity is to try to make it easier to debug scanner, grammar
|
||||||
and AST walking routines.
|
and semantic-action routines.
|
||||||
|
|
||||||
|
We also try to make the code here runnable by Python and when run should
|
||||||
|
not produce an error.
|
||||||
|
|
||||||
The numbers in the filenames are to assist running the programs from
|
The numbers in the filenames are to assist running the programs from
|
||||||
the simplest to more complex. For example, many tests have assignment
|
the simplest to more complex. For example, many tests have assignment
|
||||||
|
5
test/simple_source/exception/20_try_except.py
Normal file
5
test/simple_source/exception/20_try_except.py
Normal file
@@ -0,0 +1,5 @@
|
|||||||
|
for i in (1,2):
|
||||||
|
try:
|
||||||
|
x = 1
|
||||||
|
except ValueError:
|
||||||
|
y = 2
|
@@ -1,5 +1,8 @@
|
|||||||
# Tests:
|
# Tests:
|
||||||
# forstmt ::= SETUP_LOOP expr _for designator
|
# forstmt ::= SETUP_LOOP expr _for designator
|
||||||
# for_block POP_BLOCK COME_FROM
|
# for_block POP_BLOCK COME_FROM
|
||||||
for a in b:
|
for a in [1]:
|
||||||
c = d
|
c = 2
|
||||||
|
|
||||||
|
for a in range(2):
|
||||||
|
c = 2
|
@@ -117,7 +117,7 @@ def main(in_base, out_base, files, codes, outfile=None,
|
|||||||
outstream = _get_outstream(outfile)
|
outstream = _get_outstream(outfile)
|
||||||
# print(outfile, file=sys.stderr)
|
# print(outfile, file=sys.stderr)
|
||||||
|
|
||||||
# try to decomyple the input file
|
# Try to uncmpile the input file
|
||||||
try:
|
try:
|
||||||
uncompyle_file(infile, outstream, showasm, showast)
|
uncompyle_file(infile, outstream, showasm, showast)
|
||||||
tot_files += 1
|
tot_files += 1
|
||||||
@@ -136,8 +136,8 @@ def main(in_base, out_base, files, codes, outfile=None,
|
|||||||
outstream.close()
|
outstream.close()
|
||||||
os.rename(outfile, outfile + '_failed')
|
os.rename(outfile, outfile + '_failed')
|
||||||
else:
|
else:
|
||||||
sys.stderr.write("\n# Can't uncompyle %s\n" % infile)
|
sys.stderr.write("\n# Can't uncompile %s\n" % infile)
|
||||||
else: # uncompyle successfull
|
else: # uncompile successfull
|
||||||
if outfile:
|
if outfile:
|
||||||
outstream.close()
|
outstream.close()
|
||||||
if do_verify:
|
if do_verify:
|
||||||
@@ -145,7 +145,7 @@ def main(in_base, out_base, files, codes, outfile=None,
|
|||||||
msg = verify.compare_code_with_srcfile(infile, outfile)
|
msg = verify.compare_code_with_srcfile(infile, outfile)
|
||||||
if not outfile:
|
if not outfile:
|
||||||
if not msg:
|
if not msg:
|
||||||
print('\n# okay decompyling %s' % infile)
|
print('\n# okay decompiling %s' % infile)
|
||||||
okay_files += 1
|
okay_files += 1
|
||||||
else:
|
else:
|
||||||
print('\n# %s\n\t%s', infile, msg)
|
print('\n# %s\n\t%s', infile, msg)
|
||||||
@@ -158,7 +158,7 @@ def main(in_base, out_base, files, codes, outfile=None,
|
|||||||
else:
|
else:
|
||||||
okay_files += 1
|
okay_files += 1
|
||||||
if not outfile:
|
if not outfile:
|
||||||
mess = '\n# okay decompyling'
|
mess = '\n# okay decompiling'
|
||||||
# mem_usage = __memUsage()
|
# mem_usage = __memUsage()
|
||||||
print(mess, infile)
|
print(mess, infile)
|
||||||
if outfile:
|
if outfile:
|
||||||
|
@@ -43,7 +43,13 @@ def jabs_op(name, op):
|
|||||||
hasjabs.append(op)
|
hasjabs.append(op)
|
||||||
|
|
||||||
def updateGlobal():
|
def updateGlobal():
|
||||||
# JUMP_OPs are used in verification
|
# JUMP_OPs are used in verification and in the scanner in resolving forward/backward
|
||||||
|
# jumps
|
||||||
|
globals().update({'PJIF': opmap['POP_JUMP_IF_FALSE']})
|
||||||
|
globals().update({'PJIT': opmap['POP_JUMP_IF_TRUE']})
|
||||||
|
globals().update({'JA': opmap['JUMP_ABSOLUTE']})
|
||||||
|
globals().update({'JF': opmap['JUMP_FORWARD']})
|
||||||
|
globals().update(dict([(k.replace('+','_'),v) for (k,v) in opmap.items()]))
|
||||||
globals().update({'JUMP_OPs': map(lambda op: opname[op], hasjrel + hasjabs)})
|
globals().update({'JUMP_OPs': map(lambda op: opname[op], hasjrel + hasjabs)})
|
||||||
|
|
||||||
# Instruction opcodes for compiled code
|
# Instruction opcodes for compiled code
|
||||||
|
@@ -44,8 +44,8 @@ class _State:
|
|||||||
self.T, self.complete, self.items = [], [], items
|
self.T, self.complete, self.items = [], [], items
|
||||||
self.stateno = stateno
|
self.stateno = stateno
|
||||||
|
|
||||||
# DEFAULT_DEBUG = {'rules': True, 'transition': False}
|
# DEFAULT_DEBUG = {'rules': True, 'transition': True, 'reduce' : True}
|
||||||
DEFAULT_DEBUG = {'rules': False, 'transition': False}
|
DEFAULT_DEBUG = {'rules': False, 'transition': False, 'reduce': False}
|
||||||
class GenericParser:
|
class GenericParser:
|
||||||
'''
|
'''
|
||||||
An Earley parser, as per J. Earley, "An Efficient Context-Free
|
An Earley parser, as per J. Earley, "An Efficient Context-Free
|
||||||
@@ -450,6 +450,8 @@ class GenericParser:
|
|||||||
|
|
||||||
for rule in self.states[state].complete:
|
for rule in self.states[state].complete:
|
||||||
lhs, rhs = rule
|
lhs, rhs = rule
|
||||||
|
if self.debug['reduce']:
|
||||||
|
print("%s ::= %s" % (lhs, ' '.join(rhs)))
|
||||||
for pitem in sets[parent]:
|
for pitem in sets[parent]:
|
||||||
pstate, pparent = pitem
|
pstate, pparent = pitem
|
||||||
k = self.goto(pstate, lhs)
|
k = self.goto(pstate, lhs)
|
||||||
|
@@ -29,7 +29,6 @@ globals().update(dis.opmap)
|
|||||||
|
|
||||||
from uncompyle6.opcodes.opcode_34 import *
|
from uncompyle6.opcodes.opcode_34 import *
|
||||||
|
|
||||||
|
|
||||||
import uncompyle6.scanner as scan
|
import uncompyle6.scanner as scan
|
||||||
|
|
||||||
|
|
||||||
@@ -60,21 +59,22 @@ class Scanner34(scan.Scanner):
|
|||||||
bytecode = dis.Bytecode(co)
|
bytecode = dis.Bytecode(co)
|
||||||
|
|
||||||
# self.lines contains (block,addrLastInstr)
|
# self.lines contains (block,addrLastInstr)
|
||||||
# if classname:
|
if classname:
|
||||||
# classname = '_' + classname.lstrip('_') + '__'
|
classname = '_' + classname.lstrip('_') + '__'
|
||||||
|
|
||||||
# def unmangle(name):
|
def unmangle(name):
|
||||||
# if name.startswith(classname) and name[-2:] != '__':
|
if name.startswith(classname) and name[-2:] != '__':
|
||||||
# return name[len(classname) - 2:]
|
return name[len(classname) - 2:]
|
||||||
# return name
|
return name
|
||||||
|
|
||||||
# free = [ unmangle(name) for name in (co.co_cellvars + co.co_freevars) ]
|
# free = [ unmangle(name) for name in (co.co_cellvars + co.co_freevars) ]
|
||||||
# names = [ unmangle(name) for name in co.co_names ]
|
# names = [ unmangle(name) for name in co.co_names ]
|
||||||
# varnames = [ unmangle(name) for name in co.co_varnames ]
|
# varnames = [ unmangle(name) for name in co.co_varnames ]
|
||||||
# else:
|
else:
|
||||||
# free = co.co_cellvars + co.co_freevars
|
# free = co.co_cellvars + co.co_freevars
|
||||||
# names = co.co_names
|
# names = co.co_names
|
||||||
# varnames = co.co_varnames
|
# varnames = co.co_varnames
|
||||||
|
pass
|
||||||
|
|
||||||
# Scan for assertions. Later we will
|
# Scan for assertions. Later we will
|
||||||
# turn 'LOAD_GLOBAL' to 'LOAD_ASSERT' for those
|
# turn 'LOAD_GLOBAL' to 'LOAD_ASSERT' for those
|
||||||
@@ -439,6 +439,33 @@ class Scanner34(scan.Scanner):
|
|||||||
target += offset + 3
|
target += offset + 3
|
||||||
return target
|
return target
|
||||||
|
|
||||||
|
def next_except_jump(self, start):
|
||||||
|
"""
|
||||||
|
Return the next jump that was generated by an except SomeException:
|
||||||
|
construct in a try...except...else clause or None if not found.
|
||||||
|
"""
|
||||||
|
|
||||||
|
if self.code[start] == DUP_TOP:
|
||||||
|
except_match = self.first_instr(start, len(self.code), POP_JUMP_IF_FALSE)
|
||||||
|
if except_match:
|
||||||
|
jmp = self.prev_op[self.get_target(except_match)]
|
||||||
|
self.ignore_if.add(except_match)
|
||||||
|
self.not_continue.add(jmp)
|
||||||
|
return jmp
|
||||||
|
|
||||||
|
count_END_FINALLY = 0
|
||||||
|
count_SETUP_ = 0
|
||||||
|
for i in self.op_range(start, len(self.code)):
|
||||||
|
op = self.code[i]
|
||||||
|
if op == END_FINALLY:
|
||||||
|
if count_END_FINALLY == count_SETUP_:
|
||||||
|
assert self.code[self.prev_op[i]] in (JUMP_ABSOLUTE, JUMP_FORWARD, RETURN_VALUE)
|
||||||
|
self.not_continue.add(self.prev_op[i])
|
||||||
|
return self.prev_op[i]
|
||||||
|
count_END_FINALLY += 1
|
||||||
|
elif op in (SETUP_EXCEPT, SETUP_WITH, SETUP_FINALLY):
|
||||||
|
count_SETUP_ += 1
|
||||||
|
|
||||||
def detect_structure(self, offset):
|
def detect_structure(self, offset):
|
||||||
"""
|
"""
|
||||||
Detect structures and their boundaries to fix optimizied jumps
|
Detect structures and their boundaries to fix optimizied jumps
|
||||||
@@ -459,8 +486,51 @@ class Scanner34(scan.Scanner):
|
|||||||
start = curent_start
|
start = curent_start
|
||||||
end = curent_end
|
end = curent_end
|
||||||
parent = struct
|
parent = struct
|
||||||
|
pass
|
||||||
|
|
||||||
if op in (POP_JUMP_IF_FALSE, POP_JUMP_IF_TRUE):
|
if op == SETUP_EXCEPT:
|
||||||
|
start = offset + 3
|
||||||
|
target = self.get_target(offset)
|
||||||
|
end = self.restrict_to_parent(target, parent)
|
||||||
|
if target != end:
|
||||||
|
self.fixed_jumps[pos] = end
|
||||||
|
# print target, end, parent
|
||||||
|
# Add the try block
|
||||||
|
self.structs.append({'type': 'try',
|
||||||
|
'start': start,
|
||||||
|
'end': end-4})
|
||||||
|
# Now isolate the except and else blocks
|
||||||
|
end_else = start_else = self.get_target(self.prev_op[end])
|
||||||
|
|
||||||
|
# Add the except blocks
|
||||||
|
i = end
|
||||||
|
while self.code[i] != END_FINALLY:
|
||||||
|
jmp = self.next_except_jump(i)
|
||||||
|
if self.code[jmp] == RETURN_VALUE:
|
||||||
|
self.structs.append({'type': 'except',
|
||||||
|
'start': i,
|
||||||
|
'end': jmp+1})
|
||||||
|
i = jmp + 1
|
||||||
|
else:
|
||||||
|
if self.get_target(jmp) != start_else:
|
||||||
|
end_else = self.get_target(jmp)
|
||||||
|
if self.code[jmp] == JUMP_FORWARD:
|
||||||
|
self.fixed_jumps[jmp] = -1
|
||||||
|
self.structs.append({'type': 'except',
|
||||||
|
'start': i,
|
||||||
|
'end': jmp})
|
||||||
|
i = jmp + 3
|
||||||
|
|
||||||
|
# Add the try-else block
|
||||||
|
if end_else != start_else:
|
||||||
|
r_end_else = self.restrict_to_parent(end_else, parent)
|
||||||
|
self.structs.append({'type': 'try-else',
|
||||||
|
'start': i+1,
|
||||||
|
'end': r_end_else})
|
||||||
|
self.fixed_jumps[i] = r_end_else
|
||||||
|
else:
|
||||||
|
self.fixed_jumps[i] = i+1
|
||||||
|
elif op in (POP_JUMP_IF_FALSE, POP_JUMP_IF_TRUE):
|
||||||
start = offset + self.op_size(op)
|
start = offset + self.op_size(op)
|
||||||
target = self.get_target(offset)
|
target = self.get_target(offset)
|
||||||
rtarget = self.restrict_to_parent(target, parent)
|
rtarget = self.restrict_to_parent(target, parent)
|
||||||
|
Reference in New Issue
Block a user