You've already forked python-uncompyle6
mirror of
https://github.com/rocky/python-uncompyle6.git
synced 2025-08-03 00:45:53 +08:00
Bug in for loop with try. Add more of 2.7's COME_FROM statements.
spark.py: add tracing reduce rules. main: reduce cutsines. Start history
This commit is contained in:
109
HISTORY.md
Normal file
109
HISTORY.md
Normal file
@@ -0,0 +1,109 @@
|
||||
This project has history of over 17 years spanning back to Python 1.5
|
||||
|
||||
There have been a number of people who have worked on this. I am awed
|
||||
by the amount of work, number of people who have contributed to this,
|
||||
and the cleverness in the code.
|
||||
|
||||
The below is an annotated history from my reading of the sources cited.
|
||||
|
||||
In 1998, John Aycock first wrote a grammar parser in Python,
|
||||
eventually called SPARK, that was usable inside a Python program. This
|
||||
code was described in the
|
||||
[7th International Python Conference](http://legacy.python.org/workshops/1998-11/proceedings/papers/aycock-little/aycock-little.html). That
|
||||
paper doesn't talk about decompilation, nor did John have that in mind
|
||||
at that time. It does mention that a full parser for Python (rather
|
||||
than the simple languages in the paper) was being considered.
|
||||
|
||||
[This](http://pages.cpsc.ucalgary.ca/~aycock/spark/content.html#contributors)
|
||||
contains a of people acknowledged in developing SPARK. What's amazing
|
||||
about this code is that it is reasonably fast and has survived up to
|
||||
Python 3 with relatively little change. This work was done in
|
||||
conjunction with his Ph.D Thesis. This was finished around 2001. In
|
||||
working on his thesis, John realized SPARK could be used to deparse
|
||||
Python bytecode. In the fall of 1999, he started writing the Python
|
||||
program, "decompyle", to do this.
|
||||
|
||||
This code introduced another clever idea: using table-driven
|
||||
semantics routines, using format specifiers.
|
||||
|
||||
The last mention of a release of SPARK from John is around 2002.
|
||||
|
||||
In the fall of 2000, Hartmut Goebel
|
||||
[took over maintaining the code](https://groups.google.com/forum/#!searchin/comp.lang.python/hartmut$20goebel/comp.lang.python/35s3mp4-nuY/UZALti6ujnQJ). The
|
||||
first subsequennt public release announcement that I can find is
|
||||
["decompyle - A byte-code-decompiler version 2.2 beta 1"](https://mail.python.org/pipermail/python-announce-list/2002-February/001272.html).
|
||||
|
||||
From the CHANGES file found in
|
||||
[the tarball for that release](http://old-releases.ubuntu.com/ubuntu/pool/universe/d/decompyle2.2/decompyle2.2_2.2beta1.orig.tar.gz),
|
||||
it appears that Hartmut did most of the work to get this code to
|
||||
accept the full Python language. He added precidence to the table
|
||||
specifiers, support for multiple versions of Python, the
|
||||
pretty-printing of docstrings, lists and hashes. He also wrote
|
||||
extensive tests and routines to the testing and verification of
|
||||
decompiled bytecode.
|
||||
|
||||
decompyle2.2 was packaged for Debian (sarge) by
|
||||
[Ben Burton around 2002](https://packages.qa.debian.org/d/decompyle.html). As
|
||||
it worked on Python 2.2 only long after Python 2.3 and 2.4 were in
|
||||
widespread use, it was removed.
|
||||
|
||||
[Crazy Compilers](http://www.crazy-compilers.com/decompyle/) offers a
|
||||
byte-code decompiler service for versions of Python up to 2.6. As
|
||||
someone who worked in compilers, it is tough to make a living by
|
||||
working on compilers. (For example, based on
|
||||
[John Aycock's recent papers](http://pages.cpsc.ucalgary.ca/~aycock/)
|
||||
it doesn't look like he's done anything compiler-wise since SPARK). So
|
||||
I hope people will use the crazy-compilers service. I wish them the
|
||||
success that his good work deserves.
|
||||
|
||||
Next we get to
|
||||
["uncompyle" and PyPI](https://pypi.python.org/pypi/uncompyle/1.1) and
|
||||
the era of git repositories. In contrast to decompyle, this now runs
|
||||
only on Python 2.7 although it accepts bytecode back to Python
|
||||
2.5. Thomas Grainger is the package owner of this, although Hartmut is
|
||||
listed as the author.
|
||||
|
||||
The project exists not only on
|
||||
[github](https://github.com/gstarnberger/uncompyle) but also on
|
||||
[bitbucket](https://bitbucket.org/gstarnberger/uncompyle) where the
|
||||
git history goes back to 2009. Somewhere in there the name was changed
|
||||
from "decompyle" to "uncompyle".
|
||||
|
||||
The name Thomas Grainger isn't found in (m)any of the commits in the
|
||||
several years of active development. Guenther Starnberger, Keknehv,
|
||||
hamled, and Eike Siewertsen are principle committers here.
|
||||
|
||||
This project, uncompyle6, however owes its existence to uncompyle2 by
|
||||
Myst herie (Mysterie) whose first commit seems to goes back to 2012;
|
||||
it is also based on Hartmut's code. I chose this as it seems had been
|
||||
the most actively worked on most recently.
|
||||
|
||||
Over the many years, code styles and Python features have
|
||||
changed. However brilliant the code was and still is, it hasn't really
|
||||
had a single public active maintainer. And there have been many forks
|
||||
of the code.
|
||||
|
||||
That it has been in need of an overhaul has been recognized by the
|
||||
Hartmut a decade an a half ago:
|
||||
|
||||
[decompyle/uncompile__init__.py](https://github.com/gstarnberger/uncompyle/blob/master/uncompyle/__init__.py#L25-L26)
|
||||
|
||||
NB. This is not a masterpiece of software, but became more like a hack.
|
||||
Probably a complete rewrite would be sensefull. hG/2000-12-27
|
||||
|
||||
One of the attempts to modernize it and make it available for Python3
|
||||
is [the one by Anton Vorobyov (DarkFenX)](https://github.com/DarkFenX/uncompyle3). I've
|
||||
followed some of the ideas there in this project.
|
||||
|
||||
Lastly, I should mention [unpyc](https://code.google.com/p/unpyc3/)
|
||||
and most especially [pycdc](https://github.com/zrax/pycdc), largely by
|
||||
Michael Hansen and Darryl Pogue. If they supported getting source-code
|
||||
fragments and I could call it from Python, I'd probably ditch this and
|
||||
use that. From what I've seen, the code runs blindingly fast and spans
|
||||
all versions of Python.
|
||||
|
||||
Tests for the project have been, or are being, culled from all of the
|
||||
projects mentioned.
|
||||
|
||||
NB. If you find mistakes, want corrections, or want your name added (or removed),
|
||||
please contact me.
|
Binary file not shown.
BIN
test/bytecode_3.4/10_for.pyc
Normal file
BIN
test/bytecode_3.4/10_for.pyc
Normal file
Binary file not shown.
BIN
test/bytecode_3.4/20_try_except.pyc
Normal file
BIN
test/bytecode_3.4/20_try_except.pyc
Normal file
Binary file not shown.
Binary file not shown.
@@ -1,8 +1,11 @@
|
||||
Files in this directory contain very simnple constructs that work
|
||||
across all versions of Python.
|
||||
|
||||
Their simnplicity is to try to make it easier to debug grammar
|
||||
and AST walking routines.
|
||||
Their simplicity is to try to make it easier to debug scanner, grammar
|
||||
and semantic-action routines.
|
||||
|
||||
We also try to make the code here runnable by Python and when run should
|
||||
not produce an error.
|
||||
|
||||
The numbers in the filenames are to assist running the programs from
|
||||
the simplest to more complex. For example, many tests have assignment
|
||||
|
5
test/simple_source/exception/20_try_except.py
Normal file
5
test/simple_source/exception/20_try_except.py
Normal file
@@ -0,0 +1,5 @@
|
||||
for i in (1,2):
|
||||
try:
|
||||
x = 1
|
||||
except ValueError:
|
||||
y = 2
|
@@ -1,5 +1,8 @@
|
||||
# Tests:
|
||||
# forstmt ::= SETUP_LOOP expr _for designator
|
||||
# for_block POP_BLOCK COME_FROM
|
||||
for a in b:
|
||||
c = d
|
||||
for a in [1]:
|
||||
c = 2
|
||||
|
||||
for a in range(2):
|
||||
c = 2
|
@@ -117,7 +117,7 @@ def main(in_base, out_base, files, codes, outfile=None,
|
||||
outstream = _get_outstream(outfile)
|
||||
# print(outfile, file=sys.stderr)
|
||||
|
||||
# try to decomyple the input file
|
||||
# Try to uncmpile the input file
|
||||
try:
|
||||
uncompyle_file(infile, outstream, showasm, showast)
|
||||
tot_files += 1
|
||||
@@ -136,8 +136,8 @@ def main(in_base, out_base, files, codes, outfile=None,
|
||||
outstream.close()
|
||||
os.rename(outfile, outfile + '_failed')
|
||||
else:
|
||||
sys.stderr.write("\n# Can't uncompyle %s\n" % infile)
|
||||
else: # uncompyle successfull
|
||||
sys.stderr.write("\n# Can't uncompile %s\n" % infile)
|
||||
else: # uncompile successfull
|
||||
if outfile:
|
||||
outstream.close()
|
||||
if do_verify:
|
||||
@@ -145,7 +145,7 @@ def main(in_base, out_base, files, codes, outfile=None,
|
||||
msg = verify.compare_code_with_srcfile(infile, outfile)
|
||||
if not outfile:
|
||||
if not msg:
|
||||
print('\n# okay decompyling %s' % infile)
|
||||
print('\n# okay decompiling %s' % infile)
|
||||
okay_files += 1
|
||||
else:
|
||||
print('\n# %s\n\t%s', infile, msg)
|
||||
@@ -158,7 +158,7 @@ def main(in_base, out_base, files, codes, outfile=None,
|
||||
else:
|
||||
okay_files += 1
|
||||
if not outfile:
|
||||
mess = '\n# okay decompyling'
|
||||
mess = '\n# okay decompiling'
|
||||
# mem_usage = __memUsage()
|
||||
print(mess, infile)
|
||||
if outfile:
|
||||
|
@@ -43,7 +43,13 @@ def jabs_op(name, op):
|
||||
hasjabs.append(op)
|
||||
|
||||
def updateGlobal():
|
||||
# JUMP_OPs are used in verification
|
||||
# JUMP_OPs are used in verification and in the scanner in resolving forward/backward
|
||||
# jumps
|
||||
globals().update({'PJIF': opmap['POP_JUMP_IF_FALSE']})
|
||||
globals().update({'PJIT': opmap['POP_JUMP_IF_TRUE']})
|
||||
globals().update({'JA': opmap['JUMP_ABSOLUTE']})
|
||||
globals().update({'JF': opmap['JUMP_FORWARD']})
|
||||
globals().update(dict([(k.replace('+','_'),v) for (k,v) in opmap.items()]))
|
||||
globals().update({'JUMP_OPs': map(lambda op: opname[op], hasjrel + hasjabs)})
|
||||
|
||||
# Instruction opcodes for compiled code
|
||||
|
@@ -44,8 +44,8 @@ class _State:
|
||||
self.T, self.complete, self.items = [], [], items
|
||||
self.stateno = stateno
|
||||
|
||||
# DEFAULT_DEBUG = {'rules': True, 'transition': False}
|
||||
DEFAULT_DEBUG = {'rules': False, 'transition': False}
|
||||
# DEFAULT_DEBUG = {'rules': True, 'transition': True, 'reduce' : True}
|
||||
DEFAULT_DEBUG = {'rules': False, 'transition': False, 'reduce': False}
|
||||
class GenericParser:
|
||||
'''
|
||||
An Earley parser, as per J. Earley, "An Efficient Context-Free
|
||||
@@ -450,6 +450,8 @@ class GenericParser:
|
||||
|
||||
for rule in self.states[state].complete:
|
||||
lhs, rhs = rule
|
||||
if self.debug['reduce']:
|
||||
print("%s ::= %s" % (lhs, ' '.join(rhs)))
|
||||
for pitem in sets[parent]:
|
||||
pstate, pparent = pitem
|
||||
k = self.goto(pstate, lhs)
|
||||
|
@@ -29,7 +29,6 @@ globals().update(dis.opmap)
|
||||
|
||||
from uncompyle6.opcodes.opcode_34 import *
|
||||
|
||||
|
||||
import uncompyle6.scanner as scan
|
||||
|
||||
|
||||
@@ -60,21 +59,22 @@ class Scanner34(scan.Scanner):
|
||||
bytecode = dis.Bytecode(co)
|
||||
|
||||
# self.lines contains (block,addrLastInstr)
|
||||
# if classname:
|
||||
# classname = '_' + classname.lstrip('_') + '__'
|
||||
if classname:
|
||||
classname = '_' + classname.lstrip('_') + '__'
|
||||
|
||||
# def unmangle(name):
|
||||
# if name.startswith(classname) and name[-2:] != '__':
|
||||
# return name[len(classname) - 2:]
|
||||
# return name
|
||||
def unmangle(name):
|
||||
if name.startswith(classname) and name[-2:] != '__':
|
||||
return name[len(classname) - 2:]
|
||||
return name
|
||||
|
||||
# free = [ unmangle(name) for name in (co.co_cellvars + co.co_freevars) ]
|
||||
# names = [ unmangle(name) for name in co.co_names ]
|
||||
# varnames = [ unmangle(name) for name in co.co_varnames ]
|
||||
# else:
|
||||
else:
|
||||
# free = co.co_cellvars + co.co_freevars
|
||||
# names = co.co_names
|
||||
# varnames = co.co_varnames
|
||||
pass
|
||||
|
||||
# Scan for assertions. Later we will
|
||||
# turn 'LOAD_GLOBAL' to 'LOAD_ASSERT' for those
|
||||
@@ -439,6 +439,33 @@ class Scanner34(scan.Scanner):
|
||||
target += offset + 3
|
||||
return target
|
||||
|
||||
def next_except_jump(self, start):
|
||||
"""
|
||||
Return the next jump that was generated by an except SomeException:
|
||||
construct in a try...except...else clause or None if not found.
|
||||
"""
|
||||
|
||||
if self.code[start] == DUP_TOP:
|
||||
except_match = self.first_instr(start, len(self.code), POP_JUMP_IF_FALSE)
|
||||
if except_match:
|
||||
jmp = self.prev_op[self.get_target(except_match)]
|
||||
self.ignore_if.add(except_match)
|
||||
self.not_continue.add(jmp)
|
||||
return jmp
|
||||
|
||||
count_END_FINALLY = 0
|
||||
count_SETUP_ = 0
|
||||
for i in self.op_range(start, len(self.code)):
|
||||
op = self.code[i]
|
||||
if op == END_FINALLY:
|
||||
if count_END_FINALLY == count_SETUP_:
|
||||
assert self.code[self.prev_op[i]] in (JUMP_ABSOLUTE, JUMP_FORWARD, RETURN_VALUE)
|
||||
self.not_continue.add(self.prev_op[i])
|
||||
return self.prev_op[i]
|
||||
count_END_FINALLY += 1
|
||||
elif op in (SETUP_EXCEPT, SETUP_WITH, SETUP_FINALLY):
|
||||
count_SETUP_ += 1
|
||||
|
||||
def detect_structure(self, offset):
|
||||
"""
|
||||
Detect structures and their boundaries to fix optimizied jumps
|
||||
@@ -459,8 +486,51 @@ class Scanner34(scan.Scanner):
|
||||
start = curent_start
|
||||
end = curent_end
|
||||
parent = struct
|
||||
pass
|
||||
|
||||
if op in (POP_JUMP_IF_FALSE, POP_JUMP_IF_TRUE):
|
||||
if op == SETUP_EXCEPT:
|
||||
start = offset + 3
|
||||
target = self.get_target(offset)
|
||||
end = self.restrict_to_parent(target, parent)
|
||||
if target != end:
|
||||
self.fixed_jumps[pos] = end
|
||||
# print target, end, parent
|
||||
# Add the try block
|
||||
self.structs.append({'type': 'try',
|
||||
'start': start,
|
||||
'end': end-4})
|
||||
# Now isolate the except and else blocks
|
||||
end_else = start_else = self.get_target(self.prev_op[end])
|
||||
|
||||
# Add the except blocks
|
||||
i = end
|
||||
while self.code[i] != END_FINALLY:
|
||||
jmp = self.next_except_jump(i)
|
||||
if self.code[jmp] == RETURN_VALUE:
|
||||
self.structs.append({'type': 'except',
|
||||
'start': i,
|
||||
'end': jmp+1})
|
||||
i = jmp + 1
|
||||
else:
|
||||
if self.get_target(jmp) != start_else:
|
||||
end_else = self.get_target(jmp)
|
||||
if self.code[jmp] == JUMP_FORWARD:
|
||||
self.fixed_jumps[jmp] = -1
|
||||
self.structs.append({'type': 'except',
|
||||
'start': i,
|
||||
'end': jmp})
|
||||
i = jmp + 3
|
||||
|
||||
# Add the try-else block
|
||||
if end_else != start_else:
|
||||
r_end_else = self.restrict_to_parent(end_else, parent)
|
||||
self.structs.append({'type': 'try-else',
|
||||
'start': i+1,
|
||||
'end': r_end_else})
|
||||
self.fixed_jumps[i] = r_end_else
|
||||
else:
|
||||
self.fixed_jumps[i] = i+1
|
||||
elif op in (POP_JUMP_IF_FALSE, POP_JUMP_IF_TRUE):
|
||||
start = offset + self.op_size(op)
|
||||
target = self.get_target(offset)
|
||||
rtarget = self.restrict_to_parent(target, parent)
|
||||
|
Reference in New Issue
Block a user