Compare commits

...

25 Commits

Author SHA1 Message Date
rocky
1aeb09cb8b Get ready for release 2.9.6 2016-11-20 21:38:43 -05:00
R. Bernstein
f575234fc8 Merge pull request #68 from rocky/line-mappings
Line mappings
2016-11-20 21:16:01 -05:00
rocky
abcd10628a Add --linemaps: shows line number correspondences 2016-11-20 21:11:38 -05:00
rocky
eb2b63ce9c Merge remote-tracking branch 'origin' into line-mappings 2016-11-20 18:41:19 -05:00
rocky
805e17988e Fix bug in docstring triple quotes
Problem was not escaping """ inside """.
Use ''' when possible; and when not, use: \"\"\".
2016-11-20 12:21:56 -05:00
rocky
80df5dcc95 Back off a test.
That means bugs in 2.7 still not fixed. Sigh.
2016-11-20 11:37:19 -05:00
rocky
2bc316d6f0 more 2.7 control flow bug fixing 2016-11-20 06:55:08 -05:00
rocky
195bbc746b Pass debug in scanner26 find_targets 2016-11-20 03:42:30 -05:00
rocky
0f56b4f476 Add debug option on Python 3 find_jump_targets() 2016-11-20 03:21:03 -05:00
rocky
94719918d4 A little closesr in PyPy 2.7 list comprehensions
pysource.py: note need to handle line breaks in list comprehensions
2016-11-20 03:17:49 -05:00
rocky
f2a3721d7d Start to improve detect_structure for 2.7 and 2.x
Add debug flag to find_jump_targets to show the structure we found.
When there are control-flow bugs, it's often reflected here.

scanner3.py: make code make more similar to 2.x code
2016-11-20 02:38:59 -05:00
rocky
79863ae122 Merge branch 'master' into line-mappings 2016-11-18 09:04:03 -05:00
rocky
d7f898b4fb New feature: show line number correspondences
Option --linemap on uncompile show how original source-code line numbers
map to uncompiled source lines
2016-11-18 09:02:00 -05:00
R. Bernstein
fe36c9e9f6 Merge pull request #67 from rocky/2.6-cf-ignore-if
2.6 cf ignore if
2016-11-17 03:53:10 -05:00
rocky
76ae1592d0 verify scanner2 vs scanner3 small changes...
verify.py: allow LOAD_CONST None to make LOAD_NAME 'None'
scanner{2,3}.py: make them look more alike
2016-11-17 03:43:39 -05:00
rocky
31d387749b More AST checking
Small fixes in output format
2016-11-16 07:28:19 -05:00
rocky
9e3026bd78 WIP Grammar changes - reinstatng COME_FROMs around ignore_if's 2016-11-15 23:44:22 -05:00
rocky
bfe7e7777d Revise MANIFEST.in with what we have 2016-11-15 23:44:22 -05:00
rocky
81b4941fda Merge branch '2.6-cf-ignore-if' of github.com:rocky/python-uncompyle6 into 2.6-cf-ignore-if 2016-11-15 13:26:22 -05:00
rocky
0f719d41fd Revise MANIFEST.in with what we have 2016-11-14 20:20:07 -05:00
rocky
766451cbb9 WIP remove COME_FROMs around ignore_if's 2016-11-14 09:27:56 -05:00
rocky
1e4dc52197 WIP remove COME_FROMs around ignore_if's 2016-11-14 07:27:13 -05:00
rocky
6073c77921 Show line numbers in 2.6 "after" asm ..
start to understand some of the Python 2.6 bytecode parse failures.
2016-11-14 00:30:23 -05:00
rocky
b6e53205dd Handle verify syntax errors...
Update README.rst stats
2016-11-13 18:55:23 -05:00
rocky
ee6dddd25a Administrivia: Fixes #66 2016-11-13 14:20:36 -05:00
26 changed files with 467 additions and 131 deletions

View File

@@ -1,6 +1,103 @@
2016-11-20 rocky <rb@dustyfeet.com>
* uncompyle6/version.py: Get ready for release 2.9.6
2016-11-20 R. Bernstein <rocky@users.noreply.github.com>
* : Merge pull request #68 from rocky/line-mappings Line mappings
2016-11-20 rocky <rb@dustyfeet.com>
* : Merge remote-tracking branch 'origin' into line-mappings
2016-11-20 rocky <rb@dustyfeet.com>
* uncompyle6/parsers/parse27.py: Back off a test. That means bugs in 2.7 still not fixed. Sigh.
2016-11-20 rocky <rb@dustyfeet.com>
* pytest/test_fjt.py, uncompyle6/parsers/parse27.py,
uncompyle6/scanners/scanner2.py: more 2.7 control flow bug fixing
2016-11-20 rocky <rb@dustyfeet.com>
* uncompyle6/scanners/scanner26.py: Pass debug in scanner26
find_targets
2016-11-20 rocky <rb@dustyfeet.com>
* uncompyle6/scanners/scanner3.py: Add debug option on Python 3
find_jump_targets()
2016-11-20 rocky <rb@dustyfeet.com>
* uncompyle6/semantics/pysource.py: A little closesr in PyPy 2.7
list comprehensions pysource.py: note need to handle line breaks in list comprehensions
2016-11-20 rocky <rb@dustyfeet.com>
* pytest/test_fjt.py, uncompyle6/scanners/scanner2.py,
uncompyle6/scanners/scanner26.py, uncompyle6/scanners/scanner3.py:
Start to improve detect_structure for 2.7 and 2.x Add debug flag to find_jump_targets to show the structure we found.
When there are control-flow bugs, it's often reflected here. scanner3.py: make code make more similar to 2.x code
2016-11-18 rocky <rb@dustyfeet.com>
* : commit d7f898b4fbf79d1f66eabadb25f0f9f0f38730cb Author: rocky
<rb@dustyfeet.com> Date: Fri Nov 18 09:02:00 2016 -0500
2016-11-17 R. Bernstein <rocky@users.noreply.github.com>
* : Merge pull request #67 from rocky/2.6-cf-ignore-if 2.6 cf ignore if
2016-11-16 rocky <rb@dustyfeet.com>
* test/simple_source/bug26/03_if_vs_and.py, uncompyle6/main.py,
uncompyle6/semantics/check_ast.py, uncompyle6/semantics/pysource.py:
More AST checking Small fixes in output format
2016-11-15 rocky <rb@dustyfeet.com>
* uncompyle6/parsers/parse23.py, uncompyle6/parsers/parse26.py,
uncompyle6/scanners/scanner2.py: WIP Grammar changes - reinstatng
COME_FROMs around ignore_if's
2016-11-14 rocky <rb@dustyfeet.com>
* MANIFEST.in: Revise MANIFEST.in with what we have
2016-11-15 rocky <rb@dustyfeet.com>
* : commit 0f719d41fdf08d41de594abb1664ab42ff92bbdf Author: rocky
<rb@dustyfeet.com> Date: Mon Nov 14 20:20:07 2016 -0500
2016-11-14 rocky <rb@dustyfeet.com>
* uncompyle6/parsers/parse26.py, uncompyle6/scanners/scanner2.py:
WIP remove COME_FROMs around ignore_if's
2016-11-14 rocky <rb@dustyfeet.com>
* uncompyle6/parsers/parse26.py, uncompyle6/scanners/scanner2.py:
WIP remove COME_FROMs around ignore_if's
2016-11-14 rocky <rb@dustyfeet.com>
* uncompyle6/scanners/scanner2.py, uncompyle6/scanners/scanner26.py:
Show line numbers in 2.6 "after" asm .. start to understand some of the Python 2.6 bytecode parse failures.
2016-11-13 rocky <rb@dustyfeet.com>
* uncompyle6/version.py: Get ready for release 2.9.5
* README.rst, uncompyle6/verify.py: Handle verify syntax errors... Update README.rst stats
2016-11-13 rocky <rb@dustyfeet.com>
* setup.py: Administrivia: Fixes #66
2016-11-13 rocky <rb@dustyfeet.com>
* ChangeLog, NEWS, uncompyle6/version.py: Get ready for release
2.9.5
2016-11-13 rocky <rb@dustyfeet.com>

View File

@@ -2,10 +2,16 @@ include README.rst
include ChangeLog
include HISTORY.md
include LICENSE
include Makefile
include requirements.txt
include requirements-dev.txt
include DECOMPYLE-2.4-CHANGELOG.txt
include __pkginfo__.py
recursive-include uncompyle6 *.py
include bin/uncompyle6
include bin/pydisassemble
include pytest/Makefile
include test/Makefile
recursive-include test *.py *.pyc
recursive-include pytest *.py
recursive-include pytest/testdata *

17
NEWS
View File

@@ -1,3 +1,20 @@
uncompyle6 2.9.6 2016-11-20
- Correct MANIFEST.in
- More AST grammar checking
- --linemapping option or linenumbers.line_number_mapping()
Shows correspondence of lines between source
and decompiled source
- Some control flow adjustments in code for 2.x.
This is probably an improvement in 2.6 and before.
For 2.7 things are just shuffled around a little. Sigh.
Overall I think we are getting more precise in
or analysis even if it is not always reflected
in the results.
- better control flow debugging output
- Python 2 and 3 detect structure code is more similar
- Handle Docstrings with embedded tiple quotes (""")
uncompyle6 2.9.5 2016-11-13
- Fix Python 3 bugs:

View File

@@ -100,11 +100,13 @@ The biggest known and possibly fixable (but hard) problem has to do
with handling control flow. In some cases we can detect an erroneous
decompilation and report that.
About 90% of the decompilation verifies from Python 2.3.7 to Python
3.4.2 on the standard library packages I have on my system.
About 90% of the decompilation of Python standard library packages in
Python 2.7.12 verifies correctly. Over 99% of Python 2.7 and 3.3-3.5
"weakly" verify. Python 2.6 drops down to 96% weakly verifying.
Other versions drop off in quality too.
*Verification* is the process of decompiling bytecode, compiling with
a Python for that byecode version, and then comparing the bytecode
a Python for that bytecode version, and then comparing the bytecode
produced by the decompiled/compiled program. Some allowance is made
for inessential differences. But other semantically equivalent
differences are not caught. For example ``if x: foo()`` is

View File

@@ -8,6 +8,18 @@ def bug(state, slotstate):
for key, value in slotstate.items():
setattr(state, key, 2)
# From 2.7 disassemble
# Problem is not getting while, because
# COME_FROM not added
def bug_loop(disassemble, tb=None):
if tb:
try:
tb = 5
except AttributeError:
raise RuntimeError
while tb: tb = tb.tb_next
disassemble(tb)
def test_if_in_for():
code = bug.__code__
scan = get_scanner(PYTHON_VERSION)
@@ -16,18 +28,35 @@ def test_if_in_for():
n = scan.setup_code(code)
scan.build_lines_data(code, n)
scan.build_prev_op(n)
fjt = scan.find_jump_targets()
fjt = scan.find_jump_targets(False)
assert {15: [3], 69: [66], 63: [18]} == fjt
assert scan.structs == \
[{'start': 0, 'end': 72, 'type': 'root'},
{'start': 18, 'end': 66, 'type': 'if-then'},
{'start': 15, 'end': 66, 'type': 'if-then'},
{'start': 31, 'end': 59, 'type': 'for-loop'},
{'start': 62, 'end': 63, 'type': 'for-else'}]
code = bug_loop.__code__
n = scan.setup_code(code)
scan.build_lines_data(code, n)
scan.build_prev_op(n)
fjt = scan.find_jump_targets(False)
assert{64: [42], 67: [42, 42], 42: [16, 41], 19: [6]} == fjt
assert scan.structs == [
{'start': 0, 'end': 80, 'type': 'root'},
{'start': 3, 'end': 64, 'type': 'if-then'},
{'start': 6, 'end': 15, 'type': 'try'},
{'start': 19, 'end': 38, 'type': 'except'},
{'start': 45, 'end': 67, 'type': 'while-loop'},
{'start': 70, 'end': 64, 'type': 'while-else'},
# previous bug was not mistaking while-loop for if-then
{'start': 48, 'end': 67, 'type': 'while-loop'}]
elif 3.2 < PYTHON_VERSION <= 3.4:
scan.code = array('B', code.co_code)
scan.build_lines_data(code)
scan.build_prev_op()
fjt = scan.find_jump_targets()
fjt = scan.find_jump_targets(False)
assert {69: [66], 63: [18]} == fjt
assert scan.structs == \
[{'end': 72, 'type': 'root', 'start': 0},

View File

@@ -24,6 +24,6 @@ setup(
py_modules = py_modules,
test_suite = 'nose.collector',
url = web,
setup_requires = ['nose>=1.0'],
tests_require = ['nose>=1.0'],
version = VERSION,
zip_safe = zip_safe)

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@@ -0,0 +1,22 @@
# From 2.6 decimal
# Bug was not recognizing scope of if and
# turning it into xc == 1 and xe *= yc
def _power_exact(y, xc, yc, xe):
yc, ye = y.int, y.exp
while yc % 10 == 0:
yc //= 10
ye += 1
if xc == 1:
xe *= yc
while xe % 10 == 0:
xe //= 10
ye += 1
if ye < 0:
return None
exponent = xe * 10**ye
if y and xe:
xc = exponent
else:
xc = 0
return 5

View File

@@ -0,0 +1,7 @@
# uncompyle2 bug was not escaping """ properly
r'''func placeholder - with ("""\nstring\n""")'''
def foo():
r'''func placeholder - ' and with ("""\nstring\n""")'''
def bar():
r"""func placeholder - ' and with ('''\nstring\n''') and \"\"\"\nstring\n\"\"\" """

View File

@@ -5,7 +5,7 @@
# Copyright (c) 2000-2002 by hartmut Goebel <h.goebel@crazy-compilers.com>
#
from __future__ import print_function
import sys, os, getopt, tempfile, time
import sys, os, getopt, time
program, ext = os.path.splitext(os.path.basename(__file__))
@@ -35,7 +35,8 @@ Options:
-p <integer> use <integer> number of processes
-r recurse directories looking for .pyc and .pyo files
--verify compare generated source with input byte-code
(requires -o)
--linemaps generated line number correspondencies between byte-code
and generated source output
--help show this message
Debugging Options:
@@ -81,8 +82,8 @@ def main_bin():
try:
opts, files = getopt.getopt(sys.argv[1:], 'hagtdrVo:c:p:',
'help asm grammar recurse timestamp tree verify version '
'showgrammar'.split(' '))
'help asm grammar linemaps recurse timestamp tree '
'verify version showgrammar'.split(' '))
except getopt.GetoptError as e:
print('%s: %s' % (os.path.basename(sys.argv[0]), e), file=sys.stderr)
sys.exit(-1)
@@ -97,6 +98,8 @@ def main_bin():
sys.exit(0)
elif opt == '--verify':
options['do_verify'] = True
elif opt == '--linemaps':
options['do_linemaps'] = True
elif opt in ('--asm', '-a'):
options['showasm'] = 'after'
options['do_verify'] = False
@@ -146,11 +149,7 @@ def main_bin():
usage()
if outfile == '-':
if 'do_verify' in options and options['do_verify'] and len(files) == 1:
junk, outfile = tempfile.mkstemp(suffix=".pyc",
prefix=files[0][0:-4]+'-')
else:
outfile = None # use stdout
outfile = None # use stdout
elif outfile and os.path.isdir(outfile):
out_base = outfile; outfile = None
elif outfile and len(files) > 1:

61
uncompyle6/linenumbers.py Normal file
View File

@@ -0,0 +1,61 @@
from collections import deque, namedtuple
from xdis.code import iscode
from xdis.load import load_file, load_module
from xdis.main import get_opcode
from xdis.bytecode import Bytecode, findlinestarts, offset2line
def line_number_mapping(pyc_filename, src_filename):
(version, timestamp, magic_int, code1, is_pypy,
source_size) = load_module(pyc_filename)
try:
code2 = load_file(src_filename)
except SyntaxError as e:
return str(e)
queue = deque([code1, code2])
mappings = []
opc = get_opcode(version, is_pypy)
number_loop(queue, mappings, opc)
return sorted(mappings, key=lambda x: x[1])
def number_loop(queue, mappings, opc):
while len(queue) > 0:
code1 = queue.popleft()
code2 = queue.popleft()
assert code1.co_name == code2.co_name
linestarts_orig = findlinestarts(code1)
linestarts_uncompiled = list(findlinestarts(code2))
mappings += [[line, offset2line(offset, linestarts_uncompiled)] for offset, line in linestarts_orig]
bytecode1 = Bytecode(code1, opc)
bytecode2 = Bytecode(code2, opc)
instr2s = bytecode2.get_instructions(code2)
seen = set([code1.co_name])
for instr in bytecode1.get_instructions(code1):
next_code1 = None
if iscode(instr.argval):
next_code1 = instr.argval
if next_code1:
next_code2 = None
while not next_code2:
try:
instr2 = next(instr2s)
if iscode(instr2.argval):
next_code2 = instr2.argval
pass
except StopIteration:
break
pass
if next_code2:
assert next_code1.co_name == next_code2.co_name
if next_code1.co_name not in seen:
seen.add(next_code1.co_name)
queue.append(next_code1)
queue.append(next_code2)
pass
pass
pass
pass

View File

@@ -1,5 +1,5 @@
from __future__ import print_function
import datetime, os, sys
import datetime, os, subprocess, sys, tempfile
from uncompyle6 import verify, IS_PYPY
from xdis.code import iscode
@@ -7,6 +7,7 @@ from uncompyle6.disas import check_object_path
from uncompyle6.semantics import pysource
from uncompyle6.parser import ParserError
from uncompyle6.version import VERSION
from uncompyle6.linenumbers import line_number_mapping
from xdis.load import load_module
@@ -76,7 +77,8 @@ def uncompyle_file(filename, outstream=None, showasm=None, showast=False,
# FIXME: combine into an options parameter
def main(in_base, out_base, files, codes, outfile=None,
showasm=None, showast=False, do_verify=False,
showgrammar=False, raise_on_error=False):
showgrammar=False, raise_on_error=False,
do_linemaps=False):
"""
in_base base directory for input files
out_base base directory for output files (ignored when
@@ -99,7 +101,6 @@ def main(in_base, out_base, files, codes, outfile=None,
pass
return open(outfile, 'w')
of = outfile
tot_files = okay_files = failed_files = verify_failed_files = 0
# for code in codes:
@@ -117,10 +118,21 @@ def main(in_base, out_base, files, codes, outfile=None,
# print (infile, file=sys.stderr)
if of: # outfile was given as parameter
if outfile: # outfile was given as parameter
outstream = _get_outstream(outfile)
elif out_base is None:
outstream = sys.stdout
if do_linemaps or do_verify:
prefix = os.path.basename(filename)
if prefix.endswith('.py'):
prefix = prefix[:-len('.py')]
junk, outfile = tempfile.mkstemp(suffix=".py",
prefix=prefix)
# Unbuffer output
sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)
tee = subprocess.Popen(["tee", outfile], stdin=subprocess.PIPE)
os.dup2(tee.stdin.fileno(), sys.stdout.fileno())
os.dup2(tee.stdin.fileno(), sys.stderr.fileno())
else:
if filename.endswith('.pyc'):
outfile = os.path.join(out_base, filename[0:-1])
@@ -134,12 +146,14 @@ def main(in_base, out_base, files, codes, outfile=None,
uncompyle_file(infile, outstream, showasm, showast, showgrammar)
tot_files += 1
except (ValueError, SyntaxError, ParserError, pysource.SourceWalkerError) as e:
sys.stderr.write("\n# file %s\n# %s" % (infile, e))
sys.stdout.write("\n")
sys.stderr.write("\n# file %s\n# %s\n" % (infile, e))
failed_files += 1
except KeyboardInterrupt:
if outfile:
outstream.close()
os.remove(outfile)
sys.stdout.write("\n")
sys.stderr.write("\nLast file: %s " % (infile))
raise
# except:
@@ -152,7 +166,15 @@ def main(in_base, out_base, files, codes, outfile=None,
# sys.stderr.write("\n# Can't uncompile %s\n" % infile)
else: # uncompile successful
if outfile:
if do_linemaps:
mapping = line_number_mapping(infile, outfile)
outstream.write("\n\n## Line number correspondences\n")
import pprint
s = pprint.pformat(mapping, indent=2, width=80)
s2 = '##' + '\n##'.join(s.split("\n")) + "\n"
outstream.write(s2)
outstream.close()
if do_verify:
weak_verify = do_verify == 'weak'
try:

View File

@@ -18,6 +18,9 @@ class Python23Parser(Python24Parser):
# of Python
_while1test ::= SETUP_LOOP JUMP_FORWARD JUMP_IF_FALSE POP_TOP COME_FROM
while1stmt ::= _while1test l_stmts_opt JUMP_BACK
POP_TOP POP_BLOCK COME_FROM
while1stmt ::= _while1test l_stmts_opt JUMP_BACK
COME_FROM POP_TOP POP_BLOCK COME_FROM

View File

@@ -29,9 +29,9 @@ class Python26Parser(Python2Parser):
POP_TOP END_FINALLY
try_middle ::= jmp_abs COME_FROM except_stmts
come_from_pop END_FINALLY
POP_TOP END_FINALLY
trystmt ::= SETUP_EXCEPT suite_stmts_opt come_from_pop
trystmt ::= SETUP_EXCEPT suite_stmts_opt POP_TOP
try_middle
# Sometimes we don't put in COME_FROM to the next statement
@@ -48,11 +48,17 @@ class Python26Parser(Python2Parser):
_ifstmts_jump ::= c_stmts_opt JUMP_FORWARD COME_FROM POP_TOP
except_suite ::= c_stmts_opt JUMP_FORWARD come_from_pop
except_suite ::= c_stmts_opt JUMP_FORWARD POP_TOP
except_suite ::= c_stmts_opt jmp_abs come_from_pop
# Python 3 also has this.
come_froms ::= come_froms COME_FROM
come_froms ::= COME_FROM
# This is what happens after a jump where
# we start a new block. For reasons I don't fully
# understand, there is also a value on the top of the stack
come_from_pop ::= COME_FROM POP_TOP
come_froms_pop ::= come_froms POP_TOP
"""
@@ -70,9 +76,9 @@ class Python26Parser(Python2Parser):
jmp_true ::= JUMP_IF_TRUE POP_TOP
jmp_false ::= JUMP_IF_FALSE POP_TOP
jf_pop ::= JUMP_FORWARD come_from_pop
jf_pop ::= JUMP_ABSOLUTE come_from_pop
jb_pop ::= JUMP_BACK come_from_pop
jf_pop ::= JUMP_FORWARD POP_TOP
jf_pop ::= JUMP_ABSOLUTE POP_TOP
jb_pop ::= JUMP_BACK POP_TOP
jb_cont ::= JUMP_BACK
jb_cont ::= CONTINUE
@@ -85,13 +91,12 @@ class Python26Parser(Python2Parser):
jb_bp_come_from ::= JUMP_BACK bp_come_from
_ifstmts_jump ::= c_stmts_opt jf_pop COME_FROM
_ifstmts_jump ::= c_stmts_opt JUMP_FORWARD COME_FROM come_from_pop
_ifstmts_jump ::= c_stmts_opt JUMP_FORWARD COME_FROM POP_TOP
_ifstmts_jump ::= c_stmts_opt JUMP_FORWARD come_froms POP_TOP COME_FROM
# This is what happens after a jump where
# we start a new block. For reasons I don't fully
# understand, there is also a value on the top of the stack
come_from_pop ::= COME_FROM POP_TOP
come_froms_pop ::= come_froms POP_TOP
"""
@@ -145,9 +150,9 @@ class Python26Parser(Python2Parser):
whileelsestmt ::= SETUP_LOOP testexpr l_stmts_opt jb_pop POP_BLOCK
else_suite COME_FROM
return_stmt ::= ret_expr RETURN_END_IF come_from_pop
return_stmt ::= ret_expr RETURN_VALUE come_from_pop
return_if_stmt ::= ret_expr RETURN_END_IF come_from_pop
return_stmt ::= ret_expr RETURN_END_IF POP_TOP
return_stmt ::= ret_expr RETURN_VALUE POP_TOP
return_if_stmt ::= ret_expr RETURN_END_IF POP_TOP
iflaststmtl ::= testexpr c_stmts_opt JUMP_BACK come_from_pop
iflaststmt ::= testexpr c_stmts_opt JUMP_ABSOLUTE come_from_pop
@@ -186,7 +191,8 @@ class Python26Parser(Python2Parser):
# Make sure we keep indices the same as 2.7
setup_loop_lf ::= SETUP_LOOP LOAD_FAST
genexpr_func ::= setup_loop_lf FOR_ITER designator comp_iter jb_bp_come_from
genexpr_func ::= setup_loop_lf FOR_ITER designator comp_iter JUMP_BACK come_from_pop jb_bp_come_from
genexpr_func ::= setup_loop_lf FOR_ITER designator comp_iter JUMP_BACK come_from_pop
jb_bp_come_from
genexpr ::= LOAD_GENEXPR MAKE_FUNCTION_0 expr GET_ITER CALL_FUNCTION_1 COME_FROM
'''
@@ -208,7 +214,7 @@ class Python26Parser(Python2Parser):
def p_except26(self, args):
'''
except_suite ::= c_stmts_opt jmp_abs come_from_pop
except_suite ::= c_stmts_opt jmp_abs POP_TOP
'''
def p_misc26(self, args):

View File

@@ -25,6 +25,7 @@ from __future__ import print_function
from collections import namedtuple
from array import array
from uncompyle6.scanner import op_has_argument
from xdis.code import iscode
import uncompyle6.scanner as scan
@@ -137,7 +138,7 @@ class Scanner2(scan.Scanner):
if names[self.get_argument(i+3)] == 'AssertionError':
self.load_asserts.add(i+3)
jump_targets = self.find_jump_targets()
jump_targets = self.find_jump_targets(show_asm)
# contains (code, [addrRefToCode])
last_stmt = self.next_stmt[0]
@@ -175,7 +176,7 @@ class Scanner2(scan.Scanner):
opname = self.opc.opname[op]
oparg = None; pattr = None
has_arg = (op >= self.opc.HAVE_ARGUMENT)
has_arg = op_has_argument(op, self.opc)
if has_arg:
oparg = self.get_argument(offset) + extended_arg
extended_arg = 0
@@ -352,7 +353,7 @@ class Scanner2(scan.Scanner):
j+=1
return
def build_stmt_indices(self):
def build_statement_indices(self):
code = self.code
start = 0
end = len(code)
@@ -429,10 +430,10 @@ class Scanner2(scan.Scanner):
slist += [end] * (end-len(slist))
def next_except_jump(self, start):
'''
"""
Return the next jump that was generated by an except SomeException:
construct in a try...except...else clause or None if not found.
'''
"""
if self.code[start] == self.opc.DUP_TOP:
except_match = self.first_instr(start, len(self.code), self.opc.PJIF)
@@ -466,11 +467,11 @@ class Scanner2(scan.Scanner):
elif op in self.setup_ops:
count_SETUP_ += 1
def detect_structure(self, pos, op):
'''
def detect_structure(self, offset, op):
"""
Detect type of block structures and their boundaries to fix optimized jumps
in python2.3+
'''
"""
# TODO: check the struct boundaries more precisely -Dan
@@ -483,7 +484,7 @@ class Scanner2(scan.Scanner):
for struct in self.structs:
_start = struct['start']
_end = struct['end']
if (_start <= pos < _end) and (_start >= start and _end <= end):
if (_start <= offset < _end) and (_start >= start and _end <= end):
start = _start
end = _end
parent = struct
@@ -495,14 +496,16 @@ class Scanner2(scan.Scanner):
# Try to find the jump_back instruction of the loop.
# It could be a return instruction.
start = pos+3
target = self.get_target(pos, op)
start = offset+3
target = self.get_target(offset, op)
end = self.restrict_to_parent(target, parent)
self.setup_loop_targets[offset] = target
self.setup_loops[target] = offset
if target != end:
self.fixed_jumps[pos] = end
self.fixed_jumps[offset] = end
(line_no, next_line_byte) = self.lines[pos]
(line_no, next_line_byte) = self.lines[offset]
jump_back = self.last_instr(start, end, self.opc.JUMP_ABSOLUTE,
next_line_byte, False)
@@ -566,10 +569,10 @@ class Scanner2(scan.Scanner):
if end > jump_back+4 and code[end] in self.jump_forward:
if code[jump_back+4] in self.jump_forward:
if self.get_target(jump_back+4) == self.get_target(end):
self.fixed_jumps[pos] = jump_back+4
self.fixed_jumps[offset] = jump_back+4
end = jump_back+4
elif target < pos:
self.fixed_jumps[pos] = jump_back+4
elif target < offset:
self.fixed_jumps[offset] = jump_back+4
end = jump_back+4
target = self.get_target(jump_back, self.opc.JUMP_ABSOLUTE)
@@ -585,7 +588,7 @@ class Scanner2(scan.Scanner):
else:
test = self.prev[next_line_byte]
if test == pos:
if test == offset:
loop_type = 'while 1'
elif self.code[test] in self.opc.hasjabs + self.opc.hasjrel:
self.ignore_if.add(test)
@@ -602,15 +605,15 @@ class Scanner2(scan.Scanner):
'start': jump_back+3,
'end': end})
elif op == self.opc.SETUP_EXCEPT:
start = pos+3
target = self.get_target(pos, op)
start = offset+3
target = self.get_target(offset, op)
end = self.restrict_to_parent(target, parent)
if target != end:
self.fixed_jumps[pos] = end
self.fixed_jumps[offset] = end
# print target, end, parent
# Add the try block
self.structs.append({'type': 'try',
'start': start,
'start': start-3,
'end': end-4})
# Now isolate the except and else blocks
end_else = start_else = self.get_target(self.prev[end])
@@ -654,15 +657,15 @@ class Scanner2(scan.Scanner):
self.fixed_jumps[i] = i+1
elif op in self.pop_jump_if:
target = self.get_target(pos, op)
target = self.get_target(offset, op)
rtarget = self.restrict_to_parent(target, parent)
# Do not let jump to go out of parent struct bounds
if target != rtarget and parent['type'] == 'and/or':
self.fixed_jumps[pos] = rtarget
self.fixed_jumps[offset] = rtarget
return
start = pos+3
start = offset+3
pre = self.prev
# Does this jump to right after another conditional jump that is
@@ -677,8 +680,8 @@ class Scanner2(scan.Scanner):
op_testset = self.pop_jump_if_or_pop | self.pop_jump_if
if ( code[pre[target]] in op_testset
and (target > pos) ):
self.fixed_jumps[pos] = pre[target]
and (target > offset) ):
self.fixed_jumps[offset] = pre[target]
self.structs.append({'type': 'and/or',
'start': start,
'end': pre[target]})
@@ -690,7 +693,7 @@ class Scanner2(scan.Scanner):
# Search for other POP_JUMP_IF_FALSE targetting the same op,
# in current statement, starting from current offset, and filter
# everything inside inner 'or' jumps and midline ifs
match = self.rem_or(start, self.next_stmt[pos], self.opc.PJIF, target)
match = self.rem_or(start, self.next_stmt[offset], self.opc.PJIF, target)
# If we still have any offsets in set, start working on it
if match:
@@ -698,13 +701,13 @@ class Scanner2(scan.Scanner):
and pre[rtarget] not in self.stmts \
and self.restrict_to_parent(self.get_target(pre[rtarget]), parent) == rtarget:
if code[pre[pre[rtarget]]] == self.opc.JUMP_ABSOLUTE \
and self.remove_mid_line_ifs([pos]) \
and self.remove_mid_line_ifs([offset]) \
and target == self.get_target(pre[pre[rtarget]]) \
and (pre[pre[rtarget]] not in self.stmts or self.get_target(pre[pre[rtarget]]) > pre[pre[rtarget]])\
and 1 == len(self.remove_mid_line_ifs(self.rem_or(start, pre[pre[rtarget]], self.pop_jump_if, target))):
pass
elif code[pre[pre[rtarget]]] == self.opc.RETURN_VALUE \
and self.remove_mid_line_ifs([pos]) \
and self.remove_mid_line_ifs([offset]) \
and 1 == (len(set(self.remove_mid_line_ifs(self.rem_or(start,
pre[pre[rtarget]],
self.pop_jump_if, target)))
@@ -713,7 +716,7 @@ class Scanner2(scan.Scanner):
pass
else:
fix = None
jump_ifs = self.all_instr(start, self.next_stmt[pos], self.opc.PJIF)
jump_ifs = self.all_instr(start, self.next_stmt[offset], self.opc.PJIF)
last_jump_good = True
for j in jump_ifs:
if target == self.get_target(j):
@@ -722,53 +725,53 @@ class Scanner2(scan.Scanner):
break
else:
last_jump_good = False
self.fixed_jumps[pos] = fix or match[-1]
self.fixed_jumps[offset] = fix or match[-1]
return
else:
if (self.version < 2.7
and parent['type'] in ('root', 'for-loop', 'if-then',
'if-else', 'try')):
self.fixed_jumps[pos] = rtarget
self.fixed_jumps[offset] = rtarget
else:
# note test for < 2.7 might be superflous although informative
# for 2.7 a different branch is taken and the below code is handled
# under: elif op in self.pop_jump_if_or_pop
# below
self.fixed_jumps[pos] = match[-1]
self.fixed_jumps[offset] = match[-1]
return
else: # op != self.opc.PJIT
if self.version < 2.7 and code[pos+3] == self.opc.POP_TOP:
assert_pos = pos + 4
if self.version < 2.7 and code[offset+3] == self.opc.POP_TOP:
assert_offset = offset + 4
else:
assert_pos = pos + 3
if (assert_pos) in self.load_asserts:
assert_offset = offset + 3
if (assert_offset) in self.load_asserts:
if code[pre[rtarget]] == self.opc.RAISE_VARARGS:
return
self.load_asserts.remove(assert_pos)
self.load_asserts.remove(assert_offset)
next = self.next_stmt[pos]
if pre[next] == pos:
next = self.next_stmt[offset]
if pre[next] == offset:
pass
elif code[next] in self.jump_forward and target == self.get_target(next):
if code[pre[next]] == self.opc.PJIF:
if code[next] == self.opc.JUMP_FORWARD or target != rtarget or code[pre[pre[rtarget]]] not in (self.opc.JUMP_ABSOLUTE, self.opc.RETURN_VALUE):
self.fixed_jumps[pos] = pre[next]
self.fixed_jumps[offset] = pre[next]
return
elif code[next] == self.opc.JUMP_ABSOLUTE and code[target] in self.jump_forward:
next_target = self.get_target(next)
if self.get_target(target) == next_target:
self.fixed_jumps[pos] = pre[next]
self.fixed_jumps[offset] = pre[next]
return
elif code[next_target] in self.jump_forward and self.get_target(next_target) == self.get_target(target):
self.fixed_jumps[pos] = pre[next]
self.fixed_jumps[offset] = pre[next]
return
# don't add a struct for a while test, it's already taken care of
if pos in self.ignore_if:
if offset in self.ignore_if:
return
if code[pre[rtarget]] == self.opc.JUMP_ABSOLUTE and pre[rtarget] in self.stmts \
and pre[rtarget] != pos and pre[pre[rtarget]] != pos:
and pre[rtarget] != offset and pre[pre[rtarget]] != offset:
if code[rtarget] == self.opc.JUMP_ABSOLUTE and code[rtarget+3] == self.opc.POP_BLOCK:
if code[pre[pre[rtarget]]] != self.opc.JUMP_ABSOLUTE:
pass
@@ -786,14 +789,28 @@ class Scanner2(scan.Scanner):
if_end = self.get_target(pre_rtarget)
# Is this a loop and not an "if" statment?
if (if_end < pre_rtarget) and (code[pre[if_end]] == self.opc.SETUP_LOOP):
if(if_end > start):
if (if_end < pre_rtarget) and (pre[if_end] in self.setup_loop_targets):
if (if_end > start):
return
else:
# We still have the case in 2.7 that the next instruction
# is a jump to a SETUP_LOOP target.
next_offset = target + self.op_size(self.code[target])
next_op = self.code[next_offset]
if self.opc.opname[next_op] == 'JUMP_FORWARD':
jump_target = self.get_target(next_offset, next_op)
if jump_target in self.setup_loops:
self.structs.append({'type': 'while-loop',
'start': start - 3,
'end': jump_target})
self.fixed_jumps[start-3] = jump_target
return
end = self.restrict_to_parent(if_end, parent)
self.structs.append({'type': 'if-then',
'start': start,
'start': start-3,
'end': pre_rtarget})
self.not_continue.add(pre_rtarget)
@@ -810,46 +827,57 @@ class Scanner2(scan.Scanner):
self.return_end_ifs.add(pre_rtarget)
elif op in self.pop_jump_if_or_pop:
target = self.get_target(pos, op)
self.fixed_jumps[pos] = self.restrict_to_parent(target, parent)
target = self.get_target(offset, op)
self.fixed_jumps[offset] = self.restrict_to_parent(target, parent)
def find_jump_targets(self):
'''
def find_jump_targets(self, debug):
"""
Detect all offsets in a byte code which are jump targets
where we might insert a COME_FROM instruction.
where we might insert a pseudo "COME_FROM" instruction.
"COME_FROM" instructions are used in detecting overall
control flow. The more detailed information about the
control flow is captured in self.structs.
Since this stuff is tricky, consult self.structs when
something goes amiss.
Return the list of offsets. An instruction can be jumped
to in from multiple instructions.
'''
n = len(self.code)
"""
code = self.code
n = len(code)
self.structs = [{'type': 'root',
'start': 0,
'end': n-1}]
self.loops = [] # All loop entry points
self.fixed_jumps = {} # Map fixed jumps to their real destination
# All loop entry points
self.loops = []
# Map fixed jumps to their real destination
self.fixed_jumps = {}
self.ignore_if = set()
self.build_stmt_indices()
self.build_statement_indices()
# Containers filled by detect_structure()
self.not_continue = set()
self.return_end_ifs = set()
self.setup_loop_targets = {} # target given setup_loop offset
self.setup_loops = {} # setup_loop offset given target
targets = {}
for offset in self.op_range(0, n):
op = self.code[offset]
op = code[offset]
# Determine structures and fix jumps in Python versions
# since 2.3
self.detect_structure(offset, op)
if op >= self.opc.HAVE_ARGUMENT:
if op_has_argument(op, self.opc):
label = self.fixed_jumps.get(offset)
oparg = self.get_argument(offset)
if label is None:
if (op in self.opc.hasjrel and
(self.version < 2.0 or op != self.opc.FOR_ITER)):
if op in self.opc.hasjrel and self.opc.opname[op] != 'FOR_ITER':
# if (op in self.opc.hasjrel and
# (self.version < 2.0 or op != self.opc.FOR_ITER)):
label = offset + 3 + oparg
elif self.version == 2.7 and op in self.opc.hasjabs:
if op in (self.opc.JUMP_IF_FALSE_OR_POP,
@@ -867,23 +895,38 @@ class Scanner2(scan.Scanner):
# does now start a new statement
# Otherwise, we have want to add a "COME_FROM"
if not (self.version < 2.7 and
self.code[label] == self.opc.POP_TOP and
self.code[self.prev[label]] == self.opc.RETURN_VALUE):
code[label] == self.opc.POP_TOP and
code[self.prev[label]] == self.opc.RETURN_VALUE):
# In Python < 2.7, don't add a COME_FROM, for:
# JUMP_FORWARD, END_FINALLY
# or:
# JUMP_FORWARD, POP_TOP, END_FINALLY
if not (self.version < 2.7 and op == self.opc.JUMP_FORWARD
and ((self.code[offset+3] == self.opc.END_FINALLY)
or (self.code[offset+3] == self.opc.POP_TOP
and self.code[offset+4] == self.opc.END_FINALLY))):
targets[label] = targets.get(label, []) + [offset]
and ((code[offset+3] == self.opc.END_FINALLY)
or (code[offset+3] == self.opc.POP_TOP
and code[offset+4] == self.opc.END_FINALLY))):
# FIXME: rocky: I think we need something like this...
if offset not in set(self.ignore_if) or self.version == 2.7:
source = (self.setup_loops[label]
if label in self.setup_loops else offset)
targets[label] = targets.get(label, []) + [source]
pass
pass
pass
elif op == self.opc.END_FINALLY and offset in self.fixed_jumps and self.version == 2.7:
label = self.fixed_jumps[offset]
targets[label] = targets.get(label, []) + [offset]
pass
pass
# DEBUG:
if debug in ('both', 'after'):
print(targets)
import pprint as pp
pp.pprint(self.structs)
return targets
# FIXME: combine with scanner3.py code and put into scanner.py

View File

@@ -130,7 +130,7 @@ class Scanner26(scan.Scanner2):
if names[self.get_argument(i+4)] == 'AssertionError':
self.load_asserts.add(i+4)
jump_targets = self.find_jump_targets()
jump_targets = self.find_jump_targets(show_asm)
# contains (code, [addrRefToCode])
last_stmt = self.next_stmt[0]
@@ -279,9 +279,9 @@ class Scanner26(scan.Scanner2):
pass
pass
if show_asm:
if show_asm in ('both', 'after'):
for t in tokens:
print(t)
print(t.format(line_prefix='L.'))
print()
return tokens, customize

View File

@@ -199,7 +199,7 @@ class Scanner3(Scanner):
# Get jump targets
# Format: {target offset: [jump offsets]}
jump_targets = self.find_jump_targets()
jump_targets = self.find_jump_targets(show_asm)
for inst in bytecode:
@@ -401,14 +401,15 @@ class Scanner3(Scanner):
for _ in range(self.op_size(op)):
self.prev_op.append(offset)
def find_jump_targets(self):
def find_jump_targets(self, debug):
"""
Detect all offsets in a byte code which are jump targets.
Detect all offsets in a byte code which are jump targets
where we might insert a COME_FROM instruction.
Return the list of offsets.
This procedure is modelled after dis.findlabels(), but here
for each target the number of jumps is counted.
Return the list of offsets. An instruction can be jumped
to in from multiple instructions.
"""
code = self.code
n = len(code)
@@ -427,6 +428,8 @@ class Scanner3(Scanner):
# Containers filled by detect_structure()
self.not_continue = set()
self.return_end_ifs = set()
self.setup_loop_targets = {} # target given setup_loop offset
self.setup_loops = {} # setup_loop offset given target
targets = {}
for offset in self.op_range(0, n):
@@ -454,6 +457,13 @@ class Scanner3(Scanner):
elif op == self.opc.END_FINALLY and offset in self.fixed_jumps:
label = self.fixed_jumps[offset]
targets[label] = targets.get(label, []) + [offset]
pass
pass
# DEBUG:
if debug in ('both', 'after'):
import pprint as pp
pp.pprint(self.structs)
return targets
def build_statement_indices(self):
@@ -584,6 +594,8 @@ class Scanner3(Scanner):
start = offset+3
target = self.get_target(offset)
end = self.restrict_to_parent(target, parent)
self.setup_loop_targets[offset] = target
self.setup_loops[target] = offset
if target != end:
self.fixed_jumps[offset] = end

View File

@@ -12,9 +12,9 @@ def checker(ast, in_loop, errors):
in_loop = in_loop or ast.type in ('while1stmt', 'whileTruestmt',
'whilestmt', 'whileelsestmt',
'for_block')
if ast.type == 'augassign1' and ast[0][0] == 'and':
text = str(ast[0])
error_text = '\n# improper augmented assigment:\n#\t' + '\n# '.join(text.split("\n"))
if ast.type in ('augassign1', 'augassign2') and ast[0][0] == 'and':
text = str(ast)
error_text = '\n# improper augmented assigment (e.g. +=, *=, ...):\n#\t' + '\n# '.join(text.split("\n")) + '\n'
errors.append(error_text)
for node in ast:

View File

@@ -741,7 +741,12 @@ class SourceWalker(GenericASTTraversal, object):
self.pending_newlines = max(self.pending_newlines, 1)
def print_docstring(self, indent, docstring):
quote = '"""'
## FIXME: put this into a testable function.
if docstring.find('"""') == -1:
quote = '"""'
else:
quote = "'''"
self.write(indent)
if not PYTHON3 and not isinstance(docstring, str):
# Must be unicode in Python2
@@ -774,10 +779,11 @@ class SourceWalker(GenericASTTraversal, object):
# ruin the ending triple quote
if len(docstring) and docstring[-1] == '"':
docstring = docstring[:-1] + '\\"'
# Escape triple quote anywhere
docstring = docstring.replace('"""', '\\"\\"\\"')
# Restore escaped backslashes
docstring = docstring.replace('\t', '\\\\')
# Escape triple quote when needed
if quote == '""""':
docstring = docstring.replace('"""', '\\"\\"\\"')
lines = docstring.split('\n')
calculate_indent = maxint
for line in lines[1:]:
@@ -1010,10 +1016,7 @@ class SourceWalker(GenericASTTraversal, object):
def n_ifelsestmt(self, node, preprocess=False):
else_suite = node[3]
try:
n = else_suite[0]
except:
from trepan.api import debug; debug()
n = else_suite[0]
if len(n) == 1 == len(n[0]) and n[0] == '_stmts':
n = n[0][0][0]
@@ -1202,6 +1205,8 @@ class SourceWalker(GenericASTTraversal, object):
assert expr == 'expr'
assert list_iter == 'list_iter'
# FIXME: use source line numbers for directing line breaks
self.preorder(expr)
self.preorder(list_iter)
self.write( ' ]')
@@ -1217,7 +1222,7 @@ class SourceWalker(GenericASTTraversal, object):
n = node[-1]
elif self.is_pypy and node[-1] == 'JUMP_BACK':
n = node[-2]
list_expr = node[0]
list_expr = node[1]
if len(node) >= 3:
designator = node[3]
@@ -1242,10 +1247,9 @@ class SourceWalker(GenericASTTraversal, object):
assert expr == 'expr'
assert list_iter == 'list_iter'
# FIXME: use source line numbers for directing line breaks
self.preorder(expr)
self.write( ' for ')
self.preorder(designator)
self.write( ' in ')
self.preorder(list_expr)
self.write( ' ]')
self.prec = p
@@ -2324,7 +2328,7 @@ def deparse_code(version, co, out=sys.stdout, showasm=None, showast=False,
deparsed.write('# global %s ## Warning: Unused global' % g)
if deparsed.ast_errors:
deparsed.write("# NOTE: have decompilation errors.\n")
deparsed.write("# NOTE: have internal decompilation grammar errors.\n")
deparsed.write("# Use -t option to show full context.")
for err in deparsed.ast_errors:
deparsed.write(err)

View File

@@ -316,9 +316,12 @@ def cmp_code_objects(version, is_pypy, code_obj1, code_obj2,
i1 += 2
i2 += 2
continue
raise CmpErrorCode(name, tokens1[i1].offset, tokens1[i1],
tokens2[i2], tokens1, tokens2)
elif tokens1[i1].type == 'LOAD_NAME' and tokens2[i2].type == 'LOAD_CONST' \
and tokens1[i1].pattr == 'None' and tokens2[i2].pattr == None:
pass
else:
raise CmpErrorCode(name, tokens1[i1].offset, tokens1[i1],
tokens2[i2], tokens1, tokens2)
elif tokens1[i1].type in JUMP_OPs and tokens1[i1].pattr != tokens2[i2].pattr:
dest1 = int(tokens1[i1].pattr)
dest2 = int(tokens2[i2].pattr)
@@ -395,7 +398,10 @@ def compare_code_with_srcfile(pyc_filename, src_filename, weak_verify=False):
msg = ("Can't compare code - Python is running with magic %s, but code is magic %s "
% (PYTHON_MAGIC_INT, magic_int))
return msg
code_obj2 = load_file(src_filename)
try:
code_obj2 = load_file(src_filename)
except SyntaxError as e:
return str(e)
cmp_code_objects(version, is_pypy, code_obj1, code_obj2, ignore_code=weak_verify)
return None

View File

@@ -1,3 +1,3 @@
# This file is suitable for sourcing inside bash as
# well as importing into Python
VERSION='2.9.5'
VERSION='2.9.6'