diff --git a/HISTORY.md b/HISTORY.md index 280d46f0..7a3611af 100644 --- a/HISTORY.md +++ b/HISTORY.md @@ -115,7 +115,7 @@ mechanisms and addressed problems and extensions by some other means. Specifically, in `uncompyle`, decompilation of python bytecode 2.5 & 2.6 is done by transforming the byte code into a pseudo-2.7 Python bytecode and is based on code from Eloi Vanderbeken. A bit of this -could have bene easily added by modifying grammar rules. +could have been easily added by modifying grammar rules. This project, `uncompyle6`, abandons that approach for various reasons. Having a grammar per Python version is much cleaner and it diff --git a/NEWS.md b/NEWS.md index 9de9b502..69914306 100644 --- a/NEWS.md +++ b/NEWS.md @@ -1,3 +1,31 @@ +3.3.3 2019-05-19 Henry and Lewis +================================ + +As before, decomplation bugs fixed. The focus has primarily been on +Python 3.7. But with this release, releases will be put on hold,as a +better control-flow detection is worked on . Tis has been needed for a +while, and is long overdue. It will probably also take a while to get +done as good as what we have now. + +However this work will be done in a new project +[decompyle3](https://github.com/rocky/python-decompile3). In contrast +to _uncompyle6_ the code wil be written assuming a modern Python 3, +e.g. 3.7. It is originally intended to decompile Python version 3.7 +and greater. + +* A number of Python 3.7+ chained comparisons were fixed +* Revise Python 3.6ish format string handling +* Go over operator precedence, e.g. for AST IfExp + +Reported Bug Fixes +------------------ + +* [#239: 3.7 handling of 4-level attribute import](https://github.com/rocky/python-uncompyle6/issues/239), +* [#229: Inconsistent if block in python3.6](https://github.com/rocky/python-uncompyle6/issues/229), +* [#227: Args not appearing in decompiled src when kwargs is specified explicitly (call_ex_kw)](https://github.com/rocky/python-uncompyle6/issues/227) +2.7 confusion around "and" versus comprehension "if" +* [#225: 2.7 confusion around "and" vs comprehension "if"](https://github.com/rocky/python-uncompyle6/issues/225) + 3.3.2 2019-05-03 Better Friday ============================== @@ -12,7 +40,6 @@ get addressed in future releases Pypy 3.6 support was started. Pypy 3.x detection fixed (via xdis) - 3.3.1 2019-04-19 Good Friday ========================== @@ -27,7 +54,7 @@ Lots of decomplation bugs, especially in the 3.x series fixed. Don't worry thoug * Fix some parser failures fixes in 3.4+ using test_pyenvlib * Add more run tests -3.3.0 2019-43-14 Holy Week +3.3.0 2019-04-14 Holy Week ========================== * First cut at Python 3.8 (many bug remain) @@ -41,11 +68,13 @@ Mostly more of the same: bug fixes and pull requests. Bug Fixes ----------- -* [#155: Python 3.x bytecode confusing "try/else" with "try" in a loop](https://github.com/rocky/python-uncompyle6/issues/155), -* [#200: Python 3 bug in not detecting end bounds of an "if" ... "elif"](https://github.com/rocky/python-uncompyle6/issues/200), -* [#208: Comma placement in 3.6 and 3.7 **kwargs](https://github.com/rocky/python-uncompyle6/issues/208), -* [#209: Fix "if" return boundary in 3.6+](https://github.com/rocky/python-uncompyle6/issues/209), +* [#221: Wrong grammar for nested ifelsestmt (in Python 3.7 at least)](https://github.com/rocky/python-uncompyle6/issues/221) * [#215: 2.7 can have two JUMP_BACKs at the end of a while loop](https://github.com/rocky/python-uncompyle6/issues/215) +* [#209: Fix "if" return boundary in 3.6+](https://github.com/rocky/python-uncompyle6/issues/209), +* [#208: Comma placement in 3.6 and 3.7 **kwargs](https://github.com/rocky/python-uncompyle6/issues/208), +* [#200: Python 3 bug in not detecting end bounds of an "if" ... "elif"](https://github.com/rocky/python-uncompyle6/issues/200), +* [#155: Python 3.x bytecode confusing "try/else" with "try" in a loop](https://github.com/rocky/python-uncompyle6/issues/155), + Pull Requests ---------------- diff --git a/README.rst b/README.rst index 99e4cfb6..0e3ae60e 100644 --- a/README.rst +++ b/README.rst @@ -152,7 +152,7 @@ for that bytecode version. Having done this the bytecode produced could be compared with the original bytecode. However as Python's code generation got better, this is no longer feasible. -There verification that we use that doesn't check bytecode for +The verification that we use that doesn't check bytecode for equivalence but does check to see if the resulting decompiled source is a valid Python program by running the Python interpreter. Because the Python language has changed so much, for best results you should @@ -194,8 +194,12 @@ Between Python 3.5, 3.6 and 3.7 there have been major changes to the Currently not all Python magic numbers are supported. Specifically in some versions of Python, notably Python 3.6, the magic number has -changes several times within a version. We support only the released -magic. There are also customized Python interpreters, notably Dropbox, +changes several times within a version. + +**We support only released versions, not candidate versions.** Note however +that the magic of a released version is usually the same as the *last* candidate version prior to release. + +There are also customized Python interpreters, notably Dropbox, which use their own magic and encrypt bytcode. With the exception of the Dropbox's old Python 2.5 interpreter this kind of thing is not handled. @@ -226,7 +230,7 @@ See Also * https://github.com/rocky/python-uncompyle6/wiki : Wiki Documents which describe the code and aspects of it in more detail -.. _trepan: https://pypi.python.org/pypi/trepan2 +.. _trepan: https://pypi.python.org/pypi/trepan2g .. _compiler: https://pypi.python.org/pypi/spark_parser .. _HISTORY: https://github.com/rocky/python-uncompyle6/blob/master/HISTORY.md .. _debuggers: https://pypi.python.org/pypi/trepan3k diff --git a/test/bytecode_2.6_run/01_ifelse_listcomp.pyc b/test/bytecode_2.6_run/01_ifelse_listcomp.pyc index efe324ab..7bba8d75 100644 Binary files a/test/bytecode_2.6_run/01_ifelse_listcomp.pyc and b/test/bytecode_2.6_run/01_ifelse_listcomp.pyc differ diff --git a/test/bytecode_2.7/00_import.pyc b/test/bytecode_2.7/00_import.pyc index 62d8bd77..2ec0c16b 100644 Binary files a/test/bytecode_2.7/00_import.pyc and b/test/bytecode_2.7/00_import.pyc differ diff --git a/test/bytecode_2.7_run/01_ifelse_listcomp.pyc b/test/bytecode_2.7_run/01_ifelse_listcomp.pyc index b110fca9..359f5046 100644 Binary files a/test/bytecode_2.7_run/01_ifelse_listcomp.pyc and b/test/bytecode_2.7_run/01_ifelse_listcomp.pyc differ diff --git a/test/bytecode_3.6_run/01_fstring.pyc b/test/bytecode_3.6_run/01_fstring.pyc index 8f4d459f..89d675e6 100644 Binary files a/test/bytecode_3.6_run/01_fstring.pyc and b/test/bytecode_3.6_run/01_fstring.pyc differ diff --git a/test/bytecode_3.6_run/02_call_ex_kw.pyc b/test/bytecode_3.6_run/02_call_ex_kw.pyc new file mode 100644 index 00000000..58742f27 Binary files /dev/null and b/test/bytecode_3.6_run/02_call_ex_kw.pyc differ diff --git a/test/bytecode_3.7/00_import.pyc b/test/bytecode_3.7/00_import.pyc index a833913b..36d81c93 100644 Binary files a/test/bytecode_3.7/00_import.pyc and b/test/bytecode_3.7/00_import.pyc differ diff --git a/test/bytecode_3.7/01_chained_compare.pyc b/test/bytecode_3.7/01_chained_compare.pyc new file mode 100644 index 00000000..8a03e344 Binary files /dev/null and b/test/bytecode_3.7/01_chained_compare.pyc differ diff --git a/test/bytecode_3.7_run/01_fstring.pyc b/test/bytecode_3.7_run/01_fstring.pyc index d3d62075..9e300b53 100644 Binary files a/test/bytecode_3.7_run/01_fstring.pyc and b/test/bytecode_3.7_run/01_fstring.pyc differ diff --git a/test/bytecode_3.7_run/02_call_ex_kw.pyc b/test/bytecode_3.7_run/02_call_ex_kw.pyc new file mode 100644 index 00000000..1a44e98b Binary files /dev/null and b/test/bytecode_3.7_run/02_call_ex_kw.pyc differ diff --git a/test/simple_source/bug26/01_ifelse_listcomp.py b/test/simple_source/bug26/01_ifelse_listcomp.py index efedeb63..3973c451 100644 --- a/test/simple_source/bug26/01_ifelse_listcomp.py +++ b/test/simple_source/bug26/01_ifelse_listcomp.py @@ -2,3 +2,10 @@ # This is RUNNABLE! assert [False, True, True, True, True] == [False if not a else True for a in range(5)] assert [True, False, False, False, False] == [False if a else True for a in range(5)] + +# From bug #225 +m = ['hi', 'he', 'ih', 'who', 'ho'] +ms = {} +for f in (f for f in m if f.startswith('h')): + ms[f] = 5 +assert ms == {'hi': 5, 'he': 5, 'ho': 5} diff --git a/test/simple_source/bug36/01_fstring.py b/test/simple_source/bug36/01_fstring.py index b19a8d39..728add4b 100644 --- a/test/simple_source/bug36/01_fstring.py +++ b/test/simple_source/bug36/01_fstring.py @@ -39,6 +39,30 @@ source = 'foo' source = (f"__file__ = r'''{os.path.abspath(filename)}'''\n" + source + "\ndel __file__") -# From 3.7.3 datalasses.py +# Note how { and } are *not* escaped here +f = 'one' +name = 'two' +assert(f"{f}{'{{name}}'} {f}{'{name}'}") == 'one{{name}} one{name}' + +# From 3.7.3 dataclasses.py log_rounds = 5 assert "05$" == f'{log_rounds:02d}$' + + +def testit(a, b, l): + # print(l) + return l + +# The call below shows the need for BUILD_STRING to count expr arguments. +# Also note that we use {{ }} to escape braces in contrast to the example +# above. +def _repr_fn(fields): + return testit('__repr__', + ('self',), + ['return xx + f"(' + + ', '.join([f"{f}={{self.{f}!r}}" + for f in fields]) + + ')"']) + +fields = ['a', 'b', 'c'] +assert _repr_fn(fields) == ['return xx + f"(a={self.a!r}, b={self.b!r}, c={self.c!r})"'] diff --git a/test/simple_source/bug36/02_call_ex_kw.py b/test/simple_source/bug36/02_call_ex_kw.py new file mode 100644 index 00000000..bb4fc699 --- /dev/null +++ b/test/simple_source/bug36/02_call_ex_kw.py @@ -0,0 +1,47 @@ +# From #227 +# Bug was not handling call_ex_kw correctly +# This appears in +# showparams(c, test="A", **extra_args) +# below + +def showparams(c, test, **extra_args): + return {'c': c, **extra_args, 'test': test} + +def f(c, **extra_args): + return showparams(c, test="A", **extra_args) + +def f1(c, d, **extra_args): + return showparams(c, test="B", **extra_args) + +def f2(**extra_args): + return showparams(1, test="C", **extra_args) + +def f3(c, *args, **extra_args): + return showparams(c, *args, **extra_args) + +assert f(1, a=2, b=3) == {'c': 1, 'a': 2, 'b': 3, 'test': 'A'} + +a = {'param1': 2} +assert f1('2', '{\'test\': "4"}', test2='a', **a) \ + == {'c': '2', 'test2': 'a', 'param1': 2, 'test': 'B'} +assert f1(2, '"3"', test2='a', **a) \ + == {'c': 2, 'test2': 'a', 'param1': 2, 'test': 'B'} +assert f1(False, '"3"', test2='a', **a) \ + == {'c': False, 'test2': 'a', 'param1': 2, 'test': 'B'} +assert f(2, test2='A', **a) \ + == {'c': 2, 'test2': 'A', 'param1': 2, 'test': 'A'} +assert f(str(2) + str(1), test2='a', **a) \ + == {'c': '21', 'test2': 'a', 'param1': 2, 'test': 'A'} +assert f1((a.get('a'), a.get('b')), a, test3='A', **a) \ + == {'c': (None, None), 'test3': 'A', 'param1': 2, 'test': 'B'} + +b = {'b1': 1, 'b2': 2} +assert f2(**a, **b) == \ + {'c': 1, 'param1': 2, 'b1': 1, 'b2': 2, 'test': 'C'} + +c = (2,) +d = (2, 3) +assert f(2, **a) == {'c': 2, 'param1': 2, 'test': 'A'} +assert f3(2, *c, **a) == {'c': 2, 'param1': 2, 'test': 2} +assert f3(*d, **a) == {'c': 2, 'param1': 2, 'test': 3} + diff --git a/test/simple_source/bug37/01_chained_compare.py b/test/simple_source/bug37/01_chained_compare.py index f79f299d..a2347802 100644 --- a/test/simple_source/bug37/01_chained_compare.py +++ b/test/simple_source/bug37/01_chained_compare.py @@ -11,9 +11,16 @@ def chained_compare_b(a, obj): if -0x80000000 <= obj <= 0x7fffffff: return 5 +def chained_compare_c(a, d): + for i in len(d): + if a == d[i] != 2: + return 5 + chained_compare_a(3) try: chained_compare_a(8) except ValueError: pass chained_compare_b(True, 0x0) + +chained_compare_c(3, [3]) diff --git a/test/simple_source/stmts/00_import.py b/test/simple_source/stmts/00_import.py index 33cbb1a3..fd80daf6 100644 --- a/test/simple_source/stmts/00_import.py +++ b/test/simple_source/stmts/00_import.py @@ -7,3 +7,4 @@ import http.client as httpclient if len(__file__) == 0: # a.b.c should force consecutive LOAD_ATTRs import a.b.c as d + import stuff0.stuff1.stuff2.stuff3 as stuff3 diff --git a/uncompyle6/parser.py b/uncompyle6/parser.py index ee3214e9..9b4498b4 100644 --- a/uncompyle6/parser.py +++ b/uncompyle6/parser.py @@ -59,7 +59,6 @@ class PythonParser(GenericASTBuilder): 'imports_cont', 'kvlist_n', # Python 3.6+ - 'joined_str', 'come_from_loops', ] self.collect = frozenset(nt_list) @@ -81,7 +80,7 @@ class PythonParser(GenericASTBuilder): # FIXME: would love to do expr, sstmts, stmts and # so on but that would require major changes to the # semantic actions - self.singleton = frozenset(('str', 'joined_str', 'store', '_stmts', 'suite_stmts_opt', + self.singleton = frozenset(('str', 'store', '_stmts', 'suite_stmts_opt', 'inplace_op')) # Instructions filled in from scanner self.insts = [] diff --git a/uncompyle6/parsers/parse27.py b/uncompyle6/parsers/parse27.py index 0600ed53..79ec3fe2 100644 --- a/uncompyle6/parsers/parse27.py +++ b/uncompyle6/parsers/parse27.py @@ -229,6 +229,12 @@ class Python27Parser(Python2Parser): return invalid if rule == ('and', ('expr', 'jmp_false', 'expr', '\\e_come_from_opt')): + # If the instruction after the instructions formin "and" is an "YIELD_VALUE" + # then this is probably an "if" inside a comprehension. + if tokens[last] == 'YIELD_VALUE': + # Note: We might also consider testing last+1 being "POP_TOP" + return True + # Test that jmp_false jumps to the end of "and" # or that it jumps to the same place as the end of "and" jmp_false = ast[1][0] diff --git a/uncompyle6/parsers/parse36.py b/uncompyle6/parsers/parse36.py index 5bc3e6eb..f5d1151c 100644 --- a/uncompyle6/parsers/parse36.py +++ b/uncompyle6/parsers/parse36.py @@ -187,21 +187,14 @@ class Python36Parser(Python35Parser): self.add_unique_doc_rules(rules_str, customize) elif opname == 'FORMAT_VALUE': rules_str = """ - expr ::= fstring_single - fstring_single ::= expr FORMAT_VALUE - expr ::= fstring_expr - fstring_expr ::= expr FORMAT_VALUE - - str ::= LOAD_CONST - formatted_value ::= fstring_expr - formatted_value ::= str - + expr ::= formatted_value1 + formatted_value1 ::= expr FORMAT_VALUE """ self.add_unique_doc_rules(rules_str, customize) elif opname == 'FORMAT_VALUE_ATTR': rules_str = """ - expr ::= fstring_single - fstring_single ::= expr expr FORMAT_VALUE_ATTR + expr ::= formatted_value2 + formatted_value2 ::= expr expr FORMAT_VALUE_ATTR """ self.add_unique_doc_rules(rules_str, customize) elif opname == 'MAKE_FUNCTION_8': @@ -245,17 +238,12 @@ class Python36Parser(Python35Parser): """ self.addRule(rules_str, nop_func) - elif opname == 'BUILD_STRING': + elif opname.startswith('BUILD_STRING'): v = token.attr - joined_str_n = "formatted_value_%s" % v rules_str = """ - expr ::= fstring_multi - fstring_multi ::= joined_str BUILD_STRING - fstr ::= expr - joined_str ::= fstr+ - fstring_multi ::= %s BUILD_STRING - %s ::= %sBUILD_STRING - """ % (joined_str_n, joined_str_n, "formatted_value " * v) + expr ::= joined_str + joined_str ::= %sBUILD_STRING_%d + """ % ("expr " * v, v) self.add_unique_doc_rules(rules_str, customize) if 'FORMAT_VALUE_ATTR' in self.seen_ops: rules_str = """ diff --git a/uncompyle6/parsers/parse37.py b/uncompyle6/parsers/parse37.py index 83b2d508..8018df8c 100644 --- a/uncompyle6/parsers/parse37.py +++ b/uncompyle6/parsers/parse37.py @@ -72,8 +72,8 @@ class Python37Parser(Python36Parser): POP_TOP POP_TOP POP_TOP POP_EXCEPT POP_TOP POP_BLOCK else_suite COME_FROM_LOOP - # Is there a pattern here? attributes ::= IMPORT_FROM ROT_TWO POP_TOP IMPORT_FROM + attributes ::= attributes ROT_TWO POP_TOP IMPORT_FROM attribute37 ::= expr LOAD_METHOD expr ::= attribute37 @@ -87,26 +87,37 @@ class Python37Parser(Python36Parser): compare_chained37 ::= expr compare_chained1a_37 compare_chained37 ::= expr compare_chained1b_37 + compare_chained37 ::= expr compare_chained1c_37 + compare_chained37_false ::= expr compare_chained1_false_37 + compare_chained37_false ::= expr compare_chained2_false_37 compare_chained1a_37 ::= expr DUP_TOP ROT_THREE COMPARE_OP POP_JUMP_IF_FALSE compare_chained1a_37 ::= expr DUP_TOP ROT_THREE COMPARE_OP POP_JUMP_IF_FALSE compare_chained2a_37 ELSE POP_TOP COME_FROM compare_chained1b_37 ::= expr DUP_TOP ROT_THREE COMPARE_OP POP_JUMP_IF_FALSE compare_chained2b_37 POP_TOP JUMP_FORWARD COME_FROM + compare_chained1c_37 ::= expr DUP_TOP ROT_THREE COMPARE_OP POP_JUMP_IF_FALSE + compare_chained2a_37 POP_TOP compare_chained1_false_37 ::= expr DUP_TOP ROT_THREE COMPARE_OP POP_JUMP_IF_FALSE compare_chained2c_37 POP_TOP JUMP_FORWARD COME_FROM + compare_chained2_false_37 ::= expr DUP_TOP ROT_THREE COMPARE_OP POP_JUMP_IF_FALSE + compare_chained2a_false_37 ELSE POP_TOP JUMP_BACK COME_FROM compare_chained2a_37 ::= expr COMPARE_OP POP_JUMP_IF_TRUE JUMP_FORWARD - compare_chained2a_false_37 ::= expr COMPARE_OP POP_JUMP_IF_FALSE JUMP_FORWARD + compare_chained2a_37 ::= expr COMPARE_OP POP_JUMP_IF_TRUE JUMP_BACK + compare_chained2a_false_37 ::= expr COMPARE_OP POP_JUMP_IF_FALSE jf_cfs compare_chained2b_37 ::= expr COMPARE_OP come_from_opt POP_JUMP_IF_FALSE JUMP_FORWARD ELSE + compare_chained2b_37 ::= expr COMPARE_OP come_from_opt POP_JUMP_IF_FALSE JUMP_FORWARD compare_chained2c_37 ::= expr DUP_TOP ROT_THREE COMPARE_OP come_from_opt POP_JUMP_IF_FALSE compare_chained2a_false_37 ELSE + compare_chained2c_37 ::= expr DUP_TOP ROT_THREE COMPARE_OP come_from_opt POP_JUMP_IF_FALSE + compare_chained2a_false_37 - jf_cfs ::= JUMP_FORWARD come_froms + jf_cfs ::= JUMP_FORWARD _come_froms ifelsestmt ::= testexpr c_stmts_opt jf_cfs else_suite opt_come_from_except jmp_false37 ::= POP_JUMP_IF_FALSE COME_FROM diff --git a/uncompyle6/scanners/scanner3.py b/uncompyle6/scanners/scanner3.py index 68014e6f..9496b906 100644 --- a/uncompyle6/scanners/scanner3.py +++ b/uncompyle6/scanners/scanner3.py @@ -819,7 +819,14 @@ class Scanner3(Scanner): self.fixed_jumps[offset] = fix or match[-1] return else: - self.fixed_jumps[offset] = match[-1] + if self.version < 3.6: + # FIXME: this is putting in COME_FROMs in the wrong place. + # Fix up grammar so we don't need to do this. + # See cf_for_iter use in parser36.py + self.fixed_jumps[offset] = match[-1] + elif target > offset: + # Right now we only add COME_FROMs in forward (not loop) jumps + self.fixed_jumps[offset] = target return # op == POP_JUMP_IF_TRUE else: @@ -924,7 +931,7 @@ class Scanner3(Scanner): # Python 3.5 may remove as dead code a JUMP # instruction after a RETURN_VALUE. So we check # based on seeing SETUP_EXCEPT various places. - if self.version < 3.8 and code[rtarget] == self.opc.SETUP_EXCEPT: + if self.version < 3.6 and code[rtarget] == self.opc.SETUP_EXCEPT: return # Check that next instruction after pops and jump is # not from SETUP_EXCEPT diff --git a/uncompyle6/scanners/scanner36.py b/uncompyle6/scanners/scanner36.py index 5e279a7f..3ff181a2 100644 --- a/uncompyle6/scanners/scanner36.py +++ b/uncompyle6/scanners/scanner36.py @@ -31,6 +31,8 @@ class Scanner36(Scanner3): t.op == self.opc.CALL_FUNCTION_EX and t.attr & 1): t.kind = 'CALL_FUNCTION_EX_KW' pass + elif t.op == self.opc.BUILD_STRING: + t.kind = 'BUILD_STRING_%s' % t.attr elif t.op == self.opc.CALL_FUNCTION_KW: t.kind = 'CALL_FUNCTION_KW_%s' % t.attr elif t.op == self.opc.FORMAT_VALUE: diff --git a/uncompyle6/semantics/consts.py b/uncompyle6/semantics/consts.py index a2a98097..10707eea 100644 --- a/uncompyle6/semantics/consts.py +++ b/uncompyle6/semantics/consts.py @@ -27,81 +27,85 @@ else: maxint = sys.maxint -# Operator precidence -# See https://docs.python.org/2/reference/expressions.html -# or https://docs.python.org/3/reference/expressions.html -# for a list. +# Operator precidence See +# https://docs.python.org/2/reference/expressions.html#operator-precedence +# or +# https://docs.python.org/3/reference/expressions.html#operator-precedence +# for a list. We keep the same top-to-botom order here as in the above links, +# so we start with low precedence (high values) and go down in value. -# Things at the top of this list below with low-value precidence will -# tend to have parenthesis around them. Things at the bottom +# Things at the bottom of this list below with high precedence (low value) will +# tend to have parenthesis around them. Things at the top # of the list will tend not to have parenthesis around them. -# Note: The values in this table tend to be even value. Inside -# various templates we use odd values. Avoiding equal-precident comparisons +# Note: The values in this table are even numbers. Inside +# various templates we use odd values. Avoiding equal-precedent comparisons # avoids ambiguity what to do when the precedence is equal. -PRECEDENCE = { - 'list': 0, - 'dict': 0, - 'unary_convert': 0, - 'dict_comp': 0, - 'set_comp': 0, - 'set_comp_expr': 0, - 'list_comp': 0, - 'generator_exp': 0, - 'attribute': 2, - 'subscript': 2, - 'subscript2': 2, - 'store_subscript': 2, +PRECEDENCE = { + 'yield': 102, + 'yield_from': 102, + + '_mklambda': 30, + + 'conditional': 28, # Conditional expression + 'conditional_lamdba': 28, # Lambda expression + 'conditional_not_lamdba': 28, # Lambda expression + 'conditionalnot': 28, + 'if_expr_true': 28, + 'ret_cond': 28, + + 'or': 26, # Boolean OR + 'ret_or': 26, + + 'and': 24, # Boolean AND + 'compare': 20, # in, not in, is, is not, <, <=, >, >=, !=, == + 'ret_and': 24, + 'unary_not': 22, # Boolean NOT + + 'BINARY_AND': 14, # Bitwise AND + 'BINARY_OR': 18, # Bitwise OR + 'BINARY_XOR': 16, # Bitwise XOR + + 'BINARY_LSHIFT': 12, # Shifts << + 'BINARY_RSHIFT': 12, # Shifts >> + + 'BINARY_ADD': 10, # - + 'BINARY_SUBTRACT': 10, # + + + 'BINARY_DIVIDE': 8, # / + 'BINARY_FLOOR_DIVIDE': 8, # // + 'BINARY_MATRIX_MULTIPLY': 8, # @ + 'BINARY_MODULO': 8, # Remainder, % + 'BINARY_MULTIPLY': 8, # * + 'BINARY_TRUE_DIVIDE': 8, # Division / + + 'unary_expr': 6, # +x, -x, ~x + + 'BINARY_POWER': 4, # Exponentiation, * + + 'attribute': 2, # x.attribute + 'buildslice2': 2, # x[index] + 'buildslice3': 2, # x[index:index] + 'call': 2, # x(arguments...) 'delete_subscript': 2, 'slice0': 2, 'slice1': 2, 'slice2': 2, 'slice3': 2, - 'buildslice2': 2, - 'buildslice3': 2, - 'call': 2, + 'store_subscript': 2, + 'subscript': 2, + 'subscript2': 2, - 'BINARY_POWER': 4, - - 'unary_expr': 6, - - 'BINARY_MULTIPLY': 8, - 'BINARY_DIVIDE': 8, - 'BINARY_TRUE_DIVIDE': 8, - 'BINARY_FLOOR_DIVIDE': 8, - 'BINARY_MODULO': 8, - - 'BINARY_ADD': 10, - 'BINARY_SUBTRACT': 10, - - 'BINARY_LSHIFT': 12, - 'BINARY_RSHIFT': 12, - - 'BINARY_AND': 14, - 'BINARY_XOR': 16, - 'BINARY_OR': 18, - - 'compare': 20, - 'unary_not': 22, - 'and': 24, - 'ret_and': 24, - - 'or': 26, - 'ret_or': 26, - - 'conditional': 28, - 'conditional_lamdba': 28, - 'conditional_not_lamdba': 28, - 'conditionalnot': 28, - 'if_expr_true': 28, - 'ret_cond': 28, - - '_mklambda': 30, - - 'yield': 101, - 'yield_from': 101 + 'dict': 0, # {expressions...} + 'dict_comp': 0, + 'generator_exp': 0, # (expressions...) + 'list': 0, # [expressions...] + 'list_comp': 0, + 'set_comp': 0, + 'set_comp_expr': 0, + 'unary_convert': 0, } LINE_LENGTH = 80 @@ -216,7 +220,7 @@ TABLE_DIRECT = { 'IMPORT_FROM': ( '%{pattr}', ), 'attribute': ( '%c.%[1]{pattr}', - (0, 'expr')), + (0, 'expr')), 'LOAD_FAST': ( '%{pattr}', ), 'LOAD_NAME': ( '%{pattr}', ), 'LOAD_CLASSNAME': ( '%{pattr}', ), diff --git a/uncompyle6/semantics/customize35.py b/uncompyle6/semantics/customize35.py index 859d68ea..05639d5c 100644 --- a/uncompyle6/semantics/customize35.py +++ b/uncompyle6/semantics/customize35.py @@ -119,6 +119,12 @@ def customize_for_version35(self, version): def n_function_def(node): if self.version >= 3.6: code_node = node[0][0] + for n in node[0]: + if hasattr(n, 'attr') and iscode(n.attr): + code_node = n + break + pass + pass else: code_node = node[0][1] diff --git a/uncompyle6/semantics/customize36.py b/uncompyle6/semantics/customize36.py index 764a87e8..2245e648 100644 --- a/uncompyle6/semantics/customize36.py +++ b/uncompyle6/semantics/customize36.py @@ -33,27 +33,19 @@ def escape_format(s): ####################### def customize_for_version36(self, version): - # Value 100 is important; it is exactly - # module/function precidence. - PRECEDENCE['call_kw'] = 100 - PRECEDENCE['call_kw36'] = 100 - PRECEDENCE['call_ex'] = 100 - PRECEDENCE['call_ex_kw'] = 100 - PRECEDENCE['call_ex_kw2'] = 100 - PRECEDENCE['call_ex_kw3'] = 100 - PRECEDENCE['call_ex_kw4'] = 100 + PRECEDENCE['call_kw'] = 0 + PRECEDENCE['call_kw36'] = 1 + PRECEDENCE['call_ex'] = 1 + PRECEDENCE['call_ex_kw'] = 1 + PRECEDENCE['call_ex_kw2'] = 1 + PRECEDENCE['call_ex_kw3'] = 1 + PRECEDENCE['call_ex_kw4'] = 1 PRECEDENCE['unmap_dict'] = 0 + PRECEDENCE['formatted_value1'] = 100 TABLE_DIRECT.update({ 'tryfinally36': ( '%|try:\n%+%c%-%|finally:\n%+%c%-\n\n', (1, 'returns'), 3 ), - 'fstring_expr': ( "{%c%{conversion}}", - (0, 'expr') ), - # FIXME: the below assumes the format strings - # don't have ''' in them. Fix this properly - 'fstring_single': ( "f'''{%c%{conversion}}'''", 0), - 'formatted_value_attr': ( "f'''{%c%{conversion}}%{string}'''", - (0, 'expr')), 'func_args36': ( "%c(**", 0), 'try_except36': ( '%|try:\n%+%c%-%c\n\n', 1, -2 ), 'except_return': ( '%|except:\n%+%c%-', 3 ), @@ -68,9 +60,6 @@ def customize_for_version36(self, version): 'call_ex' : ( '%c(%p)', (0, 'expr'), (1, 100)), - 'call_ex_kw' : ( - '%c(%p)', - (0, 'expr'), (2, 100)), }) @@ -81,20 +70,28 @@ def customize_for_version36(self, version): }) def build_unpack_tuple_with_call(node): - - if node[0] == 'expr': - tup = node[0][0] + n = node[0] + if n == 'expr': + n = n[0] + if n == 'tuple': + self.call36_tuple(n) + first = 1 + sep = ', *' + elif n == 'LOAD_CONST': + value = self.format_pos_args(n) + self.f.write(value) + first = 1 + sep = ', *' else: - tup = node[0] - pass - assert tup == 'tuple' - self.call36_tuple(tup) + first = 0 + sep = '*' buwc = node[-1] assert buwc.kind.startswith('BUILD_TUPLE_UNPACK_WITH_CALL') - for n in node[1:-1]: - self.f.write(', *') + for n in node[first:-1]: + self.f.write(sep) self.preorder(n) + sep = ', *' pass self.prune() return @@ -120,45 +117,41 @@ def customize_for_version36(self, version): return self.n_build_map_unpack_with_call = build_unpack_map_with_call + def call_ex_kw(node): + """Handle CALL_FUNCTION_EX 1 (have KW) but with + BUILD_MAP_UNPACK_WITH_CALL""" + + expr = node[1] + assert expr == 'expr' + + value = self.format_pos_args(expr) + if value == '': + fmt = "%c(%p)" + else: + fmt = "%%c(%s, %%p)" % value + + self.template_engine( + (fmt, + (0, 'expr'), (2, 'build_map_unpack_with_call', 100)), node) + + self.prune() + self.n_call_ex_kw = call_ex_kw + def call_ex_kw2(node): """Handle CALL_FUNCTION_EX 2 (have KW) but with BUILD_{MAP,TUPLE}_UNPACK_WITH_CALL""" - # This is weird shit. Thanks Python! - self.preorder(node[0]) - self.write('(') - assert node[1] == 'build_tuple_unpack_with_call' - btuwc = node[1] - tup = btuwc[0] - if tup == 'expr': - tup = tup[0] - - if tup == 'LOAD_CONST': - self.write(', '.join(['"%s"' % t.replace('"','\\"') for t in tup.attr])) + value = self.format_pos_args(node[1]) + if value == '': + fmt = "%c(%p)" else: - assert tup == 'tuple' - self.call36_tuple(tup) + fmt = "%%c(%s, %%p)" % value - assert node[2] == 'build_map_unpack_with_call' + self.template_engine( + (fmt, + (0, 'expr'), (2, 'build_map_unpack_with_call', 100)), node) - self.write(', ') - d = node[2][0] - if d == 'expr': - d = d[0] - assert d == 'dict' - self.call36_dict(d) - - args = btuwc[1] - self.write(', *') - self.preorder(args) - - self.write(', **') - star_star_args = node[2][1] - if star_star_args == 'expr': - star_star_args = star_star_args[0] - self.preorder(star_star_args) - self.write(')') self.prune() self.n_call_ex_kw2 = call_ex_kw2 @@ -167,14 +160,13 @@ def customize_for_version36(self, version): BUILD_MAP_UNPACK_WITH_CALL""" self.preorder(node[0]) self.write('(') - args = node[1][0] - if args == 'expr': - args = args[0] - if args == 'tuple': - if self.call36_tuple(args) > 0: - self.write(', ') - pass + + value = self.format_pos_args(node[1][0]) + if value == '': pass + else: + self.write(value) + self.write(', ') self.write('*') self.preorder(node[1][1]) @@ -227,6 +219,25 @@ def customize_for_version36(self, version): self.prune() self.n_call_ex_kw4 = call_ex_kw4 + def format_pos_args(node): + """ + Positional args should format to: + (*(2, ), ...) -> (2, ...) + We remove starting and trailing parenthesis and ', ' if + tuple has only one element. + """ + value = self.traverse(node, indent='') + if value.startswith('('): + assert value.endswith(')') + value = value[1:-1].rstrip(" ") # Remove starting '(' and trailing ')' and additional spaces + if value == '': + pass # args is empty + else: + if value.endswith(','): # if args has only one item + value = value[:-1] + return value + self.format_pos_args = format_pos_args + def call36_tuple(node): """ A tuple used in a call, these are like normal tuples but they @@ -333,7 +344,7 @@ def customize_for_version36(self, version): self.call36_dict = call36_dict def n_call_kw36(node): - self.template_engine(("%c(", 0), node) + self.template_engine(("%p(", (0, 100)), node) keys = node[-2].attr num_kwargs = len(keys) num_posargs = len(node) - (num_kwargs + 2) @@ -408,7 +419,6 @@ def customize_for_version36(self, version): node.string = escape_format(fmt_node[0].attr) else: node.string = fmt_node - self.default(node) self.n_formatted_value_attr = n_formatted_value_attr @@ -419,60 +429,72 @@ def customize_for_version36(self, version): else: data = fmt_node.attr node.conversion = FSTRING_CONVERSION_MAP.get(data, '') + return node.conversion - def n_fstring_expr(node): - f_conversion(node) - self.default(node) - self.n_fstring_expr = n_fstring_expr - - def n_fstr(node): - if node[0] == 'expr' and node[0][0] == 'fstring_expr': - f_conversion(node[0][0]) - self.default(node[0][0]) - else: - value = strip_quotes(self.traverse(node[0], indent='')) - pass - self.write(value) + def n_formatted_value1(node): + expr = node[0] + assert expr == 'expr' + value = self.traverse(expr, indent='') + conversion = f_conversion(node) + f_str = "f%s" % escape_string("{%s%s}" % (value, conversion)) + self.write(f_str) self.prune() - self.n_fstr = n_fstr - def n_fstring_single(node): - attr4 = len(node) == 3 and node[-1] == 'FORMAT_VALUE_ATTR' and node[-1].attr == 4 - if attr4 and hasattr(node[0][0], 'attr'): - assert node[0] == 'expr' + self.n_formatted_value1 = n_formatted_value1 + + def n_formatted_value2(node): + p = self.prec + self.prec = 100 + + expr = node[0] + assert expr == 'expr' + value = self.traverse(expr, indent='') + format_value_attr = node[-1] + assert format_value_attr == 'FORMAT_VALUE_ATTR' + attr = format_value_attr.attr + if attr == 4: assert node[1] == 'expr' - self.write("{%s:%s}" % (node[0][0].attr, node[1][0].attr)) - self.prune() + fmt = strip_quotes(self.traverse(node[1], indent='')) + conversion = ":%s" % fmt else: - f_conversion(node) - self.default(node) - self.n_fstring_single = n_fstring_single + conversion = FSTRING_CONVERSION_MAP.get(attr, '') + + f_str = "f%s" % escape_string("{%s%s}" % (value, conversion)) + self.write(f_str) + + self.prec = p + self.prune() + self.n_formatted_value2 = n_formatted_value2 def n_joined_str(node): + p = self.prec + self.prec = 100 + result = '' - for fstr_node in node: - assert fstr_node == 'fstr' - assert fstr_node[0] == 'expr' - subnode = fstr_node[0][0] - if subnode.kind == 'fstring_expr': - # Don't include outer f'...' - f_conversion(subnode) - data = strip_quotes(self.traverse(subnode, indent='')) - result += data - elif subnode == 'LOAD_CONST': - result += strip_quotes(escape_string(subnode.attr)) - elif subnode == 'fstring_single': - f_conversion(subnode) - data = self.traverse(subnode, indent='') - if data[0:1] == 'f': - data = strip_quotes(data[1:]) - result += data + for expr in node[:-1]: + assert expr == 'expr' + value = self.traverse(expr, indent='') + if expr[0].kind.startswith('formatted_value'): + # remove leading 'f' + assert value.startswith('f') + value = value[1:] pass else: - result += strip_quotes(self.traverse(subnode, indent='')) - pass + # {{ and }} in Python source-code format strings mean + # { and } respectively. But only when *not* part of a + # formatted value. However in the LOAD_CONST + # bytecode, the escaping of the braces has been + # removed. So we need to put back the braces escaping in + # reconstructing the source. + assert expr[0] == 'LOAD_CONST' + value = value.replace("{", "{{").replace("}", "}}") + + # Remove leading quotes + result += strip_quotes(value) pass self.write('f%s' % escape_string(result)) + + self.prec = p self.prune() self.n_joined_str = n_joined_str diff --git a/uncompyle6/semantics/customize37.py b/uncompyle6/semantics/customize37.py index b07ea563..7d97d4a1 100644 --- a/uncompyle6/semantics/customize37.py +++ b/uncompyle6/semantics/customize37.py @@ -45,9 +45,15 @@ def customize_for_version37(self, version): 'compare_chained1_false_37': ( ' %[3]{pattr.replace("-", " ")} %p %p', (0, 19), (-4, 19)), + 'compare_chained2_false_37': ( + ' %[3]{pattr.replace("-", " ")} %p %p', + (0, 19), (-5, 19)), 'compare_chained1b_37': ( ' %[3]{pattr.replace("-", " ")} %p %p', (0, 19), (-4, 19)), + 'compare_chained1c_37': ( + ' %[3]{pattr.replace("-", " ")} %p %p', + (0, 19), (-2, 19)), 'compare_chained2a_37': ( '%[1]{pattr.replace("-", " ")} %p', (0, 19) ), diff --git a/uncompyle6/semantics/pysource.py b/uncompyle6/semantics/pysource.py index 64e6140e..8cc0f40b 100644 --- a/uncompyle6/semantics/pysource.py +++ b/uncompyle6/semantics/pysource.py @@ -1433,13 +1433,13 @@ class SourceWalker(GenericASTTraversal, object): assert node[n].kind.startswith('CALL_FUNCTION') if node[n].kind.startswith('CALL_FUNCTION_KW'): - # 3.6+ starts does this + # 3.6+ starts doing this kwargs = node[n-1].attr assert isinstance(kwargs, tuple) i = n - (len(kwargs)+1) j = 1 + n - node[n].attr else: - start = n-2 + i = start = n-2 for i in range(start, 0, -1): if not node[i].kind in ['expr', 'call', 'LOAD_CLASSNAME']: break @@ -1837,11 +1837,7 @@ class SourceWalker(GenericASTTraversal, object): typ = m.group('type') or '{' node = startnode if m.group('child'): - try: - node = node[int(m.group('child'))] - except: - from trepan.api import debug; debug() - pass + node = node[int(m.group('child'))] if typ == '%': self.write('%') elif typ == '+': diff --git a/uncompyle6/version.py b/uncompyle6/version.py index 36971cbe..4271e0be 100644 --- a/uncompyle6/version.py +++ b/uncompyle6/version.py @@ -12,4 +12,4 @@ # along with this program. If not, see . # This file is suitable for sourcing inside bash as # well as importing into Python -VERSION='3.3.2' # noqa +VERSION='3.3.3' # noqa