Compare commits

...

93 Commits

Author SHA1 Message Date
rocky
da50394841 Release 2.9.8 news 2016-12-16 22:56:48 -05:00
rocky
13d5cd1a58 Get ready for release 2.9.8 2016-12-16 22:42:46 -05:00
rocky
08dcc7d820 Start to handle 3.5 build_map_unpack_with_call
3.6 also started but needs even more work
2016-12-16 20:39:24 -05:00
rocky
7755563b65 Some Python 3.6 bytecode->wordcode fixes 2016-12-15 02:54:25 -05:00
rocky
b43cbc050d Was passing wrong type 2016-12-13 20:05:08 -05:00
rocky
db7a26d47d option -g: show start-end range when possible 2016-12-11 09:02:28 -05:00
rocky
92166452c1 two misc changes
- track print_docstring move to help (used in python 3.1)
- verify: allow RETURN_VALUE to match RETURN_END_IF
2016-12-11 08:22:26 -05:00
rocky
96fa3ef381 3.2 needs --weak-verify 2016-12-10 07:35:31 -05:00
rocky
755415c7d8 Try testing on 3.2 2016-12-10 07:32:56 -05:00
rocky
b168e1de55 Can run in Python 3.1 and Python 3.2 2016-12-10 07:30:27 -05:00
rocky
38eed14b41 Another python 3 ELSE fixes and ...
Makefile:
  - test python 3.0 bytecode
  - turn full --verify back on Python 3.x
2016-12-10 06:36:22 -05:00
rocky
2c993f8c32 Another faulty Python3 ELSE tag remove 2016-12-10 00:43:55 -05:00
rocky
65858a4c74 Grammar check: ELSE on RHS is ok. 2016-12-09 22:22:01 -05:00
rocky
263c63e009 Back of some of the verification changes 2016-12-09 21:43:22 -05:00
rocky
813bce4697 Merge branch 'master' of github.com:rocky/python-uncompyle6 2016-12-09 21:13:31 -05:00
rocky
a5d2237435 Python 3.x else clause detection and..
- Strengthen verify check.
- weak verification on Python 3.5 for now
2016-12-09 21:10:10 -05:00
rocky
d22931cb49 Get ready for release 2.9.7
Some of the many lint things. Linting is kind of stupid though.
2016-12-04 09:36:30 -05:00
rocky
9cc2700160 Shorten Python3 grammars with + and * 2016-11-28 23:49:43 -05:00
rocky
a5a0f45dde Try new spark 2.5.1 grammar syntax shortcuts
This package I now declare stable
2016-11-28 07:55:00 -05:00
R. Bernstein
3c02fa7e36 Update README.rst 2016-11-28 07:47:18 -05:00
rocky
0d0f836f76 Limitations of decompiling control structures. 2016-11-27 14:20:35 -05:00
R. Bernstein
69c93cc665 Merge pull request #69 from rocky/ast-reduce-checks
AST reduce checks
2016-11-27 14:12:08 -05:00
rocky
97576e473d Python 3 while/else bug 2016-11-27 07:06:20 -05:00
rocky
1e324e0e8d Misc changes
scanner26.py: make scanner2.py and scanner26.py more alike
scanner2.py: check that return stmt is last in list. (May change)
main.py: show filename on verify error
test/*: add more
2016-11-26 21:41:45 -05:00
rocky
7ab4e1fbdb Start grammar reduction checks 2016-11-26 15:38:00 -05:00
rocky
abecb21671 2.7 grammar bug workaround. Fix docstring bug 2016-11-24 21:57:39 -05:00
rocky
8be6369bdf Better line number tracking
Indent Python 2 list comprehensions, albeit badly.
DRY code a little via indent_if_source_nl
2016-11-24 10:31:38 -05:00
rocky
8941417a54 <2.7 "if" detection and dup Python 3 grammar rule 2016-11-24 05:33:08 -05:00
rocky
cbcfd53dae Python 2.6 grammary bug and..
__pkginfo.py__: Bump spark_parser version for parse_flags 'dups'
2016-11-23 21:44:53 -05:00
rocky
df2ca51f4a Note that we now work on 2.4 and 2.5 2016-11-23 08:28:10 -05:00
rocky
4f4069c6b5 Merge branch 'come-from-type' 2016-11-23 08:26:35 -05:00
rocky
6aa1531972 Circle CI uses 2.7.10
and 2.7.12 is not available
2016-11-23 00:48:38 -05:00
rocky
4fcb385dc0 DRY Python3 grammar 2016-11-22 19:59:19 -05:00
rocky
260ddedbfd More detailed COME_FROMs
For now we only add COME_FROM_FINALLY and COME_FROM_WITH
and even here only on 2.7
2016-11-22 19:42:26 -05:00
rocky
f8917aaf88 Remove redundant 2.7 (and 2.x) grammar rules 2016-11-22 17:31:36 -05:00
rocky
c8550d5c9e Split out print_docstring
move from pysource.py to new helper.py
2016-11-22 05:29:50 -05:00
rocky
1aeb09cb8b Get ready for release 2.9.6 2016-11-20 21:38:43 -05:00
R. Bernstein
f575234fc8 Merge pull request #68 from rocky/line-mappings
Line mappings
2016-11-20 21:16:01 -05:00
rocky
abcd10628a Add --linemaps: shows line number correspondences 2016-11-20 21:11:38 -05:00
rocky
eb2b63ce9c Merge remote-tracking branch 'origin' into line-mappings 2016-11-20 18:41:19 -05:00
rocky
805e17988e Fix bug in docstring triple quotes
Problem was not escaping """ inside """.
Use ''' when possible; and when not, use: \"\"\".
2016-11-20 12:21:56 -05:00
rocky
80df5dcc95 Back off a test.
That means bugs in 2.7 still not fixed. Sigh.
2016-11-20 11:37:19 -05:00
rocky
2bc316d6f0 more 2.7 control flow bug fixing 2016-11-20 06:55:08 -05:00
rocky
195bbc746b Pass debug in scanner26 find_targets 2016-11-20 03:42:30 -05:00
rocky
0f56b4f476 Add debug option on Python 3 find_jump_targets() 2016-11-20 03:21:03 -05:00
rocky
94719918d4 A little closesr in PyPy 2.7 list comprehensions
pysource.py: note need to handle line breaks in list comprehensions
2016-11-20 03:17:49 -05:00
rocky
f2a3721d7d Start to improve detect_structure for 2.7 and 2.x
Add debug flag to find_jump_targets to show the structure we found.
When there are control-flow bugs, it's often reflected here.

scanner3.py: make code make more similar to 2.x code
2016-11-20 02:38:59 -05:00
rocky
79863ae122 Merge branch 'master' into line-mappings 2016-11-18 09:04:03 -05:00
rocky
d7f898b4fb New feature: show line number correspondences
Option --linemap on uncompile show how original source-code line numbers
map to uncompiled source lines
2016-11-18 09:02:00 -05:00
R. Bernstein
fe36c9e9f6 Merge pull request #67 from rocky/2.6-cf-ignore-if
2.6 cf ignore if
2016-11-17 03:53:10 -05:00
rocky
76ae1592d0 verify scanner2 vs scanner3 small changes...
verify.py: allow LOAD_CONST None to make LOAD_NAME 'None'
scanner{2,3}.py: make them look more alike
2016-11-17 03:43:39 -05:00
rocky
31d387749b More AST checking
Small fixes in output format
2016-11-16 07:28:19 -05:00
rocky
9e3026bd78 WIP Grammar changes - reinstatng COME_FROMs around ignore_if's 2016-11-15 23:44:22 -05:00
rocky
bfe7e7777d Revise MANIFEST.in with what we have 2016-11-15 23:44:22 -05:00
rocky
81b4941fda Merge branch '2.6-cf-ignore-if' of github.com:rocky/python-uncompyle6 into 2.6-cf-ignore-if 2016-11-15 13:26:22 -05:00
rocky
0f719d41fd Revise MANIFEST.in with what we have 2016-11-14 20:20:07 -05:00
rocky
766451cbb9 WIP remove COME_FROMs around ignore_if's 2016-11-14 09:27:56 -05:00
rocky
1e4dc52197 WIP remove COME_FROMs around ignore_if's 2016-11-14 07:27:13 -05:00
rocky
6073c77921 Show line numbers in 2.6 "after" asm ..
start to understand some of the Python 2.6 bytecode parse failures.
2016-11-14 00:30:23 -05:00
rocky
b6e53205dd Handle verify syntax errors...
Update README.rst stats
2016-11-13 18:55:23 -05:00
rocky
ee6dddd25a Administrivia: Fixes #66 2016-11-13 14:20:36 -05:00
rocky
968a54512b Get ready for release 2.9.5 2016-11-13 10:37:51 -05:00
rocky
a81ffe8963 Python 3 bugs ...
- Was using "while 1 .. else" improperly
- docstring indent bug: was indenting docstring improperly
2016-11-13 10:08:41 -05:00
rocky
3b9e48a3b6 Revise what works and what doesn't 2016-11-13 09:07:53 -05:00
rocky
80a4ad4f1b Python 3.0 while1 if bug...
Is a workaround. We really need more tagging in of SETUP_LOOP and COME_FROM.
2016-11-13 01:28:36 -05:00
rocky
50c2e1bda9 Revert augassign change but..
Make note of what's going on and add grammar test for bad
situations we have in Python 2.6 (and perhaps others)
2016-11-11 09:08:02 -05:00
rocky
f4999f6300 augassign semantic action bug 2016-11-11 08:41:55 -05:00
rocky
0f536b18fa Bug in detecting 3.3 default value in lambda 2016-11-10 23:59:51 -05:00
rocky
6fb879d0d8 Detect some erroneous decompilations
Until we can actually prevent these in grammar rules, we will warn of
improper decompilations.

Also, we now stop when we hit a decompile error. Previously we were not.
2016-11-10 22:29:39 -05:00
rocky
411eaaeafb Remove unused imports 2016-11-10 20:10:56 -05:00
rocky
36874c72e2 Possiby tidy grammar 2016-11-07 22:06:37 -05:00
rocky
7343575e55 Bump xdis to get correct 3.0 bytecodes 2016-11-06 18:01:03 -05:00
rocky
fef0567746 Some Python 3.4 grammar rules apply to Python 3.3 2016-11-06 10:00:10 -05:00
rocky
41f360e3dc Start bytecode 3.0 decompiling 2016-11-06 09:20:46 -05:00
rocky
5d10f7a0b0 Python 3.0 doesn't have POP_JUMP ops...
In some ways Python 3.0 code generation is more like Python 2.6 (and
before) than it is Python 2.7 or 3.0.
2016-11-06 08:55:03 -05:00
R. Bernstein
2a5eda631a Merge pull request #63 from rocky/python-3.0
Python 3.0
2016-11-05 21:17:12 -04:00
rocky
a685c60606 Make parse 3.0 be its own thing 2016-11-05 21:02:49 -04:00
rocky
d2ac293cf6 Merge branch 'master' into python-3.0 2016-11-05 21:01:50 -04:00
rocky
cd3cf5ec29 Use L. for line number prefix in asm and AST 2016-11-03 21:26:12 -04:00
rocky
2eaea447eb Get ready for release 2.9.4 2016-11-02 22:44:23 -04:00
rocky
287e98b4b1 Update unpyc3 info. 2016-11-02 20:42:31 -04:00
rocky
63e4c9343f Clean up annotation grammar a little 2016-11-01 15:50:19 -04:00
rocky
eab653afdd Full Python 3 annotations 2016-11-01 12:21:27 -04:00
rocky
7700446bb1 Note github unpyc3 and..
- Add source to bytecode_2.2/03_class_method.pyc
- more ignore
2016-10-30 21:16:33 -04:00
rocky
bfd2f77fbc More source-code line indention in make_function..
and remove Python 3 situations from make_function2()
2016-10-30 10:39:11 -04:00
rocky
1574bf4e1e More annotation processing in to make_function
Move return-value annotation determination from n_mkfunc_annotate to
make_function_annotate which is where other kinds of annotation handling
will also need to be done.
2016-10-29 16:03:02 -04:00
rocky
2328ca7a55 Break out make_function() into its own file.
It is already too complex and will get worse in Python 3.6.

Note: make_function in fragments.py is still inside and
probably needs fixup.
2016-10-29 07:22:58 -04:00
rocky
ccdd37611c More complete annotate handling
Still have a bit of work to do though.
2016-10-28 19:55:17 -04:00
rocky
2e355b6245 Expand annotate return to Python 3.4 2016-10-28 11:33:54 -04:00
rocky
9849f06ff6 Expand annotate handling to 3.3
(and possibly 3.2)

- DRY Python 3.1-3.3 grammar a little
2016-10-28 09:01:41 -04:00
rocky
0e7da031b2 Split out 3.1-3.3 parsers from parser3.py
This is anticipation of extending annotation to Python 3.2+
2016-10-28 07:07:18 -04:00
rocky
25dd67a135 Clean and fix Python 3 annotate arg return 2016-10-27 13:52:07 -04:00
rocky
b54a19c6ff Start Python 3.0 decoding
Fix some Python 3.1 bugs
2016-10-24 02:11:26 -04:00
88 changed files with 2427 additions and 922 deletions

3
.gitignore vendored
View File

@@ -1,7 +1,7 @@
*.pyo
*.pyc
*_dis
*~
*.pyc
/.cache
/.eggs
/.python-version
@@ -13,5 +13,6 @@
/nose-*.egg
/tmp
/uncompyle6.egg-info
/unpyc
__pycache__
build

View File

@@ -8,6 +8,7 @@ python:
- '2.6'
- '3.3'
- '3.4'
- '3.2'
install:
- pip install -r requirements.txt

380
ChangeLog
View File

@@ -1,6 +1,377 @@
2016-12-04 rocky <rb@dustyfeet.com>
* uncompyle6/version.py: Get ready for release 2.9.7
2016-11-28 rocky <rb@dustyfeet.com>
* uncompyle6/parsers/parse3.py, uncompyle6/parsers/parse36.py:
Shorten Python3 grammars with + and *
2016-11-28 rocky <rb@dustyfeet.com>
* __pkginfo__.py, uncompyle6/parser.py,
uncompyle6/parsers/parse2.py: Try new spark 2.5.1 grammar syntax
shortcuts This package I now declare stable
2016-11-28 R. Bernstein <rocky@users.noreply.github.com>
* README.rst: Update README.rst
2016-11-27 rocky <rb@dustyfeet.com>
* README.rst: Limitations of decompiling control structures.
2016-11-27 R. Bernstein <rocky@users.noreply.github.com>
* : Merge pull request #69 from rocky/ast-reduce-checks AST reduce checks
2016-11-26 rocky <rb@dustyfeet.com>
* test/simple_source/bug26/03_elif_vs_continue.py,
uncompyle6/main.py, uncompyle6/parser.py,
uncompyle6/parsers/parse2.py, uncompyle6/scanners/scanner2.py,
uncompyle6/scanners/scanner26.py: Misc changes scanner26.py: make scanner2.py and scanner26.py more alike
scanner2.py: check that return stmt is last in list. (May change)
main.py: show filename on verify error test/*: add more
2016-11-25 rocky <rb@dustyfeet.com>
* __pkginfo__.py, test/Makefile, uncompyle6/parser.py,
uncompyle6/parsers/parse2.py, uncompyle6/parsers/parse3.py: Start
grammar reduction checks
2016-11-24 rocky <rb@dustyfeet.com>
* uncompyle6/parsers/parse27.py, uncompyle6/scanners/scanner2.py,
uncompyle6/semantics/helper.py, uncompyle6/semantics/pysource.py:
2.7 grammar bug workaround. Fix docstring bug
2016-11-24 rocky <rb@dustyfeet.com>
* uncompyle6/semantics/pysource.py: Better line number tracking Indent Python 2 list comprehensions, albeit badly. DRY code a
little via indent_if_source_nl
2016-11-24 rocky <rb@dustyfeet.com>
* uncompyle6/parsers/parse3.py, uncompyle6/scanners/scanner2.py:
<2.7 "if" detection and dup Python 3 grammar rule
2016-11-23 rocky <rb@dustyfeet.com>
* __pkginfo__.py, pytest/test_grammar.py, uncompyle6/parser.py,
uncompyle6/parsers/parse26.py: Python 2.6 grammary bug and.. __pkginfo.py__: Bump spark_parser version for parse_flags 'dups'
2016-11-23 rocky <rb@dustyfeet.com>
* __pkginfo__.py: Note that we now work on 2.4 and 2.5
2016-11-23 rocky <rb@dustyfeet.com>
* : commit 6aa1531972de83ecab15b4c96b89c873ea5a7458 Author: rocky
<rb@dustyfeet.com> Date: Wed Nov 23 00:48:38 2016 -0500
2016-11-22 rocky <rb@dustyfeet.com>
* uncompyle6/parsers/parse3.py, uncompyle6/parsers/parse32.py,
uncompyle6/parsers/parse33.py, uncompyle6/parsers/parse34.py,
uncompyle6/parsers/parse35.py: DRY Python3 grammar
2016-11-22 rocky <rb@dustyfeet.com>
* uncompyle6/parsers/parse2.py, uncompyle6/parsers/parse27.py,
uncompyle6/scanners/scanner2.py: More detailed COME_FROMs For now we only add COME_FROM_FINALLY and COME_FROM_WITH and even
here only on 2.7
2016-11-22 rocky <rb@dustyfeet.com>
* circle.yml, pytest/test_grammar.py, tox.ini,
uncompyle6/parser.py, uncompyle6/parsers/parse2.py,
uncompyle6/parsers/parse27.py: Remove redundant 2.7 (and 2.x)
grammar rules
2016-11-22 rocky <rb@dustyfeet.com>
* pytest/test_docstring.py, uncompyle6/linenumbers.py,
uncompyle6/semantics/fragments.py, uncompyle6/semantics/helper.py,
uncompyle6/semantics/make_function.py,
uncompyle6/semantics/pysource.py: Split out print_docstring move from pysource.py to new helper.py
2016-11-20 rocky <rb@dustyfeet.com>
* ChangeLog, NEWS, uncompyle6/version.py: Get ready for release
2.9.6
2016-11-20 R. Bernstein <rocky@users.noreply.github.com>
* : Merge pull request #68 from rocky/line-mappings Line mappings
2016-11-20 rocky <rb@dustyfeet.com>
* : Merge remote-tracking branch 'origin' into line-mappings
2016-11-20 rocky <rb@dustyfeet.com>
* uncompyle6/parsers/parse27.py: Back off a test. That means bugs in 2.7 still not fixed. Sigh.
2016-11-20 rocky <rb@dustyfeet.com>
* pytest/test_fjt.py, uncompyle6/parsers/parse27.py,
uncompyle6/scanners/scanner2.py: more 2.7 control flow bug fixing
2016-11-20 rocky <rb@dustyfeet.com>
* uncompyle6/scanners/scanner26.py: Pass debug in scanner26
find_targets
2016-11-20 rocky <rb@dustyfeet.com>
* uncompyle6/scanners/scanner3.py: Add debug option on Python 3
find_jump_targets()
2016-11-20 rocky <rb@dustyfeet.com>
* uncompyle6/semantics/pysource.py: A little closesr in PyPy 2.7
list comprehensions pysource.py: note need to handle line breaks in list comprehensions
2016-11-20 rocky <rb@dustyfeet.com>
* pytest/test_fjt.py, uncompyle6/scanners/scanner2.py,
uncompyle6/scanners/scanner26.py, uncompyle6/scanners/scanner3.py:
Start to improve detect_structure for 2.7 and 2.x Add debug flag to find_jump_targets to show the structure we found.
When there are control-flow bugs, it's often reflected here. scanner3.py: make code make more similar to 2.x code
2016-11-18 rocky <rb@dustyfeet.com>
* : commit d7f898b4fbf79d1f66eabadb25f0f9f0f38730cb Author: rocky
<rb@dustyfeet.com> Date: Fri Nov 18 09:02:00 2016 -0500
2016-11-17 R. Bernstein <rocky@users.noreply.github.com>
* : Merge pull request #67 from rocky/2.6-cf-ignore-if 2.6 cf ignore if
2016-11-16 rocky <rb@dustyfeet.com>
* test/simple_source/bug26/03_if_vs_and.py, uncompyle6/main.py,
uncompyle6/semantics/check_ast.py, uncompyle6/semantics/pysource.py:
More AST checking Small fixes in output format
2016-11-15 rocky <rb@dustyfeet.com>
* uncompyle6/parsers/parse23.py, uncompyle6/parsers/parse26.py,
uncompyle6/scanners/scanner2.py: WIP Grammar changes - reinstatng
COME_FROMs around ignore_if's
2016-11-14 rocky <rb@dustyfeet.com>
* MANIFEST.in: Revise MANIFEST.in with what we have
2016-11-15 rocky <rb@dustyfeet.com>
* : commit 0f719d41fdf08d41de594abb1664ab42ff92bbdf Author: rocky
<rb@dustyfeet.com> Date: Mon Nov 14 20:20:07 2016 -0500
2016-11-14 rocky <rb@dustyfeet.com>
* uncompyle6/parsers/parse26.py, uncompyle6/scanners/scanner2.py:
WIP remove COME_FROMs around ignore_if's
2016-11-14 rocky <rb@dustyfeet.com>
* uncompyle6/parsers/parse26.py, uncompyle6/scanners/scanner2.py:
WIP remove COME_FROMs around ignore_if's
2016-11-14 rocky <rb@dustyfeet.com>
* uncompyle6/scanners/scanner2.py, uncompyle6/scanners/scanner26.py:
Show line numbers in 2.6 "after" asm .. start to understand some of the Python 2.6 bytecode parse failures.
2016-11-13 rocky <rb@dustyfeet.com>
* README.rst, uncompyle6/verify.py: Handle verify syntax errors... Update README.rst stats
2016-11-13 rocky <rb@dustyfeet.com>
* setup.py: Administrivia: Fixes #66
2016-11-13 rocky <rb@dustyfeet.com>
* ChangeLog, NEWS, uncompyle6/version.py: Get ready for release
2.9.5
2016-11-13 rocky <rb@dustyfeet.com>
* uncompyle6/parsers/parse3.py, uncompyle6/parsers/parse30.py,
uncompyle6/semantics/make_function.py: Python 3 bugs ... - Was using "while 1 .. else" improperly - docstring indent bug: was indenting docstring improperly
2016-11-13 rocky <rb@dustyfeet.com>
* README.rst: Revise what works and what doesn't
2016-11-13 rocky <rb@dustyfeet.com>
* test/simple_source/bug30/02_while1_if_while1.py,
uncompyle6/parsers/parse3.py, uncompyle6/parsers/parse30.py,
uncompyle6/scanners/scanner3.py, uncompyle6/semantics/fragments.py,
uncompyle6/semantics/pysource.py: Python 3.0 while1 if bug... Is a workaround. We really need more tagging in of SETUP_LOOP and
COME_FROM.
2016-11-11 rocky <rb@dustyfeet.com>
* uncompyle6/parser.py, uncompyle6/semantics/check_ast.py,
uncompyle6/semantics/fragments.py, uncompyle6/semantics/pysource.py:
Revert augassign change but.. Make note of what's going on and add grammar test for bad situations
we have in Python 2.6 (and perhaps others)
2016-11-11 rocky <rb@dustyfeet.com>
* test/test_pyenvlib.py, uncompyle6/parser.py,
uncompyle6/semantics/pysource.py: augassign semantic action bug
2016-11-10 rocky <rb@dustyfeet.com>
* test/simple_source/bug33/02_pos_args.py,
test/simple_source/bug33/03_func_params.py,
uncompyle6/semantics/fragments.py,
uncompyle6/semantics/make_function.py: Bug in detecting 3.3 default
value in lambda
2016-11-10 rocky <rb@dustyfeet.com>
* uncompyle6/main.py, uncompyle6/semantics/check_ast.py,
uncompyle6/semantics/fragments.py, uncompyle6/semantics/pysource.py:
Detect some erroneous decompilations Until we can actually prevent these in grammar rules, we will warn
of improper decompilations. Also, we now stop when we hit a decompile error. Previously we were
not.
2016-11-10 rocky <rb@dustyfeet.com>
* uncompyle6/semantics/pysource.py: Remove unused imports
2016-11-07 rocky <rb@dustyfeet.com>
* uncompyle6/parsers/parse3.py, uncompyle6/parsers/parse30.py,
uncompyle6/parsers/parse32.py: Possiby tidy grammar
2016-11-06 rocky <rb@dustyfeet.com>
* __pkginfo__.py: Bump xdis to get correct 3.0 bytecodes
2016-11-06 rocky <rb@dustyfeet.com>
* uncompyle6/parsers/parse33.py, uncompyle6/parsers/parse34.py: Some
Python 3.4 grammar rules apply to Python 3.3
2016-11-06 rocky <rb@dustyfeet.com>
* test/Makefile, test/test_pythonlib.py,
uncompyle6/parsers/parse30.py: Start bytecode 3.0 decompiling
2016-11-06 rocky <rb@dustyfeet.com>
* uncompyle6/parsers/parse30.py, uncompyle6/scanners/scanner3.py,
uncompyle6/scanners/scanner30.py: Python 3.0 doesn't have POP_JUMP
ops... In some ways Python 3.0 code generation is more like Python 2.6 (and
before) than it is Python 2.7 or 3.0.
2016-11-05 R. Bernstein <rocky@users.noreply.github.com>
* : Merge pull request #63 from rocky/python-3.0 Python 3.0
2016-11-05 rocky <rb@dustyfeet.com>
* : commit cd3cf5ec2960a733e9fedca9c4549caf33c2d1d0 Author: rocky
<rb@dustyfeet.com> Date: Thu Nov 3 21:26:12 2016 -0400
2016-11-02 rocky <rb@dustyfeet.com>
* ChangeLog, NEWS, __pkginfo__.py, uncompyle6/version.py: Get ready
for release 2.9.4
2016-11-02 rocky <rb@dustyfeet.com>
* README.rst: Update unpyc3 info.
2016-11-01 rocky <rb@dustyfeet.com>
* pytest/test_grammar.py, uncompyle6/parsers/parse3.py,
uncompyle6/parsers/parse31.py, uncompyle6/parsers/parse32.py,
uncompyle6/semantics/make_function.py: Clean up annotation grammar a
little
2016-11-01 rocky <rb@dustyfeet.com>
* test/simple_source/bug31/04_def_annotate.py,
uncompyle6/semantics/make_function.py: Full Python 3 annotations
2016-10-30 rocky <rb@dustyfeet.com>
* .gitignore, README.rst, test/simple_source/def/03_class_method.py:
Note github unpyc3 and.. - Add source to bytecode_2.2/03_class_method.pyc - more ignore
2016-10-30 rocky <rb@dustyfeet.com>
* uncompyle6/semantics/make_function.py: More source-code line
indention in make_function.. and remove Python 3 situations from make_function2()
2016-10-29 rocky <rb@dustyfeet.com>
* uncompyle6/semantics/make_function.py,
uncompyle6/semantics/pysource.py: More annotation processing in to
make_function Move return-value annotation determination from n_mkfunc_annotate to
make_function_annotate which is where other kinds of annotation
handling will also need to be done.
2016-10-29 rocky <rb@dustyfeet.com>
* uncompyle6/semantics/fragments.py,
uncompyle6/semantics/make_function.py,
uncompyle6/semantics/parser_error.py,
uncompyle6/semantics/pysource.py: Break out make_function() into its
own file. It is already too complex and will get worse in Python 3.6. Note: make_function in fragments.py is still inside and probably
needs fixup.
2016-10-28 rocky <rb@dustyfeet.com>
* pytest/test_grammar.py, uncompyle6/parsers/parse3.py,
uncompyle6/parsers/parse31.py, uncompyle6/parsers/parse32.py,
uncompyle6/parsers/parse35.py, uncompyle6/semantics/pysource.py:
More complete annotate handling Still have a bit of work to do though.
2016-10-28 rocky <rb@dustyfeet.com>
* pytest/test_grammar.py, uncompyle6/parsers/parse3.py,
uncompyle6/parsers/parse32.py, uncompyle6/parsers/parse33.py,
uncompyle6/parsers/parse34.py, uncompyle6/semantics/pysource.py:
Expand annotate return to Python 3.4
2016-10-28 rocky <rb@dustyfeet.com>
* pytest/test_grammar.py, uncompyle6/parsers/parse31.py,
uncompyle6/parsers/parse32.py, uncompyle6/semantics/pysource.py:
Expand annotate handling to 3.3 (and possibly 3.2) - DRY Python 3.1-3.3 grammar a little
2016-10-28 rocky <rb@dustyfeet.com>
* uncompyle6/parser.py, uncompyle6/parsers/parse3.py,
uncompyle6/parsers/parse31.py, uncompyle6/parsers/parse32.py,
uncompyle6/parsers/parse33.py, uncompyle6/parsers/parse35.py: Split
out 3.1-3.3 parsers from parser3.py This is anticipation of extending annotation to Python 3.2+
2016-10-27 rocky <rb@dustyfeet.com>
* test/simple_source/bug31/04_def_annotate.py,
test/simple_source/bug31/04_def_attr.py,
uncompyle6/parsers/parse31.py, uncompyle6/semantics/pysource.py:
Clean and fix Python 3 annotate arg return
2016-10-26 rocky <rb@dustyfeet.com>
* uncompyle6/version.py: Get ready for release 2.9.3
* __pkginfo__.py: Dependencies stay within 2nd semantic level
2016-10-26 rocky <rb@dustyfeet.com>
* ChangeLog, NEWS, uncompyle6/version.py: Get ready for release
2.9.3
2016-10-26 rocky <rb@dustyfeet.com>
@@ -40,6 +411,13 @@
* uncompyle6/parsers/parse3.py, uncompyle6/scanners/scanner3.py: Fix
some Python 3.1 bugs
2016-10-24 rocky <rb@dustyfeet.com>
* Makefile, test/Makefile, test/test_pyenvlib.py,
uncompyle6/bin/uncompile.py, uncompyle6/parser.py,
uncompyle6/parsers/parse3.py, uncompyle6/scanner.py,
uncompyle6/scanners/scanner3.py: Start Python 3.0 decoding Fix some Python 3.1 bugs
2016-10-22 Daniel Bradburn <moagstar@gmail.com>
* : Merge pull request #60 from rocky/buildstring Buildstring

View File

@@ -2,10 +2,16 @@ include README.rst
include ChangeLog
include HISTORY.md
include LICENSE
include Makefile
include requirements.txt
include requirements-dev.txt
include DECOMPYLE-2.4-CHANGELOG.txt
include __pkginfo__.py
recursive-include uncompyle6 *.py
include bin/uncompyle6
include bin/pydisassemble
include pytest/Makefile
include test/Makefile
recursive-include test *.py *.pyc
recursive-include pytest *.py
recursive-include pytest/testdata *

View File

@@ -33,7 +33,7 @@ check-2.7 check-3.3 check-3.4: pytest
#: Tests for Python 3.2 and 3.5 - pytest doesn't work here
# Or rather 3.5 doesn't work not on Travis
check-3.1 check-3.2 check-3.5 check-3.6:
check-3.0 check-3.1 check-3.2 check-3.5 check-3.6:
$(MAKE) -C test $@
#:Tests for Python 2.6 (doesn't have pytest)

51
NEWS
View File

@@ -1,3 +1,54 @@
uncompyle6 2.9.7 2016-12-16
- Start to handle 3.5/3.6 build_map_unpack_with_call
- Some Python 3.6 bytecode to wordcode conversion fixes
- option -g: show start-end range when possible
- track print_docstring move to help (used in python 3.1)
- verify: allow RETURN_VALUE to match RETURN_END_IF
- some 3.2 compatibility
- Better Python 3 control flow detection by adding Pseudo ELSE opcodes
uncompyle6 2.9.6 2016-12-04
- Shorten Python3 grammars with + and *
this requires spark parser 1.5.1
- Add some AST reduction checks to improve
decompile accuracy. This too requires
spark parser 1.5.1
uncompyle6 2.9.6 2016-11-20
- Correct MANIFEST.in
- More AST grammar checking
- --linemapping option or linenumbers.line_number_mapping()
Shows correspondence of lines between source
and decompiled source
- Some control flow adjustments in code for 2.x.
This is probably an improvement in 2.6 and before.
For 2.7 things are just shuffled around a little. Sigh.
Overall I think we are getting more precise in
or analysis even if it is not always reflected
in the results.
- better control flow debugging output
- Python 2 and 3 detect structure code is more similar
- Handle Docstrings with embedded tiple quotes (""")
uncompyle6 2.9.5 2016-11-13
- Fix Python 3 bugs:
* improprer while 1 else
* docstring indent
* 3.3 default values in lambda expressions
* start 3.0 decompilation (needs newer xdis)
- Start grammar misparse checking
uncompyle6 2.9.4 2016-11-02
- Handle Python 3.x function annotations
- track def keywoard-parameter line-splitting in source code better
- bump min xdis version to mask previous xdis bug
uncompyle6 2.9.3 2016-10-26
Release forced by incompatiblity change in xdis 3.2.0.

View File

@@ -1,4 +1,4 @@
|buildstatus|
|buildstatus| |Supported Python Versions|
uncompyle6
==========
@@ -20,9 +20,9 @@ Why this?
There were a number of decompyle, uncompile, uncompyle2, uncompyle3
forks around. All of them came basically from the same code base, and
almost all of them no were no longer actively maintained. Only one
handled Python 3, and even there, only 3.2. This code pulls these
together and moves forward. It also addresses a number of open issues
in the previous forks.
handled Python 3, and even there, only 3.2 or 3.3 depending on which
code is used. This code pulls these together and moves forward. It
also addresses a number of open issues in the previous forks.
What makes this different from other CPython bytecode decompilers?: its
ability to deparse just fragments and give source-code information
@@ -43,7 +43,8 @@ information.
Requirements
------------
This project requires Python 2.6 or later, PyPy 3-2.4, or PyPy-5.0.1.
This project requires Python 2.6 or later, PyPy 3-2.4, or PyPy-5.0.1.
Python versions 2.3-2.7 are supported in the python-2.4 branch.
The bytecode files it can read has been tested on Python bytecodes from
versions 2.1-2.7, and 3.2-3.6 and the above-mentioned PyPy versions.
@@ -96,26 +97,41 @@ For usage help:
Known Bugs/Restrictions
-----------------------
About 90% of the decompilation verifies from Python 2.3.7 to Python
3.4.2 on the standard library packages I have on my system.
The biggest known and possibly fixable (but hard) problem has to do
with handling control flow. All of the Python decompilers I have looked
at have the same problem. In some cases we can detect an erroneous
decompilation and report that.
(Verification is the process of decompiling bytecode, compiling with a
Python for that byecode version, and then comparing the byetcode
About 90% of the decompilation of Python standard library packages in
Python 2.7.12 verifies correctly. Over 99% of Python 2.7 and 3.3-3.5
"weakly" verify. Python 2.6 drops down to 96% weakly verifying.
Other versions drop off in quality too.
*Verification* is the process of decompiling bytecode, compiling with
a Python for that bytecode version, and then comparing the bytecode
produced by the decompiled/compiled program. Some allowance is made
for inessential differences, but other semantically equivalent
differences are not caught.)
for inessential differences. But other semantically equivalent
differences are not caught. For example ``1 and 0`` is decompiled to
the equivalent ``0``; remnants of the first true evaluation (1) is
lost when Python compiles this. When Python next compiles ``0`` the
resulting code is simpler.
Later distributions average about 200 files. At this point, 2.7
decompilation is definitely better than uncompyle2. A number of bugs
have been fixed. We now handle more Python bytecodes than the old
decompyle program that handled Python bytecodes ranging from 1.5 to
2.4. There is some work do do on the lower end which is more
difficult for us since we don't have a Python interpreter for versions
1.5, 1.6 or 2.0.
*Weak Verification*
on the other hand doesn't check bytecode for equivalence but does
check to see if the resulting decompiled source is a valid Python
program by running the Python interpreter. Because the Python language
has changed so much, for best results you should use the same Python
Version in checking as used in the bytecode.
Python 3.5 largely works, but still has some bugs in it.
Python 3.6 changes things drastically by using word codes rather than
byte codes, and that needs to be addressed.
Later distributions average about 200 files. There is some work to do
on the lower end Python versions which is more difficult for us to
handle since we don't have a Python interpreter for versions 1.5, 1.6,
and 2.0.
Python 3.0 support is weak; Python 3.5 largely works, but still has
some bugs in it. Python 3.6 changes things drastically by using word
codes rather than byte codes. That has been addressed, but then it also
changes function call opcodes and its semantics.
Currently not all Python magic numbers are supported. Specifically in
some versions of Python, notably Python 3.6, the magic number has
@@ -125,6 +141,11 @@ which use their own magic and encrypt bytcode. With the exception of
the Dropbox's old Python 2.5 interpreter this kind of thing is not
handled.
We also don't handle PJOrion_ obfuscated code. For that try: PJOrion
Deobfuscator_ to unscramble the bytecode to get valid bytecode before
trying this tool.
There is lots to do, so please dig in and help.
See Also
@@ -132,6 +153,7 @@ See Also
* https://github.com/zrax/pycdc : supports all versions of Python and is written in C++
* https://code.google.com/archive/p/unpyc3/ : supports Python 3.2 only. The above projects use a different decompiling technique what is used here.
* https://github.com/figment/unpyc3/ : fork of above, but supports Python 3.3 only. Include some fixes like supporting function annotations
* The HISTORY_ file.
.. |downloads| image:: https://img.shields.io/pypi/dd/uncompyle6.svg
@@ -143,3 +165,7 @@ See Also
.. _this: https://github.com/rocky/python-uncompyle6/wiki/Deparsing-technology-and-its-use-in-exact-location-reporting
.. |buildstatus| image:: https://travis-ci.org/rocky/python-uncompyle6.svg
:target: https://travis-ci.org/rocky/python-uncompyle6
.. |Supported Python Versions| image:: https://img.shields.io/pypi/pyversions/uncompyle6.svg
:target: https://pypi.python.org/pypi/uncompyle6/
.. _PJOrion: http://www.koreanrandom.com/forum/topic/15280-pjorion-%D1%80%D0%B5%D0%B4%D0%B0%D0%BA%D1%82%D0%B8%D1%80%D0%BE%D0%B2%D0%B0%D0%BD%D0%B8%D0%B5-%D0%BA%D0%BE%D0%BC%D0%BF%D0%B8%D0%BB%D1%8F%D1%86%D0%B8%D1%8F-%D0%B4%D0%B5%D0%BA%D0%BE%D0%BC%D0%BF%D0%B8%D0%BB%D1%8F%D1%86%D0%B8%D1%8F-%D0%BE%D0%B1%D1%84
.. _Deobfuscator: https://github.com/extremecoders-re/PjOrion-Deobfuscator

View File

@@ -12,14 +12,16 @@ copyright = """
Copyright (C) 2015, 2016 Rocky Bernstein <rb@dustyfeet.com>.
"""
classifiers = ['Development Status :: 4 - Beta',
classifiers = ['Development Status :: 5 - Production/Stable',
'Intended Audience :: Developers',
'Operating System :: OS Independent',
'Programming Language :: Python',
'Programming Language :: Python :: 2',
'Programming Language :: Python :: 2.4',
'Programming Language :: Python :: 2.5',
'Programming Language :: Python :: 2.6',
'Programming Language :: Python :: 2.7',
'Programming Language :: Python :: 3',
'Programming Language :: Python :: 3.1',
'Programming Language :: Python :: 3.2',
'Programming Language :: Python :: 3.3',
'Programming Language :: Python :: 3.4',
'Programming Language :: Python :: 3.5',
@@ -37,8 +39,8 @@ entry_points={
'pydisassemble=uncompyle6.bin.pydisassemble:main',
]}
ftp_url = None
install_requires = ['spark-parser >= 1.4.0, < 1.5.0',
'xdis >= 3.2.0, < 3.3.0']
install_requires = ['spark-parser >= 1.5.1, < 1.6.0',
'xdis >= 3.2.4, < 3.3.0']
license = 'MIT'
mailing_list = 'python-debugger@googlegroups.com'
modname = 'uncompyle6'

View File

@@ -1,6 +1,6 @@
machine:
python:
version: 2.7.8
version: 2.7.10
environment:
COMPILE: --compile

78
pytest/test_docstring.py Normal file
View File

@@ -0,0 +1,78 @@
import sys
from uncompyle6 import PYTHON3
if PYTHON3:
from io import StringIO
minint = -sys.maxsize-1
maxint = sys.maxsize
else:
from StringIO import StringIO
minint = -sys.maxint-1
maxint = sys.maxint
from uncompyle6.semantics.helper import print_docstring
class PrintFake():
def __init__(self):
self.pending_newlines = 0
self.f = StringIO()
def write(self, *data):
if (len(data) == 0) or (len(data) == 1 and data[0] == ''):
return
out = ''.join((str(j) for j in data))
n = 0
for i in out:
if i == '\n':
n += 1
if n == len(out):
self.pending_newlines = max(self.pending_newlines, n)
return
elif n:
self.pending_newlines = max(self.pending_newlines, n)
out = out[n:]
break
else:
break
if self.pending_newlines > 0:
self.f.write('\n'*self.pending_newlines)
self.pending_newlines = 0
for i in out[::-1]:
if i == '\n':
self.pending_newlines += 1
else:
break
if self.pending_newlines:
out = out[:-self.pending_newlines]
self.f.write(out)
def println(self, *data):
if data and not(len(data) == 1 and data[0] ==''):
self.write(*data)
self.pending_newlines = max(self.pending_newlines, 1)
return
pass
def test_docstring():
for doc, expect in (
("Now is the time",
' """Now is the time"""'),
("""
Now is the time
""",
''' """
Now is the time
"""''')
# (r'''func placeholder - ' and with ("""\nstring\n """)''',
# """ r'''func placeholder - ' and with (\"\"\"\nstring\n\"\"\")'''"""),
# (r"""func placeholder - ' and with ('''\nstring\n''') and \"\"\"\nstring\n\"\"\" """,
# """ r\"\"\"func placeholder - ' and with ('''\nstring\n''') and \"\"\"\nstring\n\"\"\" \"\"\"""")
):
o = PrintFake()
# print(doc)
# print(expect)
print_docstring(o, ' ', doc)
assert expect == o.f.getvalue()

View File

@@ -8,6 +8,18 @@ def bug(state, slotstate):
for key, value in slotstate.items():
setattr(state, key, 2)
# From 2.7 disassemble
# Problem is not getting while, because
# COME_FROM not added
def bug_loop(disassemble, tb=None):
if tb:
try:
tb = 5
except AttributeError:
raise RuntimeError
while tb: tb = tb.tb_next
disassemble(tb)
def test_if_in_for():
code = bug.__code__
scan = get_scanner(PYTHON_VERSION)
@@ -16,18 +28,35 @@ def test_if_in_for():
n = scan.setup_code(code)
scan.build_lines_data(code, n)
scan.build_prev_op(n)
fjt = scan.find_jump_targets()
fjt = scan.find_jump_targets(False)
assert {15: [3], 69: [66], 63: [18]} == fjt
assert scan.structs == \
[{'start': 0, 'end': 72, 'type': 'root'},
{'start': 18, 'end': 66, 'type': 'if-then'},
{'start': 15, 'end': 66, 'type': 'if-then'},
{'start': 31, 'end': 59, 'type': 'for-loop'},
{'start': 62, 'end': 63, 'type': 'for-else'}]
code = bug_loop.__code__
n = scan.setup_code(code)
scan.build_lines_data(code, n)
scan.build_prev_op(n)
fjt = scan.find_jump_targets(False)
assert{64: [42], 67: [42, 42], 42: [16, 41], 19: [6]} == fjt
assert scan.structs == [
{'start': 0, 'end': 80, 'type': 'root'},
{'start': 3, 'end': 64, 'type': 'if-then'},
{'start': 6, 'end': 15, 'type': 'try'},
{'start': 19, 'end': 38, 'type': 'except'},
{'start': 45, 'end': 67, 'type': 'while-loop'},
{'start': 70, 'end': 64, 'type': 'while-else'},
# previous bug was not mistaking while-loop for if-then
{'start': 48, 'end': 67, 'type': 'while-loop'}]
elif 3.2 < PYTHON_VERSION <= 3.4:
scan.code = array('B', code.co_code)
scan.build_lines_data(code)
scan.build_prev_op()
fjt = scan.find_jump_targets()
fjt = scan.find_jump_targets(False)
assert {69: [66], 63: [18]} == fjt
assert scan.structs == \
[{'end': 72, 'type': 'root', 'start': 0},

View File

@@ -1,6 +1,6 @@
import pytest, re
import re
from uncompyle6 import PYTHON_VERSION, PYTHON3, IS_PYPY # , PYTHON_VERSION
from uncompyle6.parser import get_python_parser
from uncompyle6.parser import get_python_parser, python_parser
from uncompyle6.scanner import get_scanner
def test_grammar():
@@ -16,14 +16,21 @@ def test_grammar():
p = get_python_parser(PYTHON_VERSION, is_pypy=IS_PYPY)
lhs, rhs, tokens, right_recursive = p.checkSets()
expect_lhs = set(['expr1024', 'pos_arg'])
unused_rhs = set(['build_list', 'call_function', 'mkfunc', 'mklambda',
unused_rhs = set(['build_list', 'call_function', 'mkfunc',
'mklambda',
'unpack', 'unpack_list'])
expect_right_recursive = [['designList', ('designator', 'DUP_TOP', 'designList')]]
if PYTHON3:
expect_lhs.add('load_genexpr')
unused_rhs = unused_rhs.union(set("""
except_pop_except genexpr classdefdeco2 listcomp
""".split()))
if 3.0 <= PYTHON_VERSION:
expect_lhs.add("annotate_arg")
expect_lhs.add("annotate_tuple")
unused_rhs.add("mkfunc_annotate")
pass
else:
expect_lhs.add('kwarg')
assert expect_lhs == set(lhs)
@@ -34,7 +41,7 @@ def test_grammar():
"""
JUMP_BACK CONTINUE RETURN_END_IF
COME_FROM COME_FROM_EXCEPT COME_FROM_LOOP COME_FROM_WITH
COME_FROM_FINALLY
COME_FROM_FINALLY ELSE
LOAD_GENEXPR LOAD_ASSERT LOAD_SETCOMP LOAD_DICTCOMP
LAMBDA_MARKER RETURN_LAST
""".split())
@@ -43,5 +50,14 @@ def test_grammar():
check_tokens(tokens, opcode_set)
elif PYTHON_VERSION == 3.4:
ignore_set.add('LOAD_CLASSNAME')
ignore_set.add('STORE_LOCALS')
opcode_set = set(s.opc.opname).union(ignore_set)
check_tokens(tokens, opcode_set)
def test_dup_rule():
import inspect
python_parser(PYTHON_VERSION, inspect.currentframe().f_code,
is_pypy=IS_PYPY,
parser_debug={
'dups': True, 'transition': False, 'reduce': False,
'rules': False, 'errorstack': None, 'context': True})

View File

@@ -24,6 +24,6 @@ setup(
py_modules = py_modules,
test_suite = 'nose.collector',
url = web,
setup_requires = ['nose>=1.0'],
tests_require = ['nose>=1.0'],
version = VERSION,
zip_safe = zip_safe)

View File

@@ -22,6 +22,10 @@ check:
#: Run working tests from Python 2.6 or 2.7
check-2.6 check-2.7: check-bytecode-2 check-bytecode-3 check-bytecode-1 check-2.7-ok
#: Run working tests from Python 3.0
check-3.0: check-bytecode
$(PYTHON) test_pythonlib.py --bytecode-3.0 --weak-verify $(COMPILE)
#: Run working tests from Python 3.1
check-3.1: check-bytecode
$(PYTHON) test_pythonlib.py --bytecode-3.1 --weak-verify $(COMPILE)
@@ -32,11 +36,11 @@ check-3.2: check-bytecode
#: Run working tests from Python 3.3
check-3.3: check-bytecode
$(PYTHON) test_pythonlib.py --bytecode-3.3 --weak-verify $(COMPILE)
$(PYTHON) test_pythonlib.py --bytecode-3.3 --verify $(COMPILE)
#: Run working tests from Python 3.4
check-3.4: check-bytecode check-3.4-ok check-2.7-ok
$(PYTHON) test_pythonlib.py --bytecode-3.4 --weak-verify $(COMPILE)
$(PYTHON) test_pythonlib.py --bytecode-3.4 --verify $(COMPILE)
#: Run working tests from Python 3.5
check-3.5: check-bytecode
@@ -62,7 +66,8 @@ check-bytecode-2:
#: Check deparsing bytecode 3.x only
check-bytecode-3:
$(PYTHON) test_pythonlib.py --bytecode-3.1 --bytecode-3.2 --bytecode-3.3 \
$(PYTHON) test_pythonlib.py --bytecode-3.0 \
--bytecode-3.1 --bytecode-3.2 --bytecode-3.3 \
--bytecode-3.4 --bytecode-3.5 --bytecode-pypy3.2
#: Check deparsing bytecode that works running Python 2 and Python 3
@@ -99,7 +104,11 @@ check-bytecode-2.6:
#: Check deparsing Python 2.7
check-bytecode-2.7:
$(PYTHON) test_pythonlib.py --bytecode-2.7
$(PYTHON) test_pythonlib.py --bytecode-2.7 --verify
#: Check deparsing Python 3.0
check-bytecode-3.0:
$(PYTHON) test_pythonlib.py --bytecode-3.0
#: Check deparsing Python 3.1
check-bytecode-3.1:

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@@ -0,0 +1,18 @@
# Bug was using continue fouling up 1st elif, by confusing
# the "pass" for "continue" by not recognizing the if jump
# around it. We fixed by ignoring what's done in Python 2.7
# Better is better detection of control structures
def _compile_charset(charset, flags, code, fixup=None):
# compile charset subprogram
emit = code.append
if fixup is None:
fixup = 1
for op, av in charset:
if op is flags:
pass
elif op is code:
emit(fixup(av))
else:
raise RuntimeError
emit(5)

View File

@@ -0,0 +1,22 @@
# From 2.6 decimal
# Bug was not recognizing scope of if and
# turning it into xc == 1 and xe *= yc
def _power_exact(y, xc, yc, xe):
yc, ye = y.int, y.exp
while yc % 10 == 0:
yc //= 10
ye += 1
if xc == 1:
xe *= yc
while xe % 10 == 0:
xe //= 10
ye += 1
if ye < 0:
return None
exponent = xe * 10**ye
if y and xe:
xc = exponent
else:
xc = 0
return 5

View File

@@ -0,0 +1,9 @@
# From python 3.4 sre.pyc
while 1:
if __file__:
while 1:
if __file__:
break
raise RuntimeError
else:
raise RuntimeError

View File

@@ -1,5 +1,9 @@
# Bug in 3.1 _pyio.py. The -> "IOBase" is problematic
# Python 3 annotations
def foo(a, b: 'annotating b', c: int) -> float:
print(a + b + c)
# Python 3.1 _pyio.py uses the -> "IOBase" annotation
def open(file, mode = "r", buffering = None,
encoding = None, errors = None,
newline = None, closefd = True) -> "IOBase":

View File

@@ -0,0 +1,3 @@
# From Python 3.3.6 hmac.py
# Problem was getting wrong placement of positional args
digest_cons = lambda d=b'': 5

View File

@@ -0,0 +1,6 @@
# Bug from Python 3.3 codecs.py
# Bug is in 3.3 handling of this complicated parameter list
def __new__(cls, encode, decode, streamreader=None, streamwriter=None,
incrementalencoder=None, incrementaldecoder=None, name=None,
*, _is_text_encoding=None):
return

View File

@@ -0,0 +1,8 @@
# Bug from 3.4 threading. Bug is handling while/else
def acquire(self):
with self._cond:
while self:
rc = False
else:
rc = True
return rc

View File

@@ -0,0 +1 @@
f(**a, **b)

View File

@@ -0,0 +1,13 @@
# From Decompyle++
# File: 22_class_method.pyc (Python 2.2)
# An old-style Python class.
class MyClass:
def method(self, i):
if i is 5:
print 'five'
elif not (i is 2):
print 'not two'
else:
print '2'

View File

@@ -0,0 +1,7 @@
# uncompyle2 bug was not escaping """ properly
r'''func placeholder - with ("""\nstring\n""")'''
def foo():
r'''func placeholder - ' and with ("""\nstring\n""")'''
def bar():
r"""func placeholder - ' and with ('''\nstring\n''') and \"\"\"\nstring\n\"\"\" """

View File

@@ -8,7 +8,7 @@ Usage-Examples:
test_pyenvlib.py --all # decompile all tests (suite + libs)
test_pyenvlib.py --all --verify # decomyile all tests and verify results
test_pyenvlib.py --test # decompile only the testsuite
test_pyenvlib.py --2.7.11 --verify # decompile and verify python lib 2.7.11
test_pyenvlib.py --2.7.12 --verify # decompile and verify python lib 2.7.11
Adding own test-trees:
@@ -31,7 +31,8 @@ TEST_VERSIONS=('2.3.7', '2.4.6', '2.5.6', '2.6.9',
'pypy-2.4.0', 'pypy-2.6.1',
'pypy-5.0.1', 'pypy-5.3.1',
'2.7.10', '2.7.11', '2.7.12',
'3.1.5', '3.2.6', '3.3.5', '3.3.6',
'3.0.1', '3.1.5', '3.2.6',
'3.3.5', '3.3.6',
'3.4.2', '3.5.1')
target_base = '/tmp/py-dis/'

View File

@@ -80,7 +80,7 @@ for vers in (2.7, 3.4, 3.5, 3.6):
for vers in (1.5,
2.1, 2.2, 2.3, 2.4, 2.5, 2.6, 2.7,
3.1, 3.2, 3.3,
3.0, 3.1, 3.2, 3.3,
3.4, 3.5, 3.6, 'pypy3.2', 'pypy2.7'):
bytecode = "bytecode_%s" % vers
key = "bytecode-%s" % vers

View File

@@ -6,13 +6,14 @@ filename = *.py
ignore = C901,E113,E121,E122,E123,E124,E125,E126,E127,E128,E129,E201,E202,E203,E221,E222,E225,E226,E241,E242,E251,E261,E271,E272,E302,E401,E501,F401,E701,E702
[tox]
envlist = py26, py27, pypy
envlist = py27, py34, pypy
[testenv]
deps =
requests>=0.8.8
mock>=1.0.1
commands = python -W always setup.py nosetests {posargs}
hypothesis
pytest
flake8
commands = python -W always make test {posargs}
[testenv:py27]
deps =

View File

@@ -5,7 +5,7 @@
# Copyright (c) 2000-2002 by hartmut Goebel <h.goebel@crazy-compilers.com>
#
from __future__ import print_function
import sys, os, getopt, tempfile, time
import sys, os, getopt, time
program, ext = os.path.splitext(os.path.basename(__file__))
@@ -35,7 +35,8 @@ Options:
-p <integer> use <integer> number of processes
-r recurse directories looking for .pyc and .pyo files
--verify compare generated source with input byte-code
(requires -o)
--linemaps generated line number correspondencies between byte-code
and generated source output
--help show this message
Debugging Options:
@@ -64,8 +65,10 @@ def usage():
def main_bin():
if not (sys.version_info[0:2] in ((2, 6), (2, 7), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6))):
print('Error: %s requires Python 2.6, 2.7, 3.2, 3.3, 3.4, 3.5, or 3.6' % program,
if not (sys.version_info[0:2] in ((2, 6), (2, 7),
(3, 1), (3, 2), (3, 3),
(3, 4), (3, 5), (3, 6))):
print('Error: %s requires Python 2.6-2.7, or 3.1-3.6' % program,
file=sys.stderr)
sys.exit(-1)
@@ -79,8 +82,8 @@ def main_bin():
try:
opts, files = getopt.getopt(sys.argv[1:], 'hagtdrVo:c:p:',
'help asm grammar recurse timestamp tree verify version '
'showgrammar'.split(' '))
'help asm grammar linemaps recurse timestamp tree '
'verify version showgrammar'.split(' '))
except getopt.GetoptError as e:
print('%s: %s' % (os.path.basename(sys.argv[0]), e), file=sys.stderr)
sys.exit(-1)
@@ -95,8 +98,10 @@ def main_bin():
sys.exit(0)
elif opt == '--verify':
options['do_verify'] = True
elif opt == '--linemaps':
options['do_linemaps'] = True
elif opt in ('--asm', '-a'):
options['showasm'] = True
options['showasm'] = 'after'
options['do_verify'] = False
elif opt in ('--tree', '-t'):
options['showast'] = True
@@ -137,18 +142,13 @@ def main_bin():
if src_base:
sb_len = len( os.path.join(src_base, '') )
files = [f[sb_len:] for f in files]
del sb_len
if not files:
print("No files given", file=sys.stderr)
usage()
if outfile == '-':
if 'do_verify' in options and options['do_verify'] and len(files) == 1:
junk, outfile = tempfile.mkstemp(suffix=".pyc",
prefix=files[0][0:-4]+'-')
else:
outfile = None # use stdout
outfile = None # use stdout
elif outfile and os.path.isdir(outfile):
out_base = outfile; outfile = None
elif outfile and len(files) > 1:

61
uncompyle6/linenumbers.py Normal file
View File

@@ -0,0 +1,61 @@
from collections import deque
from xdis.code import iscode
from xdis.load import load_file, load_module
from xdis.main import get_opcode
from xdis.bytecode import Bytecode, findlinestarts, offset2line
def line_number_mapping(pyc_filename, src_filename):
(version, timestamp, magic_int, code1, is_pypy,
source_size) = load_module(pyc_filename)
try:
code2 = load_file(src_filename)
except SyntaxError as e:
return str(e)
queue = deque([code1, code2])
mappings = []
opc = get_opcode(version, is_pypy)
number_loop(queue, mappings, opc)
return sorted(mappings, key=lambda x: x[1])
def number_loop(queue, mappings, opc):
while len(queue) > 0:
code1 = queue.popleft()
code2 = queue.popleft()
assert code1.co_name == code2.co_name
linestarts_orig = findlinestarts(code1)
linestarts_uncompiled = list(findlinestarts(code2))
mappings += [[line, offset2line(offset, linestarts_uncompiled)] for offset, line in linestarts_orig]
bytecode1 = Bytecode(code1, opc)
bytecode2 = Bytecode(code2, opc)
instr2s = bytecode2.get_instructions(code2)
seen = set([code1.co_name])
for instr in bytecode1.get_instructions(code1):
next_code1 = None
if iscode(instr.argval):
next_code1 = instr.argval
if next_code1:
next_code2 = None
while not next_code2:
try:
instr2 = next(instr2s)
if iscode(instr2.argval):
next_code2 = instr2.argval
pass
except StopIteration:
break
pass
if next_code2:
assert next_code1.co_name == next_code2.co_name
if next_code1.co_name not in seen:
seen.add(next_code1.co_name)
queue.append(next_code1)
queue.append(next_code2)
pass
pass
pass
pass

View File

@@ -1,5 +1,5 @@
from __future__ import print_function
import datetime, os, sys
import datetime, os, subprocess, sys, tempfile
from uncompyle6 import verify, IS_PYPY
from xdis.code import iscode
@@ -7,11 +7,12 @@ from uncompyle6.disas import check_object_path
from uncompyle6.semantics import pysource
from uncompyle6.parser import ParserError
from uncompyle6.version import VERSION
from uncompyle6.linenumbers import line_number_mapping
from xdis.load import load_module
def uncompyle(
bytecode_version, co, out=None, showasm=False, showast=False,
bytecode_version, co, out=None, showasm=None, showast=False,
timestamp=None, showgrammar=False, code_objects={},
source_size=None, is_pypy=False, magic_int=None):
"""
@@ -45,15 +46,10 @@ def uncompyle(
is_pypy=is_pypy)
except pysource.SourceWalkerError as e:
# deparsing failed
print("\n")
print(co.co_filename)
if real_out != out:
print("\n", file=real_out)
print(e, file=real_out)
raise pysource.SourceWalkerError(str(e))
def uncompyle_file(filename, outstream=None, showasm=False, showast=False,
def uncompyle_file(filename, outstream=None, showasm=None, showast=False,
showgrammar=False):
"""
decompile Python byte-code file (.pyc)
@@ -64,7 +60,6 @@ def uncompyle_file(filename, outstream=None, showasm=False, showast=False,
(version, timestamp, magic_int, co, is_pypy,
source_size) = load_module(filename, code_objects)
if type(co) == list:
for con in co:
uncompyle(version, con, outstream, showasm, showast,
@@ -79,8 +74,9 @@ def uncompyle_file(filename, outstream=None, showasm=False, showast=False,
# FIXME: combine into an options parameter
def main(in_base, out_base, files, codes, outfile=None,
showasm=False, showast=False, do_verify=False,
showgrammar=False, raise_on_error=False):
showasm=None, showast=False, do_verify=False,
showgrammar=False, raise_on_error=False,
do_linemaps=False):
"""
in_base base directory for input files
out_base base directory for output files (ignored when
@@ -103,7 +99,6 @@ def main(in_base, out_base, files, codes, outfile=None,
pass
return open(outfile, 'w')
of = outfile
tot_files = okay_files = failed_files = verify_failed_files = 0
# for code in codes:
@@ -121,10 +116,21 @@ def main(in_base, out_base, files, codes, outfile=None,
# print (infile, file=sys.stderr)
if of: # outfile was given as parameter
if outfile: # outfile was given as parameter
outstream = _get_outstream(outfile)
elif out_base is None:
outstream = sys.stdout
if do_linemaps or do_verify:
prefix = os.path.basename(filename)
if prefix.endswith('.py'):
prefix = prefix[:-len('.py')]
junk, outfile = tempfile.mkstemp(suffix=".py",
prefix=prefix)
# Unbuffer output
sys.stdout = os.fdopen(sys.stdout.fileno(), 'w', 0)
tee = subprocess.Popen(["tee", outfile], stdin=subprocess.PIPE)
os.dup2(tee.stdin.fileno(), sys.stdout.fileno())
os.dup2(tee.stdin.fileno(), sys.stderr.fileno())
else:
if filename.endswith('.pyc'):
outfile = os.path.join(out_base, filename[0:-1])
@@ -137,13 +143,15 @@ def main(in_base, out_base, files, codes, outfile=None,
try:
uncompyle_file(infile, outstream, showasm, showast, showgrammar)
tot_files += 1
except (ValueError, SyntaxError, ParserError) as e:
sys.stderr.write("\n# file %s\n# %s" % (infile, e))
except (ValueError, SyntaxError, ParserError, pysource.SourceWalkerError) as e:
sys.stdout.write("\n")
sys.stderr.write("\n# file %s\n# %s\n" % (infile, e))
failed_files += 1
except KeyboardInterrupt:
if outfile:
outstream.close()
os.remove(outfile)
sys.stdout.write("\n")
sys.stderr.write("\nLast file: %s " % (infile))
raise
# except:
@@ -156,7 +164,15 @@ def main(in_base, out_base, files, codes, outfile=None,
# sys.stderr.write("\n# Can't uncompile %s\n" % infile)
else: # uncompile successful
if outfile:
if do_linemaps:
mapping = line_number_mapping(infile, outfile)
outstream.write("\n\n## Line number correspondences\n")
import pprint
s = pprint.pformat(mapping, indent=2, width=80)
s2 = '##' + '\n##'.join(s.split("\n")) + "\n"
outstream.write(s2)
outstream.close()
if do_verify:
weak_verify = do_verify == 'weak'
try:
@@ -171,17 +187,16 @@ def main(in_base, out_base, files, codes, outfile=None,
print(e)
verify_failed_files += 1
os.rename(outfile, outfile + '_unverified')
sys.stderr.write("### Error Verifying %s\n" % filename)
sys.stderr.write(str(e) + "\n")
if not outfile:
print("### Error Verifiying %s" % filename, file=sys.stderr)
print(e, file=sys.stderr)
if raise_on_error:
raise
pass
pass
pass
elif do_verify:
print("\n### uncompile successful, but no file to compare against",
file=sys.stderr)
sys.stderr.write("\n### uncompile successful, but no file to compare against\n")
pass
else:
okay_files += 1

View File

@@ -69,6 +69,33 @@ class PythonParser(GenericASTBuilder):
for i in dir(self):
setattr(self, i, None)
def debug_reduce(self, rule, tokens, parent, i):
"""Customized format and print for our kind of tokens
which gets called in debugging grammar reduce rules
"""
def fix(c):
s = str(c)
i = s.find('_')
return s if i == -1 else s[:i]
prefix = ''
if parent and tokens:
p_token = tokens[parent]
if hasattr(p_token, 'linestart') and p_token.linestart:
prefix = 'L.%3d: ' % p_token.linestart
else:
prefix = ' '
if hasattr(p_token, 'offset'):
prefix += "%3s" % fix(p_token.offset)
if len(rule[1]) > 1:
prefix += '-%-3s ' % fix(tokens[i-1].offset)
else:
prefix += ' '
else:
prefix = ' '
print("%s%s ::= %s" % (prefix, rule[0], ' '.join(rule[1])))
def error(self, instructions, index):
# Find the last line boundary
for start in range(index, -1, -1):
@@ -117,9 +144,9 @@ class PythonParser(GenericASTBuilder):
# print >> sys.stderr, 'resolve', str(list)
return GenericASTBuilder.resolve(self, list)
##############################################
## Common Python 2 and Python 3 grammar rules
##############################################
###############################################
# Common Python 2 and Python 3 grammar rules #
###############################################
def p_start(self, args):
'''
# The start or goal symbol
@@ -138,8 +165,7 @@ class PythonParser(GenericASTBuilder):
"""
passstmt ::=
_stmts ::= _stmts stmt
_stmts ::= stmt
_stmts ::= stmt+
# statements with continue
c_stmts ::= _stmts
@@ -246,13 +272,11 @@ class PythonParser(GenericASTBuilder):
# Zero or more COME_FROMs
# loops can have this
_come_from ::= _come_from COME_FROM
_come_from ::=
_come_from ::= COME_FROM*
# Zero or one COME_FROM
# And/or expressions have this
come_from_opt ::= COME_FROM
come_from_opt ::=
come_from_opt ::= COME_FROM?
"""
def p_dictcomp(self, args):
@@ -267,7 +291,12 @@ class PythonParser(GenericASTBuilder):
'''
stmt ::= augassign1
stmt ::= augassign2
# This is odd in that other augassign1's have only 3 slots
# The designator isn't used as that's supposed to be also
# indicated in the first expr
augassign1 ::= expr expr inplace_op designator
augassign1 ::= expr expr inplace_op ROT_THREE STORE_SUBSCR
augassign2 ::= expr DUP_TOP LOAD_ATTR expr
inplace_op ROT_TWO STORE_ATTR
@@ -420,7 +449,6 @@ class PythonParser(GenericASTBuilder):
expr ::= unary_not
expr ::= binary_subscr
expr ::= binary_subscr2
expr ::= load_attr
expr ::= get_iter
expr ::= buildslice2
expr ::= buildslice3
@@ -462,6 +490,8 @@ class PythonParser(GenericASTBuilder):
_mklambda ::= load_closure mklambda
_mklambda ::= mklambda
# "and" where the first part of the and is true,
# so there is only the 2nd part to evaluate
and2 ::= _jump jmp_false COME_FROM expr COME_FROM
expr ::= conditional
@@ -551,7 +581,7 @@ def parse(p, tokens, customize):
def get_python_parser(
version, debug_parser={}, compile_mode='exec',
version, debug_parser=PARSER_DEFAULT_DEBUG, compile_mode='exec',
is_pypy = False):
"""Returns parser object for Python version 2 or 3, 3.2, 3.5on,
etc., depending on the parameters passed. *compile_mode* is either
@@ -621,22 +651,30 @@ def get_python_parser(
pass
else:
import uncompyle6.parsers.parse3 as parse3
if version == 3.1:
if version == 3.0:
import uncompyle6.parsers.parse30 as parse30
if compile_mode == 'exec':
p = parse30.Python30Parser(debug_parser)
else:
p = parse30.Python30ParserSingle(debug_parser)
elif version == 3.1:
import uncompyle6.parsers.parse31 as parse31
if compile_mode == 'exec':
import uncompyle6.parsers.parse31 as parse31
p = parse31.Python31Parser(debug_parser)
else:
p = parse3.Python31ParserSingle(debug_parser)
p = parse31.Python31ParserSingle(debug_parser)
elif version == 3.2:
import uncompyle6.parsers.parse32 as parse32
if compile_mode == 'exec':
p = parse3.Python32Parser(debug_parser)
p = parse32.Python32Parser(debug_parser)
else:
p = parse3.Python32ParserSingle(debug_parser)
p = parse32.Python32ParserSingle(debug_parser)
elif version == 3.3:
import uncompyle6.parsers.parse33 as parse33
if compile_mode == 'exec':
p = parse3.Python33Parser(debug_parser)
p = parse33.Python33Parser(debug_parser)
else:
p = parse3.Python33ParserSingle(debug_parser)
p = parse33.Python33ParserSingle(debug_parser)
elif version == 3.4:
import uncompyle6.parsers.parse34 as parse34
if compile_mode == 'exec':
@@ -697,8 +735,8 @@ def python_parser(version, co, out=sys.stdout, showasm=False,
maybe_show_asm(showasm, tokens)
# For heavy grammar debugging
parser_debug = {'rules': True, 'transition': True, 'reduce' : True,
'showstack': 'full'}
# parser_debug = {'rules': True, 'transition': True, 'reduce' : True,
# 'showstack': 'full'}
p = get_python_parser(version, parser_debug)
return parse(p, tokens, customize)

View File

@@ -33,7 +33,7 @@ class AST(spark_AST):
else:
child = node.__repr1__(indent, None)
else:
inst = str(node)
inst = node.format(line_prefix='L.')
if inst.startswith("\n"):
# Nuke leading \n
inst = inst[1:]

View File

@@ -9,7 +9,7 @@ e.g. 5, myvariable, "for", etc. they are CPython Bytecode tokens,
e.g. "LOAD_CONST 5", "STORE NAME myvariable", "SETUP_LOOP", etc.
If we succeed in creating a parse tree, then we have a Python program
that a later phase can tern into a sequence of ASCII text.
that a later phase can turn into a sequence of ASCII text.
"""
from __future__ import print_function
@@ -25,20 +25,18 @@ class Python2Parser(PythonParser):
self.new_rules = set()
def p_print2(self, args):
'''
"""
stmt ::= print_items_stmt
stmt ::= print_nl
stmt ::= print_items_nl_stmt
print_items_stmt ::= expr PRINT_ITEM print_items_opt
print_items_nl_stmt ::= expr PRINT_ITEM print_items_opt PRINT_NEWLINE_CONT
print_items_opt ::= print_items
print_items_opt ::=
print_items ::= print_items print_item
print_items ::= print_item
print_item ::= expr PRINT_ITEM_CONT
print_nl ::= PRINT_NEWLINE
'''
print_items_opt ::= print_items?
print_items ::= print_item+
print_item ::= expr PRINT_ITEM_CONT
print_nl ::= PRINT_NEWLINE
"""
def p_stmt2(self, args):
"""
@@ -76,8 +74,6 @@ class Python2Parser(PythonParser):
return_if_stmts ::= _stmts return_if_stmt
return_if_stmt ::= ret_expr RETURN_END_IF
stmt ::= importstmt
stmt ::= break_stmt
break_stmt ::= BREAK_LOOP
@@ -171,8 +167,7 @@ class Python2Parser(PythonParser):
try_middle ::= jmp_abs COME_FROM except_stmts
END_FINALLY
except_stmts ::= except_stmts except_stmt
except_stmts ::= except_stmt
except_stmts ::= except_stmt+
except_stmt ::= except_cond1 except_suite
except_stmt ::= except
@@ -210,14 +205,6 @@ class Python2Parser(PythonParser):
and ::= expr jmp_false expr come_from_opt
or ::= expr jmp_true expr come_from_opt
slice0 ::= expr SLICE+0
slice0 ::= expr DUP_TOP SLICE+0
slice1 ::= expr expr SLICE+1
slice1 ::= expr expr DUP_TOPX_2 SLICE+1
slice2 ::= expr expr SLICE+2
slice2 ::= expr expr DUP_TOPX_2 SLICE+2
slice3 ::= expr expr expr SLICE+3
slice3 ::= expr expr expr DUP_TOPX_3 SLICE+3
unary_convert ::= expr UNARY_CONVERT
# In Python 3, DUP_TOPX_2 is DUP_TOP_TWO
@@ -248,11 +235,10 @@ class Python2Parser(PythonParser):
"""
inplace_op ::= INPLACE_DIVIDE
binary_op ::= BINARY_DIVIDE
binary_subscr2 ::= expr expr DUP_TOPX_2 BINARY_SUBSCR
"""
def add_custom_rules(self, tokens, customize):
'''
"""
Special handling for opcodes such as those that take a variable number
of arguments -- we add a new rule for each:
@@ -271,7 +257,7 @@ class Python2Parser(PythonParser):
expr ::= expr {expr}^n CALL_FUNCTION_KW_n POP_TOP
PyPy adds custom rules here as well
'''
"""
for opname, v in list(customize.items()):
opname_base = opname[:opname.rfind('_')]
if opname == 'PyPy':
@@ -347,7 +333,7 @@ class Python2Parser(PythonParser):
# always be the case.
self.add_unique_rules([
"stmt ::= tryfinallystmt_pypy",
"tryfinallystmt_pypy ::= SETUP_FINALLY suite_stmts_opt COME_FROM "
"tryfinallystmt_pypy ::= SETUP_FINALLY suite_stmts_opt COME_FROM_FINALLY "
"suite_stmts_opt END_FINALLY"
], customize)
continue
@@ -400,6 +386,26 @@ class Python2Parser(PythonParser):
else:
raise Exception('unknown customize token %s' % opname)
self.add_unique_rule(rule, opname_base, v, customize)
pass
self.check_reduce['augassign1'] = 'AST'
self.check_reduce['augassign2'] = 'AST'
self.check_reduce['_stmts'] = 'AST'
return
def reduce_is_invalid(self, rule, ast, tokens, first, last):
lhs = rule[0]
if lhs in ('augassign1', 'augassign2') and ast[0][0] == 'and':
return True
elif lhs == '_stmts':
for i, stmt in enumerate(ast):
if stmt == '_stmts':
stmt = stmt[0]
assert stmt == 'stmt'
if stmt[0] == 'return_stmt':
return i+1 != len(ast)
pass
return False
return False
class Python2ParserSingle(Python2Parser, PythonParserSingle):
pass

View File

@@ -18,6 +18,9 @@ class Python23Parser(Python24Parser):
# of Python
_while1test ::= SETUP_LOOP JUMP_FORWARD JUMP_IF_FALSE POP_TOP COME_FROM
while1stmt ::= _while1test l_stmts_opt JUMP_BACK
POP_TOP POP_BLOCK COME_FROM
while1stmt ::= _while1test l_stmts_opt JUMP_BACK
COME_FROM POP_TOP POP_BLOCK COME_FROM

View File

@@ -13,7 +13,6 @@ class Python26Parser(Python2Parser):
super(Python26Parser, self).__init__(debug_parser)
self.customized = {}
def p_try_except26(self, args):
"""
except_stmt ::= except_cond3 except_suite
@@ -29,9 +28,9 @@ class Python26Parser(Python2Parser):
POP_TOP END_FINALLY
try_middle ::= jmp_abs COME_FROM except_stmts
come_from_pop END_FINALLY
POP_TOP END_FINALLY
trystmt ::= SETUP_EXCEPT suite_stmts_opt come_from_pop
trystmt ::= SETUP_EXCEPT suite_stmts_opt POP_TOP
try_middle
# Sometimes we don't put in COME_FROM to the next statement
@@ -48,11 +47,17 @@ class Python26Parser(Python2Parser):
_ifstmts_jump ::= c_stmts_opt JUMP_FORWARD COME_FROM POP_TOP
except_suite ::= c_stmts_opt JUMP_FORWARD come_from_pop
except_suite ::= c_stmts_opt JUMP_FORWARD POP_TOP
except_suite ::= c_stmts_opt jmp_abs come_from_pop
# Python 3 also has this.
come_froms ::= come_froms COME_FROM
come_froms ::= COME_FROM
# This is what happens after a jump where
# we start a new block. For reasons I don't fully
# understand, there is also a value on the top of the stack
come_from_pop ::= COME_FROM POP_TOP
come_froms_pop ::= come_froms POP_TOP
"""
@@ -70,14 +75,15 @@ class Python26Parser(Python2Parser):
jmp_true ::= JUMP_IF_TRUE POP_TOP
jmp_false ::= JUMP_IF_FALSE POP_TOP
jf_pop ::= JUMP_FORWARD come_from_pop
jf_pop ::= JUMP_ABSOLUTE come_from_pop
jb_pop ::= JUMP_BACK come_from_pop
jf_pop ::= JUMP_FORWARD POP_TOP
jf_pop ::= JUMP_ABSOLUTE POP_TOP
jb_pop ::= JUMP_BACK POP_TOP
jb_cont ::= JUMP_BACK
jb_cont ::= CONTINUE
jb_cf_pop ::= JUMP_BACK come_froms POP_TOP
jb_cf_pop ::= JUMP_BACK POP_TOP
ja_cf_pop ::= JUMP_ABSOLUTE come_froms POP_TOP
jf_cf_pop ::= JUMP_FORWARD come_froms POP_TOP
@@ -85,13 +91,12 @@ class Python26Parser(Python2Parser):
jb_bp_come_from ::= JUMP_BACK bp_come_from
_ifstmts_jump ::= c_stmts_opt jf_pop COME_FROM
_ifstmts_jump ::= c_stmts_opt JUMP_FORWARD COME_FROM come_from_pop
_ifstmts_jump ::= c_stmts_opt JUMP_FORWARD COME_FROM POP_TOP
_ifstmts_jump ::= c_stmts_opt JUMP_FORWARD come_froms POP_TOP COME_FROM
# This is what happens after a jump where
# we start a new block. For reasons I don't fully
# understand, there is also a value on the top of the stack
come_from_pop ::= COME_FROM POP_TOP
come_froms_pop ::= come_froms POP_TOP
"""
@@ -145,9 +150,9 @@ class Python26Parser(Python2Parser):
whileelsestmt ::= SETUP_LOOP testexpr l_stmts_opt jb_pop POP_BLOCK
else_suite COME_FROM
return_stmt ::= ret_expr RETURN_END_IF come_from_pop
return_stmt ::= ret_expr RETURN_VALUE come_from_pop
return_if_stmt ::= ret_expr RETURN_END_IF come_from_pop
return_stmt ::= ret_expr RETURN_END_IF POP_TOP
return_stmt ::= ret_expr RETURN_VALUE POP_TOP
return_if_stmt ::= ret_expr RETURN_END_IF POP_TOP
iflaststmtl ::= testexpr c_stmts_opt JUMP_BACK come_from_pop
iflaststmt ::= testexpr c_stmts_opt JUMP_ABSOLUTE come_from_pop
@@ -183,10 +188,13 @@ class Python26Parser(Python2Parser):
comp_body ::= gen_comp_body
for_block ::= l_stmts_opt _come_from POP_TOP JUMP_BACK
# Make sure we keep indices the same as 2.7
setup_loop_lf ::= SETUP_LOOP LOAD_FAST
genexpr_func ::= setup_loop_lf FOR_ITER designator comp_iter jb_bp_come_from
genexpr_func ::= setup_loop_lf FOR_ITER designator comp_iter JUMP_BACK come_from_pop jb_bp_come_from
genexpr_func ::= setup_loop_lf FOR_ITER designator comp_iter JUMP_BACK come_from_pop
jb_bp_come_from
genexpr ::= LOAD_GENEXPR MAKE_FUNCTION_0 expr GET_ITER CALL_FUNCTION_1 COME_FROM
'''
@@ -208,7 +216,7 @@ class Python26Parser(Python2Parser):
def p_except26(self, args):
'''
except_suite ::= c_stmts_opt jmp_abs come_from_pop
except_suite ::= c_stmts_opt jmp_abs POP_TOP
'''
def p_misc26(self, args):
@@ -237,8 +245,8 @@ if __name__ == '__main__':
""".split()))
remain_tokens = set(tokens) - opcode_set
import re
remain_tokens = set([re.sub('_\d+$','', t) for t in remain_tokens])
remain_tokens = set([re.sub('_CONT$','', t) for t in remain_tokens])
remain_tokens = set([re.sub('_\d+$', '', t) for t in remain_tokens])
remain_tokens = set([re.sub('_CONT$', '', t) for t in remain_tokens])
remain_tokens = set(remain_tokens) - opcode_set
print(remain_tokens)
# print(sorted(p.rule2name.items()))

View File

@@ -31,6 +31,10 @@ class Python27Parser(Python2Parser):
def p_try27(self, args):
"""
tryfinallystmt ::= SETUP_FINALLY suite_stmts_opt
POP_BLOCK LOAD_CONST
COME_FROM_FINALLY suite_stmts_opt END_FINALLY
tryelsestmt ::= SETUP_EXCEPT suite_stmts_opt POP_BLOCK
try_middle else_suite COME_FROM
@@ -45,7 +49,10 @@ class Python27Parser(Python2Parser):
def p_jump27(self, args):
"""
_ifstmts_jump ::= c_stmts_opt JUMP_FORWARD COME_FROM
come_froms ::= come_froms COME_FROM
come_froms ::= COME_FROM
_ifstmts_jump ::= c_stmts_opt JUMP_FORWARD come_froms
bp_come_from ::= POP_BLOCK COME_FROM
# FIXME: Common with 3.0+
@@ -77,15 +84,13 @@ class Python27Parser(Python2Parser):
assert2 ::= assert_expr jmp_true LOAD_ASSERT expr RAISE_VARARGS_2
withstmt ::= expr SETUP_WITH POP_TOP suite_stmts_opt
POP_BLOCK LOAD_CONST COME_FROM
POP_BLOCK LOAD_CONST COME_FROM_WITH
WITH_CLEANUP END_FINALLY
withasstmt ::= expr SETUP_WITH designator suite_stmts_opt
POP_BLOCK LOAD_CONST COME_FROM
POP_BLOCK LOAD_CONST COME_FROM_WITH
WITH_CLEANUP END_FINALLY
while1stmt ::= SETUP_LOOP l_stmts_opt JUMP_BACK POP_BLOCK COME_FROM
# Common with 2.6
while1stmt ::= SETUP_LOOP return_stmts bp_come_from
while1stmt ::= SETUP_LOOP return_stmts COME_FROM

View File

@@ -100,8 +100,7 @@ class Python3Parser(PythonParser):
del_stmt ::= expr DELETE_ATTR
kwarg ::= LOAD_CONST expr
kwargs ::= kwargs kwarg
kwargs ::=
kwargs ::= kwarg*
classdef ::= build_class designator
@@ -139,16 +138,23 @@ class Python3Parser(PythonParser):
iflaststmtl ::= testexpr c_stmts_opt JUMP_BACK
iflaststmtl ::= testexpr c_stmts_opt JUMP_BACK COME_FROM_LOOP
# These are used to keep AST indices the same
jf_else ::= JUMP_FORWARD ELSE
ja_else ::= JUMP_ABSOLUTE ELSE
# Note: in if/else kinds of statements, we err on the side
# of missing "else" clauses. Therefore we include grammar
# rules with and without ELSE.
ifelsestmt ::= testexpr c_stmts_opt JUMP_FORWARD else_suite COME_FROM
ifelsestmt ::= testexpr c_stmts_opt jf_else else_suite _come_from
ifelsestmtc ::= testexpr c_stmts_opt JUMP_ABSOLUTE else_suitec
ifelsestmtc ::= testexpr c_stmts_opt ja_else else_suitec
ifelsestmtr ::= testexpr return_if_stmts return_stmts
ifelsestmtl ::= testexpr c_stmts_opt JUMP_BACK else_suitel
ifelsestmtl ::= testexpr c_stmts_opt JUMP_BACK else_suitel JUMP_BACK COME_FROM_LOOP
ifelsestmtl ::= testexpr c_stmts_opt JUMP_BACK else_suitel COME_FROM_LOOP
# FIXME: this feels like a hack. Is it just 1 or two
# COME_FROMs? the parsed tree for this and even with just the
@@ -246,6 +252,24 @@ class Python3Parser(PythonParser):
c_stmts_opt34 ::= JUMP_BACK JUMP_ABSOLUTE c_stmts_opt
"""
def p_def_annotations3(self, args):
"""
# Annotated functions
stmt ::= funcdef_annotate
funcdef_annotate ::= mkfunc_annotate designator
# This has the annotation value.
# LOAD_NAME is used in an annotation type like
# int, float, str
annotate_arg ::= LOAD_NAME
# LOAD_CONST is used in an annotation string
annotate_arg ::= LOAD_CONST
# This stores the tuple of parameter names
# that have been annotated
annotate_tuple ::= LOAD_CONST
"""
def p_come_from3(self, args):
"""
opt_come_from_except ::= COME_FROM_EXCEPT
@@ -295,6 +319,7 @@ class Python3Parser(PythonParser):
"""
forstmt ::= SETUP_LOOP expr _for designator for_block POP_BLOCK
opt_come_from_loop
forelsestmt ::= SETUP_LOOP expr _for designator for_block POP_BLOCK else_suite
COME_FROM_LOOP
@@ -315,31 +340,27 @@ class Python3Parser(PythonParser):
whilestmt ::= SETUP_LOOP testexpr return_stmts POP_BLOCK
COME_FROM_LOOP
while1elsestmt ::= SETUP_LOOP l_stmts JUMP_BACK
else_suite
whileelsestmt ::= SETUP_LOOP testexpr l_stmts_opt JUMP_BACK POP_BLOCK
else_suite COME_FROM_LOOP
while1elsestmt ::= SETUP_LOOP l_stmts JUMP_BACK
else_suite COME_FROM_LOOP
whileelselaststmt ::= SETUP_LOOP testexpr l_stmts_opt JUMP_BACK POP_BLOCK
else_suitec COME_FROM_LOOP
whileTruestmt ::= SETUP_LOOP l_stmts_opt JUMP_BACK POP_BLOCK
COME_FROM_LOOP
# FIXME: Python 3.? starts adding branch optimization? Put this starting there.
while1stmt ::= SETUP_LOOP l_stmts
# Python < 3.5 no POP BLOCK
whileTruestmt ::= SETUP_LOOP l_stmts_opt JUMP_BACK
COME_FROM_LOOP
whileTruestmt ::= SETUP_LOOP return_stmts
COME_FROM_LOOP
while1stmt ::= SETUP_LOOP l_stmts COME_FROM_LOOP
# FIXME: investigate - can code really produce a NOP?
whileTruestmt ::= SETUP_LOOP l_stmts_opt JUMP_BACK NOP
COME_FROM_LOOP
whileTruestmt ::= SETUP_LOOP l_stmts_opt JUMP_BACK POP_BLOCK NOP
COME_FROM_LOOP
whileTruestmt ::= SETUP_LOOP l_stmts_opt JUMP_BACK POP_BLOCK NOP
COME_FROM_LOOP
forstmt ::= SETUP_LOOP expr _for designator for_block POP_BLOCK NOP
COME_FROM_LOOP
"""
@@ -354,16 +375,17 @@ class Python3Parser(PythonParser):
'''
def p_expr3(self, args):
'''
"""
conditional ::= expr jmp_false expr jf_else expr COME_FROM
conditionalnot ::= expr jmp_true expr jf_else expr COME_FROM
expr ::= LOAD_CLASSNAME
# Python 3.4+
expr ::= LOAD_CLASSDEREF
binary_subscr2 ::= expr expr DUP_TOP_TWO BINARY_SUBSCR
# Python3 drops slice0..slice3
'''
"""
@staticmethod
def call_fn_name(token):
@@ -426,10 +448,10 @@ class Python3Parser(PythonParser):
args_kw = (token.attr >> 8) & 0xff
nak = ( len(opname)-len('CALL_FUNCTION') ) // 3
token.type = self.call_fn_name(token)
rule = ('call_function ::= expr '
+ ('pos_arg ' * args_pos)
+ ('kwarg ' * args_kw)
+ 'expr ' * nak + token.type)
rule = ('call_function ::= expr ' +
('pos_arg ' * args_pos) +
('kwarg ' * args_kw) +
'expr ' * nak + token.type)
self.add_unique_rule(rule, token.type, args_pos, customize)
rule = ('classdefdeco2 ::= LOAD_BUILD_CLASS mkfunc %s%s_%d'
% (('expr ' * (args_pos-1)), opname, args_pos))
@@ -568,9 +590,10 @@ class Python3Parser(PythonParser):
self.add_unique_rule(rule, 'kvlist_n', 1, customize)
rule = "mapexpr ::= BUILD_MAP_n kvlist_n"
elif self.version >= 3.5:
rule = kvlist_n + ' ::= ' + 'expr ' * (token.attr*2)
self.add_unique_rule(rule, opname, token.attr, customize)
rule = "mapexpr ::= %s %s" % (kvlist_n, opname)
if opname != 'BUILD_MAP_WITH_CALL':
rule = kvlist_n + ' ::= ' + 'expr ' * (token.attr*2)
self.add_unique_rule(rule, opname, token.attr, customize)
rule = "mapexpr ::= %s %s" % (kvlist_n, opname)
else:
rule = kvlist_n + ' ::= ' + 'expr expr STORE_MAP ' * token.attr
self.add_unique_rule(rule, opname, token.attr, customize)
@@ -621,10 +644,10 @@ class Python3Parser(PythonParser):
# number of apply equiv arguments:
nak = ( len(opname_base)-len('CALL_METHOD') ) // 3
rule = ('call_function ::= expr '
+ ('pos_arg ' * args_pos)
+ ('kwarg ' * args_kw)
+ 'expr ' * nak + opname)
rule = ('call_function ::= expr ' +
('pos_arg ' * args_pos) +
('kwarg ' * args_kw) +
'expr ' * nak + opname)
self.add_unique_rule(rule, opname, token.attr, customize)
elif opname.startswith('MAKE_CLOSURE'):
# DRY with MAKE_FUNCTION
@@ -668,42 +691,45 @@ class Python3Parser(PythonParser):
rule = ('mkfunc ::= %sload_closure LOAD_CONST %s'
% ('expr ' * args_pos, opname))
self.add_unique_rule(rule, opname, token.attr, customize)
pass
self.check_reduce['augassign1'] = 'AST'
self.check_reduce['augassign2'] = 'AST'
self.check_reduce['while1stmt'] = 'noAST'
return
def reduce_is_invalid(self, rule, ast, tokens, first, last):
lhs = rule[0]
if lhs in ('augassign1', 'augassign2') and ast[0][0] == 'and':
return True
elif lhs == 'while1stmt':
if tokens[last] in ('COME_FROM_LOOP', 'JUMP_BACK'):
# jump_back should be right afer SETUP_LOOP. Test?
last += 1
while last < len(tokens) and isinstance(tokens[last].offset, str):
last += 1
if last < len(tokens):
offset = tokens[last].offset
assert tokens[first] == 'SETUP_LOOP'
if offset != tokens[first].attr:
return True
return False
return False
class Python33Parser(Python3Parser):
def p_33(self, args):
class Python30Parser(Python3Parser):
def p_30(self, args):
"""
# Store locals is only in Python 3.0 to 3.3
stmt ::= store_locals
store_locals ::= LOAD_FAST STORE_LOCALS
# Python 3.3 adds yield from.
expr ::= yield_from
yield_from ::= expr expr YIELD_FROM
jmp_true ::= JUMP_IF_TRUE_OR_POP POP_TOP
_ifstmts_jump ::= c_stmts_opt JUMP_FORWARD POP_TOP COME_FROM
"""
class Python32Parser(Python3Parser):
def p_32on(self, args):
"""
# In Python 3.2+, DUP_TOPX is DUP_TOP_TWO
binary_subscr2 ::= expr expr DUP_TOP_TWO BINARY_SUBSCR
stmt ::= store_locals
store_locals ::= LOAD_FAST STORE_LOCALS
"""
pass
class Python3ParserSingle(Python3Parser, PythonParserSingle):
pass
class Python32ParserSingle(Python32Parser, PythonParserSingle):
pass
class Python33ParserSingle(Python33Parser, PythonParserSingle):
pass
def info(args):
# Check grammar
p = Python3Parser()
@@ -713,9 +739,13 @@ def info(args):
from uncompyle6.parser.parse35 import Python35Parser
p = Python35Parser()
elif arg == '3.3':
from uncompyle6.parser.parse33 import Python33Parser
p = Python33Parser()
elif arg == '3.2':
from uncompyle6.parser.parse32 import Python32Parser
p = Python32Parser()
elif arg == '3.0':
p = Python30Parser()
p.checkGrammar()
if len(sys.argv) > 1 and sys.argv[1] == 'dump':
print('-' * 50)

View File

@@ -0,0 +1,43 @@
# Copyright (c) 2016 Rocky Bernstein
"""
spark grammar differences over Python 3.1 for Python 3.0.
"""
from __future__ import print_function
from uncompyle6.parser import PythonParserSingle
from uncompyle6.parsers.parse3 import Python3Parser
class Python30Parser(Python3Parser):
def p_30(self, args):
"""
# Store locals is only in Python 3.0 to 3.3
stmt ::= store_locals
store_locals ::= LOAD_FAST STORE_LOCALS
# In many ways Python 3.0 code generation is more like Python 2.6 than
# it is 2.7 or 3.1. So we have a number of 2.6ish (and before) rules below
_ifstmts_jump ::= c_stmts_opt JUMP_FORWARD come_froms POP_TOP COME_FROM
jmp_true ::= JUMP_IF_TRUE POP_TOP
jmp_false ::= JUMP_IF_FALSE POP_TOP
# Used to keep index order the same in semantic actions
jb_pop_top ::= JUMP_BACK POP_TOP
while1stmt ::= SETUP_LOOP l_stmts COME_FROM_LOOP
else_suitel ::= l_stmts COME_FROM_LOOP JUMP_BACK
ifelsestmtl ::= testexpr c_stmts_opt jb_pop_top else_suitel
withasstmt ::= expr setupwithas designator suite_stmts_opt
POP_BLOCK LOAD_CONST COME_FROM_FINALLY
LOAD_FAST DELETE_FAST WITH_CLEANUP END_FINALLY
setupwithas ::= DUP_TOP LOAD_ATTR STORE_FAST LOAD_ATTR CALL_FUNCTION_0 setup_finally
setup_finally ::= STORE_FAST SETUP_FINALLY LOAD_FAST DELETE_FAST
"""
class Python30ParserSingle(Python30Parser, PythonParserSingle):
pass

View File

@@ -5,7 +5,7 @@ spark grammar differences over Python 3.2 for Python 3.1.
from __future__ import print_function
from uncompyle6.parser import PythonParserSingle
from uncompyle6.parsers.parse3 import Python32Parser
from uncompyle6.parsers.parse32 import Python32Parser
class Python31Parser(Python32Parser):
@@ -32,9 +32,6 @@ class Python31Parser(Python32Parser):
store ::= STORE_NAME
load ::= LOAD_FAST
load ::= LOAD_NAME
stmt ::= funcdeftest
funcdeftest ::= mkfunctest designator
"""
def add_custom_rules(self, tokens, customize):
@@ -46,9 +43,9 @@ class Python31Parser(Python32Parser):
# Check that there are 2 annotated params?
# rule = ('mkfunc2 ::= %s%sEXTENDED_ARG %s' %
# ('pos_arg ' * (args_pos), 'kwargs ' * (annotate_args-1), opname))
rule = ('mkfunctest ::= %s%sLOAD_CONST EXTENDED_ARG %s' %
(('pos_arg ' * (args_pos)), 'kwargs ', opname))
rule = ('mkfunc_annotate ::= %s%sannotate_tuple LOAD_CONST EXTENDED_ARG %s' %
(('pos_arg ' * (args_pos)),
('annotate_arg ' * (annotate_args-1)), opname))
self.add_unique_rule(rule, opname, token.attr, customize)
class Python31ParserSingle(Python31Parser, PythonParserSingle):
pass

View File

@@ -0,0 +1,47 @@
# Copyright (c) 2016 Rocky Bernstein
"""
spark grammar differences over Python 3 for Python 3.2.
"""
from __future__ import print_function
from uncompyle6.parser import PythonParserSingle
from uncompyle6.parsers.parse3 import Python3Parser
class Python32Parser(Python3Parser):
def p_32to35(self, args):
"""
# Store locals is only in Python 3.0 to 3.3
stmt ::= store_locals
store_locals ::= LOAD_FAST STORE_LOCALS
# Python < 3.5 no POP BLOCK
whileTruestmt ::= SETUP_LOOP l_stmts_opt JUMP_BACK
COME_FROM_LOOP
whileTruestmt ::= SETUP_LOOP return_stmts
COME_FROM_LOOP
"""
pass
def p_32on(self, args):
"""
# In Python 3.2+, DUP_TOPX is DUP_TOP_TWO
binary_subscr2 ::= expr expr DUP_TOP_TWO BINARY_SUBSCR
"""
pass
def add_custom_rules(self, tokens, customize):
super(Python32Parser, self).add_custom_rules(tokens, customize)
for i, token in enumerate(tokens):
opname = token.type
if opname.startswith('MAKE_FUNCTION_A'):
args_pos, args_kw, annotate_args = token.attr
# Check that there are 2 annotated params?
rule = (('mkfunc_annotate ::= %s%sannotate_tuple '
'LOAD_CONST LOAD_CONST EXTENDED_ARG %s') %
(('pos_arg ' * (args_pos)),
('annotate_arg ' * (annotate_args-1)), opname))
self.add_unique_rule(rule, opname, token.attr, customize)
class Python32ParserSingle(Python32Parser, PythonParserSingle):
pass

View File

@@ -0,0 +1,32 @@
# Copyright (c) 2016 Rocky Bernstein
"""
spark grammar differences over Python 3.2 for Python 3.3.
"""
from __future__ import print_function
from uncompyle6.parser import PythonParserSingle
from uncompyle6.parsers.parse32 import Python32Parser
class Python33Parser(Python32Parser):
def p_33on(self, args):
"""
# Python 3.3+ adds yield from.
expr ::= yield_from
yield_from ::= expr expr YIELD_FROM
# We do the grammar hackery below for semantics
# actions that want c_stmts_opt at index 1
iflaststmt ::= testexpr c_stmts_opt33
c_stmts_opt33 ::= JUMP_BACK JUMP_ABSOLUTE c_stmts_opt
_ifstmts_jump ::= c_stmts_opt JUMP_FORWARD _come_from
# Python 3.3+ has more loop optimization that removes
# JUMP_FORWARD in some cases, and hence we also don't
# see COME_FROM
_ifstmts_jump ::= c_stmts_opt
"""
class Python33ParserSingle(Python33Parser, PythonParserSingle):
pass

View File

@@ -1,13 +1,13 @@
# Copyright (c) 2016 Rocky Bernstein
"""
spark grammar differences over Python 3.2 for Python 3.4
spark grammar differences over Python 3.3 for Python 3.4
"""
from uncompyle6.parser import PythonParserSingle
from spark_parser import DEFAULT_DEBUG as PARSER_DEFAULT_DEBUG
from uncompyle6.parsers.parse3 import Python3Parser
from uncompyle6.parsers.parse33 import Python33Parser
class Python34Parser(Python3Parser):
class Python34Parser(Python33Parser):
def __init__(self, debug_parser=PARSER_DEFAULT_DEBUG):
super(Python34Parser, self).__init__(debug_parser)
@@ -15,32 +15,10 @@ class Python34Parser(Python3Parser):
def p_misc34(self, args):
"""
# Python 3.5+ optimizes the trailing two JUMPS away
for_block ::= l_stmts
iflaststmtl ::= testexpr c_stmts_opt
_ifstmts_jump ::= c_stmts_opt JUMP_FORWARD _come_from
# We do the grammar hackery below for semantics
# actions that want c_stmts_opt at index 1
iflaststmt ::= testexpr c_stmts_opt34
c_stmts_opt34 ::= JUMP_BACK JUMP_ABSOLUTE c_stmts_opt
# Python 3.3 added "yield from." Do it the same way as in
# 3.3
expr ::= yield_from
yield_from ::= expr expr YIELD_FROM
# Python 3.4+ optimizes the trailing two JUMPS away
# Is this 3.4 only?
yield_from ::= expr GET_ITER LOAD_CONST YIELD_FROM
# Python 3.4+ has more loop optimization that removes
# JUMP_FORWARD in some cases, and hence we also don't
# see COME_FROM
_ifstmts_jump ::= c_stmts_opt
"""
class Python34ParserSingle(Python34Parser, PythonParserSingle):
pass
@@ -62,8 +40,8 @@ if __name__ == '__main__':
""".split()))
remain_tokens = set(tokens) - opcode_set
import re
remain_tokens = set([re.sub('_\d+$','', t) for t in remain_tokens])
remain_tokens = set([re.sub('_CONT$','', t) for t in remain_tokens])
remain_tokens = set([re.sub('_\d+$', '', t) for t in remain_tokens])
remain_tokens = set([re.sub('_CONT$', '', t) for t in remain_tokens])
remain_tokens = set(remain_tokens) - opcode_set
print(remain_tokens)
# print(sorted(p.rule2name.items()))

View File

@@ -1,14 +1,14 @@
# Copyright (c) 2016 Rocky Bernstein
"""
spark grammar differences over Python 3.2 for Python 3.5.
spark grammar differences over Python 3.4 for Python 3.5.
"""
from __future__ import print_function
from uncompyle6.parser import PythonParserSingle
from spark_parser import DEFAULT_DEBUG as PARSER_DEFAULT_DEBUG
from uncompyle6.parsers.parse3 import Python32Parser
from uncompyle6.parsers.parse34 import Python34Parser
class Python35Parser(Python32Parser):
class Python35Parser(Python34Parser):
def __init__(self, debug_parser=PARSER_DEFAULT_DEBUG):
super(Python35Parser, self).__init__(debug_parser)
@@ -45,14 +45,28 @@ class Python35Parser(Python32Parser):
# Python 3.3+ also has yield from. 3.5 does it
# differently than 3.3, 3.4
expr ::= yield_from
yield_from ::= expr GET_YIELD_FROM_ITER LOAD_CONST YIELD_FROM
# Python 3.4+ has more loop optimization that removes
# JUMP_FORWARD in some cases, and hence we also don't
# see COME_FROM
_ifstmts_jump ::= c_stmts_opt
"""
def add_custom_rules(self, tokens, customize):
super(Python35Parser, self).add_custom_rules(tokens, customize)
for i, token in enumerate(tokens):
opname = token.type
if opname == 'BUILD_MAP_UNPACK_WITH_CALL':
nargs = token.attr % 256
map_unpack_n = "map_unpack_%s" % nargs
rule = map_unpack_n + ' ::= ' + 'expr ' * (nargs)
self.add_unique_rule(rule, opname, token.attr, customize)
rule = "unmapexpr ::= %s %s" % (map_unpack_n, opname)
self.add_unique_rule(rule, opname, token.attr, customize)
call_token = tokens[i+1]
if self.version == 3.5:
rule = 'call_function ::= expr unmapexpr ' + call_token.type
self.add_unique_rule(rule, opname, token.attr, customize)
pass
pass
return
class Python35ParserSingle(Python35Parser, PythonParserSingle):
pass
@@ -72,8 +86,8 @@ if __name__ == '__main__':
""".split()))
remain_tokens = set(tokens) - opcode_set
import re
remain_tokens = set([re.sub('_\d+$','', t) for t in remain_tokens])
remain_tokens = set([re.sub('_CONT$','', t) for t in remain_tokens])
remain_tokens = set([re.sub('_\d+$', '', t) for t in remain_tokens])
remain_tokens = set([re.sub('_CONT$', '', t) for t in remain_tokens])
remain_tokens = set(remain_tokens) - opcode_set
print(remain_tokens)
# print(sorted(p.rule2name.items()))

View File

@@ -17,8 +17,10 @@ class Python36Parser(Python35Parser):
def p_36misc(self, args):
"""
fstring_multi ::= fstring_expr_or_strs BUILD_STRING
fstring_expr_or_strs ::= fstring_expr_or_strs fstring_expr_or_str
fstring_expr_or_strs ::= fstring_expr_or_str
fstring_expr_or_strs ::= fstring_expr_or_str+
func_args36 ::= expr BUILD_TUPLE_0
call_function ::= func_args36 unmapexpr CALL_FUNCTION_EX
"""
def add_custom_rules(self, tokens, customize):
@@ -47,6 +49,7 @@ class Python36Parser(Python35Parser):
""" % (fstring_expr_or_str_n, fstring_expr_or_str_n, "fstring_expr_or_str " * v)
self.add_unique_doc_rules(rules_str, customize)
class Python36ParserSingle(Python36Parser, PythonParserSingle):
pass

View File

@@ -27,7 +27,8 @@ if PYTHON3:
intern = sys.intern
L65536 = 65536
def long(l): l
def long(l):
return l
else:
L65536 = long(65536) # NOQA
@@ -227,7 +228,7 @@ class Scanner(object):
if op < self.opc.HAVE_ARGUMENT:
return 1
else:
return 3
return 2 if self.version >= 3.6 else 3
def remove_mid_line_ifs(self, ifs):
"""

View File

@@ -25,6 +25,7 @@ from __future__ import print_function
from collections import namedtuple
from array import array
from uncompyle6.scanner import op_has_argument
from xdis.code import iscode
import uncompyle6.scanner as scan
@@ -92,11 +93,6 @@ class Scanner2(scan.Scanner):
for instr in bytecode.get_instructions(co):
print(instr._disassemble())
# from xdis.bytecode import Bytecode
# bytecode = Bytecode(co, self.opc)
# for instr in bytecode.get_instructions(co):
# print(instr._disassemble())
# Container for tokens
tokens = []
@@ -137,7 +133,7 @@ class Scanner2(scan.Scanner):
if names[self.get_argument(i+3)] == 'AssertionError':
self.load_asserts.add(i+3)
jump_targets = self.find_jump_targets()
jump_targets = self.find_jump_targets(show_asm)
# contains (code, [addrRefToCode])
last_stmt = self.next_stmt[0]
@@ -164,18 +160,29 @@ class Scanner2(scan.Scanner):
# we sort them). That way, specific COME_FROM tags will match up
# properly. For example, a "loop" with an "if" nested in it should have the
# "loop" tag last so the grammar rule matches that properly.
# last_offset = -1
for jump_offset in sorted(jump_targets[offset], reverse=True):
# if jump_offset == last_offset:
# continue
# last_offset = jump_offset
come_from_name = 'COME_FROM'
op_name = self.opc.opname[self.code[jump_offset]]
if op_name.startswith('SETUP_') and self.version == 2.7:
come_from_type = op_name[len('SETUP_'):]
if come_from_type not in ('LOOP', 'EXCEPT'):
come_from_name = 'COME_FROM_%s' % come_from_type
pass
tokens.append(Token(
'COME_FROM', None, repr(jump_offset),
come_from_name, None, repr(jump_offset),
offset="%s_%d" % (offset, jump_idx),
has_arg = True))
jump_idx += 1
op = self.code[offset]
opname = self.opc.opname[op]
op_name = self.opc.opname[op]
oparg = None; pattr = None
has_arg = (op >= self.opc.HAVE_ARGUMENT)
has_arg = op_has_argument(op, self.opc)
if has_arg:
oparg = self.get_argument(offset) + extended_arg
extended_arg = 0
@@ -187,14 +194,14 @@ class Scanner2(scan.Scanner):
if iscode(const):
oparg = const
if const.co_name == '<lambda>':
assert opname == 'LOAD_CONST'
opname = 'LOAD_LAMBDA'
assert op_name == 'LOAD_CONST'
op_name = 'LOAD_LAMBDA'
elif const.co_name == '<genexpr>':
opname = 'LOAD_GENEXPR'
op_name = 'LOAD_GENEXPR'
elif const.co_name == '<dictcomp>':
opname = 'LOAD_DICTCOMP'
op_name = 'LOAD_DICTCOMP'
elif const.co_name == '<setcomp>':
opname = 'LOAD_SETCOMP'
op_name = 'LOAD_SETCOMP'
# verify() uses 'pattr' for comparison, since 'attr'
# now holds Code(const) and thus can not be used
# for comparison (todo: think about changing this)
@@ -230,20 +237,20 @@ class Scanner2(scan.Scanner):
self.code[self.prev[offset]] == self.opc.LOAD_CLOSURE:
continue
else:
if self.is_pypy and not oparg and opname == 'BUILD_MAP':
opname = 'BUILD_MAP_n'
if self.is_pypy and not oparg and op_name == 'BUILD_MAP':
op_name = 'BUILD_MAP_n'
else:
opname = '%s_%d' % (opname, oparg)
op_name = '%s_%d' % (op_name, oparg)
if op != self.opc.BUILD_SLICE:
customize[opname] = oparg
elif self.is_pypy and opname in ('LOOKUP_METHOD',
customize[op_name] = oparg
elif self.is_pypy and op_name in ('LOOKUP_METHOD',
'JUMP_IF_NOT_DEBUG',
'SETUP_EXCEPT',
'SETUP_FINALLY'):
# The value in the dict is in special cases in semantic actions, such
# as CALL_FUNCTION. The value is not used in these cases, so we put
# in arbitrary value 0.
customize[opname] = 0
customize[op_name] = 0
elif op == self.opc.JUMP_ABSOLUTE:
# Further classify JUMP_ABSOLUTE into backward jumps
# which are used in loops, and "CONTINUE" jumps which
@@ -262,16 +269,16 @@ class Scanner2(scan.Scanner):
and self.code[offset+3] not in (self.opc.END_FINALLY,
self.opc.POP_BLOCK)
and offset not in self.not_continue):
opname = 'CONTINUE'
op_name = 'CONTINUE'
else:
opname = 'JUMP_BACK'
op_name = 'JUMP_BACK'
elif op == self.opc.LOAD_GLOBAL:
if offset in self.load_asserts:
opname = 'LOAD_ASSERT'
op_name = 'LOAD_ASSERT'
elif op == self.opc.RETURN_VALUE:
if offset in self.return_end_ifs:
opname = 'RETURN_END_IF'
op_name = 'RETURN_END_IF'
if offset in self.linestartoffsets:
linestart = self.linestartoffsets[offset]
@@ -280,7 +287,7 @@ class Scanner2(scan.Scanner):
if offset not in replace:
tokens.append(Token(
opname, oparg, pattr, offset, linestart, op,
op_name, oparg, pattr, offset, linestart, op,
has_arg, self.opc))
else:
tokens.append(Token(
@@ -291,7 +298,7 @@ class Scanner2(scan.Scanner):
if show_asm in ('both', 'after'):
for t in tokens:
print(t)
print(t.format(line_prefix='L.'))
print()
return tokens, customize
@@ -352,7 +359,7 @@ class Scanner2(scan.Scanner):
j+=1
return
def build_stmt_indices(self):
def build_statement_indices(self):
code = self.code
start = 0
end = len(code)
@@ -429,10 +436,10 @@ class Scanner2(scan.Scanner):
slist += [end] * (end-len(slist))
def next_except_jump(self, start):
'''
"""
Return the next jump that was generated by an except SomeException:
construct in a try...except...else clause or None if not found.
'''
"""
if self.code[start] == self.opc.DUP_TOP:
except_match = self.first_instr(start, len(self.code), self.opc.PJIF)
@@ -443,10 +450,16 @@ class Scanner2(scan.Scanner):
if self.version < 2.7 and self.code[jmp] in self.jump_forward:
self.not_continue.add(jmp)
jmp = self.get_target(jmp)
prev_offset = self.prev[except_match]
# COMPARE_OP argument should be "exception match" or 10
if (self.code[prev_offset] == self.opc.COMPARE_OP and
self.code[prev_offset+1] != 10):
return None
if jmp not in self.pop_jump_if | self.jump_forward:
self.ignore_if.add(except_match)
return None
self.ignore_if.add(except_match)
self.not_continue.add(jmp)
return jmp
@@ -466,11 +479,11 @@ class Scanner2(scan.Scanner):
elif op in self.setup_ops:
count_SETUP_ += 1
def detect_structure(self, pos, op):
'''
def detect_structure(self, offset, op):
"""
Detect type of block structures and their boundaries to fix optimized jumps
in python2.3+
'''
"""
# TODO: check the struct boundaries more precisely -Dan
@@ -483,7 +496,7 @@ class Scanner2(scan.Scanner):
for struct in self.structs:
_start = struct['start']
_end = struct['end']
if (_start <= pos < _end) and (_start >= start and _end <= end):
if (_start <= offset < _end) and (_start >= start and _end <= end):
start = _start
end = _end
parent = struct
@@ -495,14 +508,16 @@ class Scanner2(scan.Scanner):
# Try to find the jump_back instruction of the loop.
# It could be a return instruction.
start = pos+3
target = self.get_target(pos, op)
start = offset+3
target = self.get_target(offset, op)
end = self.restrict_to_parent(target, parent)
self.setup_loop_targets[offset] = target
self.setup_loops[target] = offset
if target != end:
self.fixed_jumps[pos] = end
self.fixed_jumps[offset] = end
(line_no, next_line_byte) = self.lines[pos]
(line_no, next_line_byte) = self.lines[offset]
jump_back = self.last_instr(start, end, self.opc.JUMP_ABSOLUTE,
next_line_byte, False)
@@ -566,10 +581,10 @@ class Scanner2(scan.Scanner):
if end > jump_back+4 and code[end] in self.jump_forward:
if code[jump_back+4] in self.jump_forward:
if self.get_target(jump_back+4) == self.get_target(end):
self.fixed_jumps[pos] = jump_back+4
self.fixed_jumps[offset] = jump_back+4
end = jump_back+4
elif target < pos:
self.fixed_jumps[pos] = jump_back+4
elif target < offset:
self.fixed_jumps[offset] = jump_back+4
end = jump_back+4
target = self.get_target(jump_back, self.opc.JUMP_ABSOLUTE)
@@ -585,7 +600,7 @@ class Scanner2(scan.Scanner):
else:
test = self.prev[next_line_byte]
if test == pos:
if test == offset:
loop_type = 'while 1'
elif self.code[test] in self.opc.hasjabs + self.opc.hasjrel:
self.ignore_if.add(test)
@@ -602,15 +617,15 @@ class Scanner2(scan.Scanner):
'start': jump_back+3,
'end': end})
elif op == self.opc.SETUP_EXCEPT:
start = pos+3
target = self.get_target(pos, op)
start = offset+3
target = self.get_target(offset, op)
end = self.restrict_to_parent(target, parent)
if target != end:
self.fixed_jumps[pos] = end
self.fixed_jumps[offset] = end
# print target, end, parent
# Add the try block
self.structs.append({'type': 'try',
'start': start,
'start': start-3,
'end': end-4})
# Now isolate the except and else blocks
end_else = start_else = self.get_target(self.prev[end])
@@ -654,15 +669,15 @@ class Scanner2(scan.Scanner):
self.fixed_jumps[i] = i+1
elif op in self.pop_jump_if:
target = self.get_target(pos, op)
target = self.get_target(offset, op)
rtarget = self.restrict_to_parent(target, parent)
# Do not let jump to go out of parent struct bounds
if target != rtarget and parent['type'] == 'and/or':
self.fixed_jumps[pos] = rtarget
self.fixed_jumps[offset] = rtarget
return
start = pos+3
start = offset+3
pre = self.prev
# Does this jump to right after another conditional jump that is
@@ -677,8 +692,8 @@ class Scanner2(scan.Scanner):
op_testset = self.pop_jump_if_or_pop | self.pop_jump_if
if ( code[pre[target]] in op_testset
and (target > pos) ):
self.fixed_jumps[pos] = pre[target]
and (target > offset) ):
self.fixed_jumps[offset] = pre[target]
self.structs.append({'type': 'and/or',
'start': start,
'end': pre[target]})
@@ -690,7 +705,7 @@ class Scanner2(scan.Scanner):
# Search for other POP_JUMP_IF_FALSE targetting the same op,
# in current statement, starting from current offset, and filter
# everything inside inner 'or' jumps and midline ifs
match = self.rem_or(start, self.next_stmt[pos], self.opc.PJIF, target)
match = self.rem_or(start, self.next_stmt[offset], self.opc.PJIF, target)
# If we still have any offsets in set, start working on it
if match:
@@ -698,13 +713,13 @@ class Scanner2(scan.Scanner):
and pre[rtarget] not in self.stmts \
and self.restrict_to_parent(self.get_target(pre[rtarget]), parent) == rtarget:
if code[pre[pre[rtarget]]] == self.opc.JUMP_ABSOLUTE \
and self.remove_mid_line_ifs([pos]) \
and self.remove_mid_line_ifs([offset]) \
and target == self.get_target(pre[pre[rtarget]]) \
and (pre[pre[rtarget]] not in self.stmts or self.get_target(pre[pre[rtarget]]) > pre[pre[rtarget]])\
and 1 == len(self.remove_mid_line_ifs(self.rem_or(start, pre[pre[rtarget]], self.pop_jump_if, target))):
pass
elif code[pre[pre[rtarget]]] == self.opc.RETURN_VALUE \
and self.remove_mid_line_ifs([pos]) \
and self.remove_mid_line_ifs([offset]) \
and 1 == (len(set(self.remove_mid_line_ifs(self.rem_or(start,
pre[pre[rtarget]],
self.pop_jump_if, target)))
@@ -713,7 +728,7 @@ class Scanner2(scan.Scanner):
pass
else:
fix = None
jump_ifs = self.all_instr(start, self.next_stmt[pos], self.opc.PJIF)
jump_ifs = self.all_instr(start, self.next_stmt[offset], self.opc.PJIF)
last_jump_good = True
for j in jump_ifs:
if target == self.get_target(j):
@@ -722,79 +737,96 @@ class Scanner2(scan.Scanner):
break
else:
last_jump_good = False
self.fixed_jumps[pos] = fix or match[-1]
self.fixed_jumps[offset] = fix or match[-1]
return
else:
if (self.version < 2.7
and parent['type'] in ('root', 'for-loop', 'if-then',
'if-else', 'try')):
self.fixed_jumps[pos] = rtarget
self.fixed_jumps[offset] = rtarget
else:
# note test for < 2.7 might be superflous although informative
# for 2.7 a different branch is taken and the below code is handled
# under: elif op in self.pop_jump_if_or_pop
# below
self.fixed_jumps[pos] = match[-1]
self.fixed_jumps[offset] = match[-1]
return
else: # op != self.opc.PJIT
if self.version < 2.7 and code[pos+3] == self.opc.POP_TOP:
assert_pos = pos + 4
if self.version < 2.7 and code[offset+3] == self.opc.POP_TOP:
assert_offset = offset + 4
else:
assert_pos = pos + 3
if (assert_pos) in self.load_asserts:
assert_offset = offset + 3
if (assert_offset) in self.load_asserts:
if code[pre[rtarget]] == self.opc.RAISE_VARARGS:
return
self.load_asserts.remove(assert_pos)
self.load_asserts.remove(assert_offset)
next = self.next_stmt[pos]
if pre[next] == pos:
next = self.next_stmt[offset]
if pre[next] == offset:
pass
elif code[next] in self.jump_forward and target == self.get_target(next):
if code[pre[next]] == self.opc.PJIF:
if code[next] == self.opc.JUMP_FORWARD or target != rtarget or code[pre[pre[rtarget]]] not in (self.opc.JUMP_ABSOLUTE, self.opc.RETURN_VALUE):
self.fixed_jumps[pos] = pre[next]
self.fixed_jumps[offset] = pre[next]
return
elif code[next] == self.opc.JUMP_ABSOLUTE and code[target] in self.jump_forward:
next_target = self.get_target(next)
if self.get_target(target) == next_target:
self.fixed_jumps[pos] = pre[next]
self.fixed_jumps[offset] = pre[next]
return
elif code[next_target] in self.jump_forward and self.get_target(next_target) == self.get_target(target):
self.fixed_jumps[pos] = pre[next]
self.fixed_jumps[offset] = pre[next]
return
# don't add a struct for a while test, it's already taken care of
if pos in self.ignore_if:
if offset in self.ignore_if:
return
if code[pre[rtarget]] == self.opc.JUMP_ABSOLUTE and pre[rtarget] in self.stmts \
and pre[rtarget] != pos and pre[pre[rtarget]] != pos:
if code[rtarget] == self.opc.JUMP_ABSOLUTE and code[rtarget+3] == self.opc.POP_BLOCK:
if code[pre[pre[rtarget]]] != self.opc.JUMP_ABSOLUTE:
pass
elif self.get_target(pre[pre[rtarget]]) != target:
pass
if self.version == 2.7:
if code[pre[rtarget]] == self.opc.JUMP_ABSOLUTE and pre[rtarget] in self.stmts \
and pre[rtarget] != offset and pre[pre[rtarget]] != offset:
if code[rtarget] == self.opc.JUMP_ABSOLUTE and code[rtarget+3] == self.opc.POP_BLOCK:
if code[pre[pre[rtarget]]] != self.opc.JUMP_ABSOLUTE:
pass
elif self.get_target(pre[pre[rtarget]]) != target:
pass
else:
rtarget = pre[rtarget]
else:
rtarget = pre[rtarget]
else:
rtarget = pre[rtarget]
# Does the "if" jump just beyond a jump op, then this is probably an if statement
pre_rtarget = pre[rtarget]
code_pre_rtarget = code[pre_rtarget]
if code_pre_rtarget in self.jump_forward:
if_end = self.get_target(pre_rtarget)
# Is this a loop and not an "if" statment?
if (if_end < pre_rtarget) and (code[pre[if_end]] == self.opc.SETUP_LOOP):
if(if_end > start):
if (if_end < pre_rtarget) and (pre[if_end] in self.setup_loop_targets):
if (if_end > start):
return
else:
# We still have the case in 2.7 that the next instruction
# is a jump to a SETUP_LOOP target.
next_offset = target + self.op_size(self.code[target])
next_op = self.code[next_offset]
if self.opc.opname[next_op] == 'JUMP_FORWARD':
jump_target = self.get_target(next_offset, next_op)
if jump_target in self.setup_loops:
self.structs.append({'type': 'while-loop',
'start': start - 3,
'end': jump_target})
self.fixed_jumps[start-3] = jump_target
return
end = self.restrict_to_parent(if_end, parent)
self.structs.append({'type': 'if-then',
'start': start,
'start': start-3,
'end': pre_rtarget})
self.not_continue.add(pre_rtarget)
if rtarget < end:
@@ -810,46 +842,57 @@ class Scanner2(scan.Scanner):
self.return_end_ifs.add(pre_rtarget)
elif op in self.pop_jump_if_or_pop:
target = self.get_target(pos, op)
self.fixed_jumps[pos] = self.restrict_to_parent(target, parent)
target = self.get_target(offset, op)
self.fixed_jumps[offset] = self.restrict_to_parent(target, parent)
def find_jump_targets(self):
'''
def find_jump_targets(self, debug):
"""
Detect all offsets in a byte code which are jump targets
where we might insert a COME_FROM instruction.
where we might insert a pseudo "COME_FROM" instruction.
"COME_FROM" instructions are used in detecting overall
control flow. The more detailed information about the
control flow is captured in self.structs.
Since this stuff is tricky, consult self.structs when
something goes amiss.
Return the list of offsets. An instruction can be jumped
to in from multiple instructions.
'''
n = len(self.code)
"""
code = self.code
n = len(code)
self.structs = [{'type': 'root',
'start': 0,
'end': n-1}]
self.loops = [] # All loop entry points
self.fixed_jumps = {} # Map fixed jumps to their real destination
# All loop entry points
self.loops = []
# Map fixed jumps to their real destination
self.fixed_jumps = {}
self.ignore_if = set()
self.build_stmt_indices()
self.build_statement_indices()
# Containers filled by detect_structure()
self.not_continue = set()
self.return_end_ifs = set()
self.setup_loop_targets = {} # target given setup_loop offset
self.setup_loops = {} # setup_loop offset given target
targets = {}
for offset in self.op_range(0, n):
op = self.code[offset]
op = code[offset]
# Determine structures and fix jumps in Python versions
# since 2.3
self.detect_structure(offset, op)
if op >= self.opc.HAVE_ARGUMENT:
if op_has_argument(op, self.opc):
label = self.fixed_jumps.get(offset)
oparg = self.get_argument(offset)
if label is None:
if (op in self.opc.hasjrel and
(self.version < 2.0 or op != self.opc.FOR_ITER)):
if op in self.opc.hasjrel and self.opc.opname[op] != 'FOR_ITER':
# if (op in self.opc.hasjrel and
# (self.version < 2.0 or op != self.opc.FOR_ITER)):
label = offset + 3 + oparg
elif self.version == 2.7 and op in self.opc.hasjabs:
if op in (self.opc.JUMP_IF_FALSE_OR_POP,
@@ -859,7 +902,6 @@ class Scanner2(scan.Scanner):
pass
pass
# FIXME: All the < 2.7 conditions are is horrible. We need a better way.
if label is not None and label != -1:
# In Python < 2.7, the POP_TOP in:
@@ -867,23 +909,38 @@ class Scanner2(scan.Scanner):
# does now start a new statement
# Otherwise, we have want to add a "COME_FROM"
if not (self.version < 2.7 and
self.code[label] == self.opc.POP_TOP and
self.code[self.prev[label]] == self.opc.RETURN_VALUE):
code[label] == self.opc.POP_TOP and
code[self.prev[label]] == self.opc.RETURN_VALUE):
# In Python < 2.7, don't add a COME_FROM, for:
# JUMP_FORWARD, END_FINALLY
# or:
# JUMP_FORWARD, POP_TOP, END_FINALLY
if not (self.version < 2.7 and op == self.opc.JUMP_FORWARD
and ((self.code[offset+3] == self.opc.END_FINALLY)
or (self.code[offset+3] == self.opc.POP_TOP
and self.code[offset+4] == self.opc.END_FINALLY))):
targets[label] = targets.get(label, []) + [offset]
and ((code[offset+3] == self.opc.END_FINALLY)
or (code[offset+3] == self.opc.POP_TOP
and code[offset+4] == self.opc.END_FINALLY))):
# FIXME: rocky: I think we need something like this...
if offset not in set(self.ignore_if) or self.version == 2.7:
source = (self.setup_loops[label]
if label in self.setup_loops else offset)
targets[label] = targets.get(label, []) + [source]
pass
pass
pass
elif op == self.opc.END_FINALLY and offset in self.fixed_jumps and self.version == 2.7:
label = self.fixed_jumps[offset]
targets[label] = targets.get(label, []) + [offset]
pass
pass
# DEBUG:
if debug in ('both', 'after'):
print(targets)
import pprint as pp
pp.pprint(self.structs)
return targets
# FIXME: combine with scanner3.py code and put into scanner.py

View File

@@ -25,5 +25,5 @@ class Scanner23(scan.Scanner24):
# These are the only differences in initialization between
# 2.3-2.6
self.version = 2.3
self.genexpr_name = '<generator expression>';
self.genexpr_name = '<generator expression>'
return

View File

@@ -25,5 +25,5 @@ class Scanner24(scan.Scanner25):
self.opc = opcode_24
self.opname = opcode_24.opname
self.version = 2.4
self.genexpr_name = '<generator expression>';
self.genexpr_name = '<generator expression>'
return

View File

@@ -130,7 +130,7 @@ class Scanner26(scan.Scanner2):
if names[self.get_argument(i+4)] == 'AssertionError':
self.load_asserts.add(i+4)
jump_targets = self.find_jump_targets()
jump_targets = self.find_jump_targets(show_asm)
# contains (code, [addrRefToCode])
last_stmt = self.next_stmt[0]
@@ -233,7 +233,7 @@ class Scanner26(scan.Scanner2):
if op != self.opc.BUILD_SLICE:
customize[op_name] = oparg
elif op == self.opc.JUMP_ABSOLUTE:
# Further classifhy JUMP_ABSOLUTE into backward jumps
# Further classify JUMP_ABSOLUTE into backward jumps
# which are used in loops, and "CONTINUE" jumps which
# may appear in a "continue" statement. The loop-type
# and continue-type jumps will help us classify loop
@@ -254,6 +254,9 @@ class Scanner26(scan.Scanner2):
# if x: continue
# the "continue" is not on a new line.
if tokens[-1].type == 'JUMP_BACK':
# We need 'intern' since we have
# already have processed the previous
# token.
tokens[-1].type = intern('CONTINUE')
elif op == self.opc.LOAD_GLOBAL:
@@ -279,9 +282,9 @@ class Scanner26(scan.Scanner2):
pass
pass
if show_asm:
if show_asm in ('both', 'after'):
for t in tokens:
print(t)
print(t.format(line_prefix='L.'))
print()
return tokens, customize

View File

@@ -91,18 +91,35 @@ class Scanner3(Scanner):
self.opc.JUMP_ABSOLUTE, self.opc.UNPACK_EX
])
self.jump_if_pop = frozenset([self.opc.JUMP_IF_FALSE_OR_POP,
self.opc.JUMP_IF_TRUE_OR_POP])
if self.version > 3.0:
self.jump_if_pop = frozenset([self.opc.JUMP_IF_FALSE_OR_POP,
self.opc.JUMP_IF_TRUE_OR_POP])
self.pop_jump_if_pop = frozenset([self.opc.JUMP_IF_FALSE_OR_POP,
self.opc.JUMP_IF_TRUE_OR_POP,
self.opc.POP_JUMP_IF_TRUE,
self.opc.POP_JUMP_IF_FALSE])
self.pop_jump_if_pop = frozenset([self.opc.JUMP_IF_FALSE_OR_POP,
self.opc.JUMP_IF_TRUE_OR_POP,
self.opc.POP_JUMP_IF_TRUE,
self.opc.POP_JUMP_IF_FALSE])
# Not really a set, but still clasification-like
self.statement_opcode_sequences = [
(self.opc.POP_JUMP_IF_FALSE, self.opc.JUMP_FORWARD),
(self.opc.POP_JUMP_IF_FALSE, self.opc.JUMP_ABSOLUTE),
(self.opc.POP_JUMP_IF_TRUE, self.opc.JUMP_FORWARD),
(self.opc.POP_JUMP_IF_TRUE, self.opc.JUMP_ABSOLUTE)]
else:
self.jump_if_pop = frozenset([])
self.pop_jump_if_pop = frozenset([])
# Not really a set, but still clasification-like
self.statement_opcode_sequences = [
(self.opc.JUMP_FORWARD,),
(self.opc.JUMP_ABSOLUTE,),
(self.opc.JUMP_FORWARD,),
(self.opc.JUMP_ABSOLUTE,)]
# Opcodes that take a variable number of arguments
# (expr's)
varargs_ops = set([
self.opc.BUILD_LIST, self.opc.BUILD_TUPLE,
self.opc.BUILD_LIST, self.opc.BUILD_TUPLE,
self.opc.BUILD_SET, self.opc.BUILD_SLICE,
self.opc.BUILD_MAP, self.opc.UNPACK_SEQUENCE,
self.opc.RAISE_VARARGS])
@@ -111,14 +128,6 @@ class Scanner3(Scanner):
varargs_ops.add(self.opc.CALL_METHOD)
self.varargs_ops = frozenset(varargs_ops)
# Not really a set, but still clasification-like
self.statement_opcode_sequences = [
(self.opc.POP_JUMP_IF_FALSE, self.opc.JUMP_FORWARD),
(self.opc.POP_JUMP_IF_FALSE, self.opc.JUMP_ABSOLUTE),
(self.opc.POP_JUMP_IF_TRUE, self.opc.JUMP_FORWARD),
(self.opc.POP_JUMP_IF_TRUE, self.opc.JUMP_ABSOLUTE)]
def opName(self, offset):
return self.opc.opname[self.code[offset]]
@@ -189,7 +198,7 @@ class Scanner3(Scanner):
# Get jump targets
# Format: {target offset: [jump offsets]}
jump_targets = self.find_jump_targets()
jump_targets = self.find_jump_targets(show_asm)
for inst in bytecode:
@@ -216,6 +225,14 @@ class Scanner3(Scanner):
jump_idx += 1
pass
pass
elif inst.offset in self.else_start:
end_offset = self.else_start[inst.offset]
tokens.append(Token('ELSE',
None, repr(end_offset),
offset='%s' % (inst.offset),
has_arg = True, opc=self.opc))
pass
pattr = inst.argrepr
opname = inst.opname
@@ -304,7 +321,9 @@ class Scanner3(Scanner):
if target <= inst.offset:
next_opname = self.opname[self.code[inst.offset+3]]
if (inst.offset in self.stmts and
next_opname not in ('END_FINALLY', 'POP_BLOCK')
next_opname not in ('END_FINALLY', 'POP_BLOCK',
# Python 3.0 only uses POP_TOP
'POP_TOP')
and inst.offset not in self.not_continue):
opname = 'CONTINUE'
else:
@@ -312,9 +331,10 @@ class Scanner3(Scanner):
# FIXME: this is a hack to catch stuff like:
# if x: continue
# the "continue" is not on a new line.
# There are other situations were we don't catch
# There are other situations where we don't catch
# CONTINUE as well.
if tokens[-1].type == 'JUMP_BACK':
if tokens[-1].type == 'JUMP_BACK' and tokens[-1].attr <= argval:
# intern is used because we are changing the *previous* token
tokens[-1].type = intern('CONTINUE')
elif op == self.opc.RETURN_VALUE:
@@ -389,14 +409,15 @@ class Scanner3(Scanner):
for _ in range(self.op_size(op)):
self.prev_op.append(offset)
def find_jump_targets(self):
def find_jump_targets(self, debug):
"""
Detect all offsets in a byte code which are jump targets.
Detect all offsets in a byte code which are jump targets
where we might insert a COME_FROM instruction.
Return the list of offsets.
This procedure is modelled after dis.findlabels(), but here
for each target the number of jumps is counted.
Return the list of offsets. An instruction can be jumped
to in from multiple instructions.
"""
code = self.code
n = len(code)
@@ -411,10 +432,13 @@ class Scanner3(Scanner):
self.fixed_jumps = {}
self.ignore_if = set()
self.build_statement_indices()
self.else_start = {}
# Containers filled by detect_structure()
self.not_continue = set()
self.return_end_ifs = set()
self.setup_loop_targets = {} # target given setup_loop offset
self.setup_loops = {} # setup_loop offset given target
targets = {}
for offset in self.op_range(0, n):
@@ -442,6 +466,13 @@ class Scanner3(Scanner):
elif op == self.opc.END_FINALLY and offset in self.fixed_jumps:
label = self.fixed_jumps[offset]
targets[label] = targets.get(label, []) + [offset]
pass
pass
# DEBUG:
if debug in ('both', 'after'):
import pprint as pp
pp.pprint(self.structs)
return targets
def build_statement_indices(self):
@@ -531,9 +562,15 @@ class Scanner3(Scanner):
Get target offset for op located at given <offset>.
"""
op = self.code[offset]
target = self.code[offset+1] + self.code[offset+2] * 256
if op in op3.hasjrel:
target += offset + 3
if self.version >= 3.6:
target = self.code[offset+1]
if op in op3.hasjrel:
target += offset + 2
else:
target = self.code[offset+1] + self.code[offset+2] * 256
if op in op3.hasjrel:
target += offset + 3
return target
def detect_structure(self, offset, targets):
@@ -572,6 +609,8 @@ class Scanner3(Scanner):
start = offset+3
target = self.get_target(offset)
end = self.restrict_to_parent(target, parent)
self.setup_loop_targets[offset] = target
self.setup_loops[target] = offset
if target != end:
self.fixed_jumps[offset] = end
@@ -734,15 +773,28 @@ class Scanner3(Scanner):
code[prev_op[prev_op[rtarget]]] != self.opc.JUMP_ABSOLUTE)):
rtarget = prev_op[rtarget]
# Does the "if" jump just beyond a jump op, then this can be
# a block inside an "if" statement
# Does the "jump if" jump beyond a jump op?
# That is, we have something like:
# POP_JUMP_IF_FALSE HERE
# ...
# JUMP_FORWARD
# HERE:
#
# If so, this can be block inside an "if" statement
# or a conditional assignment like:
# x = 1 if x else 2
#
# There are other contexts we may need to consider
# like whether the target is "END_FINALLY"
# or if the condition jump is to a forward location
if self.is_jump_forward(prev_op[rtarget]):
if_end = self.get_target(prev_op[rtarget])
rrtarget = prev_op[rtarget]
if_end = self.get_target(rrtarget)
# Is this a loop and not an "if" statement?
if ((if_end < prev_op[rtarget]) and
# If the jump target is back, we are looping
if (if_end < rrtarget and
(code[prev_op[if_end]] == self.opc.SETUP_LOOP)):
if(if_end > start):
if (if_end > start):
return
end = self.restrict_to_parent(if_end, parent)
@@ -752,10 +804,15 @@ class Scanner3(Scanner):
'end': prev_op[rtarget]})
self.not_continue.add(prev_op[rtarget])
if rtarget < end:
self.structs.append({'type': 'if-else',
if rtarget < end and (
code[rtarget] not in (self.opc.END_FINALLY,
self.opc.JUMP_ABSOLUTE) and
code[prev_op[rrtarget]] not in (self.opc.POP_EXCEPT,
self.opc.END_FINALLY)):
self.structs.append({'type': 'else',
'start': rtarget,
'end': end})
self.else_start[rtarget] = end
elif code[prev_op[rtarget]] == self.opc.RETURN_VALUE:
self.structs.append({'type': 'if-then',
'start': start,
@@ -845,7 +902,9 @@ class Scanner3(Scanner):
op = self.code[i]
if op == self.opc.END_FINALLY:
if count_END_FINALLY == count_SETUP_:
assert self.code[self.prev_op[i]] in (JUMP_ABSOLUTE, JUMP_FORWARD, RETURN_VALUE)
assert self.code[self.prev_op[i]] in (JUMP_ABSOLUTE,
JUMP_FORWARD,
RETURN_VALUE)
self.not_continue.add(self.prev_op[i])
return self.prev_op[i]
count_END_FINALLY += 1

View File

@@ -0,0 +1,34 @@
# Copyright (c) 2016 by Rocky Bernstein
"""
Python 3.0 bytecode scanner/deparser
This sets up opcodes Python's 3.0 and calls a generalized
scanner routine for Python 3.
"""
from __future__ import print_function
# bytecode verification, verify(), uses JUMP_OPs from here
from xdis.opcodes import opcode_30 as opc
JUMP_OPs = map(lambda op: opc.opname[op], opc.hasjrel + opc.hasjabs)
from uncompyle6.scanners.scanner3 import Scanner3
class Scanner30(Scanner3):
def __init__(self, show_asm=None, is_pypy=False):
Scanner3.__init__(self, 3.0, show_asm, is_pypy)
return
pass
if __name__ == "__main__":
from uncompyle6 import PYTHON_VERSION
if PYTHON_VERSION == 3.0:
import inspect
co = inspect.currentframe().f_code
tokens, customize = Scanner30().ingest(co)
for t in tokens:
print(t)
pass
else:
print("Need to be Python 3.0 to demo; I am %s." %
PYTHON_VERSION)

View File

@@ -29,7 +29,7 @@ class Token:
self.pattr = pattr
self.offset = offset
self.linestart = linestart
if has_arg == False:
if has_arg is False:
self.attr = None
self.pattr = None
self.opc = opc
@@ -53,7 +53,11 @@ class Token:
# ('%9s %-18s %r' % (self.offset, self.type, pattr)))
def __str__(self):
prefix = '\n%4d ' % self.linestart if self.linestart else (' ' * 6)
return self.format(line_prefix='')
def format(self, line_prefix=''):
prefix = ('\n%s%4d ' % (line_prefix, self.linestart)
if self.linestart else (' ' * (6 + len(line_prefix))))
offset_opname = '%6s %-17s' % (self.offset, self.type)
if not self.has_arg:
return "%s%s" % (prefix, offset_opname)

View File

@@ -0,0 +1,26 @@
"""
Python AST grammar checker.
Our rules sometimes give erroneous results. Until we have perfect rules,
This checker will catch mistakes in decompilation we've made.
FIXME idea: extend parsing system to do same kinds of checks or nonterminal
before reduction and don't reduce when there is a problem.
"""
def checker(ast, in_loop, errors):
in_loop = in_loop or ast.type in ('while1stmt', 'whileTruestmt',
'whilestmt', 'whileelsestmt',
'for_block')
if ast.type in ('augassign1', 'augassign2') and ast[0][0] == 'and':
text = str(ast)
error_text = '\n# improper augmented assigment (e.g. +=, *=, ...):\n#\t' + '\n# '.join(text.split("\n")) + '\n'
errors.append(error_text)
for node in ast:
if not in_loop and node.type in ('continue_stmt', 'break_stmt'):
text = str(node)
error_text = '\n# not in loop:\n#\t' + '\n# '.join(text.split("\n"))
errors.append(error_text)
if hasattr(node, '__repr1__'):
checker(node, in_loop, errors)

View File

@@ -60,6 +60,9 @@ from xdis.code import iscode
from uncompyle6.semantics import pysource
from uncompyle6 import parser
from uncompyle6.scanner import Token, Code, get_scanner
from uncompyle6.semantics.check_ast import checker
from uncompyle6.semantics.helper import print_docstring
from uncompyle6.show import (
maybe_show_asm,
maybe_show_ast,
@@ -67,7 +70,9 @@ from uncompyle6.show import (
)
from uncompyle6.semantics.pysource import AST, INDENT_PER_LEVEL, NONE, PRECEDENCE, \
ParserError, TABLE_DIRECT, escape, find_all_globals, find_globals, find_none, minint, MAP
ParserError, TABLE_DIRECT, escape, find_globals, minint, MAP
from uncompyle6.semantics.make_function import find_all_globals, find_none
if PYTHON3:
from itertools import zip_longest
@@ -77,8 +82,7 @@ else:
from StringIO import StringIO
from spark_parser import GenericASTTraversalPruningException, \
DEFAULT_DEBUG as PARSER_DEFAULT_DEBUG
from spark_parser import DEFAULT_DEBUG as PARSER_DEFAULT_DEBUG
from collections import namedtuple
NodeInfo = namedtuple("NodeInfo", "node start finish")
@@ -579,7 +583,7 @@ class FragmentsWalker(pysource.SourceWalker, object):
self.set_pos_info(node[-3], start, len(self.f.getvalue()))
start = len(self.f.getvalue())
self.preorder(ast[iter_index])
self.set_pos_info(iter_index, start, len(self.f.getvalue()))
self.set_pos_info(ast[iter_index], start, len(self.f.getvalue()))
self.prec = p
def comprehension_walk3(self, node, iter_index, code_index=-5):
@@ -1006,6 +1010,8 @@ class FragmentsWalker(pysource.SourceWalker, object):
maybe_show_ast(self.showast, ast)
checker(ast, False, self.ast_errors)
return ast
# FIXME: we could provide another customized routine
@@ -1604,16 +1610,16 @@ class FragmentsWalker(pysource.SourceWalker, object):
args_node = node[-1]
if isinstance(args_node.attr, tuple):
if self.version == 3.3:
if self.version <= 3.3 and len(node) > 2 and node[-3] != 'LOAD_LAMBDA':
# positional args are after kwargs
defparams = node[1:args_node.attr[0]+1]
else:
# positional args are before kwargs
defparams = node[:args_node.attr[0]]
pos_aargs, kw_args, annotate_args = args_node.attr
pos_args, kw_args, annotate_argc = args_node.attr
else:
defparams = node[:args_node.attr]
kw_args, annotate_args = (0, 0)
kw_args, annotate_argc = (0, 0)
pass
if self.version > 3.0 and isLambda and iscode(node[-3].attr):
@@ -1682,7 +1688,7 @@ class FragmentsWalker(pysource.SourceWalker, object):
if len(code.co_consts)>0 and code.co_consts[0] is not None and not isLambda: # ugly
# docstring exists, dump it
self.print_docstring(indent, code.co_consts[0])
print_docstring(self, indent, code.co_consts[0])
code._tokens = None # save memory
assert ast == 'stmts'
@@ -1764,6 +1770,14 @@ def deparse_code(version, co, out=StringIO(), showasm=False, showast=False,
for g in deparsed.mod_globs:
deparsed.write('# global %s ## Warning: Unused global' % g)
if deparsed.ast_errors:
deparsed.write("# NOTE: have decompilation errors.\n")
deparsed.write("# Use -t option to show full context.")
for err in deparsed.ast_errors:
deparsed.write(err)
deparsed.ERROR = True
if deparsed.ERROR:
raise deparsed.ERROR

View File

@@ -0,0 +1,135 @@
import sys
from uncompyle6 import PYTHON3
if PYTHON3:
minint = -sys.maxsize-1
maxint = sys.maxsize
else:
minint = -sys.maxint-1
maxint = sys.maxint
def print_docstring(self, indent, docstring):
try:
if docstring.find('"""') == -1:
quote = '"""'
else:
quote = "'''"
except:
return False
self.write(indent)
if not PYTHON3 and not isinstance(docstring, str):
# Must be unicode in Python2
self.write('u')
docstring = repr(docstring.expandtabs())[2:-1]
else:
docstring = repr(docstring.expandtabs())[1:-1]
for (orig, replace) in (('\\\\', '\t'),
('\\r\\n', '\n'),
('\\n', '\n'),
('\\r', '\n'),
('\\"', '"'),
("\\'", "'")):
docstring = docstring.replace(orig, replace)
# Do a raw string if there are backslashes but no other escaped characters:
# also check some edge cases
if ('\t' in docstring
and '\\' not in docstring
and len(docstring) >= 2
and docstring[-1] != '\t'
and (docstring[-1] != '"'
or docstring[-2] == '\t')):
self.write('r') # raw string
# restore backslashes unescaped since raw
docstring = docstring.replace('\t', '\\')
else:
# Escape '"' if it's the last character, so it doesn't
# ruin the ending triple quote
if len(docstring) and docstring[-1] == '"':
docstring = docstring[:-1] + '\\"'
# Restore escaped backslashes
docstring = docstring.replace('\t', '\\\\')
# Escape triple quote when needed
if quote == '""""':
docstring = docstring.replace('"""', '\\"\\"\\"')
lines = docstring.split('\n')
calculate_indent = maxint
for line in lines[1:]:
stripped = line.lstrip()
if len(stripped) > 0:
calculate_indent = min(calculate_indent, len(line) - len(stripped))
calculate_indent = min(calculate_indent, len(lines[-1]) - len(lines[-1].lstrip()))
# Remove indentation (first line is special):
trimmed = [lines[0]]
if calculate_indent < maxint:
trimmed += [line[calculate_indent:] for line in lines[1:]]
self.write(quote)
if len(trimmed) == 0:
self.println(quote)
elif len(trimmed) == 1:
self.println(trimmed[0], quote)
else:
self.println(trimmed[0])
for line in trimmed[1:-1]:
self.println( indent, line )
self.println(indent, trimmed[-1], quote)
return True
# if __name__ == '__main__':
# if PYTHON3:
# from io import StringIO
# else:
# from StringIO import StringIO
# class PrintFake():
# def __init__(self):
# self.pending_newlines = 0
# self.f = StringIO()
# def write(self, *data):
# if (len(data) == 0) or (len(data) == 1 and data[0] == ''):
# return
# out = ''.join((str(j) for j in data))
# n = 0
# for i in out:
# if i == '\n':
# n += 1
# if n == len(out):
# self.pending_newlines = max(self.pending_newlines, n)
# return
# elif n:
# self.pending_newlines = max(self.pending_newlines, n)
# out = out[n:]
# break
# else:
# break
# if self.pending_newlines > 0:
# self.f.write('\n'*self.pending_newlines)
# self.pending_newlines = 0
# for i in out[::-1]:
# if i == '\n':
# self.pending_newlines += 1
# else:
# break
# if self.pending_newlines:
# out = out[:-self.pending_newlines]
# self.f.write(out)
# def println(self, *data):
# if data and not(len(data) == 1 and data[0] ==''):
# self.write(*data)
# self.pending_newlines = max(self.pending_newlines, 1)
# return
# pass
# for doc in (
# "Now is the time",
# r'''func placeholder - with ("""\nstring\n""")''',
# r'''func placeholder - ' and with ("""\nstring\n""")''',
# r"""func placeholder - ' and with ('''\nstring\n''') and \"\"\"\nstring\n\"\"\" """
# ):
# o = PrintFake()
# print_docstring(o, ' ', doc)
# print(o.f.getvalue())

View File

@@ -0,0 +1,559 @@
# Copyright (c) 2015, 2016 by Rocky Bernstein
# Copyright (c) 2000-2002 by hartmut Goebel <h.goebel@crazy-compilers.com>
"""
All the crazy things we have to do to handle Python functions
"""
from xdis.code import iscode
from uncompyle6.scanner import Code
from uncompyle6.parsers.astnode import AST
from uncompyle6 import PYTHON3
from uncompyle6.semantics.parser_error import ParserError
from uncompyle6.semantics.helper import print_docstring
if PYTHON3:
from itertools import zip_longest
else:
from itertools import izip_longest as zip_longest
from uncompyle6.show import maybe_show_ast_param_default
def find_all_globals(node, globs):
"""Find globals in this statement."""
for n in node:
if isinstance(n, AST):
globs = find_all_globals(n, globs)
elif n.type in ('STORE_GLOBAL', 'DELETE_GLOBAL', 'LOAD_GLOBAL'):
globs.add(n.pattr)
return globs
def find_globals(node, globs):
"""Find globals in this statement."""
for n in node:
if isinstance(n, AST):
globs = find_globals(n, globs)
elif n.type in ('STORE_GLOBAL', 'DELETE_GLOBAL'):
globs.add(n.pattr)
return globs
def find_none(node):
for n in node:
if isinstance(n, AST):
if n not in ('return_stmt', 'return_if_stmt'):
if find_none(n):
return True
elif n.type == 'LOAD_CONST' and n.pattr is None:
return True
return False
# FIXME: DRY the below code...
def make_function3_annotate(self, node, isLambda, nested=1,
codeNode=None, annotate_last=-1):
"""
Dump function defintion, doc string, and function
body. This code is specialized for Python 3"""
def build_param(ast, name, default):
"""build parameters:
- handle defaults
- handle format tuple parameters
"""
if default:
value = self.traverse(default, indent='')
maybe_show_ast_param_default(self.showast, name, value)
result = '%s=%s' % (name, value)
if result[-2:] == '= ': # default was 'LOAD_CONST None'
result += 'None'
return result
else:
return name
# MAKE_FUNCTION_... or MAKE_CLOSURE_...
assert node[-1].type.startswith('MAKE_')
annotate_tuple = node[annotate_last]
annotate_args = {}
if (annotate_tuple == 'annotate_tuple'
and annotate_tuple[0] in ('LOAD_CONST', 'LOAD_NAME')
and isinstance(annotate_tuple[0].attr, tuple)):
annotate_tup = annotate_tuple[0].attr
i = -1
j = annotate_last-1
l = -len(node)
while j >= l and node[j].type in ('annotate_arg' 'annotate_tuple'):
annotate_args[annotate_tup[i]] = (node[j][0].attr,
node[j][0] == 'LOAD_CONST')
i -= 1
j -= 1
args_node = node[-1]
if isinstance(args_node.attr, tuple):
# positional args are before kwargs
defparams = node[:args_node.attr[0]]
pos_args, kw_args, annotate_argc = args_node.attr
else:
defparams = node[:args_node.attr]
kw_args = 0
pass
if 3.0 <= self.version <= 3.2:
lambda_index = -2
elif 3.03 <= self.version:
lambda_index = -3
else:
lambda_index = None
if lambda_index and isLambda and iscode(node[lambda_index].attr):
assert node[lambda_index].type == 'LOAD_LAMBDA'
code = node[lambda_index].attr
else:
code = codeNode.attr
assert iscode(code)
code = Code(code, self.scanner, self.currentclass)
# add defaults values to parameter names
argc = code.co_argcount
paramnames = list(code.co_varnames[:argc])
try:
ast = self.build_ast(code._tokens,
code._customize,
isLambda = isLambda,
noneInNames = ('None' in code.co_names))
except ParserError as p:
self.write(str(p))
self.ERROR = p
return
kw_pairs = args_node.attr[1]
indent = self.indent
if isLambda:
self.write("lambda ")
else:
self.write("(")
last_line = self.f.getvalue().split("\n")[-1]
l = len(last_line)
indent = ' ' * l
line_number = self.line_number
if 4 & code.co_flags: # flag 2 -> variable number of args
self.write('*%s' % code.co_varnames[argc + kw_pairs])
argc += 1
i = len(paramnames) - len(defparams)
suffix = ''
for param in paramnames[:i]:
self.write(suffix, param)
if param in annotate_args:
value, string = annotate_args[param]
if string:
self.write(': "%s"' % value)
else:
self.write(': %s' % value)
suffix = ', '
suffix = ', ' if i > 0 else ''
for n in node:
if n == 'pos_arg':
self.write(suffix)
param = paramnames[i]
self.write(param)
if param in annotate_args:
self.write(':"%s' % annotate_args[param])
self.write('=')
i += 1
self.preorder(n)
if (line_number != self.line_number):
suffix = ",\n" + indent
line_number = self.line_number
else:
suffix = ', '
# self.println(indent, '#flags:\t', int(code.co_flags))
if kw_args > 0:
if not (4 & code.co_flags):
if argc > 0:
self.write(", *, ")
else:
self.write("*, ")
pass
else:
self.write(", ")
kwargs = node[0]
last = len(kwargs)-1
i = 0
for n in node[0]:
if n == 'kwarg':
self.write('%s=' % n[0].pattr)
self.preorder(n[1])
if i < last:
self.write(', ')
i += 1
pass
pass
pass
if 8 & code.co_flags: # flag 3 -> keyword args
if argc > 0:
self.write(', ')
self.write('**%s' % code.co_varnames[argc + kw_pairs])
if isLambda:
self.write(": ")
else:
self.write(')')
if 'return' in annotate_args:
value, string = annotate_args['return']
if string:
self.write(' -> "%s"' % value)
else:
self.write(' -> %s' % value)
self.println(":")
if (len(code.co_consts) > 0 and
code.co_consts[0] is not None and not isLambda): # ugly
# docstring exists, dump it
print_docstring(self, indent, code.co_consts[0])
code._tokens = None # save memory
assert ast == 'stmts'
all_globals = find_all_globals(ast, set())
for g in ((all_globals & self.mod_globs) | find_globals(ast, set())):
self.println(self.indent, 'global ', g)
self.mod_globs -= all_globals
has_none = 'None' in code.co_names
rn = has_none and not find_none(ast)
self.gen_source(ast, code.co_name, code._customize, isLambda=isLambda,
returnNone=rn)
code._tokens = code._customize = None # save memory
def make_function2(self, node, isLambda, nested=1, codeNode=None):
"""
Dump function defintion, doc string, and function body.
This code is specialied for Python 2.
"""
# FIXME: call make_function3 if we are self.version >= 3.0
# and then simplify the below.
def build_param(ast, name, default):
"""build parameters:
- handle defaults
- handle format tuple parameters
"""
# if formal parameter is a tuple, the paramater name
# starts with a dot (eg. '.1', '.2')
if name.startswith('.'):
# replace the name with the tuple-string
name = self.get_tuple_parameter(ast, name)
pass
if default:
value = self.traverse(default, indent='')
maybe_show_ast_param_default(self.showast, name, value)
result = '%s=%s' % (name, value)
if result[-2:] == '= ': # default was 'LOAD_CONST None'
result += 'None'
return result
else:
return name
# MAKE_FUNCTION_... or MAKE_CLOSURE_...
assert node[-1].type.startswith('MAKE_')
args_node = node[-1]
if isinstance(args_node.attr, tuple):
# positional args are after kwargs
defparams = node[1:args_node.attr[0]+1]
pos_args, kw_args, annotate_argc = args_node.attr
else:
defparams = node[:args_node.attr]
kw_args = 0
pass
lambda_index = None
if lambda_index and isLambda and iscode(node[lambda_index].attr):
assert node[lambda_index].type == 'LOAD_LAMBDA'
code = node[lambda_index].attr
else:
code = codeNode.attr
assert iscode(code)
code = Code(code, self.scanner, self.currentclass)
# add defaults values to parameter names
argc = code.co_argcount
paramnames = list(code.co_varnames[:argc])
# defaults are for last n parameters, thus reverse
paramnames.reverse(); defparams.reverse()
try:
ast = self.build_ast(code._tokens,
code._customize,
isLambda = isLambda,
noneInNames = ('None' in code.co_names))
except ParserError as p:
self.write(str(p))
self.ERROR = p
return
kw_pairs = args_node.attr[1] if self.version >= 3.0 else 0
indent = self.indent
# build parameters
params = [build_param(ast, name, default) for
name, default in zip_longest(paramnames, defparams, fillvalue=None)]
params.reverse() # back to correct order
if 4 & code.co_flags: # flag 2 -> variable number of args
params.append('*%s' % code.co_varnames[argc])
argc += 1
# dump parameter list (with default values)
if isLambda:
self.write("lambda ", ", ".join(params))
else:
self.write("(", ", ".join(params))
if kw_args > 0:
if not (4 & code.co_flags):
if argc > 0:
self.write(", *, ")
else:
self.write("*, ")
pass
else:
self.write(", ")
for n in node:
if n == 'pos_arg':
continue
else:
self.preorder(n)
break
pass
if 8 & code.co_flags: # flag 3 -> keyword args
if argc > 0:
self.write(', ')
self.write('**%s' % code.co_varnames[argc + kw_pairs])
if isLambda:
self.write(": ")
else:
self.println("):")
if len(code.co_consts) > 0 and code.co_consts[0] is not None and not isLambda: # ugly
# docstring exists, dump it
print_docstring(self, indent, code.co_consts[0])
code._tokens = None # save memory
assert ast == 'stmts'
all_globals = find_all_globals(ast, set())
for g in ((all_globals & self.mod_globs) | find_globals(ast, set())):
self.println(self.indent, 'global ', g)
self.mod_globs -= all_globals
has_none = 'None' in code.co_names
rn = has_none and not find_none(ast)
self.gen_source(ast, code.co_name, code._customize, isLambda=isLambda,
returnNone=rn)
code._tokens = None; code._customize = None # save memory
def make_function3(self, node, isLambda, nested=1, codeNode=None):
"""Dump function definition, doc string, and function body."""
# FIXME: call make_function3 if we are self.version >= 3.0
# and then simplify the below.
def build_param(ast, name, default):
"""build parameters:
- handle defaults
- handle format tuple parameters
"""
if default:
value = self.traverse(default, indent='')
maybe_show_ast_param_default(self.showast, name, value)
result = '%s=%s' % (name, value)
if result[-2:] == '= ': # default was 'LOAD_CONST None'
result += 'None'
return result
else:
return name
# MAKE_FUNCTION_... or MAKE_CLOSURE_...
assert node[-1].type.startswith('MAKE_')
args_node = node[-1]
if isinstance(args_node.attr, tuple):
if self.version <= 3.3 and len(node) > 2 and node[-3] != 'LOAD_LAMBDA':
# positional args are after kwargs
defparams = node[1:args_node.attr[0]+1]
else:
# positional args are before kwargs
defparams = node[:args_node.attr[0]]
pos_args, kw_args, annotate_argc = args_node.attr
else:
defparams = node[:args_node.attr]
kw_args = 0
pass
if 3.0 <= self.version <= 3.2:
lambda_index = -2
elif 3.03 <= self.version:
lambda_index = -3
else:
lambda_index = None
if lambda_index and isLambda and iscode(node[lambda_index].attr):
assert node[lambda_index].type == 'LOAD_LAMBDA'
code = node[lambda_index].attr
else:
code = codeNode.attr
assert iscode(code)
code = Code(code, self.scanner, self.currentclass)
# add defaults values to parameter names
argc = code.co_argcount
paramnames = list(code.co_varnames[:argc])
# defaults are for last n parameters, thus reverse
if not 3.0 <= self.version <= 3.2:
paramnames.reverse(); defparams.reverse()
try:
ast = self.build_ast(code._tokens,
code._customize,
isLambda = isLambda,
noneInNames = ('None' in code.co_names))
except ParserError as p:
self.write(str(p))
self.ERROR = p
return
kw_pairs = args_node.attr[1] if self.version >= 3.0 else 0
indent = self.indent
# build parameters
if self.version != 3.2:
params = [build_param(ast, name, default) for
name, default in zip_longest(paramnames, defparams, fillvalue=None)]
params.reverse() # back to correct order
if 4 & code.co_flags: # flag 2 -> variable number of args
if self.version > 3.0:
params.append('*%s' % code.co_varnames[argc + kw_pairs])
else:
params.append('*%s' % code.co_varnames[argc])
argc += 1
# dump parameter list (with default values)
if isLambda:
self.write("lambda ", ", ".join(params))
else:
self.write("(", ", ".join(params))
# self.println(indent, '#flags:\t', int(code.co_flags))
else:
if isLambda:
self.write("lambda ")
else:
self.write("(")
pass
last_line = self.f.getvalue().split("\n")[-1]
l = len(last_line)
indent = ' ' * l
line_number = self.line_number
if 4 & code.co_flags: # flag 2 -> variable number of args
self.write('*%s' % code.co_varnames[argc + kw_pairs])
argc += 1
i = len(paramnames) - len(defparams)
self.write(", ".join(paramnames[:i]))
suffix = ', ' if i > 0 else ''
for n in node:
if n == 'pos_arg':
self.write(suffix)
self.write(paramnames[i] + '=')
i += 1
self.preorder(n)
if (line_number != self.line_number):
suffix = ",\n" + indent
line_number = self.line_number
else:
suffix = ', '
if kw_args > 0:
if not (4 & code.co_flags):
if argc > 0:
self.write(", *, ")
else:
self.write("*, ")
pass
else:
self.write(", ")
if not 3.0 <= self.version <= 3.2:
for n in node:
if n == 'pos_arg':
continue
elif self.version >= 3.4 and n.type != 'kwargs':
continue
else:
self.preorder(n)
break
else:
kwargs = node[0]
last = len(kwargs)-1
i = 0
for n in node[0]:
if n == 'kwarg':
self.write('%s=' % n[0].pattr)
self.preorder(n[1])
if i < last:
self.write(', ')
i += 1
pass
pass
pass
pass
if 8 & code.co_flags: # flag 3 -> keyword args
if argc > 0:
self.write(', ')
self.write('**%s' % code.co_varnames[argc + kw_pairs])
if isLambda:
self.write(": ")
else:
self.println("):")
if len(code.co_consts) > 0 and code.co_consts[0] is not None and not isLambda: # ugly
# docstring exists, dump it
print_docstring(self, self.indent, code.co_consts[0])
code._tokens = None # save memory
assert ast == 'stmts'
all_globals = find_all_globals(ast, set())
for g in ((all_globals & self.mod_globs) | find_globals(ast, set())):
self.println(self.indent, 'global ', g)
self.mod_globs -= all_globals
has_none = 'None' in code.co_names
rn = has_none and not find_none(ast)
self.gen_source(ast, code.co_name, code._customize, isLambda=isLambda,
returnNone=rn)
code._tokens = None; code._customize = None # save memory

View File

@@ -0,0 +1,11 @@
import uncompyle6.parser as python_parser
class ParserError(python_parser.ParserError):
def __init__(self, error, tokens):
self.error = error # previous exception
self.tokens = tokens
def __str__(self):
lines = ['--- This code section failed: ---']
lines.extend([str(i) for i in self.tokens])
lines.extend( ['', str(self.error)] )
return '\n'.join(lines)

View File

@@ -79,19 +79,21 @@ from spark_parser import GenericASTTraversal, DEFAULT_DEBUG as PARSER_DEFAULT_DE
from uncompyle6.scanner import Code, get_scanner
from uncompyle6.scanners.tok import Token, NoneToken
import uncompyle6.parser as python_parser
from uncompyle6.semantics.make_function import (
make_function2, make_function3, make_function3_annotate, find_globals)
from uncompyle6.semantics.parser_error import ParserError
from uncompyle6.semantics.check_ast import checker
from uncompyle6.semantics.helper import print_docstring
from uncompyle6.show import (
maybe_show_asm,
maybe_show_ast,
maybe_show_ast_param_default,
)
if PYTHON3:
from itertools import zip_longest
from io import StringIO
minint = -sys.maxsize-1
maxint = sys.maxsize
else:
from itertools import izip_longest as zip_longest
from StringIO import StringIO
minint = -sys.maxint-1
maxint = sys.maxint
@@ -238,7 +240,12 @@ TABLE_DIRECT = {
'dict_comp_body': ( '%c:%c', 1, 0 ),
'assign': ( '%|%c = %p\n', -1, (0, 200) ),
# The 2nd parameter should have a = suffix.
# There is a rule with a 4th parameter "designator"
# which we don't use here.
'augassign1': ( '%|%c %c %c\n', 0, 2, 1),
'augassign2': ( '%|%c.%[2]{pattr} %c %c\n', 0, -3, -4),
'designList': ( '%c = %c', 0, -1 ),
'and': ( '%c and %c', 0, 2 ),
@@ -305,7 +312,7 @@ TABLE_DIRECT = {
'whileTruestmt': ( '%|while True:\n%+%c%-\n\n', 1 ),
'whilestmt': ( '%|while %c:\n%+%c%-\n\n', 1, 2 ),
'while1stmt': ( '%|while 1:\n%+%c%-\n\n', 1 ),
'while1elsestmt': ( '%|while 1:\n%+%c%-%|else:\n%+%c%-\n\n', 1, 3 ),
'while1elsestmt': ( '%|while 1:\n%+%c%-%|else:\n%+%c%-\n\n', 1, -2 ),
'whileelsestmt': ( '%|while %c:\n%+%c%-%|else:\n%+%c%-\n\n', 1, 2, -2 ),
'whileelselaststmt': ( '%|while %c:\n%+%c%-%|else:\n%+%c%-', 1, 2, -2 ),
'forstmt': ( '%|for %c in %c:\n%+%c%-\n\n', 3, 1, 4 ),
@@ -345,7 +352,7 @@ MAP_R = (TABLE_R, -1)
MAP = {
'stmt': MAP_R,
'call_function': MAP_R,
'call_function': MAP_R,
'del_stmt': MAP_R,
'designator': MAP_R,
'exprlist': MAP_R0,
@@ -430,45 +437,6 @@ def is_docstring(node):
except:
return False
class ParserError(python_parser.ParserError):
def __init__(self, error, tokens):
self.error = error # previous exception
self.tokens = tokens
def __str__(self):
lines = ['--- This code section failed: ---']
lines.extend([str(i) for i in self.tokens])
lines.extend( ['', str(self.error)] )
return '\n'.join(lines)
def find_globals(node, globs):
"""Find globals in this statement."""
for n in node:
if isinstance(n, AST):
globs = find_globals(n, globs)
elif n.type in ('STORE_GLOBAL', 'DELETE_GLOBAL'):
globs.add(n.pattr)
return globs
def find_all_globals(node, globs):
"""Find globals in this statement."""
for n in node:
if isinstance(n, AST):
globs = find_all_globals(n, globs)
elif n.type in ('STORE_GLOBAL', 'DELETE_GLOBAL', 'LOAD_GLOBAL'):
globs.add(n.pattr)
return globs
def find_none(node):
for n in node:
if isinstance(n, AST):
if not n in ('return_stmt', 'return_if_stmt'):
if find_none(n):
return True
elif n.type == 'LOAD_CONST' and n.pattr is None:
return True
return False
class SourceWalkerError(Exception):
def __init__(self, errmsg):
self.errmsg = errmsg
@@ -505,6 +473,7 @@ class SourceWalker(GenericASTTraversal, object):
self.pending_newlines = 0
self.linestarts = linestarts
self.line_number = 0
self.ast_errors = []
# hide_internal suppresses displaying the additional instructions that sometimes
# exist in code but but were not written in the source code.
@@ -518,6 +487,12 @@ class SourceWalker(GenericASTTraversal, object):
return
def indent_if_source_nl(self, line_number, indent):
if (line_number != self.line_number):
self.write("\n" + self.indent + INDENT_PER_LEVEL[:-1])
return self.line_number
def customize_for_version(self, is_pypy, version):
if is_pypy:
########################
@@ -611,158 +586,33 @@ class SourceWalker(GenericASTTraversal, object):
'comp_for': ( ' for %c in %c%c', 2, 0, 3 ),
})
if 3.1 == version:
##########################
# Python 3.1
##########################
if version >= 3.0:
TABLE_DIRECT.update({
'funcdeftest': ( '\n\n%|def %c%c\n', -1, 0),
'funcdef_annotate': ( '\n\n%|def %c%c\n', -1, 0),
'store_locals': ( '%|# inspect.currentframe().f_locals = __locals__\n', ),
})
def make_function31(node, isLambda, nested=1,
codeNode=None, annotate=None):
"""Dump function defintion, doc string, and function
body. This code is specialzed for Python 3.1"""
def n_mkfunc_annotate(node):
def build_param(ast, name, default):
"""build parameters:
- handle defaults
- handle format tuple parameters
"""
if default:
value = self.traverse(default, indent='')
maybe_show_ast_param_default(self.showast, name, value)
result = '%s=%s' % (name, value)
if result[-2:] == '= ': # default was 'LOAD_CONST None'
result += 'None'
return result
else:
return name
# MAKE_FUNCTION_... or MAKE_CLOSURE_...
assert node[-1].type.startswith('MAKE_')
args_node = node[-1]
if isinstance(args_node.attr, tuple):
defparams = node[1:args_node.attr[0]+1]
pos_args, kw_args, annotate_args = args_node.attr
if self.version >= 3.3 or node[-2] == 'kwargs':
# LOAD_CONST code object ..
# LOAD_CONST 'x0' if >= 3.3
# EXTENDED_ARG
# MAKE_FUNCTION ..
code = node[-4]
elif node[-3] == 'expr':
code = node[-3][0]
else:
defparams = node[:args_node.attr]
kw_args = 0
pass
lambda_index = -2
if lambda_index and isLambda and iscode(node[lambda_index].attr):
assert node[lambda_index].type == 'LOAD_LAMBDA'
code = node[lambda_index].attr
else:
code = codeNode.attr
assert iscode(code)
code = Code(code, self.scanner, self.currentclass)
# add defaults values to parameter names
argc = code.co_argcount
paramnames = list(code.co_varnames[:argc])
try:
ast = self.build_ast(code._tokens,
code._customize,
isLambda = isLambda,
noneInNames = ('None' in code.co_names))
except ParserError as p:
self.write(str(p))
self.ERROR = p
return
kw_pairs = args_node.attr[1] if self.version >= 3.0 else 0
indent = self.indent
params = [build_param(ast, name, default) for
name, default in zip_longest(paramnames, defparams, fillvalue=None)]
params.reverse() # back to correct order
if 4 & code.co_flags: # flag 2 -> variable number of args
if self.version > 3.0:
params.append('*%s' % code.co_varnames[argc + kw_pairs])
else:
params.append('*%s' % code.co_varnames[argc])
argc += 1
# dump parameter list (with default values)
if isLambda:
self.write("lambda ", ", ".join(params))
else:
self.write("(", ", ".join(params))
# self.println(indent, '#flags:\t', int(code.co_flags))
if kw_args > 0:
if not (4 & code.co_flags):
if argc > 0:
self.write(", *, ")
else:
self.write("*, ")
pass
else:
self.write(", ")
kwargs = node[0]
last = len(kwargs)-1
i = 0
for n in node[0]:
if n == 'kwarg':
self.write('%s=' % n[0].pattr)
self.preorder(n[1])
if i < last:
self.write(', ')
i += 1
pass
pass
pass
if 8 & code.co_flags: # flag 3 -> keyword args
if argc > 0:
self.write(', ')
self.write('**%s' % code.co_varnames[argc + kw_pairs])
if isLambda:
self.write(": ")
else:
self.write(')')
if annotate:
self.write(' -> %s' % annotate)
self.println(":")
if (len(code.co_consts) > 0 and
code.co_consts[0] is not None and not isLambda): # ugly
# docstring exists, dump it
self.print_docstring(indent, code.co_consts[0])
code._tokens = None # save memory
assert ast == 'stmts'
all_globals = find_all_globals(ast, set())
for g in ((all_globals & self.mod_globs) | find_globals(ast, set())):
self.println(self.indent, 'global ', g)
self.mod_globs -= all_globals
has_none = 'None' in code.co_names
rn = has_none and not find_none(ast)
self.gen_source(ast, code.co_name, code._customize, isLambda=isLambda,
returnNone=rn)
code._tokens = code._customize = None # save memory
self.make_function31 = make_function31
def n_mkfunctest(node):
# LOAD_CONST code object ..
# MAKE_FUNCTION ..
code = node[-3]
self.indentMore()
code = node[-3]
annotate = None
if node[-4][0][0] == 'LOAD_CONST':
annotate = node[-4][0][0].attr
self.make_function31(node, isLambda=False,
codeNode=code, annotate=annotate)
annotate_last = -4 if self.version == 3.1 else -5
# FIXME: handle and pass full annotate args
make_function3_annotate(self, node, isLambda=False,
codeNode=code, annotate_last=annotate_last)
if len(self.param_stack) > 1:
self.write('\n\n')
@@ -770,47 +620,53 @@ class SourceWalker(GenericASTTraversal, object):
self.write('\n\n\n')
self.indentLess()
self.prune() # stop recursing
self.n_mkfunctest = n_mkfunctest
self.n_mkfunc_annotate = n_mkfunc_annotate
elif 3.2 <= version <= 3.3:
##########################
# Python 3.2 and 3.3
##########################
TABLE_DIRECT.update({
'store_locals': ( '%|# inspect.currentframe().f_locals = __locals__\n', ),
})
elif version >= 3.4:
########################
# Python 3.4+ Additions
#######################
TABLE_DIRECT.update({
'LOAD_CLASSDEREF': ( '%{pattr}', ),
})
if version >= 3.6:
if version >= 3.4:
########################
# Python 3.6+ Additions
# Python 3.4+ Additions
#######################
TABLE_DIRECT.update({
'fstring_expr': ( "{%c%{conversion}}", 0),
'fstring_single': ( "f'{%c%{conversion}}'", 0),
'fstring_multi': ( "f'%c'", 0),
})
'LOAD_CLASSDEREF': ( '%{pattr}', ),
})
if version >= 3.5:
def n_unmapexpr(node):
last_n = node[0][-1]
for n in node[0]:
self.preorder(n)
if n != last_n:
self.f.write(', **')
pass
pass
self.prune()
pass
self.n_unmapexpr = n_unmapexpr
FSTRING_CONVERSION_MAP = {1: '!s', 2: '!r', 3: '!a'}
def f_conversion(node):
node.conversion = FSTRING_CONVERSION_MAP.get(node.data[1].attr, '')
if version >= 3.6:
########################
# Python 3.6+ Additions
#######################
TABLE_DIRECT.update({
'fstring_expr': ( "{%c%{conversion}}", 0),
'fstring_single': ( "f'{%c%{conversion}}'", 0),
'fstring_multi': ( "f'%c'", 0),
})
def n_fstring_expr(node):
f_conversion(node)
self.default(node)
self.n_fstring_expr = n_fstring_expr
FSTRING_CONVERSION_MAP = {1: '!s', 2: '!r', 3: '!a'}
def n_fstring_single(node):
f_conversion(node)
self.default(node)
def f_conversion(node):
node.conversion = FSTRING_CONVERSION_MAP.get(node.data[1].attr, '')
self.n_fstring_single = n_fstring_single
def n_fstring_expr(node):
f_conversion(node)
self.default(node)
self.n_fstring_expr = n_fstring_expr
def n_fstring_single(node):
f_conversion(node)
self.default(node)
self.n_fstring_single = n_fstring_single
return
@@ -835,9 +691,8 @@ class SourceWalker(GenericASTTraversal, object):
None)
def set_pos_info(self, node):
if hasattr(node, 'offset'):
if node.offset in self.linestarts:
self.line_number = self.linestarts[node.offset]
if hasattr(node, 'linestart') and node.linestart:
self.line_number = node.linestart
def preorder(self, node=None):
super(SourceWalker, self).preorder(node)
@@ -904,67 +759,6 @@ class SourceWalker(GenericASTTraversal, object):
self.write(*data)
self.pending_newlines = max(self.pending_newlines, 1)
def print_docstring(self, indent, docstring):
quote = '"""'
self.write(indent)
if not PYTHON3 and not isinstance(docstring, str):
# Must be unicode in Python2
self.write('u')
docstring = repr(docstring.expandtabs())[2:-1]
else:
docstring = repr(docstring.expandtabs())[1:-1]
for (orig, replace) in (('\\\\', '\t'),
('\\r\\n', '\n'),
('\\n', '\n'),
('\\r', '\n'),
('\\"', '"'),
("\\'", "'")):
docstring = docstring.replace(orig, replace)
# Do a raw string if there are backslashes but no other escaped characters:
# also check some edge cases
if ('\t' in docstring
and '\\' not in docstring
and len(docstring) >= 2
and docstring[-1] != '\t'
and (docstring[-1] != '"'
or docstring[-2] == '\t')):
self.write('r') # raw string
# restore backslashes unescaped since raw
docstring = docstring.replace('\t', '\\')
else:
# Escape '"' if it's the last character, so it doesn't
# ruin the ending triple quote
if len(docstring) and docstring[-1] == '"':
docstring = docstring[:-1] + '\\"'
# Escape triple quote anywhere
docstring = docstring.replace('"""', '\\"\\"\\"')
# Restore escaped backslashes
docstring = docstring.replace('\t', '\\\\')
lines = docstring.split('\n')
calculate_indent = maxint
for line in lines[1:]:
stripped = line.lstrip()
if len(stripped) > 0:
calculate_indent = min(calculate_indent, len(line) - len(stripped))
calculate_indent = min(calculate_indent, len(lines[-1]) - len(lines[-1].lstrip()))
# Remove indentation (first line is special):
trimmed = [lines[0]]
if calculate_indent < maxint:
trimmed += [line[calculate_indent:] for line in lines[1:]]
self.write(quote)
if len(trimmed) == 0:
self.println(quote)
elif len(trimmed) == 1:
self.println(trimmed[0], quote)
else:
self.println(trimmed[0])
for line in trimmed[1:-1]:
self.println( indent, line )
self.println(indent, trimmed[-1], quote)
def is_return_none(self, node):
# Is there a better way?
ret = (node[0] == 'ret_expr'
@@ -1116,7 +910,6 @@ class SourceWalker(GenericASTTraversal, object):
pass
self.write(')')
def n_LOAD_CONST(self, node):
data = node.pattr; datatype = type(data)
if isinstance(datatype, int) and data == minint:
@@ -1318,6 +1111,13 @@ class SourceWalker(GenericASTTraversal, object):
self.indentLess()
self.prune() # stop recursing
def make_function(self, node, isLambda, nested=1,
codeNode=None, annotate=None):
if self.version >= 3.0:
make_function3(self, node, isLambda, nested, codeNode)
else:
make_function2(self, node, isLambda, nested, codeNode)
def n_mklambda(self, node):
self.make_function(node, isLambda=True, codeNode=node[-2])
self.prune() # stop recursing
@@ -1346,6 +1146,7 @@ class SourceWalker(GenericASTTraversal, object):
assert n == 'lc_body'
self.write( '[ ')
if self.version >= 2.7:
expr = n[0]
list_iter = node[-1]
@@ -1356,9 +1157,21 @@ class SourceWalker(GenericASTTraversal, object):
assert expr == 'expr'
assert list_iter == 'list_iter'
# FIXME: use source line numbers for directing line breaks
line_number = self.line_number
last_line = self.f.getvalue().split("\n")[-1]
l = len(last_line)
indent = ' ' * (l-1)
self.preorder(expr)
line_number = self.indent_if_source_nl(line_number, indent)
self.preorder(list_iter)
self.write( ' ]')
l2 = self.indent_if_source_nl(line_number, indent)
if l2 != line_number:
self.write(' ' * (len(indent) - len(self.indent) - 1) + ']')
else:
self.write( ' ]')
self.prec = p
self.prune() # stop recursing
@@ -1371,7 +1184,7 @@ class SourceWalker(GenericASTTraversal, object):
n = node[-1]
elif self.is_pypy and node[-1] == 'JUMP_BACK':
n = node[-2]
list_expr = node[0]
list_expr = node[1]
if len(node) >= 3:
designator = node[3]
@@ -1396,10 +1209,9 @@ class SourceWalker(GenericASTTraversal, object):
assert expr == 'expr'
assert list_iter == 'list_iter'
# FIXME: use source line numbers for directing line breaks
self.preorder(expr)
self.write( ' for ')
self.preorder(designator)
self.write( ' in ')
self.preorder(list_expr)
self.write( ' ]')
self.prec = p
@@ -1847,9 +1659,8 @@ class SourceWalker(GenericASTTraversal, object):
self.write(sep)
name = self.traverse(l[i], indent='')
if i > 0:
if (line_number != self.line_number):
self.write("\n" + self.indent + INDENT_PER_LEVEL[:-1])
pass
line_number = self.indent_if_source_nl(line_number,
self.indent + INDENT_PER_LEVEL[:-1])
line_number = self.line_number
self.write(name, ': ')
value = self.traverse(l[i+1], indent=self.indent+(len(name)+2)*' ')
@@ -1874,9 +1685,8 @@ class SourceWalker(GenericASTTraversal, object):
self.write(sep)
name = self.traverse(l[i+1], indent='')
if i > 0:
if (line_number != self.line_number):
self.write("\n" + self.indent + INDENT_PER_LEVEL[:-1])
pass
line_number = self.indent_if_source_nl(line_number,
self.indent + INDENT_PER_LEVEL[:-1])
pass
line_number = self.line_number
self.write(name, ': ')
@@ -1905,13 +1715,12 @@ class SourceWalker(GenericASTTraversal, object):
# kv3 ::= expr expr STORE_MAP
# FIXME: DRY this and the above
indent = self.indent + " "
if kv == 'kv':
self.write(sep)
name = self.traverse(kv[-2], indent='')
if first_time:
if (line_number != self.line_number):
self.write("\n" + self.indent + " ")
pass
line_number = self.indent_if_source_nl(line_number, indent)
first_time = False
pass
line_number = self.line_number
@@ -1921,9 +1730,7 @@ class SourceWalker(GenericASTTraversal, object):
self.write(sep)
name = self.traverse(kv[1], indent='')
if first_time:
if (line_number != self.line_number):
self.write("\n" + self.indent + " ")
pass
line_number = self.indent_if_source_nl(line_number, indent)
first_time = False
pass
line_number = self.line_number
@@ -1933,9 +1740,7 @@ class SourceWalker(GenericASTTraversal, object):
self.write(sep)
name = self.traverse(kv[-2], indent='')
if first_time:
if (line_number != self.line_number):
self.write("\n" + self.indent + " ")
pass
line_number = self.indent_if_source_nl(line_number, indent)
first_time = False
pass
line_number = self.line_number
@@ -2106,18 +1911,9 @@ class SourceWalker(GenericASTTraversal, object):
node[0].attr == 1):
self.write(',')
elif typ == 'c':
# FIXME: In Python3 sometimes like from
# importfrom
# importlist2
# import_as
# designator
# STORE_NAME 'load_entry_point'
# POP_TOP '' (2, (0, 1))
# we get that weird POP_TOP tuple, e.g (2, (0,1)).
# Why? and
# Is there some sort of invalid bounds access going on?
if isinstance(entry[arg], int):
self.preorder(node[entry[arg]])
entry_node = node[entry[arg]]
self.preorder(entry_node)
arg += 1
elif typ == 'p':
p = self.prec
@@ -2264,190 +2060,6 @@ class SourceWalker(GenericASTTraversal, object):
# return self.traverse(node[1])
raise Exception("Can't find tuple parameter " + name)
def make_function(self, node, isLambda, nested=1, codeNode=None):
"""Dump function defintion, doc string, and function body."""
def build_param(ast, name, default):
"""build parameters:
- handle defaults
- handle format tuple parameters
"""
if self.version < 3.0:
# if formal parameter is a tuple, the paramater name
# starts with a dot (eg. '.1', '.2')
if name.startswith('.'):
# replace the name with the tuple-string
name = self.get_tuple_parameter(ast, name)
pass
pass
if default:
value = self.traverse(default, indent='')
maybe_show_ast_param_default(self.showast, name, value)
result = '%s=%s' % (name, value)
if result[-2:] == '= ': # default was 'LOAD_CONST None'
result += 'None'
return result
else:
return name
# MAKE_FUNCTION_... or MAKE_CLOSURE_...
assert node[-1].type.startswith('MAKE_')
args_node = node[-1]
if isinstance(args_node.attr, tuple):
if self.version <= 3.3:
# positional args are after kwargs
defparams = node[1:args_node.attr[0]+1]
else:
# positional args are before kwargs
defparams = node[:args_node.attr[0]]
pos_args, kw_args, annotate_args = args_node.attr
else:
defparams = node[:args_node.attr]
kw_args = 0
pass
if 3.0 <= self.version <= 3.2:
lambda_index = -2
elif 3.03 <= self.version:
lambda_index = -3
else:
lambda_index = None
if lambda_index and isLambda and iscode(node[lambda_index].attr):
assert node[lambda_index].type == 'LOAD_LAMBDA'
code = node[lambda_index].attr
else:
code = codeNode.attr
assert iscode(code)
code = Code(code, self.scanner, self.currentclass)
# add defaults values to parameter names
argc = code.co_argcount
paramnames = list(code.co_varnames[:argc])
# defaults are for last n parameters, thus reverse
if not 3.0 <= self.version <= 3.2:
paramnames.reverse(); defparams.reverse()
try:
ast = self.build_ast(code._tokens,
code._customize,
isLambda = isLambda,
noneInNames = ('None' in code.co_names))
except ParserError as p:
self.write(str(p))
self.ERROR = p
return
kw_pairs = args_node.attr[1] if self.version >= 3.0 else 0
indent = self.indent
# build parameters
if not 3.0 <= self.version <= 3.2:
params = [build_param(ast, name, default) for
name, default in zip_longest(paramnames, defparams, fillvalue=None)]
params.reverse() # back to correct order
if 4 & code.co_flags: # flag 2 -> variable number of args
if self.version > 3.0:
params.append('*%s' % code.co_varnames[argc + kw_pairs])
else:
params.append('*%s' % code.co_varnames[argc])
argc += 1
# dump parameter list (with default values)
if isLambda:
self.write("lambda ", ", ".join(params))
else:
self.write("(", ", ".join(params))
# self.println(indent, '#flags:\t', int(code.co_flags))
else:
if isLambda:
self.write("lambda ")
else:
self.write("(")
if 4 & code.co_flags: # flag 2 -> variable number of args
self.write('*%s' % code.co_varnames[argc + kw_pairs])
argc += 1
i = len(paramnames) - len(defparams)
self.write(",".join(paramnames[:i]))
suffix = ', ' if i > 0 else ''
for n in node:
if n == 'pos_arg':
self.write(suffix)
self.write(paramnames[i] + '=')
i += 1
self.preorder(n)
suffix = ', '
if kw_args > 0:
if not (4 & code.co_flags):
if argc > 0:
self.write(", *, ")
else:
self.write("*, ")
pass
else:
self.write(", ")
if not 3.0 <= self.version <= 3.2:
for n in node:
if n == 'pos_arg':
continue
elif self.version >= 3.4 and n.type != 'kwargs':
continue
else:
self.preorder(n)
break
else:
kwargs = node[0]
last = len(kwargs)-1
i = 0
for n in node[0]:
if n == 'kwarg':
self.write('%s=' % n[0].pattr)
self.preorder(n[1])
if i < last:
self.write(', ')
i += 1
pass
pass
pass
pass
if 8 & code.co_flags: # flag 3 -> keyword args
if argc > 0:
self.write(', ')
self.write('**%s' % code.co_varnames[argc + kw_pairs])
if isLambda:
self.write(": ")
else:
self.println("):")
if len(code.co_consts)>0 and code.co_consts[0] is not None and not isLambda: # ugly
# docstring exists, dump it
self.print_docstring(indent, code.co_consts[0])
code._tokens = None # save memory
assert ast == 'stmts'
all_globals = find_all_globals(ast, set())
for g in ((all_globals & self.mod_globs) | find_globals(ast, set())):
self.println(self.indent, 'global ', g)
self.mod_globs -= all_globals
has_none = 'None' in code.co_names
rn = has_none and not find_none(ast)
self.gen_source(ast, code.co_name, code._customize, isLambda=isLambda,
returnNone=rn)
code._tokens = None; code._customize = None # save memory
def build_class(self, code):
"""Dump class definition, doc string and class body."""
@@ -2520,9 +2132,9 @@ class SourceWalker(GenericASTTraversal, object):
docstring = ast[i][0][0][0][0].pattr
except:
docstring = code.co_consts[0]
self.print_docstring(indent, docstring)
self.println()
del ast[i]
if print_docstring(self, indent, docstring):
self.println()
del ast[i]
# the function defining a class normally returns locals(); we
@@ -2602,6 +2214,8 @@ class SourceWalker(GenericASTTraversal, object):
maybe_show_ast(self.showast, ast)
checker(ast, False, self.ast_errors)
return ast
@classmethod
@@ -2609,7 +2223,7 @@ class SourceWalker(GenericASTTraversal, object):
return MAP.get(node, MAP_DIRECT)
def deparse_code(version, co, out=sys.stdout, showasm=False, showast=False,
def deparse_code(version, co, out=sys.stdout, showasm=None, showast=False,
showgrammar=False, code_objects={}, compile_mode='exec', is_pypy=False):
"""
ingests and deparses a given code block 'co'
@@ -2619,8 +2233,7 @@ def deparse_code(version, co, out=sys.stdout, showasm=False, showast=False,
# store final output stream for case of error
scanner = get_scanner(version, is_pypy=is_pypy)
tokens, customize = scanner.ingest(co, code_objects=code_objects)
maybe_show_asm(showasm, tokens)
tokens, customize = scanner.ingest(co, code_objects=code_objects, show_asm=showasm)
debug_parser = dict(PARSER_DEFAULT_DEBUG)
if showgrammar:
@@ -2646,7 +2259,7 @@ def deparse_code(version, co, out=sys.stdout, showasm=False, showast=False,
# convert leading '__doc__ = "..." into doc string
try:
if deparsed.ast[0][0] == ASSIGN_DOC_STRING(co.co_consts[0]):
deparsed.print_docstring('', co.co_consts[0])
print_docstring(deparsed, '', co.co_consts[0])
del deparsed.ast[0]
if deparsed.ast[-1] == RETURN_NONE:
deparsed.ast.pop() # remove last node
@@ -2660,6 +2273,13 @@ def deparse_code(version, co, out=sys.stdout, showasm=False, showast=False,
for g in deparsed.mod_globs:
deparsed.write('# global %s ## Warning: Unused global' % g)
if deparsed.ast_errors:
deparsed.write("# NOTE: have internal decompilation grammar errors.\n")
deparsed.write("# Use -t option to show full context.")
for err in deparsed.ast_errors:
deparsed.write(err)
raise SourceWalkerError("Deparsing hit an internal grammar-rule bug")
if deparsed.ERROR:
raise SourceWalkerError("Deparsing stopped due to parse error")
return deparsed
@@ -2668,8 +2288,8 @@ if __name__ == '__main__':
def deparse_test(co):
"This is a docstring"
sys_version = sys.version_info.major + (sys.version_info.minor / 10.0)
deparsed = deparse_code(sys_version, co, showasm=True, showast=True)
# deparsed = deparse_code(sys_version, co, showasm=False, showast=False,
deparsed = deparse_code(sys_version, co, showasm='after', showast=True)
# deparsed = deparse_code(sys_version, co, showasm=None, showast=False,
# showgrammar=True)
print(deparsed.text)
return

View File

@@ -316,9 +316,15 @@ def cmp_code_objects(version, is_pypy, code_obj1, code_obj2,
i1 += 2
i2 += 2
continue
raise CmpErrorCode(name, tokens1[i1].offset, tokens1[i1],
tokens2[i2], tokens1, tokens2)
elif tokens1[i1].type == 'LOAD_NAME' and tokens2[i2].type == 'LOAD_CONST' \
and tokens1[i1].pattr == 'None' and tokens2[i2].pattr is None:
pass
elif tokens1[i1].type == 'RETURN_VALUE' and \
tokens2[i2].type == 'RETURN_END_IF':
pass
else:
raise CmpErrorCode(name, tokens1[i1].offset, tokens1[i1],
tokens2[i2], tokens1, tokens2)
elif tokens1[i1].type in JUMP_OPs and tokens1[i1].pattr != tokens2[i2].pattr:
dest1 = int(tokens1[i1].pattr)
dest2 = int(tokens2[i2].pattr)
@@ -350,6 +356,9 @@ def cmp_code_objects(version, is_pypy, code_obj1, code_obj2,
if is_pypy:
# For PYPY for now we don't care about PYPY_SOURCE_IS_UTF8:
flags2 &= ~0x0100 # PYPY_SOURCE_IS_UTF8
# We also don't care about COROUTINE or GENERATOR for now
flags1 &= ~0x000000a0
flags2 &= ~0x000000a0
if flags1 != flags2:
raise CmpErrorMember(name, 'co_flags',
pretty_flags(flags1),
@@ -395,7 +404,10 @@ def compare_code_with_srcfile(pyc_filename, src_filename, weak_verify=False):
msg = ("Can't compare code - Python is running with magic %s, but code is magic %s "
% (PYTHON_MAGIC_INT, magic_int))
return msg
code_obj2 = load_file(src_filename)
try:
code_obj2 = load_file(src_filename)
except SyntaxError as e:
return str(e)
cmp_code_objects(version, is_pypy, code_obj1, code_obj2, ignore_code=weak_verify)
return None

View File

@@ -1,3 +1,3 @@
# This file is suitable for sourcing inside bash as
# well as importing into Python
VERSION='2.9.3'
VERSION='2.9.8'