Compare commits

..

15 Commits

Author SHA1 Message Date
rocky
f859758aff Get ready for release 2.1.1 2015-12-27 12:55:52 -05:00
rocky
44cd349cc7 DRY Python3 scanner code. Some cross version handling fixed.
Some Python 3.2 and 3.3 deparse fixes.
2015-12-27 04:43:35 -05:00
rocky
4640e7dece Running native on Python 3.3 needs more work 2015-12-26 19:32:32 -05:00
rocky
ce8c7a4dc2 Add ok-2.7 tests for 3.4 full testing 2015-12-26 19:25:46 -05:00
rocky
6bd61deccc Add verify tests. Add Python 2.6 bytecode and use. 2015-12-26 19:14:53 -05:00
rocky
3ac3ef24ac Add node and template code to cleanup "for" handling 2015-12-26 10:42:57 -05:00
rocky
d6ac51d0a2 Try Python 2.6 testing on travis 2015-12-26 10:25:20 -05:00
rocky
69a8404edb For testing we can't 3.3 bytecodes on 2.7 yet, so use 3.2 2015-12-26 10:24:02 -05:00
rocky
008bd79719 Fix up Python 3.2, 3.3, and 3.4 cross-version scanners
Try travis 2.6 and 3.3
2015-12-26 10:19:26 -05:00
rocky
e8ee3ac751 Travis: try checking 3.4 2015-12-26 07:41:58 -05:00
rocky
7a2703634f Fix up looping by reinstating JUMP_ABSOLUTE -> JUMP_BACK or CONTINUE
get jump offsets into jump attributes. Fix up 3.2 scanner paritally
and use that in 3.4 for in cross version disassembly.
2015-12-26 03:06:03 -05:00
rocky
fe9c8d5734 Python3 try/except handling improvements. Add Walker exception and use
that: fixes erroneous uncompyle success message on parse error.
2015-12-26 00:12:02 -05:00
rocky
0409cee6a9 WIP redo try/except for Python3 2015-12-25 17:57:53 -05:00
rocky
39f0f7440b Fix bugs in using pysource from fragments. 2015-12-25 10:47:07 -05:00
rocky
1d533cbb23 Two modes of disassembly, one where we show hidden code and one where we don't. 2015-12-25 10:08:12 -05:00
105 changed files with 1475 additions and 1027 deletions

View File

@@ -3,7 +3,12 @@ language: python
sudo: false
python:
- '2.6'
- '2.7'
- '3.4'
install:
- pip install -r requirements-dev.txt
script:
- python ./setup.py develop && COMPILE='--compile' make check-2.7
- python ./setup.py develop && COMPILE='--compile' make check

123
ChangeLog
View File

@@ -1,3 +1,126 @@
2015-12-27 rocky <rb@dustyfeet.com>
* ChangeLog, NEWS, README.rst, __pkginfo__.py: Get ready for release
2.1.1
2015-12-27 rocky <rb@dustyfeet.com>
* ChangeLog, NEWS, README.rst, __pkginfo__.py,
test/bytecompile-tests, uncompyle6/opcodes/Makefile,
uncompyle6/opcodes/opcode_23.py, uncompyle6/opcodes/opcode_24.py,
uncompyle6/opcodes/opcode_25.py, uncompyle6/opcodes/opcode_26.py,
uncompyle6/opcodes/opcode_27.py, uncompyle6/opcodes/opcode_32.py,
uncompyle6/opcodes/opcode_33.py, uncompyle6/opcodes/opcode_34.py,
uncompyle6/parser.py, uncompyle6/parsers/parse3.py,
uncompyle6/scanner.py, uncompyle6/scanners/scanner25.py,
uncompyle6/scanners/scanner26.py, uncompyle6/scanners/scanner27.py,
uncompyle6/scanners/scanner3.py, uncompyle6/scanners/scanner32.py,
uncompyle6/scanners/scanner33.py, uncompyle6/scanners/scanner34.py,
uncompyle6/semantics/fragments.py, uncompyle6/semantics/pysource.py:
DRY Python3 scanner code. Some cross version handling fixed. Some
Python 3.2 and 3.3 deparse fixes.
2015-12-26 rocky <rb@dustyfeet.com>
* .travis.yml, test/Makefile, uncompyle6/verify.py: Running native
on Python 3.3 needs more work
2015-12-26 rocky <rb@dustyfeet.com>
* test/Makefile, test/test_pythonlib.py: Add ok-2.7 tests for 3.4
full testing
2015-12-26 rocky <rb@dustyfeet.com>
* test/Makefile, test/bytecompile-tests, test/test_pythonlib.py: Add
verify tests. Add Python 2.6 bytecode and use.
2015-12-26 rocky <rb@dustyfeet.com>
* uncompyle6/semantics/fragments.py,
uncompyle6/semantics/pysource.py: Add node and template code to
cleanup "for" handling
2015-12-26 rocky <rb@dustyfeet.com>
* .travis.yml: Try Python 2.6 testing on travis
2015-12-26 rocky <rb@dustyfeet.com>
* test/Makefile: For testing we can't 3.3 bytecodes on 2.7 yet, so
use 3.2
2015-12-26 rocky <rb@dustyfeet.com>
* .travis.yml, Makefile, requirements-dev.txt, test/Makefile,
test/bytecompile-tests, test/test_pythonlib.py,
uncompyle6/__init__.py, uncompyle6/opcodes/opcode_32.py,
uncompyle6/opcodes/opcode_33.py, uncompyle6/opcodes/opcode_34.py,
uncompyle6/scanner.py, uncompyle6/scanners/scanner32.py,
uncompyle6/scanners/scanner33.py, uncompyle6/scanners/scanner34.py,
uncompyle6/semantics/pysource.py: Fix up Python 3.2, 3.3, and 3.4
cross-version scanners Try travis 2.6 and 3.3
2015-12-26 rocky <rb@dustyfeet.com>
* .travis.yml: Travis: try checking 3.4
2015-12-26 rocky <rb@dustyfeet.com>
* test/simple_source/exception/05_try_except.py,
test/simple_source/looping/10_while.py,
test/simple_source/looping/while.py,
test/simple_source/simple_stmts/00_assign.py,
test/simple_source/simple_stmts/00_import.py,
test/simple_source/simple_stmts/00_pass.py,
test/simple_source/simple_stmts/15_assert.py,
test/simple_source/stmts/00_assign.py,
test/simple_source/stmts/00_import.py,
test/simple_source/stmts/00_pass.py,
test/simple_source/stmts/15_assert.py,
test/simple_source/stmts/15_for_if.py,
uncompyle6/parsers/parse2.py, uncompyle6/parsers/parse3.py,
uncompyle6/scanners/scanner32.py, uncompyle6/scanners/scanner34.py:
Fix up looping by reinstating JUMP_ABSOLUTE -> JUMP_BACK or CONTINUE
get jump offsets into jump attributes. Fix up 3.2 scanner paritally
and use that in 3.4 for in cross version disassembly.
2015-12-26 rocky <rb@dustyfeet.com>
* test/simple_source/exception/01_try_except.py,
test/simple_source/exception/05_try_except.py, uncompyle6/main.py,
uncompyle6/opcodes/opcode_34.py, uncompyle6/parsers/parse3.py,
uncompyle6/semantics/pysource.py: Python3 try/except handling
improvements. Add Walker exception and use that: fixes erroneous
uncompyle success message on parse error.
2015-12-25 rocky <rb@dustyfeet.com>
* test/simple_source/exception/01_try_except.py,
uncompyle6/parsers/parse2.py, uncompyle6/parsers/parse3.py: WIP redo
try/except for Python3
2015-12-25 rocky <rb@dustyfeet.com>
* uncompyle6/semantics/fragments.py,
uncompyle6/semantics/pysource.py: Fix bugs in using pysource from
fragments.
2015-12-25 rocky <rb@dustyfeet.com>
* uncompyle6/semantics/Makefile, uncompyle6/semantics/fragments.py,
uncompyle6/semantics/pysource.py: Two modes of disassembly, one
where we show hidden code and one where we don't.
2015-12-25 rocky <rb@dustyfeet.com>
* README.rst: README.rst typos
2015-12-25 rocky <rb@dustyfeet.com>
* .gitignore, ChangeLog, MANIFEST.in, NEWS, __pkginfo__.py,
test/Makefile: Get ready for releaes 2.0.0
2015-12-25 rocky <rb@dustyfeet.com>
* pytest/test_deparse.py: Port deparse test from python-deparse to

View File

@@ -18,8 +18,17 @@ TEST_TYPES=check-long check-short check-2.7 check-3.4
#: Default target - same as "check"
all: check
#: Run working tests
check check-3.4 check-2.7: pytest
# Run all tests
check:
@PYTHON_VERSION=`$(PYTHON) -V 2>&1 | cut -d ' ' -f 2 | cut -d'.' -f1,2`; \
$(MAKE) check-$$PYTHON_VERSION
#: Tests for Python 2.7, 3.3 and 3.4
check-2.7 check-3.3 check-3.4: pytest
$(MAKE) -C test $@
#:Tests for Python 2.6 (doesn't have pytest)
check-2.6:
$(MAKE) -C test $@
#: Run py.test tests

15
NEWS
View File

@@ -1,4 +1,17 @@
uncompyle6 1.0.0 2015-12-11
uncompyle6 2.1.1 2015-12-27
- packaging issues
uncompyle6 2.1.0 2015-12-27
- Python 3.x deparsing much more solid
- Better cross-version deparsing
Some bugs squashed while other run rampant. Some code cleanup while
much more is yet needed. More tests added, but many more are needed.
uncompyle6 2.0.0 2015-12-11
Changes from uncompyle2

View File

@@ -3,7 +3,8 @@
uncompyle6
==========
A native Python Byte-code Disassembler, Decompiler, and byte-code library
A native Python Byte-code Disassembler, Decompiler, Fragment Decompiler
and byte-code library
Introduction
@@ -11,7 +12,8 @@ Introduction
*uncompyle6* translates Python byte-code back into equivalent Python
source code. It accepts byte-codes from Python version 2.5 to 3.4 or
so and has been tested on Python 2.6, 2.7 and Python 3.4.
so and has been tested on Python running versions 2.6, 2.7, 3.3 and
3.4.
Why this?
---------
@@ -21,7 +23,7 @@ ability to deparse just fragments and give source-code information
around a given bytecode offset.
I using this to deparse fragments of code inside my trepan_
debuggers_. For that, I need to record text fragements for all
debuggers_. For that, I need to record text fragments for all
byte-code offsets (of interest). This purpose although largely
compatible with the original intention is yet a little bit different.
See this_ for more information.
@@ -83,9 +85,12 @@ for usage help
Known Bugs/Restrictions
-----------------------
Python 3 deparsing is getting there, but not solid. Using Python 2 to
deparse Python 3 is problematic, especilly for versions 3.4 and
greater.
Python 2 deparsing is probably as solid as the various versions of
uncompyle2. Python 3 deparsing is not as solid. Using Python 2 to
deparse Python 3 has severe limitations, due to byte code format
differences and the current inablity to retrieve code object fields across
different Python versions. (I envy the pycdc C++ code which doesn't have such
problems because they live totally outside of Python.)
See Also
--------

View File

@@ -27,7 +27,7 @@ ftp_url = None
# license = 'BSDish'
mailing_list = 'python-debugger@googlegroups.com'
modname = 'uncompyle6'
packages = ['uncompyle6', 'uncompyle6.opcodes']
packages = ['uncompyle6', 'uncompyle6.opcodes', 'uncompyle6.semantics', 'uncompyle6.scanners', 'uncompyle6.parsers']
py_modules = None
short_desc = 'Python byte-code disassembler and source-code converter'
scripts = ['bin/uncompyle6', 'bin/pydisassemble']
@@ -40,7 +40,7 @@ def get_srcdir():
return os.path.realpath(filename)
ns = {}
version = '2.0.0'
version = '2.1.1'
web = 'https://github.com/rocky/python-uncompyle6/'
# tracebacks in zip files are funky and not debuggable

2
requirements-dev.txt Normal file
View File

@@ -0,0 +1,2 @@
pytest
flake8

View File

@@ -20,14 +20,19 @@ check:
$(MAKE) check-$$PYTHON_VERSION
#: Run working tests from Python 2.6
check-2.6: check-bytecode-2.5 check-bytecode-2.7
check-2.6: check-bytecode-2.7 check-bytecode-2.5
$(PYTHON) test_pythonlib.py --bytecode-2.6 --verify $(COMPILE)
#: Run working tests from Python 2.7
check-2.7: check-bytecode check-2.7-ok
#: Run working tests from Python 3.3
check-3.3: check-bytecode-3.3 check-bytecode-2.7
$(PYTHON) test_pythonlib.py --bytecode-3.3 --verify $(COMPILE)
#: Run working tests from Python 3.4
check-3.4: check-bytecode check-bytecode-3.4
$(PYTHON) test_pythonlib.py --bytecode-3.4
check-3.4: check-bytecode-2.7 check-bytecode-3.2
$(PYTHON) test_pythonlib.py --bytecode-3.4 --ok-2.7 --verify $(COMPILE)
#: Check deparsing only, but from a different Python version
check-disasm:
@@ -41,6 +46,10 @@ check-bytecode:
check-bytecode-2.5:
$(PYTHON) test_pythonlib.py --bytecode-2.5
#: Check deparsing Python 2.6
check-bytecode-2.6:
$(PYTHON) test_pythonlib.py --bytecode-2.6
#: Check deparsing Python 2.7
check-bytecode-2.7:
$(PYTHON) test_pythonlib.py --bytecode-2.7
@@ -49,6 +58,10 @@ check-bytecode-2.7:
check-bytecode-3.2:
$(PYTHON) test_pythonlib.py --bytecode-3.2
#: Check deparsing Python 3.3
check-bytecode-3.3:
$(PYTHON) test_pythonlib.py --bytecode-3.3
#: Check deparsing Python 3.4
check-bytecode-3.4:
$(PYTHON) test_pythonlib.py --bytecode-3.4

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

BIN
test/bytecode_2.6/05_if.pyc Normal file

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

BIN
test/bytecode_3.3/05_if.pyc Normal file

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

View File

@@ -54,15 +54,6 @@ tests['2.2'] = ['divide_future', 'divide_no_future', 'iterators',
tests['2.3'] = tests['2.2']
tests['2.5'] = tests['2.3']
# tests['2.7'] = ['mine'] + tests['2.6']
tests['2.7'] = [
# 'simple_source/branching/ifelse',
# 'simple_source/branching/if'
'simple_source/misc/assert',
'simple_source/misc/assign',
# 'simple_source/call_arguments/keyword',
# 'simple_source/call_arguments/positional'
]
tests['2.6'] = tests['2.7']
def file_matches(files, root, basenames, patterns):
files.extend(
@@ -79,7 +70,7 @@ for root, dirs, basenames in os.walk('simple_source'):
simple_source.append(os.path.join(root, basename)[0:-3])
pass
tests['3.4'] = simple_source
tests['2.6'] = tests['2.7'] = tests['3.2'] = tests['3.3'] = tests['3.4'] = simple_source
total_tests = len(tests['2.7'])
#tests['2.2'].sort(); print tests['2.2']

View File

@@ -25,3 +25,8 @@ except ImportError:
x = 3
finally:
x = 4
try:
x = 1
except ImportError as e:
x = 2

View File

@@ -1,6 +1,21 @@
def handle(module):
try:
module = 1
except ImportError as exc:
module = exc
return module
def handle2(module):
if module == 'foo':
try:
module = 1
except ImportError as exc:
module = exc
return module
try:
pass
except ImportError as exc:
pass
finally:
del exc
y = 1

View File

@@ -0,0 +1,2 @@
while __name__ != '__main__':
b = 4

View File

@@ -1,2 +0,0 @@
while a:
b = c

View File

@@ -0,0 +1,5 @@
if __name__:
for i in (1,2):
x = 3
pass
pass

View File

@@ -56,34 +56,27 @@ PYOC = ('*.pyc', '*.pyo')
test_options = {
# name: (src_basedir, pattern, output_base_suffix, pythoin_version)
'test':
['test', PYC, 'test'],
'bytecode-2.5':
['bytecode_2.5', PYC, 'bytecode_2.5', 2.5],
'bytecode-2.7':
['bytecode_2.7', PYC, 'bytecode_2.7', 2.7],
'bytecode-3.2':
['bytecode_3.2', PYC, 'bytecode_3.2', 3.2],
'bytecode-3.4':
['bytecode_3.4', PYC, 'bytecode_3.4', 3.4],
('test', PYC, 'test'),
'2.7':
['python2.7', PYC, 'python2.7', 2.7],
('python2.7', PYC, 'python2.7', 2.7),
'ok-2.6':
[os.path.join(src_dir, 'ok_2.6'),
PYC, 'ok-2.6', 2.6],
(os.path.join(src_dir, 'ok_2.6'),
PYC, 'ok-2.6', 2.6),
'ok-2.7': [os.path.join(src_dir, 'ok_lib2.7'),
PYC, 'ok-2.7', 2.7],
'ok-2.7': (os.path.join(src_dir, 'ok_lib2.7'),
PYC, 'ok-2.7', 2.7),
'base-2.7': [os.path.join(src_dir, 'base_tests', 'python2.7'),
PYC, 'base_2.7', 2.7],
'base-2.7': (os.path.join(src_dir, 'base_tests', 'python2.7'),
PYC, 'base_2.7', 2.7),
}
for vers in (2.5, 2.6, 2.7, 3.2, 3.3, 3.4):
bytecode = "bytecode_%s" % vers
key = "bytecode-%s" % vers
test_options[key] = (bytecode, PYC, bytecode, vers)
#-----
def help():

View File

@@ -42,8 +42,8 @@ PYTHON_VERSION_STR = "%s.%s" % (sys.version_info[0], sys.version_info[1])
sys.setrecursionlimit(5000)
def check_python_version(program):
if not (sys.version_info[0:2] in ((2, 6), (2, 7), (3, 4))):
print('Error: %s requires %s Python 2.6, 2.7 or 3.4' % program,
if not (sys.version_info[0:2] in ((2, 6), (2, 7), (3, 2), (3, 3), (3, 4))):
print('Error: %s requires Python 2.6, 2.7, 3.2, 3.3, or 3.4' % program,
file=sys.stderr)
sys.exit(-1)
return

View File

@@ -27,8 +27,10 @@ def uncompyle(version, co, out=None, showasm=False, showast=False,
try:
pysource.deparse_code(version, co, out, showasm, showast, showgrammar)
except pysource.ParserError as e : # parser failed, dump disassembly
print(e, file=real_out)
except pysource.WalkerError as e:
# deparsing failed
if real_out != out:
print(e, file=real_out)
raise
def uncompyle_file(filename, outstream=None, showasm=False, showast=False,

View File

@@ -0,0 +1,7 @@
# Whatever it is you want to do, it should be forwarded to the
# to top-level irectories
PHONY=check all
all: check
%:
$(MAKE) -C ../.. $@

View File

@@ -1,6 +1,10 @@
"""
opcode module - potentially shared between dis and other modules which
operate on bytecodes (e.g. peephole optimizers).
CPython 2.3 bytecode opcodes
This is used in scanner (bytecode disassembly) and parser (Python grammar).
This is a superset of Python 2.3's opcode.py with some opcodes that simplify
parsing and semantic interpretation.
"""
cmp_op = ('<', '<=', '==', '!=', '>', '>=', 'in', 'not in', 'is',

View File

@@ -1,6 +1,10 @@
"""
opcode module - potentially shared between dis and other modules which
operate on bytecodes (e.g. peephole optimizers).
CPython 2.4 bytecode opcodes
This is used in scanner (bytecode disassembly) and parser (Python grammar).
This is a superset of Python 2.4's opcode.py with some opcodes that simplify
parsing and semantic interpretation.
"""
__all__ = ["cmp_op", "hasconst", "hasname", "hasjrel", "hasjabs",

View File

@@ -1,6 +1,10 @@
"""
opcode module - potentially shared between dis and other modules which
operate on bytecodes (e.g. peephole optimizers).
CPython 2.5 bytecode opcodes
This is used in scanner (bytecode disassembly) and parser (Python grammar).
This is a superset of Python 2.5's opcode.py with some opcodes that simplify
parsing and semantic interpretation.
"""
cmp_op = ('<', '<=', '==', '!=', '>', '>=', 'in', 'not in', 'is',

View File

@@ -1,6 +1,10 @@
"""
opcode module - potentially shared between dis and other modules which
operate on bytecodes (e.g. peephole optimizers).
CPython 2.6 bytecode opcodes
This is used in scanner (bytecode disassembly) and parser (Python grammar).
This is a superset of Python 3.4's opcode.py with some opcodes that simplify
parsing and semantic interpretation.
"""
cmp_op = ('<', '<=', '==', '!=', '>', '>=', 'in', 'not in', 'is',

View File

@@ -1,6 +1,10 @@
"""
opcode module - potentially shared between dis and other modules which
operate on bytecodes (e.g. peephole optimizers).
CPython 2.7 bytecode opcodes
This is used in scanner (bytecode disassembly) and parser (Python grammar).
This is a superset of Python 3.4's opcode.py with some opcodes that simplify
parsing and semantic interpretation.
"""
cmp_op = ('<', '<=', '==', '!=', '>', '>=', 'in', 'not in', 'is',
@@ -198,3 +202,8 @@ def_op('MAP_ADD', 147)
updateGlobal()
del def_op, name_op, jrel_op, jabs_op
from uncompyle6 import PYTHON_VERSION
if PYTHON_VERSION == 2.7:
import dis
assert all(item in opmap.items() for item in dis.opmap.items())

View File

@@ -1,7 +1,10 @@
"""
opcode module - potentially shared between dis and other modules which
operate on bytecodes (e.g. peephole optimizers).
CPython 3.2 bytecode opcodes
This is used in scanner (bytecode disassembly) and parser (Python grammar).
This is a superset of Python 3.4's opcode.py with some opcodes that simplify
parsing and semantic interpretation.
"""
__all__ = ["cmp_op", "hasconst", "hasname", "hasjrel", "hasjabs",
@@ -40,6 +43,16 @@ def jabs_op(name, op):
def_op(name, op)
hasjabs.append(op)
def updateGlobal():
# JUMP_OPs are used in verification are set in the scanner
# and used in the parser grammar
globals().update({'PJIF': opmap['POP_JUMP_IF_FALSE']})
globals().update({'PJIT': opmap['POP_JUMP_IF_TRUE']})
globals().update({'JA': opmap['JUMP_ABSOLUTE']})
globals().update({'JF': opmap['JUMP_FORWARD']})
globals().update(dict([(k.replace('+', '_'), v) for (k, v) in opmap.items()]))
globals().update({'JUMP_OPs': map(lambda op: opname[op], hasjrel + hasjabs)})
# Instruction opcodes for compiled code
# Blank lines correspond to available opcodes
@@ -69,8 +82,10 @@ def_op('BINARY_TRUE_DIVIDE', 27)
def_op('INPLACE_FLOOR_DIVIDE', 28)
def_op('INPLACE_TRUE_DIVIDE', 29)
# Gone from Python 3 are
# Python 2's SLICE+0 .. SLICE+3
# Gone from Python 3 are Python2's
# SLICE+0 .. SLICE+3
# STORE_SLICE+0 .. STORE_SLICE+3
# DELETE_SLICE+0 .. DELETE_SLICE+3
def_op('STORE_MAP', 54)
def_op('INPLACE_ADD', 55)
@@ -125,6 +140,9 @@ name_op('STORE_ATTR', 95) # Index in name list
name_op('DELETE_ATTR', 96) # ""
name_op('STORE_GLOBAL', 97) # ""
name_op('DELETE_GLOBAL', 98) # ""
# Python 2's DUP_TOPX is gone
def_op('LOAD_CONST', 100) # Index in const list
hasconst.append(100)
name_op('LOAD_NAME', 101) # Index in name list
@@ -186,4 +204,10 @@ def_op('MAP_ADD', 147)
def_op('EXTENDED_ARG', 144)
EXTENDED_ARG = 144
updateGlobal()
del def_op, name_op, jrel_op, jabs_op
from uncompyle6 import PYTHON_VERSION
if PYTHON_VERSION == 3.2:
import dis
assert all(item in opmap.items() for item in dis.opmap.items())

View File

@@ -0,0 +1,214 @@
"""
CPython 3.3 bytecode opcodes
This is used in scanner (bytecode disassembly) and parser (Python grammar).
This is a superset of Python 3.3's opcode.py with some opcodes that simplify
parsing and semantic interpretation.
"""
# Note: this should look exactly like Python 3.4's opcode.py
__all__ = ["cmp_op", "hasconst", "hasname", "hasjrel", "hasjabs",
"haslocal", "hascompare", "hasfree", "opname", "opmap",
"HAVE_ARGUMENT", "EXTENDED_ARG"]
cmp_op = ('<', '<=', '==', '!=', '>', '>=', 'in', 'not in', 'is',
'is not', 'exception match', 'BAD')
hasconst = []
hasname = []
hasjrel = []
hasjabs = []
haslocal = []
hascompare = []
hasfree = []
opmap = {}
opname = [''] * 256
for op in range(256): opname[op] = '<%r>' % (op,)
del op
def def_op(name, op):
opname[op] = name
opmap[name] = op
def name_op(name, op):
def_op(name, op)
hasname.append(op)
def jrel_op(name, op):
def_op(name, op)
hasjrel.append(op)
def jabs_op(name, op):
def_op(name, op)
hasjabs.append(op)
def updateGlobal():
# JUMP_OPs are used in verification are set in the scanner
# and used in the parser grammar
globals().update({'PJIF': opmap['POP_JUMP_IF_FALSE']})
globals().update({'PJIT': opmap['POP_JUMP_IF_TRUE']})
globals().update({'JA': opmap['JUMP_ABSOLUTE']})
globals().update({'JF': opmap['JUMP_FORWARD']})
globals().update(dict([(k.replace('+', '_'), v) for (k, v) in opmap.items()]))
globals().update({'JUMP_OPs': map(lambda op: opname[op], hasjrel + hasjabs)})
# Instruction opcodes for compiled code
# Blank lines correspond to available opcodes
def_op('STOP_CODE', 0)
def_op('POP_TOP', 1)
def_op('ROT_TWO', 2)
def_op('ROT_THREE', 3)
def_op('DUP_TOP', 4)
def_op('DUP_TOP_TWO', 5)
def_op('NOP', 9)
def_op('UNARY_POSITIVE', 10)
def_op('UNARY_NEGATIVE', 11)
def_op('UNARY_NOT', 12)
def_op('UNARY_INVERT', 15)
def_op('BINARY_POWER', 19)
def_op('BINARY_MULTIPLY', 20)
def_op('BINARY_MODULO', 22)
def_op('BINARY_ADD', 23)
def_op('BINARY_SUBTRACT', 24)
def_op('BINARY_SUBSCR', 25)
def_op('BINARY_FLOOR_DIVIDE', 26)
def_op('BINARY_TRUE_DIVIDE', 27)
def_op('INPLACE_FLOOR_DIVIDE', 28)
def_op('INPLACE_TRUE_DIVIDE', 29)
# Gone from Python 3 are
# Python 2's SLICE+0 .. SLICE+3
def_op('STORE_MAP', 54)
def_op('INPLACE_ADD', 55)
def_op('INPLACE_SUBTRACT', 56)
def_op('INPLACE_MULTIPLY', 57)
def_op('INPLACE_MODULO', 59)
def_op('STORE_SUBSCR', 60)
def_op('DELETE_SUBSCR', 61)
def_op('BINARY_LSHIFT', 62)
def_op('BINARY_RSHIFT', 63)
def_op('BINARY_AND', 64)
def_op('BINARY_XOR', 65)
def_op('BINARY_OR', 66)
def_op('INPLACE_POWER', 67)
def_op('GET_ITER', 68)
def_op('STORE_LOCALS', 69)
def_op('PRINT_EXPR', 70)
def_op('LOAD_BUILD_CLASS', 71)
# Python3 drops/changes:
# def_op('PRINT_ITEM', 71)
# def_op('PRINT_NEWLINE', 72)
def_op('YIELD_FROM', 72)
# def_op('PRINT_ITEM_TO', 73)
# def_op('PRINT_NEWLINE_TO', 74)
def_op('INPLACE_LSHIFT', 75)
def_op('INPLACE_RSHIFT', 76)
def_op('INPLACE_AND', 77)
def_op('INPLACE_XOR', 78)
def_op('INPLACE_OR', 79)
def_op('BREAK_LOOP', 80)
def_op('WITH_CLEANUP', 81)
def_op('RETURN_VALUE', 83)
def_op('IMPORT_STAR', 84)
def_op('YIELD_VALUE', 86)
def_op('POP_BLOCK', 87)
def_op('END_FINALLY', 88)
def_op('POP_EXCEPT', 89)
HAVE_ARGUMENT = 90 # Opcodes from here have an argument:
name_op('STORE_NAME', 90) # Index in name list
name_op('DELETE_NAME', 91) # ""
def_op('UNPACK_SEQUENCE', 92) # Number of tuple items
jrel_op('FOR_ITER', 93)
def_op('UNPACK_EX', 94)
name_op('STORE_ATTR', 95) # Index in name list
name_op('DELETE_ATTR', 96) # ""
name_op('STORE_GLOBAL', 97) # ""
name_op('DELETE_GLOBAL', 98) # ""
def_op('LOAD_CONST', 100) # Index in const list
hasconst.append(100)
name_op('LOAD_NAME', 101) # Index in name list
def_op('BUILD_TUPLE', 102) # Number of tuple items
def_op('BUILD_LIST', 103) # Number of list items
def_op('BUILD_SET', 104) # Number of set items
def_op('BUILD_MAP', 105) # Number of dict entries (upto 255)
name_op('LOAD_ATTR', 106) # Index in name list
def_op('COMPARE_OP', 107) # Comparison operator
hascompare.append(107)
name_op('IMPORT_NAME', 108) # Index in name list
name_op('IMPORT_FROM', 109) # Index in name list
jrel_op('JUMP_FORWARD', 110) # Number of bytes to skip
jabs_op('JUMP_IF_FALSE_OR_POP', 111) # Target byte offset from beginning of code
jabs_op('JUMP_IF_TRUE_OR_POP', 112) # ""
jabs_op('JUMP_ABSOLUTE', 113) # ""
jabs_op('POP_JUMP_IF_FALSE', 114) # ""
jabs_op('POP_JUMP_IF_TRUE', 115) # ""
name_op('LOAD_GLOBAL', 116) # Index in name list
jabs_op('CONTINUE_LOOP', 119) # Target address
jrel_op('SETUP_LOOP', 120) # Distance to target address
jrel_op('SETUP_EXCEPT', 121) # ""
jrel_op('SETUP_FINALLY', 122) # ""
def_op('LOAD_FAST', 124) # Local variable number
haslocal.append(124)
def_op('STORE_FAST', 125) # Local variable number
haslocal.append(125)
def_op('DELETE_FAST', 126) # Local variable number
haslocal.append(126)
def_op('RAISE_VARARGS', 130) # Number of raise arguments (1, 2, or 3)
def_op('CALL_FUNCTION', 131) # #args + (#kwargs << 8)
def_op('MAKE_FUNCTION', 132) # Number of args with default values
def_op('BUILD_SLICE', 133) # Number of items
def_op('MAKE_CLOSURE', 134)
def_op('LOAD_CLOSURE', 135)
hasfree.append(135)
def_op('LOAD_DEREF', 136)
hasfree.append(136)
def_op('STORE_DEREF', 137)
hasfree.append(137)
def_op('DELETE_DEREF', 138)
hasfree.append(138)
def_op('CALL_FUNCTION_VAR', 140) # #args + (#kwargs << 8)
def_op('CALL_FUNCTION_KW', 141) # #args + (#kwargs << 8)
def_op('CALL_FUNCTION_VAR_KW', 142) # #args + (#kwargs << 8)
jrel_op('SETUP_WITH', 143)
def_op('LIST_APPEND', 145)
def_op('SET_ADD', 146)
def_op('MAP_ADD', 147)
def_op('EXTENDED_ARG', 144)
EXTENDED_ARG = 144
updateGlobal()
del def_op, name_op, jrel_op, jabs_op
from uncompyle6 import PYTHON_VERSION
if PYTHON_VERSION == 3.3:
import dis
# for item in dis.opmap.items():
# if item not in opmap.items():
# print(item)
assert all(item in opmap.items() for item in dis.opmap.items())

View File

@@ -1,9 +1,11 @@
"""
opcode module - potentially shared between dis and other modules which
operate on bytecodes (e.g. peephole optimizers).
"""
CPython 3.4 bytecode opcodes
# Note: this should look exactly like Python 3.4's opcode.py
This is used in scanner (bytecode disassembly) and parser (Python grammar).
This is a superset of Python 3.4's opcode.py with some opcodes that simplify
parsing and semantic interpretation.
"""
__all__ = ["cmp_op", "hasconst", "hasname", "hasjrel", "hasjabs",
"haslocal", "hascompare", "hasfree", "opname", "opmap",
@@ -43,8 +45,8 @@ def jabs_op(name, op):
hasjabs.append(op)
def updateGlobal():
# JUMP_OPs are used in verification and in the scanner in resolving forward/backward
# jumps
# JUMP_OPs are used in verification are set in the scanner
# and used in the parser grammar
globals().update({'PJIF': opmap['POP_JUMP_IF_FALSE']})
globals().update({'PJIT': opmap['POP_JUMP_IF_TRUE']})
globals().update({'JA': opmap['JUMP_ABSOLUTE']})
@@ -81,8 +83,10 @@ def_op('BINARY_TRUE_DIVIDE', 27)
def_op('INPLACE_FLOOR_DIVIDE', 28)
def_op('INPLACE_TRUE_DIVIDE', 29)
# Gone from Python 3 are
# Python 2's SLICE+0 .. SLICE+3
# Gone from Python 3 are Python2's
# SLICE+0 .. SLICE+3
# STORE_SLICE+0 .. STORE_SLICE+3
# DELETE_SLICE+0 .. DELETE_SLICE+3
def_op('STORE_MAP', 54)
def_op('INPLACE_ADD', 55)
@@ -140,6 +144,9 @@ name_op('STORE_ATTR', 95) # Index in name list
name_op('DELETE_ATTR', 96) # ""
name_op('STORE_GLOBAL', 97) # ""
name_op('DELETE_GLOBAL', 98) # ""
# Python 2's DUP_TOPX is gone
def_op('LOAD_CONST', 100) # Index in const list
hasconst.append(100)
name_op('LOAD_NAME', 101) # Index in name list
@@ -210,3 +217,8 @@ EXTENDED_ARG = 144
updateGlobal()
del def_op, name_op, jrel_op, jabs_op
from uncompyle6 import PYTHON_VERSION
if PYTHON_VERSION == 3.4:
import dis
assert all(item in opmap.items() for item in dis.opmap.items())

View File

@@ -82,10 +82,12 @@ def get_python_parser(version, debug_parser):
"""
if version < 3.0:
import uncompyle6.parsers.parse2 as parse2
return parse2.Python2Parser(debug_parser)
p = parse2.Python2Parser(debug_parser)
else:
import uncompyle6.parsers.parse3 as parse3
return parse3.Python3Parser(debug_parser)
p = parse3.Python3Parser(debug_parser)
p.version = version
return p
def python_parser(version, co, out=sys.stdout, showasm=False,
parser_debug=PARSER_DEFAULT_DEBUG):
@@ -94,9 +96,9 @@ def python_parser(version, co, out=sys.stdout, showasm=False,
from uncompyle6.scanner import get_scanner
scanner = get_scanner(version)
tokens, customize = scanner.disassemble(co)
# if showasm:
# for t in tokens:
# print(t)
if showasm:
for t in tokens:
print(t)
p = get_python_parser(version, parser_debug)
return parse(p, tokens, customize)

View File

@@ -405,21 +405,26 @@ class Python2Parser(PythonParser):
trystmt ::= SETUP_EXCEPT suite_stmts_opt POP_BLOCK
try_middle COME_FROM
try_middle COME_FROM
tryelsestmt ::= SETUP_EXCEPT suite_stmts_opt POP_BLOCK
try_middle else_suite COME_FROM
# this is nested inside a trystmt
tryfinallystmt ::= SETUP_FINALLY suite_stmts
POP_BLOCK LOAD_CONST
COME_FROM suite_stmts_opt END_FINALLY
tryelsestmt ::= SETUP_EXCEPT suite_stmts_opt POP_BLOCK
try_middle else_suite COME_FROM
tryelsestmtc ::= SETUP_EXCEPT suite_stmts_opt POP_BLOCK
try_middle else_suitec COME_FROM
try_middle else_suitec COME_FROM
tryelsestmtl ::= SETUP_EXCEPT suite_stmts_opt POP_BLOCK
try_middle else_suitel COME_FROM
try_middle else_suitel COME_FROM
try_middle ::= jmp_abs COME_FROM except_stmts
END_FINALLY
END_FINALLY
try_middle ::= JUMP_FORWARD COME_FROM except_stmts
END_FINALLY COME_FROM
END_FINALLY COME_FROM
except_stmts ::= except_stmts except_stmt
except_stmts ::= except_stmt
@@ -445,10 +450,6 @@ class Python2Parser(PythonParser):
jmp_abs ::= JUMP_ABSOLUTE
jmp_abs ::= JUMP_BACK
tryfinallystmt ::= SETUP_FINALLY suite_stmts
POP_BLOCK LOAD_CONST
COME_FROM suite_stmts_opt END_FINALLY
withstmt ::= expr SETUP_WITH POP_TOP suite_stmts_opt
POP_BLOCK LOAD_CONST COME_FROM
WITH_CLEANUP END_FINALLY

View File

@@ -53,12 +53,11 @@ class Python3Parser(PythonParser):
def p_list_comprehension(self, args):
'''
# Python3 adds LOAD_LISTCOMP and does list comprehension like
# Python3 scanner adds LOAD_LISTCOMP. Python3 does list comprehension like
# other comprehensions (set, dictionary).
# listcomp is a custom rule
expr ::= listcomp
listcomp ::= LOAD_LISTCOMP LOAD_CONST MAKE_FUNCTION_0 expr GET_ITER CALL_FUNCTION_1
expr ::= list_compr
list_compr ::= BUILD_LIST_0 list_iter
@@ -71,7 +70,7 @@ class Python3Parser(PythonParser):
_come_from ::= COME_FROM
_come_from ::=
list_for ::= expr FOR_ITER designator list_iter JUMP_ABSOLUTE
list_for ::= expr FOR_ITER designator list_iter JUMP_BACK
list_if ::= expr jmp_false list_iter
list_if_not ::= expr jmp_true list_iter
@@ -407,7 +406,7 @@ class Python3Parser(PythonParser):
testtrue ::= expr jmp_true
_ifstmts_jump ::= return_if_stmts
_ifstmts_jump ::= c_stmts_opt JUMP_FORWARD COME_FROM
_ifstmts_jump ::= c_stmts_opt JUMP_FORWARD COME_FROM _come_from
iflaststmt ::= testexpr c_stmts_opt JUMP_ABSOLUTE
@@ -425,49 +424,52 @@ class Python3Parser(PythonParser):
trystmt ::= SETUP_EXCEPT suite_stmts_opt POP_BLOCK
try_middle COME_FROM
tryfinallystmt ::= SETUP_EXCEPT suite_stmts_opt POP_BLOCK
try_middle
POP_TOP SETUP_FINALLY POP_BLOCK POP_EXCEPT LOAD_CONST
COME_FROM suite_stmts END_FINALLY
JUMP_FORWARD END_FINALLY COME_FROM COME_FROM
# this is nested inside a trystmt
tryfinallystmt ::= SETUP_FINALLY suite_stmts
POP_BLOCK LOAD_CONST
COME_FROM suite_stmts END_FINALLY
COME_FROM suite_stmts_opt END_FINALLY
tryelsestmt ::= SETUP_EXCEPT suite_stmts_opt
tryelsestmt ::= SETUP_EXCEPT suite_stmts_opt POP_BLOCK
try_middle else_suite COME_FROM
tryelsestmtc ::= SETUP_EXCEPT suite_stmts_opt POP_BLOCK
try_middle else_suitec COME_FROM
try_middle else_suitec COME_FROM
tryelsestmtl ::= SETUP_EXCEPT suite_stmts_opt POP_BLOCK
try_middle else_suitel COME_FROM
try_middle else_suitel COME_FROM
try_middle ::= jmp_abs COME_FROM except_stmts END_FINALLY
try_middle ::= jmp_abs COME_FROM except_stmts
END_FINALLY
try_middle ::= JUMP_FORWARD COME_FROM except_stmts
END_FINALLY COME_FROM
try_middle ::= JUMP_FORWARD COME_FROM except_cond2
except_stmts ::= except_stmts except_stmt
except_stmts ::= except_stmt
except_stmt ::= except_cond1 except_suite
except_stmt ::= except_cond2 except_suite
except_stmt ::= except_cond2 except_suite
except_stmt ::= except_cond2 except_suite_finalize
except_stmt ::= except
# Python3 introduced POP_EXCEPT
except_suite ::= c_stmts_opt POP_EXCEPT JUMP_FORWARD
except_suite ::= c_stmts_opt POP_EXCEPT jmp_abs
# This is used in Python 3 in
# "except ... as e" to remove 'e' after the c_stmts_opt finishes
except_suite_finalize ::= SETUP_FINALLY c_stmts_opt except_var_finalize
END_FINALLY _jump
except_var_finalize ::= POP_BLOCK POP_EXCEPT LOAD_CONST COME_FROM LOAD_CONST
designator del_stmt
except_suite ::= return_stmts
except_cond1 ::= DUP_TOP expr COMPARE_OP
jmp_false POP_TOP POP_TOP POP_TOP
except_cond2 ::= DUP_TOP expr COMPARE_OP
jmp_false POP_TOP designator
jmp_false POP_TOP designator POP_TOP
except ::= POP_TOP POP_TOP POP_TOP POP_EXCEPT c_stmts_opt JUMP_FORWARD
except ::= POP_TOP POP_TOP POP_TOP c_stmts_opt JUMP_FORWARD
@@ -487,7 +489,7 @@ class Python3Parser(PythonParser):
whilestmt ::= SETUP_LOOP
testexpr
l_stmts_opt JUMP_ABSOLUTE
l_stmts_opt JUMP_BACK
POP_BLOCK COME_FROM
whilestmt ::= SETUP_LOOP
@@ -512,11 +514,11 @@ class Python3Parser(PythonParser):
_for ::= GET_ITER FOR_ITER
_for ::= LOAD_CONST FOR_LOOP
for_block ::= l_stmts_opt JUMP_ABSOLUTE
for_block ::= l_stmts_opt JUMP_BACK
for_block ::= return_stmts _come_from
forstmt ::= SETUP_LOOP expr _for designator
for_block POP_BLOCK COME_FROM
for_block POP_BLOCK _come_from
forelsestmt ::= SETUP_LOOP expr _for designator
for_block POP_BLOCK else_suite COME_FROM
@@ -670,6 +672,14 @@ class Python3Parser(PythonParser):
'''
def custom_buildclass_rule(self, opname, i, token, tokens, customize):
"""
Python >= 3.3:
buildclass ::= LOAD_BUILD_CLASS mkfunc LOAD_CONST LOAD_CLASSNAME CALL_FUNCTION_3
Python < 3.3
buildclass ::= LOAD_BUILD_CLASS LOAD_CONST MAKE_FUNCTION_0 LOAD_CONST
CALL_FUNCTION_n
"""
# look for next MAKE_FUNCTION
for i in range(i+1, len(tokens)):
@@ -677,11 +687,15 @@ class Python3Parser(PythonParser):
break
pass
assert i < len(tokens)
assert tokens[i+1].type == 'LOAD_CONST'
if self.version >= 3.3:
assert tokens[i+1].type == 'LOAD_CONST'
load_check = 'LOAD_NAME'
else:
load_check = 'LOAD_CONST'
# find load names
have_loadname = False
for i in range(i+1, len(tokens)):
if tokens[i].type == 'LOAD_NAME':
if tokens[i].type == load_check:
tokens[i].type = 'LOAD_CLASSNAME'
have_loadname = True
break
@@ -703,9 +717,14 @@ class Python3Parser(PythonParser):
j = 0
load_names = ''
# customize CALL_FUNCTION
call_function = 'CALL_FUNCTION_%d' % (j + 2)
rule = ("buildclass ::= LOAD_BUILD_CLASS mkfunc LOAD_CONST %s%s" %
(load_names, call_function))
if self.version >= 3.3:
call_function = 'CALL_FUNCTION_%d' % (j + 2)
rule = ("buildclass ::= LOAD_BUILD_CLASS mkfunc LOAD_CONST %s%s" %
(load_names, call_function))
else:
call_function = 'CALL_FUNCTION_%d' % (j + 1)
rule = ("buildclass ::= LOAD_BUILD_CLASS mkfunc %s%s" %
(load_names, call_function))
self.add_unique_rule(rule, opname, token.attr, customize)
return
@@ -714,6 +733,15 @@ class Python3Parser(PythonParser):
Special handling for opcodes that take a variable number
of arguments -- we add a new rule for each:
Python 3.4:
listcomp ::= LOAD_LISTCOMP LOAD_CONST MAKE_FUNCTION_0 expr
GET_ITER CALL_FUNCTION_1
Python < 3.4
listcomp ::= LOAD_LISTCOMP MAKE_FUNCTION_0 expr
GET_ITER CALL_FUNCTION_1
buildclass (see load_build_class)
build_list ::= {expr}^n BUILD_LIST_n
build_list ::= {expr}^n BUILD_TUPLE_n
unpack_list ::= UNPACK_LIST {expr}^n
@@ -747,6 +775,14 @@ class Python3Parser(PythonParser):
+ ('kwarg ' * args_kw)
+ 'expr ' * nak + token.type)
self.add_unique_rule(rule, token.type, args_pos, customize)
elif opname == 'LOAD_LISTCOMP':
if self.version >= 3.4:
rule = ("listcomp ::= LOAD_LISTCOMP LOAD_CONST MAKE_FUNCTION_0 expr "
"GET_ITER CALL_FUNCTION_1")
else:
rule = ("listcomp ::= LOAD_LISTCOMP MAKE_FUNCTION_0 expr "
"GET_ITER CALL_FUNCTION_1")
self.add_unique_rule(rule, opname, token.attr, customize)
elif opname == 'LOAD_BUILD_CLASS':
self.custom_buildclass_rule(opname, i, token, tokens, customize)
elif opname_base in ('BUILD_LIST', 'BUILD_TUPLE', 'BUILD_SET'):
@@ -760,7 +796,10 @@ class Python3Parser(PythonParser):
elif opname_base == ('MAKE_FUNCTION'):
self.addRule('mklambda ::= %s LOAD_LAMBDA %s' %
('expr ' * token.attr, opname), nop_func)
rule = 'mkfunc ::= %s LOAD_CONST LOAD_CONST %s' % ('expr ' * token.attr, opname)
if self.version >= 3.3:
rule = 'mkfunc ::= %s LOAD_CONST LOAD_CONST %s' % ('expr ' * token.attr, opname)
else:
rule = 'mkfunc ::= %s LOAD_CONST %s' % ('expr ' * token.attr, opname)
self.add_unique_rule(rule, opname, token.attr, customize)
pass
return

View File

@@ -25,16 +25,21 @@ if PYTHON3:
intern = sys.intern
L65536 = 65536
def cmp(a, b):
return (a > b) - (a < b)
def long(l): l
else:
L65536 = long(65536) # NOQA
from uncompyle6.opcodes import opcode_25, opcode_26, opcode_27, opcode_32, opcode_34
from uncompyle6.opcodes import opcode_25, opcode_26, opcode_27, opcode_32, opcode_33, opcode_34
class GenericPythonCode:
'''
Class for representing code-like objects across different versions of
Python.
'''
def __init__(self):
return
class Code:
'''
Class for representing code-objects.
@@ -49,9 +54,9 @@ class Code:
self._tokens, self._customize = scanner.disassemble(co, classname)
class Scanner(object):
opc = None # opcode module
def __init__(self, version):
# FIXME: DRY
if version == 2.7:
self.opc = opcode_27
elif version == 2.6:
@@ -60,16 +65,16 @@ class Scanner(object):
self.opc = opcode_25
elif version == 3.2:
self.opc = opcode_32
elif version == 3.3:
self.opc = opcode_33
elif version == 3.4:
self.opc = opcode_34
else:
raise TypeError("%i is not a Python version I know about")
# FIXME: This weird Python2 behavior is not Python3
self.resetTokenClass()
def setShowAsm(self, showasm, out=None):
self.showasm = showasm
self.out = out
def setTokenClass(self, tokenClass):
# assert isinstance(tokenClass, types.ClassType)
self.Token = tokenClass
@@ -280,7 +285,7 @@ class Scanner(object):
return result
def restrict_to_parent(self, target, parent):
'''Restrict pos to parent boundaries.'''
"""Restrict target to parent structure boundaries."""
if not (parent['start'] < target < parent['end']):
target = parent['end']
return target
@@ -302,6 +307,9 @@ def get_scanner(version):
elif version == 3.2:
import uncompyle6.scanners.scanner32 as scan
scanner = scan.Scanner32()
elif version == 3.3:
import uncompyle6.scanners.scanner33 as scan
scanner = scan.Scanner33()
elif version == 3.4:
import uncompyle6.scanners.scanner34 as scan
scanner = scan.Scanner34()

View File

@@ -12,7 +12,6 @@ Python 3 and other versions of Python. Also, we save token
information for later use in deparsing.
"""
import inspect
from collections import namedtuple
from array import array
@@ -151,7 +150,11 @@ class Scanner25(scan.Scanner):
continue
if op in hasconst:
const = co.co_consts[oparg]
if inspect.iscode(const):
# We can't use inspect.iscode() because we may be
# using a different version of Python than the
# one that this was byte-compiled on. So the code
# types may mismatch.
if hasattr(const, 'co_name'):
oparg = const
if const.co_name == '<lambda>':
assert op_name == 'LOAD_CONST'
@@ -912,6 +915,7 @@ class Scanner25(scan.Scanner):
return targets
if __name__ == "__main__":
import inspect
co = inspect.currentframe().f_code
tokens, customize = Scanner25().disassemble(co)
for t in tokens:

View File

@@ -11,7 +11,6 @@ other versions of Python. Also, we save token information for later
use in deparsing.
"""
import inspect
from collections import namedtuple
from array import array
@@ -145,7 +144,11 @@ class Scanner26(scan.Scanner):
continue
if op in hasconst:
const = co.co_consts[oparg]
if inspect.iscode(const):
# We can't use inspect.iscode() because we may be
# using a different version of Python than the
# one that this was byte-compiled on. So the code
# types may mismatch.
if hasattr(const, 'co_name'):
oparg = const
if const.co_name == '<lambda>':
assert op_name == 'LOAD_CONST'
@@ -901,6 +904,7 @@ class Scanner26(scan.Scanner):
return targets
if __name__ == "__main__":
import inspect
co = inspect.currentframe().f_code
tokens, customize = Scanner26().disassemble(co)
for t in tokens:

View File

@@ -138,7 +138,11 @@ class Scanner27(scan.Scanner):
continue
if op in hasconst:
const = co.co_consts[oparg]
if inspect.iscode(const):
# We can't use inspect.iscode() because we may be
# using a different version of Python than the
# one that this was byte-compiled on. So the code
# types may mismatch.
if hasattr(const, 'co_name'):
oparg = const
if const.co_name == '<lambda>':
assert op_name == 'LOAD_CONST'

View File

@@ -0,0 +1,607 @@
# Copyright (c) 2015 by Rocky Bernstein
"""
Python 3 Generic ytecode scanner/deparser
This overlaps various Python3's dis module, but it can be run from
Python 2 and other versions of Python. Also, we save token information
for later use in deparsing.
"""
from __future__ import print_function
import dis, re
from collections import namedtuple
from array import array
from uncompyle6.scanner import Token
from uncompyle6 import PYTHON_VERSION, PYTHON3
# Get all the opcodes into globals
globals().update(dis.opmap)
from uncompyle6.opcodes.opcode_33 import *
import uncompyle6.scanner as scan
class Scanner3(scan.Scanner):
def __init__(self):
scan.Scanner.__init__(self, PYTHON_VERSION)
def disassemble_generic(self, co, classname=None):
"""
Convert code object <co> into a sequence of tokens.
The below is based on (an older version?) of Python dis.disassemble_bytes().
"""
# Container for tokens
tokens = []
customize = {}
self.code = code = array('B', co.co_code)
codelen = len(code)
self.build_lines_data(co)
self.build_prev_op()
# self.lines contains (block,addrLastInstr)
if classname:
classname = '_' + classname.lstrip('_') + '__'
def unmangle(name):
if name.startswith(classname) and name[-2:] != '__':
return name[len(classname) - 2:]
return name
free = [ unmangle(name) for name in (co.co_cellvars + co.co_freevars) ]
names = [ unmangle(name) for name in co.co_names ]
varnames = [ unmangle(name) for name in co.co_varnames ]
else:
free = co.co_cellvars + co.co_freevars
names = co.co_names
varnames = co.co_varnames
pass
# Scan for assertions. Later we will
# turn 'LOAD_GLOBAL' to 'LOAD_ASSERT' for those
# assertions
self.load_asserts = set()
for i in self.op_range(0, codelen):
if self.code[i] == POP_JUMP_IF_TRUE and self.code[i+3] == LOAD_GLOBAL:
if names[self.get_argument(i+3)] == 'AssertionError':
self.load_asserts.add(i+3)
# Get jump targets
# Format: {target offset: [jump offsets]}
jump_targets = self.find_jump_targets()
# contains (code, [addrRefToCode])
last_stmt = self.next_stmt[0]
i = self.next_stmt[last_stmt]
replace = {}
imports = self.all_instr(0, codelen, (IMPORT_NAME, IMPORT_FROM, IMPORT_STAR))
if len(imports) > 1:
last_import = imports[0]
for i in imports[1:]:
if self.lines[last_import].next > i:
if self.code[last_import] == IMPORT_NAME == self.code[i]:
replace[i] = 'IMPORT_NAME_CONT'
last_import = i
# Initialize extended arg at 0. When extended arg op is encountered,
# variable preserved for next cycle and added as arg for next op
extended_arg = 0
for offset in self.op_range(0, codelen):
# Add jump target tokens
if offset in jump_targets:
jump_idx = 0
for jump_offset in jump_targets[offset]:
tokens.append(Token('COME_FROM', None, repr(jump_offset),
offset='{}_{}'.format(offset, jump_idx)))
jump_idx += 1
pass
pass
op = code[offset]
op_name = opname[op]
oparg = None; pattr = None
if op >= HAVE_ARGUMENT:
oparg = self.get_argument(offset) + extended_arg
extended_arg = 0
if op == EXTENDED_ARG:
extended_arg = oparg * scan.L65536
continue
if op in hasconst:
const = co.co_consts[oparg]
if not PYTHON3 and isinstance(const, str):
m = re.search('^<code object (.*) '
'at 0x(.*), file "(.*)", line (.*)>', const)
if m:
const = scan.GenericPythonCode()
const.co_name = m.group(1)
const.co_filenaame = m.group(3)
const.co_firstlineno = m.group(4)
pass
# We can't use inspect.iscode() because we may be
# using a different version of Python than the
# one that this was byte-compiled on. So the code
# types may mismatch.
if hasattr(const, 'co_name'):
oparg = const
if const.co_name == '<lambda>':
assert op_name == 'LOAD_CONST'
op_name = 'LOAD_LAMBDA'
elif const.co_name == '<genexpr>':
op_name = 'LOAD_GENEXPR'
elif const.co_name == '<dictcomp>':
op_name = 'LOAD_DICTCOMP'
elif const.co_name == '<setcomp>':
op_name = 'LOAD_SETCOMP'
elif const.co_name == '<listcomp>':
op_name = 'LOAD_LISTCOMP'
# verify() uses 'pattr' for comparison, since 'attr'
# now holds Code(const) and thus can not be used
# for comparison (todo: think about changing this)
# pattr = 'code_object @ 0x%x %s->%s' %\
# (id(const), const.co_filename, const.co_name)
pattr = '<code_object ' + const.co_name + '>'
else:
pattr = const
elif op in hasname:
pattr = names[oparg]
elif op in hasjrel:
pattr = repr(offset + 3 + oparg)
elif op in hasjabs:
pattr = repr(oparg)
elif op in haslocal:
pattr = varnames[oparg]
elif op in hascompare:
pattr = cmp_op[oparg]
elif op in hasfree:
pattr = free[oparg]
if op in (BUILD_LIST, BUILD_TUPLE, BUILD_SET, BUILD_SLICE,
UNPACK_SEQUENCE,
MAKE_FUNCTION, CALL_FUNCTION, MAKE_CLOSURE,
CALL_FUNCTION_VAR, CALL_FUNCTION_KW,
CALL_FUNCTION_VAR_KW, RAISE_VARARGS
):
# As of Python 2.5, values loaded via LOAD_CLOSURE are packed into
# a tuple before calling MAKE_CLOSURE.
if (op == BUILD_TUPLE and
self.code[self.prev_op[offset]] == LOAD_CLOSURE):
continue
else:
# CALL_FUNCTION OP renaming is done as a custom rule in parse3
if op_name not in ('CALL_FUNCTION', 'CALL_FUNCTION_VAR',
'CALL_FUNCTION_VAR_KW', 'CALL_FUNCTION_KW'):
op_name = '%s_%d' % (op_name, oparg)
if op != BUILD_SLICE:
customize[op_name] = oparg
elif op == JUMP_ABSOLUTE:
target = self.get_target(offset)
if target < offset:
if (offset in self.stmts
and self.code[offset+3] not in (END_FINALLY, POP_BLOCK)
and offset not in self.not_continue):
op_name = 'CONTINUE'
else:
op_name = 'JUMP_BACK'
elif op == LOAD_GLOBAL:
if offset in self.load_asserts:
op_name = 'LOAD_ASSERT'
elif op == RETURN_VALUE:
if offset in self.return_end_ifs:
op_name = 'RETURN_END_IF'
if offset in self.linestarts:
linestart = self.linestarts[offset]
else:
linestart = None
if offset not in replace:
tokens.append(Token(op_name, oparg, pattr, offset, linestart))
else:
tokens.append(Token(replace[offset], oparg, pattr, offset, linestart))
pass
return tokens, customize
def build_lines_data(self, code_obj):
"""
Generate various line-related helper data.
"""
# Offset: lineno pairs, only for offsets which start line.
# Locally we use list for more convenient iteration using indices
linestarts = list(dis.findlinestarts(code_obj))
self.linestarts = dict(linestarts)
# Plain set with offsets of first ops on line
self.linestart_offsets = {a for (a, _) in linestarts}
# 'List-map' which shows line number of current op and offset of
# first op on following line, given offset of op as index
self.lines = lines = []
LineTuple = namedtuple('LineTuple', ['l_no', 'next'])
# Iterate through available linestarts, and fill
# the data for all code offsets encountered until
# last linestart offset
_, prev_line_no = linestarts[0]
offset = 0
for start_offset, line_no in linestarts[1:]:
while offset < start_offset:
lines.append(LineTuple(prev_line_no, start_offset))
offset += 1
prev_line_no = line_no
# Fill remaining offsets with reference to last line number
# and code length as start offset of following non-existing line
codelen = len(self.code)
while offset < codelen:
lines.append(LineTuple(prev_line_no, codelen))
offset += 1
def build_prev_op(self):
"""
Compose 'list-map' which allows to jump to previous
op, given offset of current op as index.
"""
code = self.code
codelen = len(code)
self.prev_op = [0]
for offset in self.op_range(0, codelen):
op = code[offset]
for _ in range(self.op_size(op)):
self.prev_op.append(offset)
def op_size(self, op):
"""
Return size of operator with its arguments
for given opcode <op>.
"""
if op < dis.HAVE_ARGUMENT:
return 1
else:
return 3
def find_jump_targets(self):
"""
Detect all offsets in a byte code which are jump targets.
Return the list of offsets.
This procedure is modelled after dis.findlables(), but here
for each target the number of jumps is counted.
"""
code = self.code
codelen = len(code)
self.structs = [{'type': 'root',
'start': 0,
'end': codelen-1}]
# All loop entry points
# self.loops = []
# Map fixed jumps to their real destination
self.fixed_jumps = {}
self.ignore_if = set()
self.build_statement_indices()
# Containers filled by detect_structure()
self.not_continue = set()
self.return_end_ifs = set()
targets = {}
for offset in self.op_range(0, codelen):
op = code[offset]
# Determine structures and fix jumps for 2.3+
self.detect_structure(offset)
if op >= dis.HAVE_ARGUMENT:
label = self.fixed_jumps.get(offset)
oparg = code[offset+1] + code[offset+2] * 256
if label is None:
if op in dis.hasjrel and op != FOR_ITER:
label = offset + 3 + oparg
elif op in dis.hasjabs:
if op in (JUMP_IF_FALSE_OR_POP, JUMP_IF_TRUE_OR_POP):
if oparg > offset:
label = oparg
if label is not None and label != -1:
targets[label] = targets.get(label, []) + [offset]
elif op == END_FINALLY and offset in self.fixed_jumps:
label = self.fixed_jumps[offset]
targets[label] = targets.get(label, []) + [offset]
return targets
def build_statement_indices(self):
code = self.code
start = 0
end = codelen = len(code)
statement_opcodes = {
SETUP_LOOP, BREAK_LOOP, CONTINUE_LOOP,
SETUP_FINALLY, END_FINALLY, SETUP_EXCEPT, SETUP_WITH,
POP_BLOCK, STORE_FAST, DELETE_FAST, STORE_DEREF,
STORE_GLOBAL, DELETE_GLOBAL, STORE_NAME, DELETE_NAME,
STORE_ATTR, DELETE_ATTR, STORE_SUBSCR, DELETE_SUBSCR,
RETURN_VALUE, RAISE_VARARGS, POP_TOP, PRINT_EXPR,
JUMP_ABSOLUTE
}
statement_opcode_sequences = [(POP_JUMP_IF_FALSE, JUMP_FORWARD), (POP_JUMP_IF_FALSE, JUMP_ABSOLUTE),
(POP_JUMP_IF_TRUE, JUMP_FORWARD), (POP_JUMP_IF_TRUE, JUMP_ABSOLUTE)]
designator_ops = {
STORE_FAST, STORE_NAME, STORE_GLOBAL, STORE_DEREF, STORE_ATTR,
STORE_SUBSCR, UNPACK_SEQUENCE, JUMP_ABSOLUTE
}
# Compose preliminary list of indices with statements,
# using plain statement opcodes
prelim = self.all_instr(start, end, statement_opcodes)
# Initialize final container with statements with
# preliminnary data
stmts = self.stmts = set(prelim)
# Same for opcode sequences
pass_stmts = set()
for sequence in statement_opcode_sequences:
for i in self.op_range(start, end-(len(sequence)+1)):
match = True
for elem in sequence:
if elem != code[i]:
match = False
break
i += self.op_size(code[i])
if match is True:
i = self.prev_op[i]
stmts.add(i)
pass_stmts.add(i)
# Initialize statement list with the full data we've gathered so far
if pass_stmts:
stmt_offset_list = list(stmts)
stmt_offset_list.sort()
else:
stmt_offset_list = prelim
# 'List-map' which contains offset of start of
# next statement, when op offset is passed as index
self.next_stmt = slist = []
last_stmt_offset = -1
i = 0
# Go through all statement offsets
for stmt_offset in stmt_offset_list:
# Process absolute jumps, but do not remove 'pass' statements
# from the set
if code[stmt_offset] == JUMP_ABSOLUTE and stmt_offset not in pass_stmts:
# If absolute jump occurs in forward direction or it takes off from the
# same line as previous statement, this is not a statement
target = self.get_target(stmt_offset)
if target > stmt_offset or self.lines[last_stmt_offset].l_no == self.lines[stmt_offset].l_no:
stmts.remove(stmt_offset)
continue
# Rewing ops till we encounter non-JA one
j = self.prev_op[stmt_offset]
while code[j] == JUMP_ABSOLUTE:
j = self.prev_op[j]
# If we got here, then it's list comprehension which
# is not a statement too
if code[j] == LIST_APPEND:
stmts.remove(stmt_offset)
continue
# Exclude ROT_TWO + POP_TOP
elif code[stmt_offset] == POP_TOP and code[self.prev_op[stmt_offset]] == ROT_TWO:
stmts.remove(stmt_offset)
continue
# Exclude FOR_ITER + designators
elif code[stmt_offset] in designator_ops:
j = self.prev_op[stmt_offset]
while code[j] in designator_ops:
j = self.prev_op[j]
if code[j] == FOR_ITER:
stmts.remove(stmt_offset)
continue
# Add to list another list with offset of current statement,
# equal to length of previous statement
slist += [stmt_offset] * (stmt_offset-i)
last_stmt_offset = stmt_offset
i = stmt_offset
# Finish filling the list for last statement
slist += [codelen] * (codelen-len(slist))
def get_target(self, offset):
"""
Get target offset for op located at given <offset>.
"""
op = self.code[offset]
target = self.code[offset+1] + self.code[offset+2] * 256
if op in dis.hasjrel:
target += offset + 3
return target
def detect_structure(self, offset):
"""
Detect structures and their boundaries to fix optimizied jumps
in python2.3+
"""
code = self.code
op = code[offset]
# Detect parent structure
parent = self.structs[0]
start = parent['start']
end = parent['end']
# Pick inner-most parent for our offset
for struct in self.structs:
curent_start = struct['start']
curent_end = struct['end']
if (curent_start <= offset < curent_end) and (curent_start >= start and curent_end <= end):
start = curent_start
end = curent_end
parent = struct
if op in (POP_JUMP_IF_FALSE, POP_JUMP_IF_TRUE):
start = offset + self.op_size(op)
target = self.get_target(offset)
rtarget = self.restrict_to_parent(target, parent)
prev_op = self.prev_op
# Do not let jump to go out of parent struct bounds
if target != rtarget and parent['type'] == 'and/or':
self.fixed_jumps[offset] = rtarget
return
# Does this jump to right after another cond jump?
# If so, it's part of a larger conditional
if (code[prev_op[target]] in (JUMP_IF_FALSE_OR_POP, JUMP_IF_TRUE_OR_POP,
POP_JUMP_IF_FALSE, POP_JUMP_IF_TRUE)) and (target > offset):
self.fixed_jumps[offset] = prev_op[target]
self.structs.append({'type': 'and/or',
'start': start,
'end': prev_op[target]})
return
# Is it an and inside if block
if op == POP_JUMP_IF_FALSE:
# Search for other POP_JUMP_IF_FALSE targetting the same op,
# in current statement, starting from current offset, and filter
# everything inside inner 'or' jumps and midline ifs
match = self.rem_or(start, self.next_stmt[offset], POP_JUMP_IF_FALSE, target)
match = self.remove_mid_line_ifs(match)
# If we still have any offsets in set, start working on it
if match:
if (code[prev_op[rtarget]] in (JUMP_FORWARD, JUMP_ABSOLUTE) and prev_op[rtarget] not in self.stmts and
self.restrict_to_parent(self.get_target(prev_op[rtarget]), parent) == rtarget):
if (code[prev_op[prev_op[rtarget]]] == JUMP_ABSOLUTE and self.remove_mid_line_ifs([offset]) and
target == self.get_target(prev_op[prev_op[rtarget]]) and
(prev_op[prev_op[rtarget]] not in self.stmts or self.get_target(prev_op[prev_op[rtarget]]) > prev_op[prev_op[rtarget]]) and
1 == len(self.remove_mid_line_ifs(self.rem_or(start, prev_op[prev_op[rtarget]], (POP_JUMP_IF_FALSE, POP_JUMP_IF_TRUE), target)))):
pass
elif (code[prev_op[prev_op[rtarget]]] == RETURN_VALUE and self.remove_mid_line_ifs([offset]) and
1 == (len(set(self.remove_mid_line_ifs(self.rem_or(start, prev_op[prev_op[rtarget]],
(POP_JUMP_IF_FALSE, POP_JUMP_IF_TRUE), target))) |
set(self.remove_mid_line_ifs(self.rem_or(start, prev_op[prev_op[rtarget]],
(POP_JUMP_IF_FALSE, POP_JUMP_IF_TRUE, JUMP_ABSOLUTE),
prev_op[rtarget], True)))))):
pass
else:
fix = None
jump_ifs = self.all_instr(start, self.next_stmt[offset], POP_JUMP_IF_FALSE)
last_jump_good = True
for j in jump_ifs:
if target == self.get_target(j):
if self.lines[j].next == j + 3 and last_jump_good:
fix = j
break
else:
last_jump_good = False
self.fixed_jumps[offset] = fix or match[-1]
return
else:
self.fixed_jumps[offset] = match[-1]
return
# op == POP_JUMP_IF_TRUE
else:
next = self.next_stmt[offset]
if prev_op[next] == offset:
pass
elif code[next] in (JUMP_FORWARD, JUMP_ABSOLUTE) and target == self.get_target(next):
if code[prev_op[next]] == POP_JUMP_IF_FALSE:
if code[next] == JUMP_FORWARD or target != rtarget or code[prev_op[prev_op[rtarget]]] not in (JUMP_ABSOLUTE, RETURN_VALUE):
self.fixed_jumps[offset] = prev_op[next]
return
elif (code[next] == JUMP_ABSOLUTE and code[target] in (JUMP_ABSOLUTE, JUMP_FORWARD) and
self.get_target(target) == self.get_target(next)):
self.fixed_jumps[offset] = prev_op[next]
return
# Don't add a struct for a while test, it's already taken care of
if offset in self.ignore_if:
return
if (code[prev_op[rtarget]] == JUMP_ABSOLUTE and prev_op[rtarget] in self.stmts and
prev_op[rtarget] != offset and prev_op[prev_op[rtarget]] != offset and
not (code[rtarget] == JUMP_ABSOLUTE and code[rtarget+3] == POP_BLOCK and code[prev_op[prev_op[rtarget]]] != JUMP_ABSOLUTE)):
rtarget = prev_op[rtarget]
# Does the if jump just beyond a jump op, then this is probably an if statement
if code[prev_op[rtarget]] in (JUMP_ABSOLUTE, JUMP_FORWARD):
if_end = self.get_target(prev_op[rtarget])
# Is this a loop not an if?
if (if_end < prev_op[rtarget]) and (code[prev_op[if_end]] == SETUP_LOOP):
if(if_end > start):
return
end = self.restrict_to_parent(if_end, parent)
self.structs.append({'type': 'if-then',
'start': start,
'end': prev_op[rtarget]})
self.not_continue.add(prev_op[rtarget])
if rtarget < end:
self.structs.append({'type': 'if-else',
'start': rtarget,
'end': end})
elif code[prev_op[rtarget]] == RETURN_VALUE:
self.structs.append({'type': 'if-then',
'start': start,
'end': rtarget})
self.return_end_ifs.add(prev_op[rtarget])
elif op in (JUMP_IF_FALSE_OR_POP, JUMP_IF_TRUE_OR_POP):
target = self.get_target(offset)
if target > offset:
unop_target = self.last_instr(offset, target, JUMP_FORWARD, target)
if unop_target and code[unop_target+3] != ROT_TWO:
self.fixed_jumps[offset] = unop_target
else:
self.fixed_jumps[offset] = self.restrict_to_parent(target, parent)
def rem_or(self, start, end, instr, target=None, include_beyond_target=False):
"""
Find offsets of all requested <instr> between <start> and <end>,
optionally <target>ing specified offset, and return list found
<instr> offsets which are not within any POP_JUMP_IF_TRUE jumps.
"""
# Find all offsets of requested instructions
instr_offsets = self.all_instr(start, end, instr, target, include_beyond_target)
# Get all POP_JUMP_IF_TRUE (or) offsets
pjit_offsets = self.all_instr(start, end, POP_JUMP_IF_TRUE)
filtered = []
for pjit_offset in pjit_offsets:
pjit_tgt = self.get_target(pjit_offset) - 3
for instr_offset in instr_offsets:
if instr_offset <= pjit_offset or instr_offset >= pjit_tgt:
filtered.append(instr_offset)
instr_offsets = filtered
filtered = []
return instr_offsets
def remove_mid_line_ifs(self, ifs):
"""
Go through passed offsets, filtering ifs
located somewhere mid-line.
"""
filtered = []
for if_ in ifs:
# For each offset, if line number of current and next op
# is the same
if self.lines[if_].l_no == self.lines[if_+3].l_no:
# Check if last op on line is PJIT or PJIF, and if it is - skip it
if self.code[self.prev_op[self.lines[if_].next]] in (POP_JUMP_IF_TRUE, POP_JUMP_IF_FALSE):
continue
filtered.append(if_)
return filtered
if __name__ == "__main__":
import inspect
co = inspect.currentframe().f_code
tokens, customize = Scanner3().disassemble_generic(co)
for t in tokens:
print(t)
pass

View File

@@ -1,6 +1,3 @@
# Copyright (c) 1999 John Aycock
# Copyright (c) 2000-2002 by hartmut Goebel <h.goebel@crazy-compilers.com>
# Copyright (c) 2005 by Dan Pascu <dan@windowmaker.org>
# Copyright (c) 2015 by Rocky Bernstein
"""
Python 3.2 bytecode scanner/deparser
@@ -12,484 +9,21 @@ for later use in deparsing.
from __future__ import print_function
import dis, marshal
from collections import namedtuple
from array import array
import uncompyle6.scanners.scanner3 as scan3
from uncompyle6.scanner import Token, L65536
import uncompyle6.opcodes.opcode_34
# verify uses JUMP_OPs from here
JUMP_OPs = uncompyle6.opcodes.opcode_34.JUMP_OPs
class Scanner32(scan3.Scanner3):
# Get all the opcodes into globals
globals().update(dis.opmap)
from uncompyle6.opcodes.opcode_27 import *
import uncompyle6.scanner as scan
def disassemble(self, co, classname=None):
return self.disassemble_generic(co, classname)
class Scanner32(scan.Scanner):
def __init__(self):
scan.Scanner.__init__(self, 3.2) # check
def run(self, bytecode):
code_object = marshal.loads(bytecode)
tokens = self.tokenize(code_object)
return tokens
def disassemble(self, co):
"""
Convert code object <co> into a sequence of tokens.
Based on dis.disassemble() function.
"""
# Container for tokens
tokens = []
customize = {}
self.code = code = array('B', co.co_code)
codelen = len(code)
self.build_lines_data(co)
self.build_prev_op()
# Get jump targets
# Format: {target offset: [jump offsets]}
jump_targets = self.find_jump_targets()
# Initialize extended arg at 0. When extended arg op is encountered,
# variable preserved for next cycle and added as arg for next op
extended_arg = 0
free = None
for offset in self.op_range(0, codelen):
# Add jump target tokens
if offset in jump_targets:
jump_idx = 0
for jump_offset in jump_targets[offset]:
tokens.append(Token('COME_FROM', None, repr(jump_offset),
offset='{}_{}'.format(offset, jump_idx)))
jump_idx += 1
op = code[offset]
# Create token and fill all the fields we can
# w/o touching arguments
current_token = Token(dis.opname[op])
current_token.offset = offset
if offset in self.linestarts:
current_token.linestart = self.linestarts[offset]
else:
current_token.linestart = None
if op >= dis.HAVE_ARGUMENT:
# Calculate op's argument value based on its argument and
# preceding extended argument, if any
oparg = code[offset+1] + code[offset+2]*256 + extended_arg
extended_arg = 0
if op == dis.EXTENDED_ARG:
extended_arg = oparg * L65536
# Fill token's attr/pattr fields
current_token.attr = oparg
if op in dis.hasconst:
current_token.pattr = repr(co.co_consts[oparg])
elif op in dis.hasname:
current_token.pattr = co.co_names[oparg]
elif op in dis.hasjrel:
current_token.pattr = repr(offset + 3 + oparg)
elif op in dis.haslocal:
current_token.pattr = co.co_varnames[oparg]
elif op in dis.hascompare:
current_token.pattr = dis.cmp_op[oparg]
elif op in dis.hasfree:
if free is None:
free = co.co_cellvars + co.co_freevars
current_token.pattr = free[oparg]
tokens.append(current_token)
return tokens, customize
def build_lines_data(self, code_obj):
"""
Generate various line-related helper data.
"""
# Offset: lineno pairs, only for offsets which start line.
# Locally we use list for more convenient iteration using indices
linestarts = list(dis.findlinestarts(code_obj))
self.linestarts = dict(linestarts)
# Plain set with offsets of first ops on line
self.linestart_offsets = {a for (a, _) in linestarts}
# 'List-map' which shows line number of current op and offset of
# first op on following line, given offset of op as index
self.lines = lines = []
LineTuple = namedtuple('LineTuple', ['l_no', 'next'])
# Iterate through available linestarts, and fill
# the data for all code offsets encountered until
# last linestart offset
_, prev_line_no = linestarts[0]
offset = 0
for start_offset, line_no in linestarts[1:]:
while offset < start_offset:
lines.append(LineTuple(prev_line_no, start_offset))
offset += 1
prev_line_no = line_no
# Fill remaining offsets with reference to last line number
# and code length as start offset of following non-existing line
codelen = len(self.code)
while offset < codelen:
lines.append(LineTuple(prev_line_no, codelen))
offset += 1
def build_prev_op(self):
"""
Compose 'list-map' which allows to jump to previous
op, given offset of current op as index.
"""
code = self.code
codelen = len(code)
self.prev_op = [0]
for offset in self.op_range(0, codelen):
op = code[offset]
for _ in range(self.op_size(op)):
self.prev_op.append(offset)
def op_size(self, op):
"""
Return size of operator with its arguments
for given opcode <op>.
"""
if op < dis.HAVE_ARGUMENT:
return 1
else:
return 3
def find_jump_targets(self):
"""
Detect all offsets in a byte code which are jump targets.
Return the list of offsets.
This procedure is modelled after dis.findlables(), but here
for each target the number of jumps is counted.
"""
code = self.code
codelen = len(code)
self.structs = [{'type': 'root',
'start': 0,
'end': codelen-1}]
# All loop entry points
# self.loops = []
# Map fixed jumps to their real destination
self.fixed_jumps = {}
self.ignore_if = set()
self.build_statement_indices()
# Containers filled by detect_structure()
self.not_continue = set()
self.return_end_ifs = set()
targets = {}
for offset in self.op_range(0, codelen):
op = code[offset]
# Determine structures and fix jumps for 2.3+
self.detect_structure(offset)
if op >= dis.HAVE_ARGUMENT:
label = self.fixed_jumps.get(offset)
oparg = code[offset+1] + code[offset+2] * 256
if label is None:
if op in dis.hasjrel and op != FOR_ITER:
label = offset + 3 + oparg
elif op in dis.hasjabs:
if op in (JUMP_IF_FALSE_OR_POP, JUMP_IF_TRUE_OR_POP):
if oparg > offset:
label = oparg
if label is not None and label != -1:
targets[label] = targets.get(label, []) + [offset]
elif op == END_FINALLY and offset in self.fixed_jumps:
label = self.fixed_jumps[offset]
targets[label] = targets.get(label, []) + [offset]
return targets
def build_statement_indices(self):
code = self.code
start = 0
end = codelen = len(code)
statement_opcodes = {
SETUP_LOOP, BREAK_LOOP, CONTINUE_LOOP,
SETUP_FINALLY, END_FINALLY, SETUP_EXCEPT, SETUP_WITH,
POP_BLOCK, STORE_FAST, DELETE_FAST, STORE_DEREF,
STORE_GLOBAL, DELETE_GLOBAL, STORE_NAME, DELETE_NAME,
STORE_ATTR, DELETE_ATTR, STORE_SUBSCR, DELETE_SUBSCR,
RETURN_VALUE, RAISE_VARARGS, POP_TOP, PRINT_EXPR,
JUMP_ABSOLUTE
}
statement_opcode_sequences = [(POP_JUMP_IF_FALSE, JUMP_FORWARD), (POP_JUMP_IF_FALSE, JUMP_ABSOLUTE),
(POP_JUMP_IF_TRUE, JUMP_FORWARD), (POP_JUMP_IF_TRUE, JUMP_ABSOLUTE)]
designator_ops = {
STORE_FAST, STORE_NAME, STORE_GLOBAL, STORE_DEREF, STORE_ATTR,
STORE_SUBSCR, UNPACK_SEQUENCE, JUMP_ABSOLUTE
}
# Compose preliminary list of indices with statements,
# using plain statement opcodes
prelim = self.all_instr(start, end, statement_opcodes)
# Initialize final container with statements with
# preliminnary data
stmts = self.stmts = set(prelim)
# Same for opcode sequences
pass_stmts = set()
for sequence in statement_opcode_sequences:
for i in self.op_range(start, end-(len(sequence)+1)):
match = True
for elem in sequence:
if elem != code[i]:
match = False
break
i += self.op_size(code[i])
if match is True:
i = self.prev_op[i]
stmts.add(i)
pass_stmts.add(i)
# Initialize statement list with the full data we've gathered so far
if pass_stmts:
stmt_offset_list = list(stmts)
stmt_offset_list.sort()
else:
stmt_offset_list = prelim
# 'List-map' which contains offset of start of
# next statement, when op offset is passed as index
self.next_stmt = slist = []
last_stmt_offset = -1
i = 0
# Go through all statement offsets
for stmt_offset in stmt_offset_list:
# Process absolute jumps, but do not remove 'pass' statements
# from the set
if code[stmt_offset] == JUMP_ABSOLUTE and stmt_offset not in pass_stmts:
# If absolute jump occurs in forward direction or it takes off from the
# same line as previous statement, this is not a statement
target = self.get_target(stmt_offset)
if target > stmt_offset or self.lines[last_stmt_offset].l_no == self.lines[stmt_offset].l_no:
stmts.remove(stmt_offset)
continue
# Rewing ops till we encounter non-JA one
j = self.prev_op[stmt_offset]
while code[j] == JUMP_ABSOLUTE:
j = self.prev_op[j]
# If we got here, then it's list comprehension which
# is not a statement too
if code[j] == LIST_APPEND:
stmts.remove(stmt_offset)
continue
# Exclude ROT_TWO + POP_TOP
elif code[stmt_offset] == POP_TOP and code[self.prev_op[stmt_offset]] == ROT_TWO:
stmts.remove(stmt_offset)
continue
# Exclude FOR_ITER + designators
elif code[stmt_offset] in designator_ops:
j = self.prev_op[stmt_offset]
while code[j] in designator_ops:
j = self.prev_op[j]
if code[j] == FOR_ITER:
stmts.remove(stmt_offset)
continue
# Add to list another list with offset of current statement,
# equal to length of previous statement
slist += [stmt_offset] * (stmt_offset-i)
last_stmt_offset = stmt_offset
i = stmt_offset
# Finish filling the list for last statement
slist += [codelen] * (codelen-len(slist))
def get_target(self, offset):
"""
Get target offset for op located at given <offset>.
"""
op = self.code[offset]
target = self.code[offset+1] + self.code[offset+2] * 256
if op in dis.hasjrel:
target += offset + 3
return target
def detect_structure(self, offset):
"""
Detect structures and their boundaries to fix optimizied jumps
in python2.3+
"""
code = self.code
op = code[offset]
# Detect parent structure
parent = self.structs[0]
start = parent['start']
end = parent['end']
# Pick inner-most parent for our offset
for struct in self.structs:
curent_start = struct['start']
curent_end = struct['end']
if (curent_start <= offset < curent_end) and (curent_start >= start and curent_end <= end):
start = curent_start
end = curent_end
parent = struct
if op in (POP_JUMP_IF_FALSE, POP_JUMP_IF_TRUE):
start = offset + self.op_size(op)
target = self.get_target(offset)
rtarget = self.restrict_to_parent(target, parent)
prev_op = self.prev_op
# Do not let jump to go out of parent struct bounds
if target != rtarget and parent['type'] == 'and/or':
self.fixed_jumps[offset] = rtarget
return
# Does this jump to right after another cond jump?
# If so, it's part of a larger conditional
if (code[prev_op[target]] in (JUMP_IF_FALSE_OR_POP, JUMP_IF_TRUE_OR_POP,
POP_JUMP_IF_FALSE, POP_JUMP_IF_TRUE)) and (target > offset):
self.fixed_jumps[offset] = prev_op[target]
self.structs.append({'type': 'and/or',
'start': start,
'end': prev_op[target]})
return
# Is it an and inside if block
if op == POP_JUMP_IF_FALSE:
# Search for other POP_JUMP_IF_FALSE targetting the same op,
# in current statement, starting from current offset, and filter
# everything inside inner 'or' jumps and midline ifs
match = self.rem_or(start, self.next_stmt[offset], POP_JUMP_IF_FALSE, target)
match = self.remove_mid_line_ifs(match)
# If we still have any offsets in set, start working on it
if match:
if (code[prev_op[rtarget]] in (JUMP_FORWARD, JUMP_ABSOLUTE) and prev_op[rtarget] not in self.stmts and
self.restrict_to_parent(self.get_target(prev_op[rtarget]), parent) == rtarget):
if (code[prev_op[prev_op[rtarget]]] == JUMP_ABSOLUTE and self.remove_mid_line_ifs([offset]) and
target == self.get_target(prev_op[prev_op[rtarget]]) and
(prev_op[prev_op[rtarget]] not in self.stmts or self.get_target(prev_op[prev_op[rtarget]]) > prev_op[prev_op[rtarget]]) and
1 == len(self.remove_mid_line_ifs(self.rem_or(start, prev_op[prev_op[rtarget]], (POP_JUMP_IF_FALSE, POP_JUMP_IF_TRUE), target)))):
pass
elif (code[prev_op[prev_op[rtarget]]] == RETURN_VALUE and self.remove_mid_line_ifs([offset]) and
1 == (len(set(self.remove_mid_line_ifs(self.rem_or(start, prev_op[prev_op[rtarget]],
(POP_JUMP_IF_FALSE, POP_JUMP_IF_TRUE), target))) |
set(self.remove_mid_line_ifs(self.rem_or(start, prev_op[prev_op[rtarget]],
(POP_JUMP_IF_FALSE, POP_JUMP_IF_TRUE, JUMP_ABSOLUTE),
prev_op[rtarget], True)))))):
pass
else:
fix = None
jump_ifs = self.all_instr(start, self.next_stmt[offset], POP_JUMP_IF_FALSE)
last_jump_good = True
for j in jump_ifs:
if target == self.get_target(j):
if self.lines[j].next == j + 3 and last_jump_good:
fix = j
break
else:
last_jump_good = False
self.fixed_jumps[offset] = fix or match[-1]
return
else:
self.fixed_jumps[offset] = match[-1]
return
# op == POP_JUMP_IF_TRUE
else:
next = self.next_stmt[offset]
if prev_op[next] == offset:
pass
elif code[next] in (JUMP_FORWARD, JUMP_ABSOLUTE) and target == self.get_target(next):
if code[prev_op[next]] == POP_JUMP_IF_FALSE:
if code[next] == JUMP_FORWARD or target != rtarget or code[prev_op[prev_op[rtarget]]] not in (JUMP_ABSOLUTE, RETURN_VALUE):
self.fixed_jumps[offset] = prev_op[next]
return
elif (code[next] == JUMP_ABSOLUTE and code[target] in (JUMP_ABSOLUTE, JUMP_FORWARD) and
self.get_target(target) == self.get_target(next)):
self.fixed_jumps[offset] = prev_op[next]
return
# Don't add a struct for a while test, it's already taken care of
if offset in self.ignore_if:
return
if (code[prev_op[rtarget]] == JUMP_ABSOLUTE and prev_op[rtarget] in self.stmts and
prev_op[rtarget] != offset and prev_op[prev_op[rtarget]] != offset and
not (code[rtarget] == JUMP_ABSOLUTE and code[rtarget+3] == POP_BLOCK and code[prev_op[prev_op[rtarget]]] != JUMP_ABSOLUTE)):
rtarget = prev_op[rtarget]
# Does the if jump just beyond a jump op, then this is probably an if statement
if code[prev_op[rtarget]] in (JUMP_ABSOLUTE, JUMP_FORWARD):
if_end = self.get_target(prev_op[rtarget])
# Is this a loop not an if?
if (if_end < prev_op[rtarget]) and (code[prev_op[if_end]] == SETUP_LOOP):
if(if_end > start):
return
end = self.restrict_to_parent(if_end, parent)
self.structs.append({'type': 'if-then',
'start': start,
'end': prev_op[rtarget]})
self.not_continue.add(prev_op[rtarget])
if rtarget < end:
self.structs.append({'type': 'if-else',
'start': rtarget,
'end': end})
elif code[prev_op[rtarget]] == RETURN_VALUE:
self.structs.append({'type': 'if-then',
'start': start,
'end': rtarget})
self.return_end_ifs.add(prev_op[rtarget])
elif op in (JUMP_IF_FALSE_OR_POP, JUMP_IF_TRUE_OR_POP):
target = self.get_target(offset)
if target > offset:
unop_target = self.last_instr(offset, target, JUMP_FORWARD, target)
if unop_target and code[unop_target+3] != ROT_TWO:
self.fixed_jumps[offset] = unop_target
else:
self.fixed_jumps[offset] = self.restrict_to_parent(target, parent)
def restrict_to_parent(self, target, parent):
"""Restrict target to parent structure boundaries."""
if not (parent['start'] < target < parent['end']):
target = parent['end']
return target
def rem_or(self, start, end, instr, target=None, include_beyond_target=False):
"""
Find offsets of all requested <instr> between <start> and <end>,
optionally <target>ing specified offset, and return list found
<instr> offsets which are not within any POP_JUMP_IF_TRUE jumps.
"""
# Find all offsets of requested instructions
instr_offsets = self.all_instr(start, end, instr, target, include_beyond_target)
# Get all POP_JUMP_IF_TRUE (or) offsets
pjit_offsets = self.all_instr(start, end, POP_JUMP_IF_TRUE)
filtered = []
for pjit_offset in pjit_offsets:
pjit_tgt = self.get_target(pjit_offset) - 3
for instr_offset in instr_offsets:
if instr_offset <= pjit_offset or instr_offset >= pjit_tgt:
filtered.append(instr_offset)
instr_offsets = filtered
filtered = []
return instr_offsets
def remove_mid_line_ifs(self, ifs):
"""
Go through passed offsets, filtering ifs
located somewhere mid-line.
"""
filtered = []
for if_ in ifs:
# For each offset, if line number of current and next op
# is the same
if self.lines[if_].l_no == self.lines[if_+3].l_no:
# Check if last op on line is PJIT or PJIF, and if it is - skip it
if self.code[self.prev_op[self.lines[if_].next]] in (POP_JUMP_IF_TRUE, POP_JUMP_IF_FALSE):
continue
filtered.append(if_)
return filtered
if __name__ == "__main__":
import inspect
co = inspect.currentframe().f_code
tokens, customize = Scanner32().disassemble(co)
for t in tokens:
print(t)
pass

View File

@@ -0,0 +1,29 @@
# Copyright (c) 2015 by Rocky Bernstein
"""
Python 3 bytecode scanner/deparser
This overlaps Python's 3.3's dis module, but it can be run from
Python 2 and other versions of Python. Also, we save token information
for later use in deparsing.
"""
from __future__ import print_function
import uncompyle6.scanners.scanner3 as scan3
import uncompyle6.opcodes.opcode_33
# verify uses JUMP_OPs from here
JUMP_OPs = uncompyle6.opcodes.opcode_33.JUMP_OPs
class Scanner33(scan3.Scanner3):
def disassemble(self, co, classname=None):
return self.disassemble_generic(co, classname)
if __name__ == "__main__":
import inspect
co = inspect.currentframe().f_code
tokens, customize = Scanner33().disassemble(co)
for t in tokens:
print(t)
pass

Some files were not shown because too many files have changed in this diff Show More