Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Py36 support #31

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
Open

Py36 support #31

wants to merge 13 commits into from

Conversation

vrthra
Copy link

@vrthra vrthra commented Jan 30, 2018

Adding python 3.6 support.

Includes the python 3.4 patch from @darius so that the tests pass. If that gets committed, I will remove the last patch from this PR.

@llllllllll
Copy link
Collaborator

Thanks for working on this! I will try to look at this either tonight or tomorrow after work.

@vrthra vrthra force-pushed the py36 branch 10 times, most recently from 8b42c5a to ebfdc5c Compare January 31, 2018 12:02
@vrthra
Copy link
Author

vrthra commented Feb 4, 2018

@llllllllll ping -- any thing I can do?

Copy link
Collaborator

@llllllllll llllllllll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it took so long, I left some comments. This is pretty awesome, thanks!

# D'oh! http://bugs.python.org/issue19611 Py2 doesn't know how to
# inspect set comprehensions, dict comprehensions, or generator
# expressions properly. They are always functions of one argument,
# so just do the right thing.
# so just do the right thing. Py3.4 also would fail without this
# hack, for list comprehensions too. (Haven't checked for other 3.x.)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we check this?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another from #20 PR by @darius

byterun/pyvm2.py Outdated
f = self.frame
opoffset = f.f_lasti
byteCode = byteint(f.f_code.co_code[opoffset])
if f.py36_opcodes:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need special processing here? the non-3.6 branch looks like it should handle this fine.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only until 3.4. I will change the special processing to check for 3.4 onward if that is required.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, using the opcodes = dis.get_instructions API is kind of hard in versions before Python 3.6 because on jumps, the VM sets the bytecode counter to the offset to be jumped to. In versions before Python 3.6, there is no direct correspondence to between the opcodes index and the offset to be jumped to -- and has to be computed for each jump. For Python 3.6, and above, there is a direct correspondence because each bytecode is now fixed width, and one can compute the jump index directly by dividing target by 2. Given the significant simplification Python 3.6 has provided, along with the API for bytecode iteration, perhaps we should special case from Python 3.6 onward?

Copy link

@rocky rocky Jun 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vrthra reports:

opcodes = dis.get_instructions API is kind of hard in versions before Python 3.6 because o[f] jumps

While there are more serious problems mentioned below, by using xdis in my fork, the newer and better APIs in later releases can be applied to bytecode on older versions of Python.

the VM sets the bytecode counter to the offset to be jumped to.

In fact, this is wrong at least as far as emulating CPython semantics because this value is stored in the frame's f_lasti.

While this may be tolerable for reading log traces, if you were to try to hook this up with a debugger it would be intolerable: you can't report a frame's f_lasti as the instruction that would be run if the current instruction had finished and not jumped, returned or raised an exception. Instead f_lasti needs to show the current instruction as it does in CPython.

perhaps we should special case from Python 3.6 onward?

Although the specific problems you mention are addressed above, something like this in x-python goes on with opcodes. Here the advantages are

  • pedagocial
  • cleaner separatation of code, which leads to
  • scalability

byterun/pyvm2.py Outdated
arg = None
arguments = []
if f.py36_opcodes and byteCode == dis.EXTENDED_ARG:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think extended_arg is new to 3.6, can we remove the first part of this check?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned previously, this approach only works for Python 3.6 and above, and only if we are using the get_instructions api.

byterun/pyvm2.py Outdated
intArg = byteint(arg[0]) + (byteint(arg[1]) << 8)
if f.py36_opcodes:
intArg = currentOp.arg
else:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also here, I am not sure why we are handling 3.6 in such a special way, the code below looks correct in either case.

Copy link
Author

@vrthra vrthra Feb 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I misunderstood you earlier.) From 3.6 on, there is a subtle difference. All ops have arguments. Which means that we do not need to increment the f_lasti. Given that we have to special case py36 anyway, we might as well rely on the get_instructions API for versions from now on to get the actual argument rather than doing the bit manipulation ourselves.

byterun/pyvm2.py Outdated
elts = self.popn(count)
self.push(tuple(e for l in elts for e in l))

def byte_BUILD_TUPLE_UNPACK(self, count):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the C source, BUILD_TUPLE_UNPACK, BUILD_TUPLE_UNPACK_WITH_CALL, and BUILD_LIST_UNPACK have the same target, should we reflect that by having them share the same target here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update to reflect that that.

byterun/pyvm2.py Outdated

def byte_BUILD_MAP(self, count):
# Pushes a new dictionary on to stack.
if not(six.PY3 and sys.version_info.minor >= 5):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this might be more clear as sys.version_info[:2] < (3, 5). at first I thought this was flipped

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update that

byterun/pyvm2.py Outdated
# dictionary holds count entries: {..., TOS3: TOS2, TOS1:TOS}
# updated in version 3.5
kvs = {}
for i in range(0, count):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

range(n) is shorthand for range(0, n) and is a more common form

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I will change that.

globs = self.frame.f_globals
fn = Function(name, code, globs, defaults, None, self)
if PY3 and sys.version_info.minor >= 6:
closure = self.pop() if (argc & 0x8) else None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a reason to use hex-literals for values under 10?

Copy link
Author

@vrthra vrthra Feb 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@llllllllll I took the comparisons from the Python bytecode documentation directly, which uses these hex literals.


def byte_STORE_LOCALS(self):
self.frame.f_locals = self.pop()

if 0: # Not in py2.7
def byte_SET_LINENO(self, lineno):
self.frame.f_lineno = lineno

if PY3:
def build_class(func, name, *bases, **kwds):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did we need to rewrite this function? I think the scope of this project is confined to the interpreter loop and function calling. I may have missed some new reason why this needs to be in Python now though.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the comment further down in the function, and my remarks on the pull request #20 -- it might be better to discuss this over there instead. I'm not sure if any of your other comments are about my old submission instead of @vrthra's, since I've forgotten what was in it by now.

I vaguely remember trying alternatives, but couldn't find any better way to get Py3.4 supported.

byterun/pyobj.py Outdated
self.f_code = f_code
self.py36_opcodes = list(dis.get_instructions(self.f_code)) \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to prefer this over just reading the f_code? If anything, 3.6 makes this very simple with the fixed width instructions.

Copy link
Author

@vrthra vrthra Feb 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that Python is now exposing the API to iterate, and given that the bytecode format is not considered by Python project to be part of their public API, I think we should use their public API rather than read and interpret f_code.

Unfortunately the API is only available from 3.4 so we have to special case this for any code from 3.4 onwards.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants