Skip to content

Python2 to 3 Compatibility Pitfalls

feeb edited this page May 2, 2018 · 8 revisions

This is a collection of surprising behavior changes important for writing python3 compatible code. Because of the presence of libfuture, these behaviors are also present when the codebase is run on python2.

Bytestrings

  • bytes() cannot take an iterator. Though python3's bytes() handles this properly, libfuture's bytes() doesn't.
  • bytes(n) where n is a number returns a bytestring of length n, all zeroes.
  • Indexing into a bytestring returns an integer on python3 and a one-character string on python2. libfuture standardizes this to always return ints.
  • Use the hexlify/unhexlify methods from binascii module to convert things to/from hexadecimal representation.
  • More on this here.

Strings

  • Use the string methods in utils/helpers.py to determine types of strings being passed around. Prefer isbytestr/isunicode to isstring, unless it really doesn't matter.
  • Functions that read in data (e,g, readlines, popen) will return str. In python3 str is basically a unicode string, and in python2 they're basically bytestrings. Use the helper methods to make sure you're ready for either case.
  • Use the string literals u'foo' or b'foo' rather than 'foo' to avoid ambiguities, if you can.
  • io.StringIO similarly now handles unicode rather than bytestrings.
  • More on unicode, the base string type and encoding issues here.

Other types

  • Use the is* methods in utils/helpers.py for checking types. Libfuture introduces python3 compatible replacement types for types like bytes, int, object etc. This is largely good but can be a weird headache if you are doing literal typechecking. E.g., a string could potentially be a newstring if it came from manticore, or str if it came from a different library. Most of the headaches have already been worked through.
  • Python3 doesn't have a concept of a long. If you need to check for ints or longs, use isint. There's no reason to use long literals.

Python object model / metaclasses

  • __hash__ needs to be re-implemented if your class overrides __eq__. Reference
  • Be careful with implementing custom __setattr__ and __getattr__ methods. The resolution behaviors of these vary between python2 and 3 in confusing ways.
  • Metaclass syntax and behavior is pretty different in python3. Manticore doesn't use this much, but be aware of it.