Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Returning objects to Python from Rust should use __del__ for cleanup rather than context manager #60

Open
scooter-dangle opened this issue Feb 15, 2018 · 6 comments · May be fixed by #61

Comments

@scooter-dangle
Copy link

scooter-dangle commented Feb 15, 2018

2018-05-02 UPDATE: Per discussion below, issue is that Python FFI object example only shows the deterministic, with ... as ...: approach whereas the Ruby (and others?) example only provides the non-deterministic approach.


I've seen blogs advise against using __del__ for general cleanup in Python, but for de-allocation, it more closely maps to what we want than context managers. For instance, with the current code's use of context management, the only way to make a collection of ZipCodeDatabases would be to explicitly nest with ... as statements. As well, the current example complicates persisting a Rust object that is expensive to initialize but cheap to use, whereas __del__ supports this while still freeing the resources when the reference count drops to zero.

Proposed Python code:

class ZipCodeDatabase:
    def __init__(self):
        self.obj = lib.zip_code_database_new()

    def __del__(self):
        lib.zip_code_database_free(self.obj)

    def populate(self):
        lib.zip_code_database_populate(self.obj)

    def population_of(self, zip):
        return lib.zip_code_database_population_of(self.obj, zip.encode('utf-8'))

database = ZipCodeDatabase()
database.populate()
pop1 = database.population_of("90210")
pop2 = database.population_of("20500")
print(pop1 - pop2)

With that, we can also do this (which—granted—isn't super useful in this particular case):

databases = [ZipCodeDatabase() for _ in range(3)]

I can think of two drawbacks to using __del__ over context management:

  1. Memory cleanup in the presence of reference cycles in Python is less predictable.
  • This is already an issue in Python that developers will need to be aware of. The fact that some of the objects involved will be cleaned up by calling a Rust function doesn't exacerbate the problem, in my opinion.
  1. The Rust cleanup function might not run immediately upon the variable going out of scope
  • Since the cleanup function is to avoid a memory leak, I don't see how this is any different than any other allocated object in Python. Additionally, we know in the Rust world to not rely on drop being called to guarantee properties like safety, so I think __del__ maps better to the concept of the drop function it will be calling.

Aside from __del__ reducing the restrictions on using Rust objects within Python, it's also conceptually simpler. And to an audience that's already learning Rust, it should be easier to teach in conjunction with teaching Drop.

scooter-dangle added a commit to scooter-dangle/rust-ffi-omnibus that referenced this issue Feb 15, 2018
`__del__` more closely maps to what we want than context managers. For
instance, with the current code's use of context management, the only way to
make a collection of `ZipCodeDatabase`s would be to explicitly nest
`with ... as` statements. As well, a context manager would complicate
persisting a Rust object that is expensive to initialize but cheap to use,
whereas `__del__` supports this while still freeing the resources when the
reference count drops to zero.

There are two drawbacks to using `__del__` over context management:
1. Memory cleanup in the presence of reference cycles in Python is less
   predictable.
  * This is already an issue in Python that developers will need to be aware
    of. The fact that some of the objects involved will be cleaned up by
    calling a Rust function doesn't exacerbate that problem.
2. The Rust cleanup function might not run immediately upon the variable going
   out of scope
  * Since the cleanup function is to avoid a memory leak, I don't see how this
    is any different than any other allocated object in Python. Additionally,
    we know in the Rust world to not rely on `drop` being called to guarantee
    properties like safety, so I think `__del__` maps better to the concept of
    the `drop` function it will be calling.

Aside from `__del__` reducing the restrictions on using Rust objects within
Python, it's also conceptually simpler. And to an audience that's already
learning Rust, it should be easier to teach in conjunction with teaching
`Drop`.

Fixes shepmaster#60
@scooter-dangle scooter-dangle linked a pull request Feb 15, 2018 that will close this issue
scooter-dangle added a commit to scooter-dangle/rust-ffi-omnibus that referenced this issue Feb 15, 2018
`__del__` more closely maps to what we want than context managers. For
instance, with the current code's use of context management, the only way to
make a collection of `ZipCodeDatabase`s would be to explicitly nest
`with ... as` statements. As well, a context manager would complicate
persisting a Rust object that is expensive to initialize but cheap to use,
whereas `__del__` supports this while still freeing the resources when the
reference count drops to zero.

There are two drawbacks to using `__del__` over context management:
1. Memory cleanup in the presence of reference cycles in Python is less
   predictable.
  * This is already an issue in Python that developers will need to be aware
    of. The fact that some of the objects involved will be cleaned up by
    calling a Rust function doesn't exacerbate that problem.
2. The Rust cleanup function might not run immediately upon the variable going
   out of scope
  * Since the cleanup function is to avoid a memory leak, I don't see how this
    is any different than any other allocated object in Python. Additionally,
    we know in the Rust world to not rely on `drop` being called to guarantee
    properties like safety, so I think `__del__` maps better to the concept of
    the `drop` function it will be calling.

Aside from `__del__` reducing the restrictions on using Rust objects within
Python, it's also conceptually simpler. And to an audience that's already
learning Rust, it should be easier to teach in conjunction with teaching
`Drop`.

Fixes shepmaster#60
@scooter-dangle
Copy link
Author

scooter-dangle commented Feb 19, 2018

By the way, found this Stack Overflow answer by huon that recommends __del__ over context managers for the general questioner's use case.

@shepmaster
Copy link
Owner

Thanks! I'm traveling for a week or so, but then I hope to circle back around to your PR!

@shepmaster
Copy link
Owner

One thing I vaguely remember is that __del__ has no guaranteed execution point, while a context does. Is that correct?

@scooter-dangle
Copy link
Author

scooter-dangle commented Feb 19, 2018

From what I've read, yes, that's correct.

huon mentions that in his Stack Overflow post:

[Using a context manager] does have the advantage of mak[ing] the region of code where the resource is valid/allocated clearer

It sounds like one caveat in particular that would make resource freeing less predictable is if the Python object is part of a reference cycle, which requires separate detection in the GC rather than just relying on reference counting. For any of the FFI use cases, though, that doesn't seem very relevant.

Also, per the docs exception propagation in __del__ is suppressed and instead printed to stderr, but again, that doesn't seem very relevant to the FFI example.

There's no guarantee that __del__ will be called when the interpreter exits, but that should be fine for memory cleanup.

I think another thing that initially confused me was just that the context manager isn't equivalent to the Ruby auto-pointer hook. If it's useful to use the context manager's cleanup-callback pattern in some cases, it would be instructive to show the equivalent pattern in the other language examples. The Ruby version (since it's the one I most immediately know how to do) of the Python context manager would roughly look like the following:

require 'ffi'

class ZipCodeDatabase < FFI::Pointer
  def self.run(&block)
    database = Binding.new
    block.yield(database)
  ensure
    Binding.free(database)
  end

  def populate
    Binding.populate(self)
  end

  def population_of(zip)
    Binding.population_of(self, zip)
  end

  module Binding
    extend FFI::Library
    ffi_lib 'objects'

    attach_function :new, :zip_code_database_new,
                    [], ZipCodeDatabase
    attach_function :free, :zip_code_database_free,
                    [ZipCodeDatabase], :void
    attach_function :populate, :zip_code_database_populate,
                    [ZipCodeDatabase], :void
    attach_function :population_of, :zip_code_database_population_of,
                    [ZipCodeDatabase, :string], :uint32
  end
end

ZipCodeDatabase.run do |database|
  database.populate
  pop1 = database.population_of("90210")
  pop2 = database.population_of("20500")

  puts pop1 - pop2
end

@shepmaster
Copy link
Owner

Woah, so I lied a little when I said I'd get right on this! 😰

One reason that I'm hesitant to accept this as-is is because Rust objects do rely on deterministic drop times, sometimes. For example, if the object returned were a database connection or a MutexGuard.

Is it possible/a good idea to implement both types of lifecycle management? That is, implement __enter__, __exit__, and __del__? If not, we may want to describe both methods because it ultimately depends on what kind of FFI object you are marshalling across...

isn't equivalent to the Ruby auto-pointer hook

A great point! If we decide to show both forms for Python, we should show both for Ruby.

@scooter-dangle
Copy link
Author

Sorry for the lag in response!

One reason that I'm hesitant to accept this as-is is because Rust objects do rely on deterministic drop times, sometimes. For example, if the object returned were a database connection or a MutexGuard.

Is it possible/a good idea to implement both types of lifecycle management? That is, implement __enter__, __exit__, and __del__? If not, we may want to describe both methods because it ultimately depends on what kind of FFI object you are marshalling across...

Yeah, the deterministic behavior can be desirable for some use cases. I had initially assumed you'd prefer to have one recommended starting point just to keep the material here simple, but if you think presenting both without confusing the user is do-able, I'll revise my submitted PR.

Personally, as long as I can be confident that the information is accurate and up-to-date, more examples is almost always better—especially if they contrast in ways that illustrate differences in how I can approach the problem I'm trying to solve. 👍

lovasoa added a commit to lovasoa/rust-ffi-omnibus that referenced this issue Mar 21, 2019
Implement both a context manager and a __del__ method for the python object example.

This allows the user to decide whether he wants the rust object to be de-allocated implicitly (and non-deterministically) when the python object is garbage-collected, or explicitly at the end of a python context.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants