Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible naming improvements #73

Open
bluebird75 opened this issue Jun 27, 2018 · 4 comments
Open

Possible naming improvements #73

bluebird75 opened this issue Jun 27, 2018 · 4 comments

Comments

@bluebird75
Copy link
Contributor

I was a bit surprised when approaching the package by the complexity of the naming.

What surprised me :

  1. the tool is named pyannotate but you must import pyannotate_runtime to use it
  2. collect_types.init_types_collection() ? Does the name really need to be that long ? Do we really need to explicitly init by the way ? Why not initialize on first resume() ?
  3. dump_stats() is ok but dump() is good too

Suggestions :

  • allow to import directly from pyannotate
  • 3 functions only : start(), stop(), dump()
  • initialize on first start() or rename init_types_collection() to init()

By the way, I am looking into contributing to pyannotate.

@bluebird75
Copy link
Contributor Author

Even dump() could be left to an atexit handler.

@gvanrossum
Copy link
Contributor

The reason for the unusual module naming is that I want the runtime support to be separate from the command line tool that reads the JSON file and updates your source code.

The reason for the separate init_types_collection() call is that this should be called before starting any threads, at least if you're interested in getting results from other threads (at Dropbox we have a solid use case for this). I suppose we can rename this to init().

Doing the dump() in atexit() feels too late and too implicit.

Honestly I would rather (first) see a contribution that added good documentation, so we can stop referring to the blog post, instead of further changes to the API naming (which feels pretty shallow).

In terms of features I'd like to see a way of merging multiple type_info.json files, e.g. to combine results gathered by multiple tests run as separate processes (possibly on separate hosts), perhaps similar to the functionality in coverage.py for merging coverage results.

@bluebird75
Copy link
Contributor Author

Thanks for the clear feedback.

I don't get the point about the threads though. If you want to collect type information from other threads, you should call start() before any of these threads gets started, and start will implicitly call init() so all is fine.

And if your threads are active, even if you call start() (and so init()) late in their lifetime, you should still capture their live type information. Am I missing something ?

If you look at monkey type, their code instrumentation is much much simpler to use. Just do monkeytype run or just use one context manager.

I'll try to work on the missing aspects. Python 3 type annotations is top priority to me, beause after 10 years of lobbying, it looks like the world mostly has switched to Python 3 (eventhough Dropbox and a few other codebase are resisisting).

Improved documentation is also indeed important and should not be that difficult to get.

Then I agree that merging multiple type_info is a critical feature. This allows to collect type information in different environements and benefit from it. In my opinion, pyannotate should even propose some modes where existing function type information is enhanced with external type information, to get a more granularity for the incremental typing.

@gvanrossum
Copy link
Contributor

The use case for threads is the Dropbox Client. We have a debug version that lets a developer start and stop type collection using a menu entry. This is handy so you can collect types for a specific sequence of operations in the app (but not for the rather long app startup sequence). The init() method needs to be called when the app initializes itself, because the app creates long-running threads. If init() was implicitly called by start(), we would collect no types from threads that have already started.

As a compromise, I think it's fine to have start() call init() if it hasn't been called yet -- we must make sure that calling init() multiple times is okay.

Note that init() also has a optional argument used to filter filenames. This makes more sense here than for start(). Similarly, dump_stats() is separate because it takes an output filename. (You can use dumps_stats() if you want to do something else with the data -- though I think it's a shame there's no API to get the JSON as a dict instead of as a string.)

What we could also use is an API that resets all the global state -- similar to what collecting_types() in pyannotate_runtime/tests/test_collect_types.py does.

So, all in all I am open to API changes but I want certain functionality to be available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants