New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consecutive headless pyglet application runs as part of a pytest testing flow #953
Comments
This is my understanding as well. The arcade library goes further by re-using the same window throughout unit tests in addition to just the same pyglet app context. I haven't looked at many other projects for comparison, but they seem to either assume there will be one app run, or provide some capacity for nesting with the assumption there will be one root parent. I think FastAPI may fall in the latter category, but I may be wrong on the details. Regardless of what other applications do, the idea of having separable app contexts might imply each having their own clock. |
I don't think that python itself (or at least CPython) provide any stable mechanism for re-importing a library to get past this. |
Oh, thanks for the quick reply, I'll try looking at how arcade makes that happen. |
I forgot to mention this, but it's important: arcade also assumes there will only be one application window. |
I wonder if enabling this implies great changes to pyglet, though it would make more sense for pytest (or an alternative testing framework) to be smart enough to allow running marked tests on separate python interpreters, within the same test suite being managed by it. |
That thread mentions
Can you be more specific about the use cases for this with pyglet? Is your goal testing the application abstraction itself, or something else? If it's the latter, then it might be fine as long as you don't try to share a single GL context between the application contexts. Someone with more GL experience will have to weigh in one we have specifics. |
Thanks so much @pushfoo. I just have multiple headless tests for my single application, each running the application with different arguments and/or differently simulating user events while the application is headless. I wouldn't call these unit-tests, as they simulate user interaction timelines against the application. After switching my application code to take the window objects as an input argument (a.k.a. dependency injection) and providing the very same list of pre-instantiated window objects to all tests via a pytest fixture at test-suite running time (still using pytest), everything seems to work as expected. Here's the pytest fixture code ― print('pytest conftest making a fixture shared at the pytest session level, providing the same headless window objects to all tests during a test suite run')
@pytest.fixture(scope='session')
def windows():
return [MyPygletWindow(resizable=True, visible=False, headless=True),
MyPygletWindow(resizable=True, visible=False, headless=True)] (I omit my class subclassing pyglet.window.Window from the code listing, basically it introduces a config and some branching so that when instructed to run headless it doesn't try to use antialiasing through sample buffers). This is a bit like #911, though only circumstantially, in which it had emerged yet in a platform-specific manner, that closing and then reopening windows brings in some challenges which are best avoided. This change has entailed switching the application to never close its windows, but rather exit by using I have tested this solution diligently but will keep monitoring for a bit, as I add headless tests to my suite. |
Are you testing non-drawing only functionality only, or are you doing pixel comparison as well? There was a non-drawing branch I was working on to replace shaders and the GL context with mocks. If you are interested in this as well, I would appreciate your help and/or feedback.
These are probably better referred to as integration or end to end tests, but the wording isn't as relevant as what they tell us. Could you share more about them?
I agree for the time being, even though I'd like to better understand or solve these one day. If it's indeed related to X windowing as #911 suggests, it may be why Factorio's devs share sub-regions of a window across multiple tests threads. Splitting the window may only make sense if you've already implemented or plan to split rendering, client input, and server logic from each other for networked gameplay. It also makes window management much easier.
Thank you for the update. I would appreciate it if you could share either code or more about what you learn. Maybe it will help with being able to parallelize arcade's tests one day, even if current design choices make it difficult. My understanding is that your test setup may be like one using Selenium grid, except you can only add new runners instead of destroy existing ones. |
Hi @pushfoo, I am not doing pixel-level verification in any of my tests, or at least I haven't figured going to that level would be a worthwhile strategy for my application insofar. My application is an application for taking video and reviewing it with various props, features and modes, but if I can be of any utility for anything please let me know what it might be! As to my "solution", it really boils down to that pytest fixture which I copied earlier above, no parallelization is involved, I do not expect reaching a mass of headless tests to merit going concurrent, so I have no contribution in that realm. If anything fancy or interesting comes up in my journey I'll post an update though. I can share my private repo to your user if you'd like to take a look though. You are absolutely right, I should have called them integration or end-to-end tests, although the title of this issue was meant to imply towards that in a sense. I hope this isn't too disappointing to read and thank you again for your support (!) |
That said, why would you replace shaders and the GL context with mocks, if that could be accomplished by using the headless mode which as far as my understanding is leveraging EGL to accomplish the same? |
Mutually Beneficial Items Which Need Help
Solving one or both of these benefits everyone:
I've been blocked by not having enough:
Prioritizing Mac support might be a good idea if you're doing any of the following:
Macs are already commonly used for media production, but their recent models may have additional benefits for you. These include:
Figuring out cmd-A for
Feedback or further research on the keyboard shortcut or drag and drop systems would also be helpful. Why Mock Shaders?It's for a very specific use case: isolating unit tests of Python code for init and properties in high level abstractions. We use headless tests and interactive tests elsewhere. To be more specific, abstractions like shapes and sprites break a lot, which keeps wasting everyone's time. We need to solve this. Here's why it happens and how shader mocks / dummy classes help fix it:
There are also further benefits:
|
I can see the benefit of the mocking now, but the rest of items are indeed a handful. Specifically in my project, Mac support is not even at the bottom of the list of priorities in the foreseeable future, but I have to be honest my familiarity with pyglet's codebase and general scenarios, and to be very open about it my coding level, are definitely very far from being at a contributor level as of yet, I can possibly try smaller and safer contributions if I live long enough. |
That's understandable. Some other contributors have even stronger opinions, and would prefer we didn't support mac at all. I understand their sentiment because it takes a totally different approach to some tasks. The second text entry ticket I linked is actually cross-platform, but it may also be worth thinking about it further. I'm not sure it should also cover up/down arrow keys.
For example, I had to throw out two precursors to my current shader mock branch.
Good bug reports and the patience to follow up on them are also important forms of contribution. I've been reading through the backlog of issues in search of easier ones not associated with glaring inconveniences (shape / etc API instability), and I appreciate the ones you've submitted so far. I need to step away for now, but I'll think over the original issue. If the pytest plugin I linked above doesn't cover it already, another ugly workaround might be to start other interpreters and use an RPC library to pass data between the tests and the wrapped interpreter. Your current approach may be less work, however. |
You are too kind, but I think that reusing windows in testing is a simple, working approach, which overlaps with the philosophy and practice of dependency injection which is anyway a good practical approach to many desires in testability. Pytest's fork plugin is more of a dead-end than something to keep using for long, if you read the wording on that url, I wouldn't recommend it for new development project to pick up right now. I imagine that executing interpreters and collecting their return codes and exceptions makes a natural approach ― yet this should be IMHO outside of a library like pyglet, I'm not sure why RPC should be involved, and I assume many people write their own subprocess wrappers to this end, at varying degrees of robustness and generality and providing varying sets of guarantees (same as test frameworks have different sets of features). If my current approach breaks at any point, I'm sure to report back on it, but it seems like the "golden path" to address headless testing suites till the point that some long-running window reuse breakage emerges in pyglet or due to the underlying opengl implementation, at which point, launching tests as sub-processes is always doable ― even though it means going down to writing infrastructure code, the building blocks are quite simple. Thanks again for pyglet ― hope to become helpful as things make progress. |
Just a note to myself that a consequence of Such a scheduled event will persevere across the test cases, as pytest only loads pylget once. So when a given test case goes Of course, one can make sure to avoid test cases exiting with pending callbacks queued, by tightening up conditional logic over when callbacks are registered, give or take using I could instantiate a new pyglet Perhaps in future versions, the |
I worked to prevent my test cases exiting code while any callbacks are left queued, but I think it could be a nice enhancement to either provide (i) a method to clear all queued callbacks which a test runner fixture or app code can call at test boundaries, or, (ii) to have Obviously one may rightfully contend that leaving behind queued callbacks is bad application implementation, but since they are bound to creep up in a way very hard to debug in the mentioned test running scenario, perhaps it is easy to provide some affordance that avoids it, via the framework/library iteslf. * disclaiming that I'm not aware at this time whether other test runners (that is other than pytest) carry the same library loading behavior and possible consequence. |
All that said I'd like to share that running headless tests has proven immensely useful and successful for my application, I would not be able to develop and evolve my project without this extensive headless testing suite that I am running. I guess that something careful enough in the style as suggested may save future others from very hard to debug issues in their headless testing architecture for their pyglet applications, due to the spill-over effect that you get in the current state. |
TL;DR of the state of tests since earlier comments:
|
Thanks for those comments. Would PlatformEventLoop.notify actually be an idiomatic way for syncrhonously clearing all events between pytest tests? |
The following code simulates a pytest test suite run where a first test runs a pyglet app and finalizes it, and later another pyglet app is run as part of the next test, yielding an error. The below minimally reproducible flow reproduces the same issue ― within the boundaries of a single plain script ― starting and finalizing a pyglet app, and then naively taking the first step into starting the next test having the same nature:
The exception text provided by pytest is:
I might imagine that there would be a different way to finalize and restart a pyglet application, or that the api was never designed to be used for more than one application within the same flow ― however when using pytest ― to the best of my understanding the library is only loaded once and then reused across all test cases being run as part of a test suite run. Which bumps into this issue, when instructing pyglet to run headless.
I find that the running multiple tests starting and stopping the application to be well-motivated, when you really want to test your application with multiple test cases, is there any way to work around this?
System Information
Ubuntu 20.04
python 3.10.13
pyglet 2.0.5
The text was updated successfully, but these errors were encountered: