Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt at un-flaky'ing the preview a very long message cucumber scenario #8428

Merged
merged 1 commit into from
Jun 13, 2023

Conversation

denschub
Copy link
Member

... by making the extremely long status message shorter.

This wasn't fun to debug, and honestly, I still don't know if this is even a fix. Here's what I think happened:

In #8418, we switched from apparition to cuprite, primarily because apparition is unmaintained. Both are remote-controlling Chrome via the Chrome Devtools Protocol (CDP). This is technically a bit of a hack, but a fine one (right now), because standardized cross-browser bi-direction remote control is still WIP at the moment (although there's good progress!). cuprite uses ferrum internally, which is a high-level API for CDP.

Recently, the preview a very long message scenario in features/desktop/post_preview.feature started becoming flaky, and it's more red than green. The odd thing here was that the test timed out while entering text into the publisher:

2502:    When I insert an extremely long status message # features/step_definitions/publisher_steps.rb:44
2503:      Timed out waiting for response. It's possible that this happened because something took a very long time (for example a page load was slow). If so, setting the :timeout option to a higher value might help. (Ferrum::TimeoutError)

and this kinda makes no sense. I did convince Cucumber to make screenshots of the failure, and GitHub to store those for me, and ... you can clearly see that all the text is there at the time of failure, so that makes even less sense, because that means that a) it found the element to type into, b) it was able to type into.

I noticed, however, that the error we were getting was Ferrum::TimeoutError. This is odd, because for most things that can go wrong (like element search timeouts, etc) there's a level of abstraction above that, and I'd expect a Capybara::* error. So what's happening here is that there somehow was a timeout sending or receiving a CDP message to the browser.

Anyway, it looks like the old message, being 2048 chars long, apparently sometimes tripped up Ferrum or Chrome itself. This PR replaces this with "long post\n" * 15, which is only 150 chars. The new, shorter, message does break less (or not at all - I ran a couple of runs and didn't see it break once). It's still long enough, though, as the way we determine if a status message is "too long" is by height only, so line-breaks work.

I'm still not convinced that this is actually a reliable fix, which is why I'm leaving this essay here. If we ever have to come back, at least we know what not to try. 馃檭

The old message, being 2048 chars long, apparently sometimes tripped up
Ferrum or Chrome itself. The new, shorter, message does that less or not
at all. It's still long enough, though, as the way we determine if a
status message is "too long" is by height only, so line-breaks work.
Copy link
Member

@SuperTux88 SuperTux88 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for debugging this 馃崻

It's still weird that the message is only too long for chrome sometimes and only on GitHub, but I also had a feeling that it could be something like that, so lets try if it works.

@denschub denschub merged commit 6430aa3 into diaspora:develop Jun 13, 2023
12 checks passed
@denschub denschub deleted the hackaround-flaky-cuke branch June 13, 2023 13:25
@denschub
Copy link
Member Author

denschub commented Jun 13, 2023

It's still weird that the message is only too long for chrome sometimes and only on GitHub

My current thought about this is something networking-related, btw. GitHub Actions is containerized, and we don't know much about the runtime. 2k isn't a big payload, but it might cause fragmentation or just some tiny latency internally. It could also be that there's a specific Chrome bug in the version the runner is using. There now is a way to deliberately install a specific version of Chrome for testing (published yesterday!), so this might be something to explore in the future if we have a similar issue again... ... or just hope that by then, we can run cucumber using Selenium using WebDriver BiDi, and just skip this entire problem space.

@SuperTux88 SuperTux88 added this to the 1.0.0 milestone Jun 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants