Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bot/Celluloid only connects 50% of the time #217

Open
boncey opened this issue Mar 27, 2019 · 4 comments
Open

Bot/Celluloid only connects 50% of the time #217

boncey opened this issue Mar 27, 2019 · 4 comments
Labels

Comments

@boncey
Copy link

boncey commented Mar 27, 2019

My bot doesn't seem to connect to Slack around 50% of the time when I start it up.
I get these lines every time...

request: POST https://slack.com/api/rtm.start
response: Status 200

But I only get this...

Successfully connected team XXXXXXX (XXXXXXX) to https://XXXXXXX.slack.com.

around half the time.

Here's some logging output.

Success...

I, [2019-03-27T19:05:32.850641 #22933]  INFO -- request: POST https://slack.com/api/rtm.start
D, [2019-03-27T19:05:32.850858 #22933] DEBUG -- request: Accept: "application/json; charset=utf-8"
User-Agent: "Slack Ruby Client/0.14.1"
Content-Type: "application/x-www-form-urlencoded"
I, [2019-03-27T19:05:33.080719 #22933]  INFO -- response: Status 200
D, [2019-03-27T19:05:33.081056 #22933] DEBUG -- response: content-type: "application/json; charset=utf-8"
content-length: "7379"
connection: "close"
date: "Wed, 27 Mar 2019 19:05:32 GMT"
server: "Apache"
expires: "Mon, 26 Jul 1997 05:00:00 GMT"
x-accepted-oauth-scopes: "rtm:stream,client"
x-content-type-options: "nosniff"
cache-control: "private, no-cache, no-store, must-revalidate"
x-oauth-scopes: "identify,read,post,client,apps"
access-control-expose-headers: "x-slack-req-id"
x-xss-protection: "0"
x-slack-req-id: "a7e8bef3-ed62-49c1-8495-d0abf386d5fd"
vary: "Accept-Encoding"
pragma: "no-cache"
access-control-allow-headers: "slack-route, x-slack-version-ts"
strict-transport-security: "max-age=31536000; includeSubDomains; preload"
referrer-policy: "no-referrer"
access-control-allow-origin: "*"
x-via: "haproxy-www-3o80"
x-cache: "Miss from cloudfront"
via: "1.1 08efbf25f39cc444a2f013fdf0a017cc.cloudfront.net (CloudFront)"
x-amz-cf-id: "2Vqtz-Dv9k6iK4fEUfXHxf-JuANv_Tgc6fzNGU5NFGxY3NXwQK-6SA=="
D, [2019-03-27T19:05:33.131482 #22933] DEBUG -- Slack::RealTime::Concurrency::Celluloid::Socket#connect!: WebSocket::Driver::Client
D, [2019-03-27T19:05:33.408505 #22933] DEBUG -- Slack::RealTime::Concurrency::Celluloid::Socket#write: GET /websocket/dAs6IhfWzsZRBuzI1Ech53ZYD6h8ZnvBeked_2dlhftUFtgqWr4aJqIEp-QT-72cX_tpArB05_bnCrhac5dl4diAhQ_-1K1bkcqgyhDvZUI= HTTP/1.1
Host: cerberus-xxxx.lb.slack-msgs.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: DyEfhIrF5WC6KgyfLcWk4Q==
Sec-WebSocket-Version: 13


D, [2019-03-27T19:05:33.493841 #22933] DEBUG -- Slack::RealTime::Concurrency::Celluloid::Socket#handle_read: HTTP/1.1 101 Switching Protocols
upgrade: websocket
connection: upgrade
sec-websocket-accept: 3ICQB3ks/+XMUGkBdMNtzNzi6EI=
X-Via: haproxy-lb-mp-lbcert-iad-q9r9


D, [2019-03-27T19:05:33.500763 #22933] DEBUG -- SlackRubyBot::Client#run_loop: WebSocket::Driver::OpenEvent
D, [2019-03-27T19:05:33.501570 #22933] DEBUG -- Slack::RealTime::Concurrency::Celluloid::Socket#handle_read: ��{"type": "hello"}
D, [2019-03-27T19:05:33.503971 #22933] DEBUG -- SlackRubyBot::Client#run_loop: WebSocket::Driver::MessageEvent, {"type": "hello"}
D, [2019-03-27T19:05:33.504300 #22933] DEBUG -- SlackRubyBot::Client#dispatch: type=hello
I, [2019-03-27T19:05:33.504531 #22933]  INFO -- : Successfully connected team XXXXXXX (XXXXXXX) to https://XXXXXXX.slack.com.

Failure...

I, [2019-03-27T19:05:52.011383 #23127]  INFO -- request: POST https://slack.com/api/rtm.start
D, [2019-03-27T19:05:52.011610 #23127] DEBUG -- request: Accept: "application/json; charset=utf-8"
User-Agent: "Slack Ruby Client/0.14.1"
Content-Type: "application/x-www-form-urlencoded"
I, [2019-03-27T19:05:53.000160 #23127]  INFO -- response: Status 200
D, [2019-03-27T19:05:53.000538 #23127] DEBUG -- response: content-type: "application/json; charset=utf-8"
content-length: "7379"
connection: "close"
date: "Wed, 27 Mar 2019 19:05:52 GMT"
server: "Apache"
expires: "Mon, 26 Jul 1997 05:00:00 GMT"
x-accepted-oauth-scopes: "rtm:stream,client"
x-content-type-options: "nosniff"
cache-control: "private, no-cache, no-store, must-revalidate"
x-oauth-scopes: "identify,read,post,client,apps"
access-control-expose-headers: "x-slack-req-id"
x-xss-protection: "0"
x-slack-req-id: "d7ece95e-c146-4c1f-bba5-eced67b53cdb"
vary: "Accept-Encoding"
pragma: "no-cache"
access-control-allow-headers: "slack-route, x-slack-version-ts"
strict-transport-security: "max-age=31536000; includeSubDomains; preload"
referrer-policy: "no-referrer"
access-control-allow-origin: "*"
x-via: "haproxy-www-26um"
x-cache: "Miss from cloudfront"
via: "1.1 eaeb02896a2d60cfeb39bf4d1e8e5932.cloudfront.net (CloudFront)"
x-amz-cf-id: "LGE0n_COojQdUerY5W5tdFSPoLogxFI0j_2YKulGKPqj1dJa_TA4Tg=="
D, [2019-03-27T19:05:53.056372 #23127] DEBUG -- Slack::RealTime::Concurrency::Celluloid::Socket#connect!: WebSocket::Driver::Client

My config.ru start up code looks like this...

Thread.abort_on_exception = false
Thread.report_on_exception = false
SlackRubyBot::Client.logger.level = Logger::DEBUG

Thread.new do
  SlackXXXXXbot::Bot.run
rescue => e
  STDERR.puts "ERROR: #{e}"
  STDERR.puts e.backtrace
  raise e
end

run SlackXXXXXbot::Web

I'm not going via a proxy and as far as I can tell I have no network issues (the box lives in a data centre with excellent connectivity).
I'm running the latest version.

My question is, is there a way to detect when this happens and try again?
Also, is there something I can add to my app's healthcheck that tests my connectivity so I can try to reconnect?

@dblock
Copy link
Collaborator

dblock commented Mar 28, 2019

This seems strange, and certainly not observed elsewhere.

Could you try switching to async-websocket from Celluloid? Would like to learn whether this is an asynchrony problem or a networking one.

@dblock dblock added the bug? label Mar 28, 2019
@boncey
Copy link
Author

boncey commented Mar 28, 2019

I did that (I just changed the entry in the Gemfile) and it seems to connect every time now.
I just did it 7 times in a row and it connected every time.

@dblock
Copy link
Collaborator

dblock commented Mar 28, 2019

Ok, which means we have a real bug with Celluloid in this case. I'll leave this open, if you have time would love it if you could spend some more time debugging/adding log code/try/rescue, maybe we're not seeing an exception after the websocket connection is opened or something of that kind...

@dblock dblock changed the title My bot only connects 50% of the time Celluloid only connects 50% of the time Mar 28, 2019
@dblock dblock changed the title Celluloid only connects 50% of the time Bot/Celluloid only connects 50% of the time Mar 28, 2019
@dblock
Copy link
Collaborator

dblock commented Mar 28, 2019

And if you are generous with your time, I would confirm whether this issue exists with EventMachine and with Celluloid on older versions of slack-ruby-client (especially in 0.13.x, 0.12.x).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants