Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to receive push messages after switching network type #6447

Closed
2 tasks done
timotk opened this issue Mar 27, 2017 · 8 comments · May be fixed by signalapp/libsignal-service-java#49
Closed
2 tasks done

Unable to receive push messages after switching network type #6447

timotk opened this issue Mar 27, 2017 · 8 comments · May be fixed by signalapp/libsignal-service-java#49

Comments

@timotk
Copy link

timotk commented Mar 27, 2017

I have:


Bug description

Whenever I switch network type (from WiFi to 3G/LTE or vice versa), Signal is unable to receive any push messages. This includes normal messages and also received confirmation (the double check mark). I am however able to send messages (single check mark) and they are also received. Switching back from mobile network to WiFi also does not solve the problem. After reinstalling Signal the bug still persists.

Important: This phone does not have GCM. It appears the websocket pipe gets broken.

Steps to reproduce

  • Switch from WiFi to 3G/LTE (or vice versa)

Actual result: Switching to mobile network causes Signal to not receive any push messages
Expected result: Signal is able to handle the network switch and is able to receive both messages and received confirmations.

Device info

Device: Samsung Galaxy S4
Android version: 7.1.1
Signal version: 4.1.0

Link to debug log

https://gist.github.com/timotk/bad33a0bff9129358938cd2c4f568ada

I believe this is the relevant part:

03-27 11:16:39.679 31337 31371 W MessageRetrievalService: Application level read timeout...
03-27 11:16:39.682 31337 31371 W MessageRetrievalService: Network requirement: true, active activities: 1, push pending: 0, gcm disabled: true
03-27 11:16:39.684 31337 31371 W MessageRetrievalService: Reading message...

@timotk
Copy link
Author

timotk commented Apr 1, 2017

A workaround is restarting the service whenever you switch network type, by using the following adb commands:
am stopservice org.thoughtcrime.securesms/.service.MessageRetrievalService
am startservice org.thoughtcrime.securesms/.service.MessageRetrievalService

I run these commands using tasker whenever mobile network is connected or disconnected. They need to be run as root.

@olliejm
Copy link

olliejm commented Apr 16, 2017

Just thought I would add that I had the same issue when trying to use signal without GCM. The messages would come through whenever I opened up Signal itself though, but never otherwise, unless I hadn't switched network since the device was last booted, or since signal was last quit and reopened.

That was on a OP3T running LineageOS first April nightly, rooted and with microG installed (but Signal not registered with GCM).

@cubagithub
Copy link

I am having the same problem on Websocket version 4.13.7. Switching back to the original network (WiFi, 4G) doesn't restore the connection, but I find that I can still send outbound messages - they only get 'one tick' next to them as in it isn't confirmed as received on the desitnation end, but they do receive it. I do not receive a response until killing the app and reopening, then I get this (excerpt before/during/after killing app):

12-21 22:37:17.963 27355 27429 W WebSocketConnection: Sending keep alive...
12-21 22:37:18.240 27355 27403 W WebSocketConnection: WSC onMessage()
12-21 22:37:18.242 27355 27403 W WebSocketConnection: Message Type: 2
12-21 22:37:30.677 27355 27400 W MessageRetrievalService: Application level read timeout...
12-21 22:37:30.681 27355 27400 W MessageRetrievalService: Network requirement: true, active activities: 0, push pending: 0, gcm disabled: true
12-21 22:37:30.686 27355 27400 W MessageRetrievalService: Reading message...
12-21 22:37:59.814 27355 27355 W PassphraseRequiredActionBarActivity: onResume()
12-21 22:37:59.886 27355 27355 W KeyCachingService: onStartCommand, org.thoughtcrime.securesms.service.action.ACTIVITY_START_EVENT
12-21 22:37:59.887 27355 27355 W KeyCachingService: Incrementing activity count...
12-21 22:37:59.888 27355 27355 W MessageRetrievalService: Active Count: 1
12-21 22:38:03.283 27869 27869 I MultiDex: VM with version 2.1.0 has multidex support
12-21 22:38:03.283 27869 27869 I MultiDex: Installing application
12-21 22:38:03.283 27869 27869 I MultiDex: VM has multidex support, MultiDex support library is disabled.
12-21 22:38:03.285 27869 27869 W PartProvider: onCreate()
12-21 22:38:03.351 27869 27869 W PersistentAlarmManagerListener: class org.thoughtcrime.securesms.service.RotateSignedPreKeyListener scheduling for: 1513941693116
12-21 22:38:03.353 27869 27869 W PersistentAlarmManagerListener: class org.thoughtcrime.securesms.service.DirectoryRefreshListener scheduling for: 1513898933241
12-21 22:38:03.355 27869 27869 W PersistentAlarmManagerListener: class org.thoughtcrime.securesms.service.UpdateApkRefreshListener scheduling for: 1513877373259
12-21 22:38:03.364 27869 27869 I org.webrtc.Logging: NativeLibrary: Loading native library.
12-21 22:38:03.364 27869 27869 I org.webrtc.Logging: NativeLibrary: Loading library: jingle_peerconnection_so
12-21 22:38:03.394 27869 27869 D JVM : JVM::Initialize@[tid=27869]
12-21 22:38:03.394 27869 27869 D JVM : JVM::JVM@[tid=27869]
12-21 22:38:03.394 27869 27869 D JVM : LoadClasses
12-21 22:38:03.394 27869 27869 D JVM : name: org/webrtc/voiceengine/BuildInfo
12-21 22:38:03.394 27869 27869 D JVM : name: org/webrtc/voiceengine/WebRtcAudioManager
12-21 22:38:03.394 27869 27869 D JVM : name: org/webrtc/voiceengine/WebRtcAudioRecord
12-21 22:38:03.395 27869 27869 D JVM : name: org/webrtc/voiceengine/WebRtcAudioTrack
12-21 22:38:03.415 27869 27869 W PassphraseRequiredActionBarActivity: onCreate(null)
12-21 22:38:03.550 27869 27869 W ExperienceUpgradeActivity: getExperienceUpgrade(308)
12-21 22:38:03.553 27869 27869 W PassphraseRequiredActionBarActivity: routeApplicationState(), state: 0
12-21 22:38:03.559 27869 27869 W art : Before Android 4.1, method android.graphics.PorterDuffColorFilter android.support.graphics.drawable.VectorDrawableCompat.updateTintFilter(android.graphics.PorterDuffColorFilter, android.content.res.ColorStateList, android.graphics.PorterDuff$Mode) would have incorrectly overridden the package-private method in android.graphics.drawable.Drawable
12-21 22:38:03.563 27869 27869 W PassphraseRequiredActionBarActivity: initializeClearKeyReceiver()
12-21 22:38:03.711 27869 27869 D NetworkSecurityConfig: No Network Security Config specified, using platform default
12-21 22:38:03.725 27869 27869 W PassphraseRequiredActionBarActivity: onResume()
12-21 22:38:03.764 27869 27869 W KeyCachingService: onCreate()
12-21 22:38:03.837 27869 27869 W service : Broadcasting new secret...
12-21 22:38:03.838 27869 27869 W KeyCachingService: onStartCommand, null
12-21 22:38:03.839 27869 27869 W KeyCachingService: onStartCommand, org.thoughtcrime.securesms.service.action.ACTIVITY_START_EVENT
12-21 22:38:03.839 27869 27869 W KeyCachingService: Incrementing activity count...
12-21 22:38:03.840 27869 27894 W MmsSmsDatabase: Executing query: SELECT _id, unique_row_id, body, type, thread_id, address, address_device_id, subject, date_sent, date_received, m_type, msg_box, status, part_count, ct_l, tr_id, m_size, exp, st, delivery_receipt_count, read_receipt_count, mismatched_identities, network_failures, subscription_id, expires_in, expire_started, notified, transport_type, attachment_id, unique_id, mid, data_size, file_name, _data, thumbnail, ct, cl, digest, fast_preflight_id, voice_note, cd, name, pending_push FROM (SELECT DISTINCT date_sent AS date_sent, date AS date_received, _id, 'SMS::' || _id || '::' || date_sent AS unique_row_id, NULL AS attachment_id, body, read, thread_id, type, address, address_device_id, subject, NULL AS m_type, NULL AS msg_box, status, NULL AS part_count, NULL AS ct_l, NULL AS tr_id, NULL AS m_size, NULL AS exp, NULL AS st, delivery_receipt_count, read_receipt_count, mismatched_identities, subscription_id, expires_in, expire_started, notified, NULL AS network_failures, 'sms' AS transport_type, NULL AS unique_id, NULL AS mid, NULL AS data_size, NULL AS file_name, NULL AS _data, NULL AS thumbnail, NULL AS ct, NULL AS cl, NULL AS digest, NULL AS fast_preflight_id, NULL AS voice_note, NULL AS cd, NULL AS name, NULL AS pending_push FROM sms WHERE (read = 0 AND notified = 0) UNION ALL SELECT DISTINCT date AS date_sent, date_received AS date_received, mms._id AS _id, 'MMS::' || mms._id || '::' || date AS unique_row_id, part._id AS attachment_id, body, read, thread_id, NULL AS type, address, address_device_id, NULL AS subject, m_type, msg_box, NULL AS status, part_count, ct_l, tr_id, m_size, exp, st, delivery_receipt_count, read_receipt_count, mismatched_identities, subscription_id, expires_in, expire_started, notified, network_failures, 'mms' AS transport_type, unique_id, mid, data_size, file_name, _data, thumbnail, ct, cl, digest, fast_preflight_id, voice_note, cd, name, pending_push FROM mms LEFT OUTER JOIN part ON part._id = (SELECT part._id FROM part WHERE part.mid = mms._id LIMIT 1) WHERE (read = 0 AND notified = 0) ORDER BY date_received ASC)
12-21 22:38:03.883 27869 27907 W MessageRetrievalService: Waiting for websocket state change....
12-21 22:38:03.884 27869 27907 W MessageRetrievalService: Network requirement: true, active activities: 0, push pending: 0, gcm disabled: true
12-21 22:38:03.892 27869 27907 W MessageRetrievalService: Making websocket connection....
12-21 22:38:03.893 27869 27907 W WebSocketConnection: WSC connect()...
12-21 22:38:03.893 27869 27869 W MessageRetrievalService: Active Count: 1
12-21 22:38:03.918 27869 27907 W SignalCommunicationModule: onConnecting()
12-21 22:38:03.921 27869 27907 W MessageRetrievalService: Network requirement: true, active activities: 1, push pending: 0, gcm disabled: true
12-21 22:38:03.922 27869 27907 W MessageRetrievalService: Reading message...

Full debug:
https://gist.github.com/cubagithub/e79b28e6836b9a895a14a48bfb904e15

@timotk
Copy link
Author

timotk commented Dec 22, 2017

@cubagithub

I believe @dpapavas was working on a fix, maybe he can give an update?
See his comment: #6644 (comment)
And his PR: signalapp/libsignal-service-java#45

Btw, I think this issue and #6644 are the same.

@dpapavas
Copy link

dpapavas commented Dec 22, 2017

I'm not sure if this bug is related to #6644, or not, but I do remember looking into it as part of the bug. I'm a bit hazy on the details though, but I can confirm that it can be fixed, if not fully, at least to a large degree.

One line of reasoning is that, just as the keep-alive thread blocks forever due to reasons explained in #6644, so does the loop that creates, tries to read from and then destroys the message pipe. The upshot is that, if the network changes, the current pipe's connection is broken, but it's stuck in blocking sleep, so a new one isn't created until it happens to wake up.

That wouldn't explain why the device is able to send messages though, so the issue must be a bit more complex. I see that my patched version (see 2c02f8f#diff-9c046c6a5dc41570be2a81403504b78d), keeps track of several pipes, so the problem is probably that, while a new connection is generated, the message retrieval thread is blocked and stuck listening to the old connection, where the server will never respond with a receipt. It still uses the new pipe to send messages though, so that works.

I think the issue is further compounded, at least on some devices, mine being one of them, by the fact that it can take a lot of time (I've measured 20 minutes, if memory serves) from the time the connection is broken, until the socket (or the library handling them) signals that fact, causing a new connection to be generated. I remember trying to detect a change in the network connection of the device, trying to force a reconnection, but I can't find any of that code, so I assume it didn't work out.

As to a fix, I'm afraid I'm not working on an official fix, for reasons I've explained in signalapp/libsignal-service-java#45. I have resolved to keep my private patched version, until further notice, where I can confirm that this problem is mostly solved. (I say mostly, because I still occasionally get a missed call notification, even though the device is just sitting on the table, so that it shouldn't have any reason to have connection problems).

The patched version has a further fix concerning #6258 and it also has the update nag notification disabled, for obvious reasons. I update it from time to time and anyone is, of course, welcome to use it. You'll need this and this.

@ChristianKreuzberger
Copy link

Same issue here, websockets won't time out when switching network connection, and signal doesnt receive any messages for a long time...

@dpapavas
Copy link

dpapavas commented Feb 8, 2018

I remember trying to detect a change in the network connection of the device, trying to force a reconnection, but I can't find any of that code, so I assume it didn't work out.

I forgot to update here, that my assumption was wrong. I did have a fix that, which I had included in my private version, and which I recently improved and commited (4359b4f). More details can be found in my post here.

@automated-signal
Copy link

GitHub Issue Cleanup:
See #7598 for more information.

@signalapp signalapp locked and limited conversation to collaborators Apr 3, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
6 participants