Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API feature to close/release all connections or clean the cached devices #688

Open
kvmw opened this issue May 12, 2020 · 22 comments
Open

API feature to close/release all connections or clean the cached devices #688

kvmw opened this issue May 12, 2020 · 22 comments
Labels
bug-android Bug that is caused by Android OS

Comments

@kvmw
Copy link

kvmw commented May 12, 2020

Is your feature request related to a problem? Please describe.
I'm using the library to scan and read some characteristics from other phones
while everything is working perfect with relatively new phones with android 8 or higher,
old phones with android 7 are failing to close the connections and after couple of hours they are ending up in status 133:

com.polidea.rxandroidble2.exceptions.BleDisconnectedException: Disconnected from MAC='XX:XX:XX:XX:XX:XX' with status 133 (GATT_ERROR)

and they are not able to scan until the device is rebooted.

This might be a bug in my code or the library, but since new phones are working flawlessly. i suspect it is related to android os.

Describe the solution you'd like
Would it be possible to kill the connections and release them using an api call in the library?
Or
If this is caching related issue, to clean cached devices, via this library ?

@dariuszseweryn
Copy link
Owner

This is OS related problem. There is nothing can be done afaik. Their BLE stack is probably getting off and there is no programmatic way that could ensure proper functionality. You could try turning off/on the Bluetooth Adapter and Wifi (on <=6.0)

@kvmw
Copy link
Author

kvmw commented May 13, 2020

@dariuszseweryn, unfortunately turning off/on doesn't help. i should reboot the device or clear the bluetooth cache and data. I was hopping for a solution without user interaction.

is this a known issue ? has there been any investigation from your side to find the cause of this issue?

@dariuszseweryn
Copy link
Owner

I have not encountered the issue you are describing. I successfully tested connections that spanned ~3 days (over a weekend).

There are known issues with Android BLE stack that has a tendency to put a lot of problems under "status 133". I have no idea on what may be the problem in your case.

There is only BluetoothGatt cache that can be cleared programatically but as far as I know it only clears cache of discovered services and I doubt that it could help in your case.

At least you have not described what you are doing with the connection anyway.

Feel free to provide more info, perhaps make investigation what may be going on in your case — if that will give some generic conclusions I will be more than happy to incorporate this into the library.

@kvmw
Copy link
Author

kvmw commented May 13, 2020

@dariuszseweryn thanks for the reply.
Here is my scenario :

  • multiple phones that are advertising and scanning in the same time.
  • advertisers have only one characteristic that can be read by the scanners.

so for scanners, here is what i'm doing, at least in theory:

  • i have a foreground android service that does the following
  • observe the client state until it is READY
  • scans devices with specific service-id
  • for each device in scan result:
    - connects, reads characteristic and disconnect

here is my sample code to give more context:

rxBle.observeStateChanges()
   .startWith(rxBle.state)
   .switchMap { state ->
            when (state) {
                READY -> rxBle.scanBleDevices(settings, filter)
                else -> Observable.empty()
            }
        }
            .filter { result -> isDisconnected(result) }
            .subscribe(
                { result ->
                    val compositeDisposable = CompositeDisposable()
                    result
                        .bleDevice
                        .establishConnection(false)
                        .flatMapSingle { connection ->
                            Single.zip(
                                connection.readCharacteristic(UUID.fromString("sample-char-uuid")),
                                connection.readRssi(),
                                BiFunction { value, rssi -> Pair(value, rssi) }
                            )
                        }
                        .doFinally {
                            compositeDisposable.clear()
                        }
                        .take(1)
                        .subscribe(
                            { pair -> save(pair) },
                            { error -> log(error) }
                        )
                        .let {
                            compositeDisposable.add(it)
                        }
                },
                { error -> log(error) }
            )
            .let {
                scanDisposable = it
            }

@dariuszseweryn
Copy link
Owner

old phones with android 7 are failing to close the connections

I haven't noticed this before — how can you tell that they are failing to close connections?

@kvmw
Copy link
Author

kvmw commented May 15, 2020

old phones with android 7 are failing to close the connections

I haven't noticed this before — how can you tell that they are failing to close connections?

i'm not 100% sure but reading and searching about status 133 always links to connection leaks and failing to close the connections properly ( the famous close and disconnect conversation).

also, even though in my code i'm filtering for disconnected devices but i see BleAlreadyConnectedException in stack trace too.

@dariuszseweryn
Copy link
Owner

dariuszseweryn commented May 15, 2020

The library is always calling BluetoothGatt.close() when ending the connection

BleAlreadyConnectedException Javadoc:

 /**
 * An exception being emitted from an {@link io.reactivex.Observable} returned by the function
 * {@link com.polidea.rxandroidble2.RxBleDevice#establishConnection(boolean)} or other establishConnection() overloads when this kind
 * of observable was already subscribed and {@link com.polidea.rxandroidble2.RxBleConnection} is currently being established or active.
 *
 * <p>
 *     To prevent this exception from being emitted one must either:<br>
 *     * always unsubscribe from the above mentioned Observable before subscribing again<br>
 *     * {@link io.reactivex.Observable#share()} or {@link io.reactivex.Observable#publish()} the above mentioned
 *     Observable so it will be subscribed only once
 * </p>
 */

It just means that you are already trying to connect to a given peripheral probably due to race conditions.

@lassebe
Copy link

lassebe commented May 15, 2020

old phones with android 7 are failing to close the connections

I haven't noticed this before — how can you tell that they are failing to close connections?

The main observed behaviour is that these devices eventually, usually after 30 minutes or so, become unable to establish new connections.

We are inferring that it's related to the number of connections. Will try and keep a counter and see if the failures start around a specific number.

@dariuszseweryn
Copy link
Owner

Android has its limits of how many connections it can handle at any given time. Search for BluetoothGatt limitations on different API levels.
It was ~4 at API 18, 7 on API 21, 15 on API 23 per whole OS. You have to close connections once they are not needed.

@lassebe
Copy link

lassebe commented May 16, 2020

Right, that's why we try to connect to one device at a time, and dispose of the connectionDisposable as soon as we either get a success or an error.

This doesn't seem to be an issue with too many concurrent connections, but rather that we are hitting some limitation in how many connections we can open (and close) before something goes terribly wrong. I don't think it's tied to the library really, it feels more likely that it's an OS level issue. So far, I don't think we've seen it on Android 9+, just 7 and in one case 8.1.

We're really just trying to figure out if there's anything else we can do to:

  1. Reliably detect that the device has entered this state
  2. Mitigate it if possible

@kvmw
Copy link
Author

kvmw commented May 21, 2020

@dariuszseweryn I've set a simpler scenario which still reproduces the issue in about 3 hours:

in 1 minutes intervals

  • start scan: for 10 seconds
  • pick one device from the result (if there is any)
  • stop the scan (dispose)
  • connect to the device and read a single characteristics
  • close the connection (dispose)

since the default connection timeout is 30 secs. the 1 min window is more than enough to scan for 10 secs, find a single device, connect and read a single characteristics (or timeout).

I am using a single device for advertising and another one for scanning. so, there is no race condition and there is no connection leak in the code, as far as i can tell.

i'm almost convinced this is an issue with android BLE stack.

@dariuszseweryn
Copy link
Owner

Ideally between scanning and connecting to the device there should be ~0.5 second delay. This was observed to help. There is also a known issue (race condition) on Android in which the timeout happens around the same time as the peripheral gets connected – it then appear to the application and the OS that the peripheral is not connected, yet the peripheral has connection to the mobile.
I cannot find reference right now. This cannot be mitigated in the code unfortunately. The best way to avoid it is to have a peripheral that advertises itself often enough.

@kvmw
Copy link
Author

kvmw commented May 22, 2020

Ideally between scanning and connecting to the device there should be ~0.5 second delay. This was observed to help. There is also a known issue (race condition) on Android in which the timeout happens around the same time as the peripheral gets connected – it then appear to the application and the OS that the peripheral is not connected, yet the peripheral has connection to the mobile.
I cannot find reference right now. This cannot be mitigated in the code unfortunately. The best way to avoid it is to have a peripheral that advertises itself often enough.

Thanks for the suggestion, but event apply 1 second delay after scanning and before connecting doesn't help. still running to the same problem.

@dariuszseweryn dariuszseweryn added bug-android Bug that is caused by Android OS and removed enhancement idea labels May 25, 2020
@dariuszseweryn
Copy link
Owner

Unfortunately I do not know how could I help you further with this issue. This looks like an Android OS BLE stack problem and I do not know any further mitigations. If you would find any — feel free to add them here and I could think on how to incorporate them into the library.

@SmartShepherdUser
Copy link

It is an Android OS problem - I hit it a lot in the past and the only way to mitigate it is to force Android to clean up as much as you can. My app deals with 500+ sequential bluetooth LE devices and the only reliable way to continue through that many is to turn the bluetooth adapter off, wait until you get the notification that the adapter switched off, then turn it back on again. Obviously you need permission to do this and doing so without informing the user is considered poor behaviour. However, in my application, asking them whether they want to turn it on and off all the time is more obnoxious than just doing it in the background. This is for an industrial application so I don't have to worry too much about re-pairing with headphones or whatever since my app is generally the only thing being run. It does, however, work.

If there was a neat way to automatically cycle the bluetooth adapter on and off using this library that would be super handy, but doing so is only really 10 lines of code. It's more something to be aware of on older versions of android.

In my experience, there is a limited number of connections the bluetooth subsystem will keep track of and that number is 8. Once you have 8 (clientIf 0-7) non-responsive devices in your list you are tanked and you get the 133 error and a lot more besides. A smarter programmer than me might keep track of the errors / timeouts / nonresponding devices and clean up only when necessary but I find it much safer in my application to simply turn the adapter on and off every 5 device interactions regardless of whether they were successful or not. Best day we had was talking to 1000+ devices in a row on the older version of Android without a reboot.

I just replaced my hideously complicated bluetooth code (caching, queueing, threading oh my!) with RxAndroidBle and I am very happy with it - so simple now I was finally able to put in a proper check for read/write characteristic time outs. Getting the adapter to cycle on a set schedule would be the cherry on top but it's easy enough to write that code outside of the library and embed it in your Activity instead (which is probably where that should be)

@kvmw
Copy link
Author

kvmw commented Jun 8, 2020

@SmartShepherdUser yeah, you are right. it's seems android issue.
The workaround you are using (switching on/off the bluetooth) doesn't work in all devices (for example in most samsung devices). in those devices you have to clear the bluetooth cache/data or reset the devices.

@dariuszseweryn
Copy link
Owner

@SmartShepherdUser
Copy link

Probably depends a lot on the device involved, the technique works on an ALPS based system I use for a specific industrial purpose. It doesn't cause any harm on a Samsung Galaxy S8 but it may not do any good. The only advice I have is that if you let errors accumulate, the older Android versions never properly release the connections and that is what leads you to the death of the whole bluetooth stack. How that is dealt with under the hood in the drivers is probably specific to the device. Regardless, perhaps the feature should be error accumulation warnings so that the application knows it's hitting the danger zone on earlier versions of Android. It would be one thing that would reduce a lot of application logic in the same way the Rx stuff removed all my queues/caches etc. without proscribing the behaviour so you could handle the error at a higher level (i.e. make the error accumulation a warning, by perhaps looking at the clientIf number coming back in the devices, and when the clientIf number reaches a certain threshold on certain versions of Android, throw a warning)

@SmartShepherdUser
Copy link

There was an issue with how many Bluetooth Devices cache entries an Android device may handle. This could explain some of the issues @SmartShepherdUser has faced.

No this I think is related to the issue but isn't the underlying issue (max 8 active BLE connections per application). The caching thing looks super bad but I never hit that. I had 1000 BLE devices turned on in a room last year and everything Bluetooth stopped completely, including headphones etc. I don't recommend that ha ha!

@dariuszseweryn
Copy link
Owner

From what I know it is currently max 13 connections per system (since 9.0 prior to it it was 7). Source. There is an additional restriction in maximum BluetoothGatt objects per system as well, but I AFAIR it is much higher than that.

@SmartShepherdUser
Copy link

So to get back to the point, would it be possible to produce a warning before those version specific thresholds were reached? I can help testing and perhaps have some code to contribute. Also need to reiterate this is not an rxandroidble issue but could make it a killer feature.

@dariuszseweryn
Copy link
Owner

Though it seems to be quite away from the original topic — that is some interesting idea. I will extract it as a separate issue. I have quite a lot on my stack now and definitely not enough time unfortunately.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-android Bug that is caused by Android OS
Projects
None yet
Development

No branches or pull requests

4 participants