Summary: | App does not reconnect on wifi connect | ||
---|---|---|---|
Product: | [Applications] kdeconnect | Reporter: | Axel <thaS5egh2chae9vi> |
Component: | android-application | Assignee: | Albert Vaca Cintora <albertvaka> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | andrew.g.r.holmes, CoelacanthusHex, p.r.worrall, voidpointertonull+bugskdeorg |
Priority: | NOR | ||
Version: | unspecified | ||
Target Milestone: | --- | ||
Platform: | Android | ||
OS: | Android 12.x | ||
Latest Commit: | Version Fixed In: |
Description
Axel
2023-05-21 14:05:22 UTC
I see the same symptoms, including the log message. Opening the KDE Connect Settings on my computer sometimes triggers a reconnect, but sometimes I also need to click the refresh button one or two times. My phone's a Pixel 6a running Android 13 Operating System: Arch Linux KDE Plasma Version: 5.27.6 KDE Frameworks Version: 5.107.0 Qt Version: 5.15.10 Graphics Platform: Wayland Either I tracked down another issue, or the bug description is not really correct even though it's mostly what users are expected to observe without paying a lot of attention to details. I don't really know much about Android development, but attempted to see if the network status change callback is working, and it seems to be more than just fine, there's even quite a bit of "overachieving": ``` KDE/BackgroundService: onNetworkChange KDE/LanLinkProvider: identity packet received from a TCP connection from [...] KDE/LanLinkProvider: Starting SSL handshake with [...] trusted:true LanLinkProvider: Starting handshake KDE/LanLinkProvider: identity packet received from a TCP connection from [...] KDE/LanLinkProvider: Starting SSL handshake with [...] trusted:true KDE/LanLinkProvider: Handshake as client successful with [...] secured with TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA KDE/LanLinkProvider: Creating a new link for device [...] updateDeviceInfo: Updating supported plugins according to new capabilities LanLinkProvider: Starting handshake ``` Then the sockets (yes, two!) just manage to get closed. The PC side also shows the same oddity: ``` kdeconnect.core: LanLinkProvider started kdeconnect.core: Daemon started kdeconnect.core: Broadcasting identity packet kdeconnect.core: TCP connection done (i'm the existing device) kdeconnect.core: Starting server ssl (I'm the client TCP socket) kdeconnect.core: TCP connection done (i'm the existing device) kdeconnect.core: Starting server ssl (I'm the client TCP socket) kdeconnect.core: Socket successfully established an SSL connection kdeconnect.core: It is a known device "[...]" kdeconnect.core: Not loading plugin "kdeconnect_lockdevice" because device doesn't support it kdeconnect.core: Not loading plugin "kdeconnect_remotesystemvolume" because device doesn't support it kdeconnect.core: Not loading plugin "kdeconnect_remotecommands" because device doesn't support it kdeconnect.core: Not loading plugin "kdeconnect_virtualmonitor" because device doesn't support it error activating kdeconnectd: QDBusError("org.freedesktop.DBus.Error.NoReply", "Did not receive a reply. Possible causes include: the remote application did not send a reply, the message bus security policy blocked the reply, the reply timeout expired, or the network connection was broken.") ``` The connection can be observed in both GUIs for a tiny window of opportunity which is quite easy to miss. Take my guess with a bucket of salt, but I think there's a bug exposing a race condition. I'm not sure why are there 2 connections established, but I believe the two handshakes are racing because link update/replacement seems to be for only already established (handshake done) links. Moved to Wireshark to see who's naughty. One broadcast (with way too much information, now I get the "trusted" network concept) from the phone, then it makes two connections, but not even seeing the desktop telling the phone where should it connect to, that can't be right... unless the communication is on another interface. Currently not even running a multi-cable setup, but WiFi is also connected, so disabled that, and boom, now I can't even reproduce the issue even after turning on WiFi again. So overall I'm not sure if mine is yet another issue, but I seem to have the apparently not uncommon issue of multiple networks links not being handled well, so maybe those with tethering issues are also suffering from this. Please make sure you have only 1 network link on both sides (phone is likely to have just WiFi already which is the only relevant link as cellular connection shouldn't matter), then report back so it can be judged if it's the same issue I'm seeing, or we are dealing with multiple problems here. I remember seeing WiFi getting turned off when Ethernet was connected, just not sure how long ago was that, and of course this behavior heavily depends on the setup, but it may be a recent-ish change both for faster disconnect recovery and for letting MPTCP use both connections as support for that is evolving. Sorry for the (lack of proper) formatting, but even without the code block format working I believe the log chunks are short enough not to be disturbing. That's a different but maybe not completely separate issue. Both my computer and the phone have only one link to the local network. If my phone does connect, only one TCP connection is established. Your comment got me to look at this closer again, and I might have been misled by the Background execution not allowed message. I still get that message, but a few seconds later I also get onNetworkChange from the KDE Connect service: 08-14 16:48:00.436 965 1031 W BroadcastQueue: Background execution not allowed: receiving Intent { act=android.net.wifi.WIFI_STATE_CHANGED flg=0x4000010 (has extras) } to org.kde.kdeconnect_tp/org.kde.kdeconnect.KdeConnectBroadcastReceiver [...] 08-14 16:48:05.194 30572 30623 D KDE/BackgroundService: onNetworkChange I checked with Wireshark and my computer does not receive any broadcasts from the phone (it does if I restart the KDE Connect service on the phone). There are no later log messages from the KDE Connect app. I don't see anywhere in LanLinkProvider.sendUdpIdentityPacket(List<InetAddress> ipList) that the socket for broadcasting the identity packet is bound to an interface resp. address. If I interpret Android's logs correctly at the time onNetworkChange is called the wifi network is not yet the default, a few hundred ms later I see messages like these: 08-14 16:48:05.416 965 1248 D SemHotspotMobileDataLimit: Default network changed to : 507 [...] 08-14 16:48:05.425 965 1213 D ConnectivityService: Sending DISCONNECTED broadcast for type 0 [502 CELLULAR] isDefaultNetwork=true [...] 08-14 16:48:05.427 965 1213 D ConnectivityService: Sending CONNECTED broadcast for type 1 [507 WIFI] isDefaultNetwork=true I guess the broadcast gets send over the wrong link, cellular instead of wifi. I'll try if I can validate if this is the cause for this issue. The identity packet is broadcast on the wrong network interface because onNetworkChange is called before the wifi interface becomes the default. I logged the current default network in sendUdpIdentityPacket: diff --git a/src/org/kde/kdeconnect/Backends/LanBackend/LanLinkProvider.java b/src/org/kde/kdeconnect/Backends/LanBackend/LanLinkProvider.java index e7983928..651b9025 100644 --- a/src/org/kde/kdeconnect/Backends/LanBackend/LanLinkProvider.java +++ b/src/org/kde/kdeconnect/Backends/LanBackend/LanLinkProvider.java @@ -368,6 +375,11 @@ public class LanLinkProvider extends BaseLinkProvider { return; } + ConnectivityManager connectivityManager = this.context.getSystemService(ConnectivityManager.class); + Network network = connectivityManager.getActiveNetwork(); + LinkProperties linkProperties = connectivityManager.getLinkProperties(network); + Log.i("LanLinkProvider", "default network interface: " + linkProperties.getInterfaceName()); + NetworkPacket identity = NetworkPacket.createIdentityPacket(context); identity.set("tcpPort", tcpServer.getLocalPort()); and got the cellular interface instead of wifi: 2023-08-15 16:25:14.906 21199-21239 KDE/BackgroundService org.kde.kdeconnect_tp D onNetworkChange 2023-08-15 16:25:14.933 21199-21294 LanLinkProvider org.kde.kdeconnect_tp I default network interface: rmnet0 to validate this I put a 500ms sleep before the call to sendUdpIdentityPacket in broadcastUdpIdentityPacket and got the correct interface and the phone connected to the computer: 2023-08-15 16:27:23.245 21637-21683 KDE/BackgroundService org.kde.kdeconnect_tp D onNetworkChange 2023-08-15 16:27:23.751 21637-21687 LanLinkProvider org.kde.kdeconnect_tp I default network interface: wlan0 2023-08-15 16:27:23.786 21637-21687 KDE/LanLinkProvider org.kde.kdeconnect_tp I identity packet received from a TCP connection from [...] I removed the sleep and hacked onNetworkChange to pass the new network from the callback on to sendUdpIdentityPacket and bind the socket to the right interface: diff --git a/src/org/kde/kdeconnect/BackgroundService.java b/src/org/kde/kdeconnect/BackgroundService.java index 10c73ab2..b37aa6c8 100644 --- a/src/org/kde/kdeconnect/BackgroundService.java +++ b/src/org/kde/kdeconnect/BackgroundService.java @@ -136,7 +136,7 @@ public class BackgroundService extends Service { @Override public void onAvailable(Network network) { connectedToNonCellularNetwork.postValue(true); - onNetworkChange(); + onNetworkChange(network); } @Override public void onLost(Network network) { diff --git a/src/org/kde/kdeconnect/Backends/LanBackend/LanLinkProvider.java b/src/org/kde/kdeconnect/Backends/LanBackend/LanLinkProvider.java index e7983928..dc6c391e 100644 --- a/src/org/kde/kdeconnect/Backends/LanBackend/LanLinkProvider.java +++ b/src/org/kde/kdeconnect/Backends/LanBackend/LanLinkProvider.java }); } @WorkerThread - public void sendUdpIdentityPacket(List<InetAddress> ipList) { + public void sendUdpIdentityPacket(Network network, List<InetAddress> ipList) { if (tcpServer == null || !tcpServer.isBound()) { Log.i("LanLinkProvider", "Won't broadcast UDP packet if TCP socket is not ready yet"); return; } @@ -382,11 +397,17 @@ public class LanLinkProvider extends BaseLinkProvider { DatagramSocket socket; try { socket = new DatagramSocket(); + if (network != null) { + network.bindSocket(socket); + } socket.setReuseAddress(true); socket.setBroadcast(true); } catch (SocketException e) { Log.e("KDE/LanLinkProvider", "Failed to create DatagramSocket", e); return; } Works fine but should probably be done a bit more careful for real: 2023-08-15 16:49:26.104 22943-22985 KDE/BackgroundService org.kde.kdeconnect_tp D onNetworkChange 2023-08-15 16:49:26.111 22943-22992 LanLinkProvider org.kde.kdeconnect_tp I default network interface: rmnet0 2023-08-15 16:49:26.115 22943-22992 LanLinkProvider org.kde.kdeconnect_tp I callback network interface: wlan0 2023-08-15 16:49:26.146 22943-22992 KDE/LanLinkProvider org.kde.kdeconnect_tp I identity packet received from a TCP connection from [...] The issue described by Pedro where we would create two sockets and that would cause a race condition where neither connection works should be fixed by this commit: https://invent.kde.org/network/kdeconnect-android/-/commit/5c99b1d32e697e842ca972c4fb7f0dabbad1e576 I will soon make a release including this fix. The issue where we broadcast too soon and the WiFi network isn't the default yet seems an also real but different problem. Axel, since you wrote a patch for this issue, maybe you can open a merge request on Gitlab? https://invent.kde.org/network/kdeconnect-android Maybe a cleaner alternative to passing the Network to the sendUdpIdentityPacket function would be to use bindProcessToNetwork in the networkChange callback, which should affect all the sockets that are created from that point: https://developer.android.com/reference/android/net/ConnectivityManager#bindProcessToNetwork(android.net.Network) This should also guarantee that we never broadcast over a cellular network, which seems like an improvement. And btw, thanks for catching this issue and investigating it! A possibly relevant merge request was started @ https://invent.kde.org/network/kdeconnect-android/-/merge_requests/391 I created a merge request using bindProcessToNetwork as you suggested, it works just as well and is a lot simpler. Version 1.28.0 has been released which should fix this issue. Please reopen if it doesn't. |