Version: (using Devel) Installed from: Compiled sources OS: Linux Konqeror, akregator and ktorrent nearly always fail to connect to the remote server when using a FQDN. When using a IP-Adress instead things are working better. With "KDE_FORK_SLAVES=true" I always see a similar error message. Here is an example from konqueror: ########### kio_http(9761)/kio_http_debug HTTPProtocol::get: "http://heise.de/" kio_http(9761)/kio_http_debug HTTPProtocol::checkRequestUrl: "http://heise.de/" kio_http(9761)/kio_http_debug HTTPProtocol::resetSessionSettings: Using proxy: false URL: "" Realm: "" kio_http(9761)/kio_http_debug HTTPProtocol::resetSessionSettings: Enable Persistent Proxy Connection: false kio_http(9761)/kio_http_debug HTTPProtocol::resetSessionSettings: Window Id = "65011713" kio_http(9761)/kio_http_debug HTTPProtocol::resetSessionSettings: ssl_was_in_use = "" kio_http(9761)/kio_http_debug HTTPProtocol::retrieveContent: kio_http(9761)/kio_http_debug HTTPProtocol::retrieveHeader: kio_http(9761)/kio_http_debug HTTPProtocol::httpOpen: kio_http(9761)/kio_http_debug HTTPProtocol::isOffline: networkstatus <unreachable> kio_http(9761)/kio_http_debug HTTPProtocol::httpCheckConnection: Keep Alive: true First: false kio_http(9761)/kio_http_debug HTTPProtocol::httpOpen: Calling checkCachedAuthentication kio_http(9761)/kio (kioslave) KIO::SlaveBase::checkCachedAuthentication: window = 65011713 url = KUrl("http://heise.de/") kio_http(9761) HTTPProtocol::httpOpen: ============ Sending Header: kio_http(9761) HTTPProtocol::httpOpen: "GET / HTTP/1.1" kio_http(9761) HTTPProtocol::httpOpen: "Connection: Keep-Alive" kio_http(9761) HTTPProtocol::httpOpen: "User-Agent: Mozilla/5.0 (compatible; Konqueror/4.0; Linux) KHTML/4.0.74 (like Gecko)" kio_http(9761) HTTPProtocol::httpOpen: "Accept: text/html, image/jpeg, image/png, text/*, image/*, */*" kio_http(9761) HTTPProtocol::httpOpen: "Accept-Encoding: x-gzip, x-deflate, gzip, deflate" kio_http(9761) HTTPProtocol::httpOpen: "Accept-Charset: utf-8, utf-8;q=0.5, *;q=0.5" kio_http(9761) HTTPProtocol::httpOpen: "Accept-Language: en-US, en" kio_http(9761) HTTPProtocol::httpOpen: "Host: heise.de" kio_http(9761)/kio_http_debug HTTPProtocol::httpOpenConnection: kio_http(9761)/kssl KIO::TCPSlaveBase::disconnectFromHost: kio_http(9761)/kssl KIO::TCPSlaveBase::connectToHost: before connectToHost: Socket error is 0 , Socket state is 0 kio_http(9761)/kssl KIO::TCPSlaveBase::connectToHost: after connectToHost: Socket error is 0 , Socket state is 1 konqueror(9740)/kdeui (KMainWindow) KMainWindow::saveMainWindowSettings: KMainWindow::saveMainWindowSettings "Profile" kio_http(9761)/kssl KIO::TCPSlaveBase::connectToHost: after waitForConnected: Socket error is 6 , Socket state is 0 , waitForConnected returned false kio_http(9761)/kio_http_debug HTTPProtocol::httpOpen: Couldn't connect, oopsie! kio_http(9761)/kio_http_debug HTTPProtocol::httpClose: kio_http(9761)/kio_http_debug HTTPProtocol::httpClose: keep alive ( 60 ) ########### An here one from ktorrent: ########### kio_http(7590)/kio_http_debug HTTPProtocol::reparseConfiguration: kio_http(7590)/kio_http_debug HTTPProtocol::setHost: Hostname is now: "tracker.opensuse.org" ( "tracker.opensuse.org" ) kio_http(7590)/kio_http_debug HTTPProtocol::get: "http://tracker.opensuse.org:6969/announce?peer_id=-KT31B2-aAQczbJuj2Za&port=6881&uploaded=0&downloaded=0&left=688128000&compact=1&numwant=100&key=398488384&event=started&info_hash=%96!%e9%f2%15%cdd%88%af%23%0fT%fa%1d%22H6%08%d2%03" kio_http(7590)/kio_http_debug HTTPProtocol::checkRequestUrl: "http://tracker.opensuse.org:6969/announce?peer_id=-KT31B2-aAQczbJuj2Za&port=6881&uploaded=0&downloaded=0&left=688128000&compact=1&numwant=100&key=398488384&event=started&info_hash=%96!%e9%f2%15%cdd%88%af%23%0fT%fa%1d%22H6%08%d2%03" kio_http(7590)/kio_http_debug HTTPProtocol::resetSessionSettings: Using proxy: false URL: "" Realm: "" kio_http(7590)/kio_http_debug HTTPProtocol::resetSessionSettings: Enable Persistent Proxy Connection: false kio_http(7590)/kio_http_debug HTTPProtocol::resetSessionSettings: Window Id = "" kio_http(7590)/kio_http_debug HTTPProtocol::resetSessionSettings: ssl_was_in_use = "" kio_http(7590)/kio_http_debug HTTPProtocol::retrieveContent: kio_http(7590)/kio_http_debug HTTPProtocol::retrieveHeader: kio_http(7590)/kio_http_debug HTTPProtocol::httpOpen: kio_http(7590)/kio_http_debug HTTPProtocol::isOffline: networkstatus <unreachable> kio_http(7590)/kio_http_debug HTTPProtocol::httpCheckConnection: Keep Alive: true First: false kio_http(7590)/kio_http_debug HTTPProtocol::httpOpen: Calling checkCachedAuthentication kio_http(7590)/kio (kioslave) KIO::SlaveBase::checkCachedAuthentication: window = 0 url = KUrl("http://tracker.opensuse.org:6969/announce?peer_id=-KT31B2-aAQczbJuj2Za&port=6881&uploaded=0&downloaded=0&left=688128000&compact=1&numwant=100&key=398488384&event=started&info_hash=%96!%e9%f2%15%cdd%88%af%23%0fT%fa%1d%22H6%08%d2%03") kio_http(7590) HTTPProtocol::httpOpen: ============ Sending Header: kio_http(7590) HTTPProtocol::httpOpen: "GET /announce?peer_id=-KT31B2-aAQczbJuj2Za&port=6881&uploaded=0&downloaded=0&left=688128000&compact=1&numwant=100&key=398488384&event=started&info_hash=%96!%e9%f2%15%cdd%88%af%23%0fT%fa%1d%22H6%08%d2%03 HTTP/1.1" kio_http(7590) HTTPProtocol::httpOpen: "Connection: Keep-Alive" kio_http(7590) HTTPProtocol::httpOpen: "User-Agent: KTorrent/3.1beta2" kio_http(7590) HTTPProtocol::httpOpen: "Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2" kio_http(7590) HTTPProtocol::httpOpen: "Accept-Encoding: x-gzip, x-deflate, gzip, deflate" kio_http(7590) HTTPProtocol::httpOpen: "Host: tracker.opensuse.org:6969" kio_http(7590)/kio_http_debug HTTPProtocol::httpOpenConnection: kio_http(7590)/kssl KIO::TCPSlaveBase::disconnectFromHost: kio_http(7590)/kssl KIO::TCPSlaveBase::connectToHost: before connectToHost: Socket error is 6 , Socket state is 0 kio_http(7590)/kssl KIO::TCPSlaveBase::connectToHost: after connectToHost: Socket error is 6 , Socket state is 1 kio_http(7590)/kssl KIO::TCPSlaveBase::connectToHost: after waitForConnected: Socket error is 6 , Socket state is 0 , waitForConnected returned false kio_http(7590)/kio_http_debug HTTPProtocol::httpOpen: Couldn't connect, oopsie! ########### If you need more information or want me to debug something just ask.
Is this related to the "Unknown error" that happens in Konqueror and KGet and was mentioned in bugreport #154774? The conclusion in this bugreport is that the problem isn't in Konqueror, but deeper.
If I understood #kde-devel correctly this should be the same bug. Does ktorrent work for you? And what happens if you set "KDE_FORK_SLAVES=true" to true? Do you also find "kio_http(9761)/kssl KIO::TCPSlaveBase::connectToHost: after waitForConnected: Socket error is 6 , Socket state is 0 , waitForConnected returned false " in the debug output?
So, Modestas Vainius found the problem in my wireshark logs: Kio sends two dns requests at nearly the same time. The router gets confused by this problem and answers with *one* request on the port of the first request with the transaction id of the second package. The resulting package is invalid and gets discarded by glibc then. Now we got the reason, the question is just if there's a way to work around kio (or something in qt, or...) making two requests in a row. Another issue is that kio makes a new dns request for each query and doesn't share the resolved IP between several kio instances. But that's another story...
So, here you can find the debug output of the qt deno browser and konqueror with enabled qtnetwork debugging information: http://alioth.debian.org/~trigger-guest/the_broken_dns_case/ Seems as if they are both doing the same, but the qt browser succeeds, while konqueror fails. Who sees the important difference? I don't right now.
Oh, one thing: konqueror also can't load trolltech.com and the qt browser can load heise.de...
The difference between the KDE and Qt logs from comment #4 is that KDE uses waitForConnected(), while Qt/WebKit uses some hostFound() slot. A quick peek into Qt reveals that waitForConnected() aborts the current DNS lookup and starts another one (cf. file qabstractsocket.cpp, line 1464 from qt-4.4.0 release)
*** This bug has been confirmed by popular vote. ***
Created attachment 24956 [details] Separated host lookup and connectToHost in TCPSlaveBase. waitForConnected() indeed aborts a running asynchronous DNS lookup initiated by connectToHost(QString host,...), looks up the host itself and eventually runs into a Qt bug. The single QHostInfoAgent instance is apparently not thread-safe due to improper locking of QHostInfoAgent::queries, but is nevertheless accessed from two different threads: The one initiated by connectToHost(..), and the main thread, when waitForConnected() tries to abort the lookup. Attached is a patch for TCPSlaveBase::connectToHost. The host lookup is done via QHostInfo::fromName(), and then KTcpSocket::connectToHost(QHostAddress,..) is called instead of KTcpSocket::(QString host,..).
Please give me the details of the QHostInfo issues. Or, better yet, report them to qt-bugs@trolltech.com. I'll take care of fixing it there.
There may be a small problem with the patch: It simply uses the first resolved IP, and never tries the others (if any). Maybe just a foreach over the list of returned IPs, until it finds an IP for that waitForConnected() does not return failure? Maybe try them all at once, and pick the first one that works? Maybe the lookup returns both the IPv6 and IPv4 addresses, and selecting the wrong one causes a further delay. I cannot test the patch, as I do not run KDE from compiled sources. Does it work around the Fritz router problem? Armin? Daniel?
I will test the patch in a minute. Isn't it the job of the glibc to give you a random IP? I don't remember the details, but I thought this is the case. The only problem is the ipv6 lookup btw. If this doesn't succeed an ipv4 lookup will happen afterwards.
Created attachment 24959 [details] Separated host lookup and connectToHost in TCPSlaveBase. waitForConnected() indeed aborts a running asynchronous DNS lookup initiated by connectToHost(QString host,...), looks up the host itself and eventually runs into a Qt bug. The single QHostInfoAgent instance is apparently not thread-safe due to improper locking of QHostInfoAgent::queries, but is nevertheless accessed from two different threads: The one initiated by connectToHost(..), and the main thread, when waitForConnected() tries to abort the lookup. Attached is a patch for TCPSlaveBase::connectToHost. The host lookup is done via QHostInfo::fromName(), and then KTcpSocket::connectToHost(QHostAddress,..) is called instead of KTcpSocket::(QString host,..).
Created attachment 24960 [details] Separated host lookup and connectToHost in TCPSlaveBase. waitForConnected() indeed aborts a running asynchronous DNS lookup initiated by connectToHost(QString host,...), looks up the host itself and eventually runs into a Qt bug. The single QHostInfoAgent instance is apparently not thread-safe due to improper locking of QHostInfoAgent::queries, but is nevertheless accessed from two different threads: The one initiated by connectToHost(..), and the main thread, when waitForConnected() tries to abort the lookup. Attached is a patch for TCPSlaveBase::connectToHost. The host lookup is done via QHostInfo::fromName(), and then KTcpSocket::connectToHost(QHostAddress,..) is called instead of KTcpSocket::(QString host,..).
Regarding the getaddrinfo() usage in qhostinfo_unix.cpp: Recommendation to always set AI_ADDRCONFIG: http://people.redhat.com/drepper/linux-rfc3484.html Patch applied to APR: http://marc.info/?l=apr-dev&m=105836879006735&w=2 But also note this not always having been effective: http://sources.redhat.com/ml/glibc-bugs/2007-06/msg00014.html
So people, we got news from #kde-devel. It would be cool if somoone could try the following workaround for crappy crappy routers: Find 'hints.ai_family = PF_UNSPEC;' in src/network/kernel/qhostinfo_unix.cpp of your Qt copy and add the following in the next line: 'hints.ai_flags = AI_ADDRCONFIG;'. Afterwards rebuild libQtNetwork. If you disable ipv6 for your box now (add 'alias net-pf-10 off' in /etc/modprobe.d/aliases on Debian systems e.g.) and restart your computer. Now Qt shouldn't emit any ipv6 DNS lookups and your router shouldn't go crazy anymore. thiago wanted me to try this, but didn't comment so far if this will go upstream (but I guess there are no reasons not to add it). But anyway kio shouldn't make two lookups at once and cache dns requests... ;)
I'm applying that fix to Qt 4.4.1.
After rereading run() and fromName() in QHostInfoAgent the supposed Qt bug dissolves and is reduced to the fact that a call to waitForConnected() almost always fails in aborting a host lookup query triggered by connectToHost(), so that fromName() and eventually getaddrinfo is called at the same time from different threads and blocks under certain circumstances. The proposed hints.ai_flags = AI_ADDRCONFIG (or even AF_INET) solves the problem only partially, because parallel calls to getaddrinfo with a nonexisting name like "mumbel.grumbel" still block for a long time. So I think it is still better if TcpSlaveBase "serializes" calls to getaddrinfo, either by waiting for the connected() signal or by separating host lookup and connectToHost. After all, the Qt Demo Browser works fine with hints.ai_flags = AF_UNSPEC. The weak spot is its habit to post duplicates on bugs.kde.org ...
SVN commit 830140 by thiago: Make IOSlaves based on TCPSlaveBase request DNS resolution via the application. And make the application cache results for 5 minutes. This should avoid the DNS request storm that happens when loading webpages. Whereas this is completly normal and has been done for years, apparently we're doing something different now that causes some cheap routers to lock up or fail to respond. Those defective routers should be replaced, but while they aren't, we introduce a cache. Patch by Roland Harneau <truthandprogress@googlemail.com> BUG:162600 CCMAIL:<truthandprogress@googlemail.com> M +1 -0 CMakeLists.txt M +2 -1 kio/global.h A kio/hostinfo.cpp [License: LGPL] A kio/hostinfo_p.h [License: LGPL] M +33 -0 kio/slavebase.cpp M +11 -0 kio/slavebase.h M +15 -0 kio/slaveinterface.cpp M +5 -2 kio/slaveinterface.h M +3 -1 kio/slaveinterface_p.h M +29 -17 kio/tcpslavebase.cpp WebSVN link: http://websvn.kde.org/?view=rev&revision=830140
like to reopen this bug. It's inacceptable. Every other browser works with the fritz boxes. Only konqueror 4 is .. forget it.
Reopen? Why? Have you tested the patch I posted, from Roland? If it doesn't work, let us know! Mind you that there's a big, recent DNS vulnerability found this week. This will require *most* DNS servers in the world to be replaced. You may want to start doing it now.
> Reopen? Why? Have you tested the patch I posted, from Roland? If it doesn't work, let us know! Well, it didn't seem to work for me. I updated kdelibs, but still Konqueror was unusable for browsing because hostname resolving took so much time. In the meantime, I've configured the DNS of my system to bypass my router and go directly to the DNS servers of my ISP, which fixes the problem. If you want to, I can provide network traffic dumps with my old configuration.
If you have traffic dumps with the patch applied (KDE trunk, to-be-4.2, it's not in the 4.1 line *yet*), it will probably be useful.
*** Bug 155157 has been marked as a duplicate of this bug. ***
Reopening, as I can reproduce it on trunk in r831382.
Created attachment 26063 [details] Traffic dump This is a traffic dump of Konqueror trying to connect to www.test.de. I only captured UDP port 53. My kdelibs is trunk r831382, so it has the "DNS caching" already applied (although I don't really see how caching helps, since 80% of the time you visit a new webpage anyway)
*** Bug 166366 has been marked as a duplicate of this bug. ***
I have a fritz box, too and suffering from this issue. I just did a tcpdump and wondering, whether KDE or the router behaves incorrectly. I'll attach the whole dump, here's a short partial summary: Time Src Dst Src P. Dst P. IP ID DNS Trans ID ( 15.013162 client router 27138 53 0x5dcb 0x8f09 15.014707 client router 6241 53 0x5dcc 0xf83d 15.054842 router client 53 27138 0x0284 0xf83d ) 20.013119 client router 24440 53 0x7154 0x0279 20.016952 client router 13523 53 0x7155 0xdbe4 20.051192 router client 53 24440 0x0285 0xdbe4 25.009072 client router 24440 53 0x7155 0x0279 25.020018 client router 13523 53 0x7156 0xdbe4 25.057100 router client 53 24440 0x0286 0xdbe4 So you can see, that client requests twice on port 24440 with a different IP Identifier, but it also sends requests from port 13523 (maybe the other parallel running kio slave?) with the same IP Identifier. Looking at the reply, it i.e. sends it to source port 24440 but uses the DNS transaction ID of the request which came from port 13523. Maybe the router gets confused because both source ports use the same IP Identifier? I'm not an networking expert, but something is wrong here...
Created attachment 26086 [details] complete verbose dump
Created attachment 26087 [details] dump in pcap format
Me again ;) I could reproduce the long delays without KDE as follows: 1. get example getai c-program from http://www.logix.cz/michal/devel/various/getaddrinfo.c (I found this in google) 2. compile gcc -o getaddrinfo getaddrinfo.c 3. call it parallel ./getaddrinfo google.com & ./getaddrinfo google.com It takes 20 second to perform this lookup. Using to different hostnames returns immediately. So it's definately a Bug in the Router. It sends the reply to the udp port of process 1 but uses the DNS transaction ID of process 2. That surely doesn't work. I've contacted AVM support with a detailed description. Let's see what they'll tell me (I hope they come up with a solution and not a stupid standard answer). I have an older model (7050) which may not be supported any longer, so they might just say "buy a newer model". Thanks for the dns cache which will make surfing faster and reduce DNS traffic, but this really seems to be a bug in the AVM Fritz Boxes.
Indeed, Robin, your router is replying to port 24440 the request of ID 0xdbe4 (56292). But that request came from port 13523. That is definitely a router bug. Also note that the entirety of the DNS handling is done by glibc. We have no control over source ports or request IDs. I can't see anything in Thomas's log.
I also own a Fritz!Box (7270) and have these issues but I can reproduce the problem in the vpn at the university so I don't think it depends especially on the fritz!box... I will try to capture a log like Thomas' one.
A capture like Thomas's is much more difficult to read. Please attach a pcap file. (tcpdump -w /tmp/capture.pcap port 53)
I have got the same issue with an 7170. But I use pdnsd as a local DNS cache; so I would assume that a built-in cache wouldn't help that much here.
Uhm, am i wrong, or did you all just rediscover, what was already mentioned in comment #3? The DNS-Storm should be gone with this patch, but the two DNS-lookups of one which will be terminated (see comment #8) is still there. This double dns lookup at the time is what kills these freaking routers.
WRT comment #34: Pdnsd can't help here, because there is nothing it could cache. The ipv6 lookup fails, so it has nothing in the cache and will forward all and every request directly to the router.
The DNS cache patch makes the double lookup also disappear. As you can see in http://websvn.kde.org/trunk/KDE/kdelibs/kio/kio/tcpslavebase.cpp?r1=830140&r2=830139&pathrev=830140, we no longer call connectToHost with a hostname, but with each of the IPs looked up.
@#35, well I tried to find out if KDE might do something wrong which may confuse the router. I found the duplicate IP Identification and thought this might have caused the confusion. But the getaddrinfo procedure I described does not produce this duplicate IP Identification but still triggers the bug in the router. So it's verified that the router really is buggy. I'd like to test the dns cache patch but don't have time to compile KDE from svn. I'll have to wait for some updated opensuse RPMs...
The patch prevents parallel resolutions of the same name (with glibc's getaddrinfo() a "lookup" in the typical case (existing A record but no AAAA record) consists of two AAAA and one A request)), but allows concurrent resolutions of different names. My router (AVM fritzbox WLAN 3030) can handle the last case. To prevent concurrent lookups in general try the following: In kdelibs/kio/kio/slaveinterface.cpp insert the line #include <QtNetwork/QHostInfo> and change line 343 from HostInfo::lookupHost(hostName, this, SLOT(slotHostInfo(QHostInfo))); to QHostInfo::lookupHost(hostName, this, SLOT(slotHostInfo(QHostInfo)));
fritz boxes are working. No browser shows any problems with them. The problem is konqueror, kio.. kde. Btw, the Fritz boxes are possibly the most popular routers in Germany.
Created attachment 26256 [details] Traffic dump with tcpdump while trying to connect to www.test.de with Konqueor4
Created attachment 26257 [details] Traffic dump with tcpdump while trying to connect to www.test.de with Konqueor3
> To prevent concurrent lookups in general try the following: > In kdelibs/kio/kio/slaveinterface.cpp insert the line > [SNIP] This didn't make a difference for me. Hostname lookups are still slow with current kdelibs from trunk, with or without that change. For the record, my router is a D-Link DSL-G664T. Note that the webkit demo browser from qt-copy also has the same problem, and KDE3 doesn't have this problem at all. Seems to be a change in Qt which triggered this new behavior of host lookups.
Regarding router brands: I am using a Netgear WGR614v5. Given the variety of models being used I am careful not to blame the problem on a router bug. It must be rather the software on my system - or maybe my ISPs DNS which is said to be bad. But things have changed for me suddenly: since yesterday my connection problems are gone! What did I do? Unfortunately too many things at the same time. Did the first "svn up" of kdelibs since day and got lots of new packages through "apt-get upgrade" from Debian testing. Previously I had already applied the qhostinfo_unix.cpp patch and turned on IPv6 support which seemed to improve things a little bit. And I believe I had already been using the DNS cache patch but didn't notice a difference (could be wrong).
@Thomas: your system is still sending IPv6 requests. Either you have IPv6 addresses in your machine or you haven't patched Qt to not send the requests. Aside from that, there's nothing wrong with your trace. The entire exchange of your Konqueror paste is less than 30 seconds. There's absolutely nothing wrong with it, neither from Konqueror's side nor from your router. It does lookup twice, however. @comment 40 (usa): please see comment 30.
@Thomas: the suggested change from KIO::HostInfo::lookupHost to QHostInfo::lookupHost should exactly mimic the behavior of Qt's WebKit showcase regarding DNS requests. But if the latter fails this is of course pointless. Your Konqui4 traffic dump is somewhat puzzling, e.g. No. 1 und 3 come from the same port (1116) but request the resolution of different names (www.test.de and www.test.de.site) which is not typical for getaddrinfo on glibc-based Linux. Are you behind a firewall? Anyway, you should follow Thiago's suggestion, i.e. patch Qt according to #15 and disable IPv6 in your system. @Harri: The AVM Fritzbox router are definitely buggy, they can't handle parallel requests to resolve the same name, and Thomas' D-Link seems to have problems with IPv6 queries in general. Apparently every vendor has a unique way to implement crappiness.
Patching qt-copy as described in comment #15 fixed this issue for me. Thiago and Roland: Sorry for stealing your time. I thought the KIO patch alone would fix the problem and didn't see comment #15 before.
Just for your information: The patch referenced by Comment #18 From Thiago Macieira will introduce problems with ssl authentication. See Bug #167166
Any chance this (and this: http://websvn.kde.org/?view=rev&revision=832072 which fixes the regression) can be backported for 4.1.1?
Not yet. It's not without issues, so we have to work on it a little bit more.
*** Bug 168921 has been marked as a duplicate of this bug. ***
The bug still exists in kde 4.1.1 in the normal arch-packages (not kdemod).
The bug is not closed. You don't have to tell us it still exists: we're not claiming otherwise.
I have installed privoxy on my home(default settings) pc and i changed konqueror settings to use it. Seems to make it connect faster to the websites. Still slower than ff.
The bug is still present in KDE-4.1.1 (amd64, precompiled debian experimental packages). I am using a fritzbox. No program besides KDE4 is affected.
Comment 55: this bug is still open, so yeah we are aware that it's still present.
This just occurred to me: Has anyone contacted the manufacturer of the defective hardware and reported the issue? If no one does, they'll never know they have a bug to fix.
I send a problem description to ASUS Technical Support, because I have such issue with my ASUS 6020. I got an answer on my native language, so I try to reproduce here this answer in English "Hello! Thank you for call to ASUS Technical Support Service The ASUS Company doesn't support OS Linux. There is no such problem with computers, there OS Windows was installed" Very strange answer! Because ASUS 6020 is Linux-embedded device, AFAIK. I think that ASUS Russian support team is not competent in this questions, so they gave me so stupid answer. Who can get more?
Maybe try to politely remind them they're selling GNU/Linux machines themselves, in particular some editions of the eeePC...
I wrote 2nd request to ASUS - now in English. The answer came from Russian office again. Of course, in Russian language. I note them about Linux on ASUS eePC 7xx/9xx. The answer was: (translate from russian again) "Due numerous amount of Linux distro's, The ASUS Company support that users only, who use Linux distro developed by ASUS only and installed on computer, manufactured by ASUS only" The problems of Indians doesn't disturb the Sheriff, isn't it? I shall no buy any ASUS notebook in near or far future, eePC Linux doesn't include KDE4, so I don't know, how I can call ASUS support again? Anybody install KDE4 on Windows? Does this bug present in this OS?
*** Bug 171230 has been marked as a duplicate of this bug. ***
Come on guys. Be realistic! It can't be a problem of the hardware. The fact that EVERYTHING, except some KDE apps, is working shows that it has to be a problem of kde or qt. Windows: Everything works Firefox: No problems Konqueror in KDE3.x: No problems By the way: Konqueror in Kde4 was always very very slow on my Pc. But since a couple of weeks it does't even connect to any website. No idea what it could be. Tell me what additional infos you need. Kmail is only connecting sporadically connecting to the pop server, too. The same for Khotnewstuff.
KDE 4 with other hardware works fine too. It's only the combination which doesn't work, and all evidence points to the hardware being at fault.
Ok I see. It could be the combination. But i think it is rather the software than the hardware. But ok... You are the real experts. But even if it really is a hardware problem... We can sit and wait for centuries, waiting for avm to fix that bug (I bet they will never do). Or change the way the kde4 apps are accessing the internet. I'm sure there is a proper way, because all the other apps can do it, also with the bug in the hardware.
If you know what we should change, tell us. We don't know yet (or we'd have fixed this bug a long time ago).
Ok, cranky me. Please ignore comment #65. Read instead as follows: We understand that the hardware getting fixed is a long-shot. We are willing to change the KDE code to make it not trigger the bug on the faulty hardware. However, we don't know yet what we're doing that is making the router go nuts. So we also don't know what we should change to resolve the issue. That's why this bug is still open. Once we do know what we should change, we will change, and close the report.
> If you know what we should change, tell us. We don't know yet (or we'd have fixed this bug a long time ago). Does modifying Qt like described in comment #15 not help everyone here? Have those who still comment here tried that? It at least works like a charm for me.
The change from comment #15 was made permanent to Qt 4.4.1 and 4.4.2. Since people are still complaining, I believe the bug hasn't been fixed.
I'm sorry. I got you wrong. I thought you don't want to change the code since it's a bug of the hardware, because the bug is there for such a long time. Do you think the other things mentioned in comment #62 belong to the same bug?
> Do you think the other things mentioned in comment #62 belong to the same bug? Yes, seems to be exactly the same problem.
*** Bug 168619 has been marked as a duplicate of this bug. ***
1) I have read the RFC in question RFC1035 http://tools.ietf.org/html/rfc1035 4.1.1. Header section format - - - ID A 16 bit identifier assigned by the program that generates any kind of query. This identifier is copied the corresponding reply and can be used by the requester to match up replies to outstanding queries. I think the router problem is that it uses ID (maybe together with IP) to track where the future answer should be forwarded. 2) glibc/kernel or... fills in this field. But not randomly enough - two programs/threads that opens a port and sends this request at the same time often gets the same IDs! (seeded in the same way - with time?) - on my dual core it is almost 100% of the time. [but I do not get the timeouts, as I use my internet providers DNS directly] If this analyze is correct, fixing all routers is not possible. Then it will be difficult to fix in KDE alone. (But letting one program do all queries should help, caching or not). Standard tools like bind might be possible to configure to do this - adding one working process in front of non working router.
Hi, I just stepped on this bug, because I was reading userbase. I am do not have the problem, since KDE4 is not (yet) running on my system. But I would really recommend that someone who can explain in detail the problem contacts AVM, the company behind Fritz!Box-routers. They have a very good and competent phone and email support. The company is known for supporting Linux and they provide frequent updates and even test builds for router improvements. So if the problem is on their side, I guess the will help to fix it soon.
Okay, I did it, I contacted the AVM-support and talked to them about this problem. If I get any news from them I will inform you
I need your help! The AVM-support asks if the problem exists with actual fritzboxes (7170, 7270) and the actual firmware (xx.04.59) Could somebody test this and post the result? Thanks!
Just to clarify: you mean "current" ("aktuell" in German). "actual" in English means "eigentlich" or "tatsächlich", that's not what was meant here.
To Thomas Schulz: There is a related error "Unable to check multiple pop3 accounts at the s.." http://bugs.kde.org/show_bug.cgi?id=166366 which definately occurs on FRITZ!Box Fon WLAN 7141 (UI) Firmware-Version 40.04.59 while a local "bind" DNS server works fine. So the answer is: Yes. Error occurs on .04.59..
Hi :-) I have read this after having installed KDE 4.1.2 on Gentoo. Some time ago, I also answered Bug #154774, which depends on this issue (and the problem still exists, as already mentionned above) I think it's okay you guys don't want to produce some crappy code to make things work with the Fritz-Boxes when it's a Fritz-Box bug. But a lot of people (including me) do use those routers (I have a FRITZ!Box WLAN 3030). And even if the AVM guys fix the problem, I bet, this will only happen with recent versions. There has been no firmware update for my Router for two years or so and not everyone wants to buy a new router because Konqueror won't work – especially because every single other browser does work fine. And if this also affects KMail, Akregator, Kopete, etc. (I haven't tested it), this _will_ be a reason simply not to use KDE 4 for me and many others, as it's just not useable. I hope there will be some solution for this, as KDE 4 is really great work.
Answer to Comment #75: Yes, this problem exists on my fritz box 7170 and firmware version 29.04.59. If required, I can provide tcpdumps that shows how the fritz box mixes the DNS transaction IDs and the ports, which my provider's DNS server doesn't.
The problem exists on FritzBox 7270 with firmware 54.04.63-12365, too. The FritzBoxes are Linux-powered and I have full ssh access to mine, so is there anything I can do on the machine itself? I'm no network-expert so could someone explain to me the meaning of "bug" in this case? I mean is it a misconfigured DNS-daemon on the box or a buggy kernel or something completely different?
I mailed a lot with AVM the company behind the fritzboxes. They told me that the problem is with IPv6-DNS-requests. The fritzboxes can't handle them so they just pass them through to the external DNS-server. If the external server has no IPv6-IP or can't handle IPv6-requests it gives only an reply that it got the request. The fritboxes pass them through, too, you can see them with wireshark. You get those replies with the fritzbox as DNS or with a direct DNS in the internet, but konqueror doesn't react if they come from the fritzbox and runs into timeout instead of doing an IPv4-request. AVM said, it must be KDE, because they give the correct DNS-answer, they just pass them through. I don't know enough about those things, so, is he correct, or is there something else, I could tell him? (Thiago: I tried to talk to you about it, but it was difficult to be at the computer at the same time and I had connectivity-problems)
Do you have that wireshark trace? If so, please attach the packet capture file to this bug report. Also, please try turning IPv6 off in your machine. If you don't have an IPv6 connection, you probably don't want it on at all. We have code to avoid sending IPv6 requests if IPv6 isn't active. See if that solves the problem.
Created attachment 28093 [details] pcap trace of the FB 7170, fw 29.04.59, AAAA query I attached the trace from my Fritz!Box 7170 with firmware 29.04.59. This is konqueror doing a request to www.heise.de, where I only kept the first few requests that show the problem. The following requests had the same issue from time to time. Note that my NIC does Checksum offloading, since the packets do get into the internet... ;-) Packets 7, 8, and 9 never get answered correctly. The response in packet 11 however goes to port 49059 (which was used in packet 7) with transaction ID 0xd9d5, which is the ID from packet 9 (port 50730).
I can confirm that disabling IPv6 in the kernel completely solves the problem (at least for me) :)
Created attachment 28100 [details] DNS with FritzBox as DNS-Proxy
Created attachment 28101 [details] Direct DNS-request to an DNS-Server in the internet
Created attachment 28102 [details] DNS-request with IPv6 disabled
Okay, I added three wireshark-files, the normal behavior with fritzbox as a DNS-Server, direct request to an DNS-server in the internet and a request with disabled ipv6. Disabling IPv6 seems to solve the problem.
I can confirm that disabling IPv6 fixes the problem, which was likely to be discovered :)
Okay, I got another answer from AVM. The devs there think the same thing what is mentioned in comment #4. The 2 requests are following too fast after each other and the fritzboxes only answer to one of them because the other one is recognized as "retransmit". They look if it is possible to optimize the behaviour of the fritzboxes.
What's really bad is that the Fritzboxes send one mixed answer for both packages. So neither of them gets a proper answer. It was comment #3 FWIW ;-)
So how about providing a) a gui option or deactivating IPv6 and b) an error message leading to this config. So this temporary "fix" could be part of KDE 4.2 and normal users could deal with it easily.
Amybe I'm mistaken, but it's not that easy to deactivate IPv6 from within KDE. You have to tell the Kernel about not loading the respective module. And another problem: How can KDE detect it is running into exactly this problems? There are various other reasons why DNS lookups could take some time. But maybe one should document this issue at some prominent place.
*** Bug 176576 has been marked as a duplicate of this bug. ***
I got a mail from AVM they closed my bugreport there. They will not change the behaviour of their FritzBoxes. They say it is a retransmit if a second DNS-request is sent during the answer of the first one. Firefox could handle it that there is only one answer for two requests, they say, that is enough in their opinion, because the DNS-request is answered. Original german mail-text: Abschließend: Es wird in FRITZ!Box Fon WLAN 7050 keine Änderung diesbezgl. geben. Als ein Retransmit wird es in FRITZ!Box angesehen, wenn während einer DNS-Antwort aus dem Internet noch ein 2. DNS-Request aus dem LAN kommt. Der Firefox kommt damit klar, dass er nur auf einen von zwei Requests eine Antwort bekommt. Das sollte nach unserer Meinung auch reichen. Die DNS-Info liegt ja dann vor.
ob(In reply to comment #95) > I got a mail from AVM they closed my bugreport there. They will not change the > behaviour of their FritzBoxes. Of course, it's a KDE-only bug. Fritz boxes are solid workers.
(In reply to comment #96) > Of course, it's a KDE-only bug. > Fritz boxes are solid workers. No, it also happens outside KDE, for example running: "getent kde.org & getent kde.org" is enough to trigger this bug. (In reply to comment #95) > Firefox could handle it > that there is only one answer for two requests, they say, that is enough in > their opinion, because the DNS-request is answered. They don't seem to understand that the fritzbox mixes data from both requests; see comment #3, which is definately a bug (imho)
(In reply to comment #97) > (In reply to comment #96) > > Of course, it's a KDE-only bug. > > Fritz boxes are solid workers. > > No, it also happens outside KDE, for example running: "getent kde.org & getent > kde.org" is enough to trigger this bug. > Sorry for the traffic, meant "getent hosts kde.org & getent hosts kde.org"
Well, it seems if some mac os x update triggers the same problem in safari and others.. AVM got informed by a dns problem in their boxes by the german computer magazine c't and well AVM fixed the dns issue. For me it seems very likely that it i s the same bug in those boxes which is causing this bug. (although AVM always said when asked about this bug, that is not a bug). Not as Macs are hitten, they fixed it. Firmeware update should be out soon. There the (german) article about it: http://www.heise.de/newsticker/Fritz-Box-bremst-Mac--/meldung/121555 If someone can test the new firmeware and see if it solves the issue, please report.
(In reply to comment #99) > There the (german) article about it: > > http://www.heise.de/newsticker/Fritz-Box-bremst-Mac--/meldung/121555 > > If someone can test the new firmeware and see if it solves the issue, please > report. Thanks for this information, I overlooked it (I'm a regular reader of heise.de) Unfortunately, I have a FritzBox 7050 and doubt that there will be an update for it...
According to forum posts this is fixed in the Fritz!Box beta firmware release: http://www.avm.de/de/Service/Service-Portale/Service-Portal/Labor/labor.php (Only for 7270) Could someone with 7270 please test status with the beta firmware?
@Robin #100: I would recommend to write to support of AVM. They are very friendly people. They don't support older boxes with new features. But if there is a known bug, I believe they provide an update. Just be friendly, describe the problem and refer to the heise report and this bug.
AVM support wrote to me that they are working on a bug fix firmware. This will be published on their site. So it would be nice, if anyone who notices a new firmware there puts a note here. Seems like this bug can be closed, since it is not a problem of KDE anymore.
(In reply to comment #103) > Seems like this bug can be closed, since it is not a problem of KDE anymore. From my point of view KDE should try anyway not to fire an DNS storm at the servers.
(In reply to comment #104) > (In reply to comment #103) > > Seems like this bug can be closed, since it is not a problem of KDE anymore. > > From my point of view KDE should try anyway not to fire an DNS storm at the > servers. It doesn't since 4.2. This bug is not even a problem for KDE anymore and can really be closed. Thiago?
As stated by Roland Harnau, this one shold be fixed in 4.2 and trunk.
There is a new firmware @ www.avm.de/labor . It does not say if the bugfix is in it, but I assume this is the case. I can not try myself at the moment, but may be others can try and if the bugs still exists, can give this feedback to avm.
I have installed the new firmware and it seems like the bug is fixed there, but I am not 100% sure. Just when I wrote everything is fine, it took half a minute to display this bug page. But this may be a server problem. Now people with older versions of the fritz box should ask AVM to release bug fixed firmwares for them too.
You can test it with: 'host test.de & host test.de &' If it is not fixed it will spawn ';; Warning: ID mismatch: expected ID Y, got X' (X != Y obviously) Would be happy if it is fixed and ask AVM for a backport to 7050 as the bug has been reported while 7050 was still supported. And btw., there also is a new 7170 beta Firmware, so you guys can be happy now to I hope =)
It works. BTW tested with the 7170 firmware.
Why are there two dns queries for the same hostname in parallel? Which process is responsible for this? Is there no local dns cache? You can blame AVM for this bug in their routers, but this situation is really unwanted. You should avoid avoid this by using a local dns cache. Even f$%&%/() Windows has such a cache!
Local caching is not always enabled. Don't blame us.
I am re-opening this bug because the fix committed to TCPSlaveBase in comment #18 as a "workaround for broken routers" makes no sense for several reasons. #1. The single most important problem with the "workaround fix" is that it causes TCPSlaveBase to do a DNS lookup of request host names even when using a proxy server. This causes bug reports such as https://bugs.kde.org/show_bug.cgi?id=207550. This is also the cause of tunneled proxy connections, aka https over http proxy using CONNECT, requests that originate from KDE based application always use IP address instead of hostname in the CONNECT request, which is also not a desired behavior. #2. There are many DNS queries that take place before a request even makes it to the TCPSlaveBase level. See my response to the bug report mentioned above to understand what else might cause DNS queries. Hint: It is not KIO. #3. How is it that TCPSlaveBase gets a "workaround fix" when QAbstractSocket aborts a name lookup that was started by another one of its own member functions, connectToHost ? What TCPSlaveBase::connectToHost used to do is exactly the same thing KSocketFactory::synchronousConnectToHost does now! I just do not see why the issue is worked around in one location and not the other !?!? That means if this bug was truly caused by what was described in comment #8, then kio_ftp should a victim to this bug today since it uses KSocketFactory::synchronousConnectToHost. Anyhow, this needs to be fixed another way. If waitForConnected should not be called before the connectToHost has completed its host lookup, then we need to find a workaround for that specific issue, but not by making TCPSlaveBase perform a name look up.
This bug can no longer be tested, since the broken routers that were the source of the report have since got firmware upgrades (when Safari started suffering from the same problem). Also, I believe that the sending-of-IP-addresses-in-proxy problem was fixed several years ago too.
(In reply to comment #115) > This bug can no longer be tested, since the broken routers that were the source > of the report have since got firmware upgrades (when Safari started suffering > from the same problem). Well that may be true, but unfortunately the ramifications of the workaround that was committed to TCPSlaveBase is still around wrecking havoc today. > Also, I believe that the sending-of-IP-addresses-in-proxy problem was fixed > several years ago too. It might be fixed in Qt's networking classes, but because the aforementioned workaround commit the sending-of-IP-address-in-proxy is alive and well in KDE. But don't take my word for it, set up HTTPS proxy in KDE to a proxy server, and browse to an SSL site. Look at the log file of the proxy server and you will clearly see that the resulting CONNECT message contains an IP address and not a host name. And then there is the matter of of using QAbstractSocket::waitForConnected. I still do not comprhend why it aborts host name lookup in progress just to turn around and to the same lookup in a blocking mode. Perhaps that is done to make that function a synchronous function at a cost of duplicate host name lookups ?? Anyhow, that is not its only problem. Unlike the QAbstractSocket::connectToHostImplementation function, waitForConnected seems not to even bother with optimizing for the case where the supplied host name is actually an IP address. Instead it seems to perform a blind blocking lookup. Of course that causes unnecessary reverse lookup.
(In reply to comment #115) > This bug can no longer be tested, since the broken routers that were the source > of the report have since got firmware upgrades (when Safari started suffering > from the same problem). Well that may be true, but unfortunately the ramifications of the workaround that was committed to TCPSlaveBase is still around wrecking havoc today. > Also, I believe that the sending-of-IP-addresses-in-proxy problem was fixed > several years ago too. It might be fixed in Qt's networking classes, but because the aforementioned workaround commit the sending-of-IP-address-in-proxy is alive and well in KDE. But don't take my word for it, set up HTTPS proxy in KDE to a proxy server, and browse to an SSL site. Look at the log file of the proxy server and you will clearly see that the resulting CONNECT message contains an IP address and not a host name. And then there is the matter of using QAbstractSocket::waitForConnected. I still do not comprehend why it aborts a host name lookup in progress just to turn around and do the same lookup in a blocking mode. Was that done to make the function a synchronous function ? Anyhow, that is not its only problem. Unlike the QAbstractSocket::connectToHostImplementation function, waitForConnected seems not to even bother with optimizing for the case where the supplied host name is actually an IP address. Instead it seems to blindly perform a blocking lookup. Of course that results in an very unnecessary reverse lookup.
(In reply to comment #117) > > Also, I believe that the sending-of-IP-addresses-in-proxy problem was fixed > > several years ago too. > > It might be fixed in Qt's networking classes, but because the aforementioned > workaround commit the sending-of-IP-address-in-proxy is alive and well in KDE. If that's the case, then the issue has regressed. I am 100% sure that I tried this with kio_http at one point and it worked. If that's the case, then it's also very likely that the very fix for this bug is the cause. > And then there is the matter of using QAbstractSocket::waitForConnected. I > still do not comprehend why it aborts a host name lookup in progress just to > turn around and do the same lookup in a blocking mode. Was that done to make > the function a synchronous function ? Yes. It needs to be fully synchronous and there's no QHostInfo::waitForFinished. So the only way of ensuring that the results get in without starting a nested event loop is to cancel the lookup and restart it. With *any* sane caching DNS server, this makes absolutely no difference. The problem is when you get insane and braindead servers, like the Fritzboxes had.
> This bug can no longer be tested, since the broken routers that were the source > of the report have since got firmware upgrades (when Safari started suffering > from the same problem). Then I suggest we just drop the workaround.
(In reply to comment #119) > > This bug can no longer be tested, since the broken routers that were the source > > of the report have since got firmware upgrades (when Safari started suffering > > from the same problem). > > Then I suggest we just drop the workaround. Note that the workaround does introduce some interesting functionality. It provides some level of DNS pinning.
(In reply to comment #120) > (In reply to comment #119) > > > This bug can no longer be tested, since the broken routers that were the source > > > of the report have since got firmware upgrades (when Safari started suffering > > > from the same problem). > > > > Then I suggest we just drop the workaround. > > Note that the workaround does introduce some interesting functionality. It > provides some level of DNS pinning. But that is happening at the wrong location. If such functionality is interesting, then it should happen at the socket level, be it KTcpSocket or Q*Socket. TCPSlaveBase is too high level for performing any sort of name lookups for such purposes IMO.
(In reply to comment #121) > (In reply to comment #120) > > Note that the workaround does introduce some interesting functionality. It > > provides some level of DNS pinning. > > But that is happening at the wrong location. If such functionality is > interesting, then it should happen at the socket level, be it KTcpSocket or > Q*Socket. TCPSlaveBase is too high level for performing any sort of name > lookups for such purposes IMO. I disagree. In fact, I would even say that TCPSlaveBase is still not high enough. DNS pinning should happen from the application/use layer. That is, from the HTML engine: all loads from a given address in the same page should come from the same RRset, even if the DNS result would have changed. The socket level doesn't know what other sockets are in use, so it doesn't know how it should apply the pinning. That said, QHostInfo does implement a 5-minute cache these days, so the same level of pinning that this workaround afforded will be kept. What it won't keep is the pinning across slaves: two kio_http launched for the same address will still do two DNS queries and could end up with different results.
(In reply to comment #122) > (In reply to comment #121) > > (In reply to comment #120) > > > Note that the workaround does introduce some interesting functionality. It > > > provides some level of DNS pinning. > > > > But that is happening at the wrong location. If such functionality is > > interesting, then it should happen at the socket level, be it KTcpSocket or > > Q*Socket. TCPSlaveBase is too high level for performing any sort of name > > lookups for such purposes IMO. > > I disagree. In fact, I would even say that TCPSlaveBase is still not high > enough. > > DNS pinning should happen from the application/use layer. That is, from the > HTML engine: all loads from a given address in the same page should come from > the same RRset, even if the DNS result would have changed. Well there is already such a feature in both KHTML & KWebkitPart under the ospesis of DNS Prefetching. Granted that information is not shared/ used by the socket class and I am unsure whether or not the engines automatically use prefetched IP address if that functionality is enabled. However, to me all the idea of DNS pinning and/or prefetching will only work correctly as intended when the entire stack shares the same DNS caching mechanism much like the 3rd party DNS caches available in Linux. That way everything, not just KDE application, gets to benefit from using the cached > The socket level doesn't know what other sockets are in use, so it doesn't know > how it should apply the pinning. > > That said, QHostInfo does implement a 5-minute cache these days, so the same > level of pinning that this workaround afforded will be kept. What it won't keep > is the pinning across slaves: two kio_http launched for the same address will > still do two DNS queries and could end up with different results. But that is currently happening at the cost of user's privacy and/or security when using proxies. And I say that because unlike other places where lookups do occur, this lookup cannot be disabled.
(In reply to comment #118) > (In reply to comment #117) > > > Also, I believe that the sending-of-IP-addresses-in-proxy problem was fixed > > > several years ago too. > > > > It might be fixed in Qt's networking classes, but because the aforementioned > > workaround commit the sending-of-IP-address-in-proxy is alive and well in KDE. > > If that's the case, then the issue has regressed. I am 100% sure that I tried > this with kio_http at one point and it worked. If that's the case, then it's > also very likely that the very fix for this bug is the cause. It is hard for me to see where the regression could have occured. The code that was committed as a workaround does exactly what it was intended to do. Resolve the host name and use the ip address when connecting to the server. That is very obvious from looking at TcpSlaveBase::connectToHost. As a result, in a https over http proxy connection (aka CONNECT), ip address will be used when constructing the CONNECT header because that is the only thing the Q*Socket classes have. So as far as I can tell, there is no regression there. Only the side effect of the work around. > > And then there is the matter of using QAbstractSocket::waitForConnected. I > > still do not comprehend why it aborts a host name lookup in progress just to > > turn around and do the same lookup in a blocking mode. Was that done to make > > the function a synchronous function ? > > Yes. It needs to be fully synchronous and there's no > QHostInfo::waitForFinished. So the only way of ensuring that the results get in > without starting a nested event loop is to cancel the lookup and restart it. I figured as much. However, the question is what would the side effect or negative impact of using a local event loop be in case of TCPSlaveBase ? IOW, what potential problems would be encountered if one were to add the code below in between the calling d->socket.connectToHost and d->socket.waitForConnected: if (d->socket.state() == KTcpSocket::HostLookupState) { QEventLoop loop; QTimer timer; int elapsedTime = 0; timer.setInterval(500); timer.setSingleShot(true); QObject::connect (&timer, SIGNAL(timeout()), &loop, SLOT(quit())); Q_FOREVER { timer.start(); loop.exec(); if (d->socket.state() != KTcpSocket::HostLookupState || elapsedTime >= timeout) break; elapsedTime += timer.interval(); } } > With *any* sane caching DNS server, this makes absolutely no difference. The > problem is when you get insane and braindead servers, like the Fritzboxes had. Though I agree that what the Fritzboxes did was "insane", that very same argument could be leveled against what waitForConnected does. I do not think any developer that uses this API expects waitForConnected to do what it currently does. Perhaps a single note or some kind of heads up in the API documentation would have informed developers about such unexpected behavior. I can literally give you an example of where unexpected behavior of a function is the cause of password caching bug in KDE. Anyhow, it would be nice if waitForConnected gets fixed so that it does not do a reverse lookup when the supplied host name is actually an IP address already.
(In reply to comment #124) > (In reply to comment #118) > > Yes. It needs to be fully synchronous and there's no > > QHostInfo::waitForFinished. So the only way of ensuring that the results get in > > without starting a nested event loop is to cancel the lookup and restart it. > > I figured as much. However, the question is what would the side effect or > negative impact of using a local event loop be in case of TCPSlaveBase ? IOW, > what potential problems would be encountered if one were to add the code below > in between the calling d->socket.connectToHost and d->socket.waitForConnected: > > if (d->socket.state() == KTcpSocket::HostLookupState) { > QEventLoop loop; Let's stop here. I said "without starting a nested event loop" and the line above exists only to do exactly what I said mustn't be done. Ask any proficient Qt developer and they'll tell you that nested event loops are evil and must be avoided. Having code like socket connections spin the event loop are really unexpected. Moreover, introducing an event loop where there was none is also potentially catastrophic. > > With *any* sane caching DNS server, this makes absolutely no difference. The > > problem is when you get insane and braindead servers, like the Fritzboxes had. > > Though I agree that what the Fritzboxes did was "insane", that very same > argument could be leveled against what waitForConnected does. I do not think > any developer that uses this API expects waitForConnected to do what it > currently does. Perhaps a single note or some kind of heads up in the API > documentation would have informed developers about such unexpected behavior. I > can literally give you an example of where unexpected behavior of a function is > the cause of password caching bug in KDE. I'm sorry, which behaviour? The behaviour of attempting a name lookup again? That's an implementation detail and completely irrelevant for the discussion, except for that it triggered a bug in the fritzboxes. The bug was clearly in the fritzboxes, not in Qt code. That is not up for discussion. Implementation details are just that. Application developers don't have to know them and they should never rely on them, for they may change. In fact, I think that the DNS caching functionality present in Qt 4.7 has changed this behaviour in many ways, including the fact that it may not execute a second query at all if the first one is running in a different thread. > Anyhow, it would be nice if waitForConnected gets fixed so that it does not do > a reverse lookup when the supplied host name is actually an IP address already. Why? What's the consequence?
On Wed, Apr 20, 2011 at 6:43 PM, Thiago Macieira <thiago@kde.org> wrote: > https://bugs.kde.org/show_bug.cgi?id=162600 > > > > > > --- Comment #125 from Thiago Macieira <thiago kde org> 2011-04-21 00:43:09 --- > (In reply to comment #124) >> (In reply to comment #118) >> > Yes. It needs to be fully synchronous and there's no >> > QHostInfo::waitForFinished. So the only way of ensuring that the results get in >> > without starting a nested event loop is to cancel the lookup and restart it. >> >> I figured as much. However, the question is what would the side effect or >> negative impact of using a local event loop be in case of TCPSlaveBase ? IOW, >> what potential problems would be encountered if one were to add the code below >> in between the calling d->socket.connectToHost and d->socket.waitForConnected: >> >> if (d->socket.state() == KTcpSocket::HostLookupState) { >> QEventLoop loop; > > Let's stop here. I said "without starting a nested event loop" and the line > above exists only to do exactly what I said mustn't be done. I know what you said. That is exactly why I asked whether doing this would be detrimental to TCPSlaveBase. I was simply curious how adding such a local loop would impact the use case of TCPSlaveBase which is neither thread safe nor re-entrant. Neither does it connect to any signals or emit signals itself. IOW it is completely isolated unto itself. Regardless, it was a hypothetical question that required a specific answer as to why it would be bad. I know nested event loops in general are dangerous. > Ask any proficient Qt developer and they'll tell you that nested event loops > are evil and must be avoided. Having code like socket connections spin the > event loop are really unexpected. Moreover, introducing an event loop where > there was none is also potentially catastrophic. >> > With *any* sane caching DNS server, this makes absolutely no difference. The >> > problem is when you get insane and braindead servers, like the Fritzboxes had. >> >> Though I agree that what the Fritzboxes did was "insane", that very same >> argument could be leveled against what waitForConnected does. I do not think >> any developer that uses this API expects waitForConnected to do what it >> currently does. Perhaps a single note or some kind of heads up in the API >> documentation would have informed developers about such unexpected behavior. I >> can literally give you an example of where unexpected behavior of a function is >> the cause of password caching bug in KDE. > > I'm sorry, which behaviour? The behaviour of attempting a name lookup again? > That's an implementation detail and completely irrelevant for the discussion, > except for that it triggered a bug in the fritzboxes. > > The bug was clearly in the fritzboxes, not in Qt code. That is not up for > discussion. > > Implementation details are just that. Application developers don't have to know > them and they should never rely on them, for they may change. In fact, I think > that the DNS caching functionality present in Qt 4.7 has changed this behaviour > in many ways, including the fact that it may not execute a second query at all > if the first one is running in a different thread. Let me give you an example why I have issues about implementation details. SlaveBase::openPasswordDialog in KDE, which I originally wrote in KDE 2.x or early KDE 3.x days, was changed in KDE 3.1 to automatically cache the password if the user checks the "Remember password" checkbox and clicks OK in the dialog. Unfortunately, that was not the purpose of the openPasswordDialog. It was designed to simply prompt the user and return the result. Then any ioslave that wanted to save the result would manually call another function called SlaveBase::cacheAuthentication. There was a reason for this madness. You do not want to cache an incorrect password ; so until the ioslave can successfully login the password should not be cached at all. Viola, the unexpected behavior change in the implementation caused countless bug reports that has yet to be addressed. In my many years of software development, I have seen unexpected behaviors that were chocked up to "implementation details" cause many such hidden headaches and bugs. As a result I personally frown upon functions that purport to do something, but behind the scenes do something unexpected simply because that is an implementation detail, no matter how valid the reason behind it might be. Anyhow, this really does not matter since that is not the issue at hand. >> Anyhow, it would be nice if waitForConnected gets fixed so that it does not do >> a reverse lookup when the supplied host name is actually an IP address already. > > Why? What's the consequence? Hmm... let me turn around and ask you the same question. What is the point of doing a reverse name lookup at this point ? Specially since the QAbstractSocket::connectToHostImplementation function in that same class seems to specifically protect against this by using QHostAddress to avoid looking up an ip address. Does QAbstractSocket::waitForConnected need to look up the host name associated with a given ip address ? Regards, Dawit A.
(In reply to comment #126) > Hmm... let me turn around and ask you the same question. What is the > point of doing a reverse name lookup at this point ? Specially since > the QAbstractSocket::connectToHostImplementation function in that same > class seems to specifically protect against this by using QHostAddress > to avoid looking up an ip address. Does > QAbstractSocket::waitForConnected need to look up the host name > associated with a given ip address ? Please file a Qt bug report about this. I thought you meant the behaviour that it might execute two name lookups.
(In reply to comment #127) > (In reply to comment #126) > > Hmm... let me turn around and ask you the same question. What is the > > point of doing a reverse name lookup at this point ? Specially since > > the QAbstractSocket::connectToHostImplementation function in that same > > class seems to specifically protect against this by using QHostAddress > > to avoid looking up an ip address. Does > > QAbstractSocket::waitForConnected need to look up the host name > > associated with a given ip address ? > > Please file a Qt bug report about this. I thought you meant the behaviour that > it might execute two name lookups. http://bugreports.qt.nokia.com/browse/QTBUG-18881
I am going to commit a patch that will revert back the commit in comment #18. You can view the revert patch at https://git.reviewboard.kde.org/r/101338/.
Git commit 65aabc8c6df6d25fc35d06ad880ecdc9a2e43291 by Dawit Alemayehu. Committed on 01/05/2011 at 17:46. Pushed by adawit into branch 'master'. Avoid resolving host names in TCPSlaveBase::connectToHost. This basically reverts commit 79c4ed8a7c7fe18f4c1d02d5faba5e7a412f57ae which was a workaround for bugs in hardware that was caused by QAbstractSocket's potential propensity to perform multiple look ups when connectToHost and waitForConnected are called successively. BUG: 207550 BUG: 162600 REVIEW: 101338 M +13 -29 kio/kio/tcpslavebase.cpp http://commits.kde.org/kdelibs/65aabc8c6df6d25fc35d06ad880ecdc9a2e43291