Bug 171878 - konqueror fails to request all images from page
Summary: konqueror fails to request all images from page
Status: RESOLVED WORKSFORME
Alias: None
Product: kio
Classification: Frameworks and Libraries
Component: http (show other bugs)
Version: unspecified
Platform: unspecified Linux
: NOR normal
Target Milestone: ---
Assignee: kdelibs bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-09-29 23:46 UTC by Michal Witkowski
Modified: 2010-12-26 23:52 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
konqueror konsole log (18.03 KB, text/x-log)
2008-12-25 16:11 UTC, Michal Witkowski
Details
wykop.pl web html as seen 2009.04.12 10:50 (83.38 KB, text/html)
2009-04-12 10:53 UTC, Michal Witkowski
Details
libpcap capture file of konqueror wykop.pl visit (102.63 KB, application/octet-stream)
2009-04-12 10:54 UTC, Michal Witkowski
Details
screenshot of wykop.pl as seen on 2009.04.12 with a clean KDE profile (185.03 KB, image/png)
2009-04-12 10:55 UTC, Michal Witkowski
Details
log of konqueror's wykop.pl visit on 2009.04.12 (29.41 KB, text/x-log)
2009-04-12 10:55 UTC, Michal Witkowski
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michal Witkowski 2008-09-29 23:46:41 UTC
Version:           4.1.1 (KDE 4.1.1) (using 4.1.1 (KDE 4.1.1), Arch Linux)
Compiler:          gcc
OS:                Linux (i686) release 2.6.26-ARCH

I have a personal (http auth needed) gallery script set up using http://scry.org/ and apache2 on debian lenny.

The problem is that with large listings of images, konqueror fails to retrieve them all. After reloading the page, some of the missing images are retrieved, after multiple reloads (circa 10) almost all of them get retrieved.

Each image requires two requests: the first request is handled by the script, which generates a thumbnail if necessary and then responds with 302 Temporary Moved to the location of the cached thumbnail. The image is later retrieved via a separate request (normal GET and HTTP 200 response).

I've been playing around the problem as reported here (http://bugs.kde.org/show_bug.cgi?id=61235) and by watching the apache server log and wireshark I've established the following:
1. almost all of the first 1/3 images (total of about 100 thumbnails) get fully retrieved
2. the next 1/3 is issued the first request (the one to the script) but fails to perform the second request to the actual image
3. there usually are no requests sent for the last 1/3 of images, however some of them are displayed.

My guess is that konqueror spawns a set number of kio_http processes. It seems however, that after the first requests are served, konqueror doesn't try to get the rest as if they somehow timed-out.

I get the same behaviour with and without adblock. Also, I don't use any proxy. Tried with proxies, got the same thing.

I get this behaviour also on pages with large number of images like: http://wwww.gazeta.pl http://www.onet.pl Some of their images are missing (spacers, part of frames, even main images) but they appear after the first or second refresh (when konqueror caches them). The refreshing doesn't fix the problem when cache is off.
Comment 1 Maksim Orlovich 2008-10-03 01:18:06 UTC
Thanks for filing this, it's quite interesting, and very different from what I was guessing. With respect to the number of requests, it limits the number of connections, though potentially not enough --- are you perhaps seeing any connection refusals? The # of images shouldn't matter per se, since additional requests should just be queued. It seems possible that the communication w/kio_http and the app is somehow messed up. I've fixed bugs like that before (which are tricky), but may be there is an additional one with redirects.

Also, are you connecting over loopback, LAN, or internet?

And, well, how difficult is this thing to set up? If it takes a long time, I'll probably try to trick the kio_http guy into doing instead of looking at it myself :)
Comment 2 Michal Witkowski 2008-10-03 10:51:38 UTC
Toy mean conenction refusals in the server log? Nope. None at all. Other browsers (Opera, Firefox and even IE) work ok wih the site regardless of connection speed.

I've checked a Wi-Fi connection, a LAN conneciton and connecting from outside (3mbit up). Konqueror still kept losing the images with similar proportions.

Actually this is pretty easy to set up. You just need apache with PHP5 and the script. Then just set the paths for to a directory containing lots of images and fire away.



Comment 3 Michal Witkowski 2008-10-04 16:14:31 UTC
Alright I've set up the following example gallery on my website for all of you to see/test upon.

http://continuity.intelink.pl/bugs/index.php?v=list&i=0&p=kde

It contains screenshots of websites I've encountered affected by this problem. These are:
http://www.gazeta.pl (major polish website)
http://www.onet.pl (major polish website)
http://www.digart.pl/przegladaj/dd.html?p=6&s=&k=
http://www.cnn.com
etc... See the location bar of the images to see the URLs.

I've duped each image 7 times to fill the gallery. The funny thing is, this time, when the script generated the thumbnails, each image loaded perfectly (each thumbnail had to be generated which means that each request took about 0.5s). This suggests that the problem occurs when the requests are handled too fast.

The above screenshots were made with KDE 4.1.2, so the problem is still around.
Comment 4 Michal Witkowski 2008-10-08 21:06:13 UTC
It appears that this bug in kio_http is caused by the existance of a transparent web PROXY.

I'm running squid (2.6.STABLE20-1) on my home router as a transparent proxy. I get the above behaviour (with various sites) on konqueror only whenever the port 80 redirection to squid is turned on. The problems ceased to appear when I turned the transparend proxy'ing off.

However this is not a solution. My parents' ISP also uses a transparent proxy (squid again) and I get the same corruption when I visit their place. I can't get around the issue by removing the rediraction of port 80, because, obciously, I don't control the ISP's servers. 

The thing is: it's a KDE thing. Both Opera and Firefox work well with the sites with transparent proxy on.



Comment 5 Maksim Orlovich 2008-10-08 21:18:42 UTC
Thanks for the analysis, great detective work.
Comment 6 Michal Witkowski 2008-12-25 16:11:50 UTC
Created attachment 29620 [details]
konqueror konsole log

I'm now using KDE 4.2 beta2 and although the issue is limited (most of the sites are fine, I get missing images on some pages. For example on http://www.wykop.pl (polish variant of digg) I get a lot of missing item pictures associated with an error message in console log:
konqueror(29531)/kio (KIOJob) KIO::SlaveInterface::dispatch: error  123   "www.wykop.pl: Unknown error"   

Has anyone tracked what this issue may be related to?
Comment 7 Michal Witkowski 2009-04-12 10:52:46 UTC
Currently using KDE 4.2.2 and the bug is still there. 

wykop.pl is a good test site for this bug. I constantly get dropped images there, regardless of proxy usage. This may be relate do the fact, that wykop.pl requires a lot of HTTP requests to retrieve all page elements. This is due to the fact, that link thumbnails are retrieved as follows:
1. The image element is pointing to i.wykop.pl
2. The request returns 302 Moved with a new location pointing to amazons s3
3. The new request to the location needs to be made

It seems that konqueror somehow runs out of khtml IO handlers and doesn't bother to retry retrieving elements later. By viewing a wireshark log, it is apparent that some web page elements aren't requested at all (no HTTP requests sent for them to i.wykop.pl).

I attach the current wykop.pl website html (for reference), a screenshot, a wireshark libpcap capture file and the log of error messages from konqueror's console.

This bug is really annoying and renders konqueror unusable on some web pages.
Comment 8 Michal Witkowski 2009-04-12 10:53:42 UTC
Created attachment 32774 [details]
wykop.pl web html as seen 2009.04.12 10:50
Comment 9 Michal Witkowski 2009-04-12 10:54:29 UTC
Created attachment 32775 [details]
libpcap capture file of konqueror wykop.pl visit
Comment 10 Michal Witkowski 2009-04-12 10:55:05 UTC
Created attachment 32776 [details]
screenshot of wykop.pl as seen on 2009.04.12 with a clean KDE profile
Comment 11 Michal Witkowski 2009-04-12 10:55:47 UTC
Created attachment 32777 [details]
log of konqueror's wykop.pl visit on 2009.04.12
Comment 12 A. Spehr 2009-06-02 09:36:52 UTC
Any progress on this? When looking at the current http://wykop.pl page, I don't see any dropped images...

 4.2.86 here
Comment 13 Dawit Alemayehu 2010-12-26 23:52:54 UTC
The http://wykop.pl issue cannot be reproduced with the recent versions of KDE. There might be other dropped image issues, but those are probably caused by the caching issues in kio_http. See bug# 256104... 

Anyhow feel free to reopen this report if you still the problem in KDE, 4.5.x and up. Thanks for the report.