Bug 59965 - [testcase] Should not refetch page for saving or changing view modes
Summary: [testcase] Should not refetch page for saving or changing view modes
Status: RESOLVED FIXED
Alias: None
Product: konqueror
Classification: Applications
Component: general (show other bugs)
Version: 3.2.1
Platform: Compiled Sources Linux
: NOR normal
Target Milestone: ---
Assignee: Konqueror Developers
URL:
Keywords:
: 59534 59966 61940 72938 78793 79136 80513 119241 122373 122759 128985 132267 (view as bug list)
Depends on:
Blocks:
 
Reported: 2003-06-17 21:42 UTC by András Manţia
Modified: 2008-06-02 16:54 UTC (History)
21 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
testcase (969 bytes, application/octet-stream)
2008-05-04 17:57 UTC, Michael Leupold
Details
Print from a techbase page I used for analysis (52.52 KB, application/octet-stream)
2008-05-04 18:19 UTC, Michael Leupold
Details

Note You need to log in before you can comment on or make changes to this bug.
Description András Manţia 2003-06-17 21:42:06 UTC
Version:           3.1.2 (using KDE KDE 3.1.2)
Installed from:    Compiled From Sources

When you try to save a web page or an image from a page in Konqueror, and your internet connection is down (eg. because you have closed it, as you read the pages offline), Konqueror will try to get the page/image once more from the server, and it fails. I believe it should save exactly the page that you see and not the page at the same address as it is on the server. The two pages may be different (if you try to save the page after several hours), but even in this case I believe the user is interested in the page that he/she sees and not in the up-to-date/changed version which is on the server.
I don't know if it's the same problem when you try to print, but I guess so. 
And for sure it's the same issue when you try to view the document source.

Andras
Comment 1 Simon Huerlimann 2003-06-17 21:47:26 UTC
*** Bug 59966 has been marked as a duplicate of this bug. ***
Comment 2 Simon Huerlimann 2003-06-17 21:50:37 UTC
I just filed #59966 :-) and closed already, you've been just faster... 
I changed the summary to be more general, as the offline stuff is just a special case. 
Change back if you want 
 
Simon 
Comment 3 András Manţia 2003-06-18 07:41:47 UTC
Subject: Re:  Should not refetch page for saving

Hi,

 Yes, U was a little but faster. :-) I'm OK with the new summary as well.

Andras

On Tuesday 17 June 2003 22:50, Simon Huerlimann wrote:
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
> You are a voter for the bug, or are watching someone who is.
>
> http://bugs.kde.org/show_bug.cgi?id=59965
> simon.huerlimann@access.unizh.ch changed:
>
>            What    |Removed                     |Added
> ---------------------------------------------------------------------------
>- Summary|Saving in offline mode      |Should not refetch page for
>
>                    |                            |saving
>
> ------- Additional Comments From simon.huerlimann@access.unizh.ch 
> 2003-06-17 21:50 ------- I just filed #59966 :-) and closed already, you've
> been just faster... I changed the summary to be more general, as the
> offline stuff is just a special case. Change back if you want
>
> Simon

Comment 4 esigra 2003-06-18 07:52:28 UTC
The same problem seems to occur sometimes when going back/forward. 
Comment 5 esigra 2003-06-18 07:56:08 UTC
See also the related bug report 14553 (and vote for it). 
Comment 6 András Manţia 2003-06-18 08:15:57 UTC
Subject: Re:  Saving in offline mode

Ok, it works as expected if the cache is set to "Use cache if possible". But I 
still think it should work the same way on save/print/view source/save image 
even if the cache is set to "Keep cache in synch". The user (at least I) 
expect that the page in the current form will be saved, and I would be 
surprised if the local copy contains is not the one that I thought I saved 
(e.g because the file on the server has changed meantime).

Andras
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE+8EoETQdfac6L/08RAsOwAKDlpuc1Ml4ncIz7WsYbI17llL9MAwCePNz0
vQJiZ1smGPNqMWgqgLSGoK8=
=K0CS
-----END PGP SIGNATURE-----

Comment 7 Tim Middleton 2003-06-18 14:02:58 UTC
I absolutely agree with the previous comment. This used to drive me nuts about
Mozilla; it would always reload to save/print also. It just seems like broken
logic to me: a person virtually always wants to save/print what i *see*... not
what gets reloaded. It also caused a lot of problems trying to print "post"
forms in mozilla... i haven't tried this with Konqueror.
Comment 8 Waldo Bastian 2003-06-24 13:57:30 UTC
KHTML now tries harder to load the page from the cache. However, there is no 
guarantee that the page is stored in the cache. 
 
Can you provide some URLs of pages where this is still a problem with recent CVS 
HEAD? Also mention the exact action(s) that cause server-traffic. 
Comment 9 András Manţia 2003-06-25 08:26:28 UTC
Subject: Re:  Should not refetch page for saving

It seems to work OK. Can this be backported to BRANCH?

Andras
Comment 10 Vedran Ljubovic 2003-06-28 17:37:20 UTC
The same problem applies to "View Source". Is that fixed as well? 
Comment 11 esigra 2003-07-12 23:24:12 UTC
> KHTML now tries harder to load the page from the cache. However, there is no 
> guarantee that the page is stored in the cache.  
  
Now the obvious question is: Why can't there be a guarantee that whatever is 
displayed in the browser is also cached? Just don't delete cache entries as long as they 
are being displayed. 
Comment 12 cb-kde 2003-07-27 22:46:04 UTC
Shouldn't it save the displayed item, even if the cache is disabled? 
Comment 13 Stephan Binner 2003-08-02 11:33:52 UTC
*** Bug 61940 has been marked as a duplicate of this bug. ***
Comment 14 Wilco Greven 2003-10-24 14:25:42 UTC
*** Bug 59534 has been marked as a duplicate of this bug. ***
Comment 15 Vedran Ljubovic 2003-11-12 18:20:15 UTC
Using 3.1.93. It appears to be working fine now, both for save and for view source, at least I didn't find any situation where it refeched the page. I suggest that this bug be closed now.
Comment 16 Mikolaj Machowski 2003-11-13 02:56:08 UTC
Subject: Re:  Should not refetch page for saving

> ------- Using 3.1.93. It appears to be working fine now, both for save
> and for view source, at least I didn't find any situation where it
> refeched the page. I suggest that this bug be closed now.

Yes. Confirm that. But still only main content (text) is saved. No
images, stylesheets etc.

Comment 17 Mark Szentes-Wanner 2003-11-14 20:04:31 UTC
If I save images or reopen them in a new window via the "View Image" context menu, it reloads it, even if the local cache is switched on in Konqueror's Preferences.
Comment 18 Jesper Juhl 2004-02-27 17:40:10 UTC
>> KHTML now tries harder to load the page from the cache. However, there is no 
> > guarantee that the page is stored in the cache. 
> 
> Now the obvious question is: Why can't there be a guarantee that whatever is 
> displayed in the browser is also cached? Just don't delete cache entries as >long as they 
> are being displayed. 

I very much agree with this comment. As long as a page is displayed it (and all elements it contains) should be cached locally and be guaranteed to be cached locally. As soon as the user leaves the page it may be purged from the cache, but as long as it is visible the sensible thing is to *always* guarantee that a cached local copy is available (yes, even if caching is turned off). I honestly don't see any reason to do anything else - is it that hard to do?
Comment 19 Mikolaj Machowski 2004-02-27 22:38:21 UTC
> purged from the cache, but as long as it is visible the sensible thing is
> to *always* guarantee that a cached local copy is available (yes, even if
> caching is turned off). I honestly don't see any reason to do anything
> else - is it that hard to do?

I don't know, but this is one of the most infuriating things in
Konqueror. I am often opening page in tab and don't have time to inspect
it properly for few hours. I am looking at it at the end and what I see?
Eg. no images because http_cache_cleaner removed them. Also how I can
see css files? Of course I can go to /var/user-kde/http/w and find it
but it is extremely unfriendly. With Mozilla I can save "whole page" and
browse images in image viewer, files in editor etc. Partial solution
would be Web Archiver but it cannot properly use cache...

Comment 20 jamethknorth 2004-03-27 03:33:36 UTC
Due to the fact that changing your current viewer reloads the document, the 'View Document Source' option cannot be removed from the RMB menu, causing clutter in an already cluttered menu. This needs to be resolved.
Comment 21 Matt Rogers 2004-04-03 18:55:33 UTC
*** Bug 78793 has been marked as a duplicate of this bug. ***
Comment 22 Stephan Binner 2004-04-08 11:01:11 UTC
*** Bug 79136 has been marked as a duplicate of this bug. ***
Comment 23 Caoilte O'Connor 2004-04-13 10:39:58 UTC
I think I now understand why I found Konqueror view source so annoying. In Debian packages (and I assume by default) use cache is not turned on by default (and is also set to like 1MB or something).

It doesn't even occur to me that I need to _configure_ a web browser to use it's cache. It took me several weeks to stumble across the configuration tab. Perhaps when you view source a pop up could inform you whether you are about to refetch the page from the internet and also how you can configure the browser to have the page available in the cache in future.
Comment 24 Leo Savernik 2004-04-13 14:24:45 UTC
No, that must be a Debian peculiarity. Khtml's defaults, as preset by KDE are:
[x] Use cache
[x] Keep cache in sync

Disk cache size: [   5Mb]
Comment 25 Lubos Lunak 2004-04-28 10:34:49 UTC
*** Bug 80513 has been marked as a duplicate of this bug. ***
Comment 26 Stefan Mueller 2004-05-20 10:03:13 UTC
I often save images from a web page to my disk. It would be great if this is done instantly without a second download (esp. annoying when browsing offline).
(KDE 3.2.1, Suse 9.1 pro)
Comment 27 Oded Arbel 2004-05-20 10:08:18 UTC

    
Comment 28 Mikolaj Machowski 2004-05-20 17:11:42 UTC
> This you actually can do easily - when you see an image, right click 
> it
> and choose "copy to". it will copy it to your target folder immdietly w/o
> reloading. another bonus is that it remembers the last 5 destinations so
> you don't have to browse to the target folder again (and its better then
> "Save as" which only remembers the last target folder and sometimes not
> even that)

Not entirely true. I am often opening links in separate tabs and 
returning to them after hours of other activity. I am trying to save 
image... and BOOM! kio_http_cache_cleaner wipes out image.

m.

Comment 29 David Faure 2004-09-10 20:02:29 UTC
> BOOM! kio_http_cache_cleaner wipes out image
Well, increase the cache size then...

So there's still a bug here, that is mostly apparent when going offline?
With both saving images and 'view' source? Any specific URL where it happens?
Comment 30 Mikolaj Machowski 2004-09-11 01:25:09 UTC
> > BOOM! kio_http_cache_cleaner wipes out image
> Well, increase the cache size then...
I wrote previously: cache size is set to 200M but real
size is kept at 40M.
> So there's still a bug here, that is mostly apparent when going offline?
> With both saving images and 'view' source? Any specific URL where it
> happens?
Anywhere.

Comment 31 Gioele Barabucci 2004-09-11 16:54:46 UTC
In reply to #29

"Save as" should save without redownload even without cache enabled (as I set it right now).
The page is loaded somewhere in memory, this must be enough.

(Memory == anyplace that can store it: RAM, separate disk cache for "currently loaded pages" or anything else)
Comment 32 Gioele Barabucci 2004-09-11 17:02:14 UTC
It is also danderous to save pages that has been generated after a POST/GET form.

Think of www.mystore.com/buy.php?cc=0192939483&article=xxx

If you try to save that page, that displays "well, the article xxx will be billed to 0192938364", you're going to redownload it, and that means two articles paid instead of one.

I know this example is a bit stupid, but there are more stupid sites around.
Comment 33 Matt Rogers 2004-12-20 06:45:36 UTC
*** Bug 72938 has been marked as a duplicate of this bug. ***
Comment 34 raditzman 2005-06-01 18:53:18 UTC
Someone should change this bug's title to "Should not refetch page and images when web archiving or saving image using 'save as'", i think...
Comment 35 raditzman 2005-06-01 19:38:37 UTC
Ops, I've just checked on kde 3.4.1 and here's the results...

save image as --> fetches from cache

Save as (at File menu) --> fetches from cache

View Source-Code --> fetches from cache 

web archiving --> doesn't fetch from cache

copy to       --> doesn't fetch from cache


So this bug sould be closed, (saving doesn't refetch page) and two opened (which I just did):
[106615]"Should get elements in cache for web archiving (not refetch from network)" and 
[106616]"Should get image from cache when using 'save image as' to save an image"
Comment 36 Matt Rogers 2005-06-02 01:30:44 UTC
closing according to the observations made in the last comment.
Comment 37 cb-kde 2005-06-02 12:53:38 UTC
Above comments are not a comprehensive summary. In particular, changing View Mode  has not been tested. I'm not currently in a position to test it myself.

Raditzman@yahoo.com: can you test this for HTML pages (change to text editor) and images (change between image viewer and embedded image viewer)?

Also, was the Konqueror cache enabled or disabled when you did your test? Konqueror should not re-download even if the cache is disabled.

Thanks
Charlie

Comment 38 Leo Savernik 2005-06-02 16:02:43 UTC
Sorry, resolving bugs on vague observations is not the way to go.

Reopening. I miss the tested behaviour when the cache is switched off, as mentioned by cb-kde@fish.zetnet.co.uk, and when a page has been idle for some hours (namely, the khttpcacheclear has run in the meantime).

If neither the page itself, nor any of its embedded elements (images, objects, frames, styles, scripts) are refetched, then and only then this bug can be resolved.
Comment 39 András Manţia 2005-06-02 17:59:47 UTC
Exactly, I remember (cannot check right now) that it happens also with
current Konqueror that after a while if you move the mouse over an image,
the image dissappears.
Comment 40 Thiago Macieira 2005-06-10 06:28:43 UTC
Two new bugs were opened when this one got closed, to list issues still found. Does anyone know the ids?
Comment 41 raditzman 2005-06-11 03:50:29 UTC
This one has been reopened because I didn't test it as properly as I thought (see discussion here at bugs.kde.org). The 2 I opened are:

106615
I closed the other one, because it was the same as this one. 

What I think needs to be done is:
1 - test all the cases in the entire kde system, to evaluate when there is a refetch from network (bug) and when there isn't one (thus working right) where applicable. Also one should see how the enabling/disabling of the cache and large time periods after fist access, affect these cases.
2 - evaluate where the fault for each bug is (some in konqueror, others perhaps in the some kpart or utility)
3 - open bugs for each specific bug in each specific component, and address each issue separately, as they are in fact separate from each other.
4 - Follow the development of all of those bugs until they are successfully closed (hopefully before 3.5). 

p.s. - I'm a newbie developer, never tried to program for kde and so can't fix these bugs.
Comment 42 m.wege 2006-02-17 13:49:38 UTC
The problem also appears when you view a PDF in KPDF, if you choose to save the file, the download starts all over again. Very annoying, if you only have a slow modem connection.
Comment 43 Tommi Tervo 2006-02-21 13:56:24 UTC
*** Bug 119241 has been marked as a duplicate of this bug. ***
Comment 44 Tommi Tervo 2006-02-21 13:56:33 UTC
*** Bug 122373 has been marked as a duplicate of this bug. ***
Comment 45 Tommi Tervo 2006-02-27 08:49:59 UTC
*** Bug 122759 has been marked as a duplicate of this bug. ***
Comment 46 András Manţia 2006-03-10 07:45:03 UTC
This is really annoying. I just looked at a PDF inside Konqueror. It is 
2.8 MB long, which took some time to download. Now when I want to save 
it downloads again. On dial up this is no fun.
Comment 47 Maciej Pilichowski 2006-03-28 21:16:21 UTC
My only comment to this report is such that I agree with most /all? :-)/ of you -- Konqueror should work exactly in WYSIWYG mode. "I save what I see". Cache should be used for jumping through the history, not for saving or changing view mode. I already see something, I want to have this on my disk! Konqueror here falls in a trap "I know better (what the user wants)".

However small exception -- if object /image, page/ I want to save is incomplete, Konqueror should display question dialog if I want to save it incomplete or should Konqueror reload it for me before saving /or easier -- "object incomplete -- save/cancel ?"/.
Comment 48 Oded Arbel 2006-03-28 23:33:12 UTC
I think there is some confusion here: the object of this ticket is for Konqueror to always save from the cache, instead of reloading - i.e. cache is used for history and ALSO for saving and changing modes.

I also didn't understand the comments about "incomplete" - do you mean, when trying to save an object before it was fully loaded ? I would think that Konqueror should then wait for the object to completely load before finishing to save (it might start to save before the object was completely loaded). If you hit ESC or in some other way abort the download, then when saving you would then save the incomplete object.
Comment 49 Maciej Pilichowski 2006-05-10 22:40:59 UTC
> I also didn't understand the comments about "incomplete" - do you mean, when
> trying to save an object before it was fully loaded ?

Incomplete -- the connection was lost, Konqueror didn't loaded full image, I accidentally hit esc, etc etc. Btw. it is so common that Konqueror saves corrupted data that I posted explicit request for fixing this bug:

http://bugs.kde.org/show_bug.cgi?id=126478
Comment 50 Tommi Tervo 2006-08-12 13:44:48 UTC
*** Bug 128985 has been marked as a duplicate of this bug. ***
Comment 51 Tommi Tervo 2006-08-12 13:45:01 UTC
*** Bug 132267 has been marked as a duplicate of this bug. ***
Comment 52 Māris Nartišs 2006-08-16 14:17:08 UTC
It's quite annoying to change view mode from KHTML to text editor and to see different page and not same as in KHTML. As AJAX based web pages get more popular, more and more users will be disappointed by lack of possibility to save current web page.
Also redownloading huge PDF's really sucks (I once opened PDF that took half hour to download. Yet another half hour to refetch it just to save...)

BTW, it works for printing - I can print javascript modified pages with Konqueror 3.5.2.
Comment 53 Stephan Sokolow 2007-01-15 23:34:26 UTC
I just experienced this problem with a TigerDirect.ca POST order summary on Konqueror 3.5.5. Thankfully, the only problem was my accidentally saving a blank invoice to disk.

It also exacerbates another long-standing bug which wreaks havoc between Konqueror and Fanfiction.net. "Save As..." of http://www.fanfiction.net/s/960637/30/ will always fail with "Error 404: Could not retrieve http://www.fanfiction.net/s/960637/30/index.html"
Comment 54 Stephan Sokolow 2007-10-06 12:28:29 UTC
Update: Fanfiction.net now lets you stick anything onto the end of URLs (they use it to stick a copy of the story title into the URL) so a new example is necessary.

Here's one: http://lwn.net/Articles/246381/
Comment 55 Michael Leupold 2008-05-04 17:57:54 UTC
Created attachment 24628 [details]
testcase

This is a little testcase I constructed. I uses php to delay loading so you can
actually see if some content is reloaded (needs fpdf.php).

You can find it online at:
http://test.confuego.org/59965/59965-html.php - for html content
http://test.confuego.org/59965/59965-gif.php - for image content
http://test.confuego.org/59965/59965-pdf.php - for pdf content
Comment 56 Michael Leupold 2008-05-04 18:18:14 UTC
I analyzed the matters on trunk r803905 and 3.5.9 alike.

Things that reload while they should not, 3.5.9:
- saving pdf (kpdf, uncached)
- saving image (gvpart, uncached)
- switching view on images (uncached)
- switching view on html (uncached)
In all of those cases the version already downloaded could be used.

Things that reload while they should not, 4.1 trunk:
- saving image (gvpart, uncached)
- switching view on images (uncached)
- switching view on html (uncached)

So at least the pdf reloading problem is solved - maybe attributable to okular. I agree that and already loaded document should not neccessarily have to reload on switching the view in uncached mode - especially as it's clearly visible the file has to lurk around somewhere (eg. viewable using "show document source").

I posted tables with some extra information on: http://techbase.kde.org/User:Lemma/KonquerorCaching - I'll also attach it as a PDF so it stays up.
Comment 57 Michael Leupold 2008-05-04 18:19:59 UTC
Created attachment 24630 [details]
Print from a techbase page I used for analysis

This is a pdf print from the techbase page I used to test the caching
behaviour. I also attached some notes to some of the items.
Comment 58 Michael Pyne 2008-05-28 01:24:31 UTC
Saving an image using gvpart is bug 162141, now fixed.
Comment 59 David Faure 2008-06-02 14:50:57 UTC
On Sunday 04 May 2008, Michael Leupold wrote:
> Things that reload while they should not, 4.1 trunk:
> - saving image (gvpart, uncached)


That one is a gvpart bug, it should save its own data from memory directly.

> - switching view on images (uncached)
> - switching view on html (uncached)


But this is no bug. If you switch to another component, and if you have disabled caching, then where should this other component get the data from?
It has to download it again, by design... otherwise you need a cache, and that's exactly what you disabled ;-)

> So at least the pdf reloading problem is solved - maybe attributable to okular. I agree that and already loaded document should not neccessarily have to reload on switching the view in uncached mode - especially as it's clearly visible the file has to lurk around somewhere (eg. viewable using "show document source").


Well "show document source" saves the data from memory into a temporary file. But that solution isn't applicable
to switching to another component: you would see /tmp/kde-dfaure/konquerorfooxrh.tmp instead of the real remote
URL, which would break any relative paths to other things from HTML files, etc. (ok this is not loaded by the plain text
editor, but there's no way for konqueror to know what kind of component is being selected... it could very well
be webkitpart, or any other component that can really render the HTML). Switching components still means
using the same URL, not switching to a tmp URL.
For the contents of that URL to come without a reload, you need the cache to be enabled.
Comment 60 David Faure 2008-06-02 16:54:30 UTC
> That one is a gvpart bug, it should save its own data from memory directly.

Aurélien Gateau tells me that he fixed this recently.
So AFAICS this closes this report - it all behaves as expected now.