Bug 150054

Summary: Kget fails to download directory-style HTTP URLs which break when index.html is appended
Product: [Applications] kget Reporter: Stephan Sokolow <kde_bugzilla_2>
Component: generalAssignee: KGet authors <kget>
Status: RESOLVED FIXED    
Severity: normal    
Priority: NOR    
Version: 0.8.5   
Target Milestone: ---   
Platform: Compiled Sources   
OS: Linux   
Latest Commit: Version Fixed In:

Description Stephan Sokolow 2007-09-21 09:55:57 UTC
Version:           0.8.5 (using KDE KDE 3.5.7)
Installed from:    Compiled From Sources
OS:                Linux

When trying to download a URL which ends in a / (and therefore appears to be a directory), KGet silently appends index.html, causing 404 errors on many sites which use non-standard DirectoryIndex files or mod_rewrite and friends. (or their non-Apache equivalents, of course)

As of this posting, the problem appears with http://lwn.net/Articles/246381/

It used to happen on Fanfiction.net (requiring me to either use wget and {1..whatever} in the shell to save chapters, or read my fanfiction in Firefox) but they've now altered their configuration so that arbitrary stuff can appear following the final slash without affecting the URL's meaning. (They use it to imbed a sanitized version of the story or author name in the URL)
Comment 1 Urs Wolfer 2007-09-22 16:51:14 UTC
I'm not sure if I really understood right. You would like to download websites? Or files the have no filenames?
Comment 2 Stephan Sokolow 2007-09-23 03:07:31 UTC
Neither is truly correct, but "the files have no filenames" is reasonably close. 

Basically, if I feed http://www.foo.com/bar/ into KGet, it'll try to download 
http://www.foo.com/bar/index.html and, if my index file is named index.php, index.shtml, main.asp, or whatever.foo, it won't work.

Same problem if the URL isn't actually a file path. (For example, Ruby on Rails routes, Pylons routes, many mod_rewrite rules, and PATH_INFO tricks like http://www.foo.com/bar.php/queryID/)
Comment 3 Lukas Appelhans 2007-11-11 00:27:10 UTC
*** Bug has been marked as fixed ***.
Comment 4 Lukas Appelhans 2007-11-11 00:27:29 UTC
Fixed in KDE4-trunk