Version: (using KDE KDE 3.4.0) Installed from: Compiled From Sources OS: Linux In Konqueror, when clicking on an archive on a local drive, it is opened right there using the relevant kioslave, which is very useful. However, when clicking on an archive on a remote drive, e.g. sftp or smb, it is opened using "ark", which looks and feels very different to the Konqueror icon and tree views. That's confusing and inconsistent. The problem is that kioslaves can't be nested. When clicking on a local archive, the resulting URL is something like: tar:/dir/archive.tgz This means that the 'tar' slave directly accesses local files, which duplicates functionality of the 'file' slave and stops it from working on remote files (unless they're mounted locally of course). Ideally, ioslaves for archives would be wrappers around other slaves, thereby allowing for remote access, e.g.: tar:(file:/dir/archive.tgz) tar:(sftp://user@server/archive.tgz) The parentheses (or some other syntax) would be necessary to disambiguate directories when browsing into archives, e.g.: tar:(sftp://user@server/archive.zip)/dir In this way, nested archives would be possible too, e.g.: zip:(tar:(file:/dir/archive.tgz)/file.zip) I think this should be considered for KDE4.0.
Problems with that: 1) That address would no longer be a valid URL. We could come up with a "hostname" that encodes the real address, like zip://smb%3a%2f%2fserver%2fsharename%2ffile.zip/ but I am afraid that the hostname-strictness we have would also prevent that hostname (STD 3 compliance) 2) IOSlaves don't provide random-access functions. So, in order to do that, we have to download the file all over. That's what happens with ark, but not what happens on local files using the ioslaves.
I accept this isn't a trivial request, but do you agree that nested ioslaves would be good for interface consistency and network transparency? KDE4 is a great opportunity to make these kinds of substantial changes. regarding 1) True, these addresses wouldn't be URLs as we know them, but should that be a showstopper? KURL could be extended to support nested protocols, whereby standard one-protocol URLs would just become a special case. I don't know what STD3 compliance entails, but I don't think the hostname encoding scheme would be a good idea anyway because it's not really human-readable and because it wouldn't allow for multiple nesting levels. Regarding 2) I don't think that would be a problem from the user's perspective, as long as the download is performed transparently, e.g. the user doesn't get to see the temp file address, and longer downloads are indicated by a progress bar, which is of course what already happens when applications access ioslaves. But perhaps it's a good time anyway to look at random-access functions in ioslaves for protocols that do support it?
Should the fact that you can't represent the address as an URL be a problem? Yes, a very grave one indeed. If it isn't an URL, it cannot be passed to other programs. Extending KURL to "accept" the broken URIs would be unwise, as we are trying to clean it up. Maybe we could conjure up a new URI scheme that would allow us to do that, but remember that URIs are a superset of URLs and not all programs can accept them. As for random-access functions, I fully agree with you.
If nested URLs were implemented in KURL and kio, wouldn't all KDE programs automatically be able to use them? In the case of non-KDE programs, they have to be tricked using temporary files anyway if they're supposed to work with ioslaves. So I'm not sure I understand the problem. As for cleaning up or splitting up KURL, I don't think that necessarily rules out extending its functionality, as long as it's done in a coherent and well though-out way. Thinking about it, things like tar and zip aren't really protocols, but filters, so perhaps a syntax along the following lines might be worth considering: sftp://user@server/dir/archive.tgz|tar/file This would allow for multiple filters without requiring parentheses and correspond nicely with shell syntax. (But I suspect the bar character isn't available to be used like that?)
There's no such thing as a nested URL. It's either a proper URL, or it's not. However, if we get decent URI support (which I intend to provide), we could invent an URI scheme of our own and do things like: accessing a file inside a remote tarball: multi:fish://me@server.domain.com/home/me/file.tar.gz,tar:/directory/file.c automatically uncompressing a gzip's content: multi:file:///home/thiago/text.ps.gz,gzip:/ It would probably be a good idea to try and standardise that as a FreeDesktop spec, so that we are allowed to pass them in %U/%u. It would also be possible to craft such a solution with the current implementation, using kioslaves: multi://me@server.domain.com/?p1=fish,home/me/file.tar.gz&p2=tar,directory/file.c multi:///?p1=file,home/thiago/text.ps.gz&p2=gzip What do you think? I prefer the URI way, since it's cleaner, but I'm not sure if it is supported by KIO currently. What it would lack is proper hostname-encoding, since KURL doesn't support URIs very well now. The second one, though, is fully compliant and would certainly be doable now. I am assuming the second and further ioslaves down the chain don't need hostnames, ports, usernames or passwords. Can you think of a situation where they would be required? (I can imagine people wanting to nest fish inside fish for proxying, but I don't think we should support that)
When you say the second way is doable now, do you mean for 3.5? That would be great, in whatever form, especially if it gets discussions about ioslave architecture for 4.0 going. I also prefer the first scheme though, because it's cleaner and more readable. The fish-inside-fish scenario would actually be quite useful, e.g. it would allow me to browse my computer at work while having to go through my department's ssh gateway, but it doesn't seem to fit the kind of URI nesting we're discussing here. But extending the fish URI scheme to support host chains might do the job. (Should I file a separate wish for that?) fish://user1@host1//user2@host2/dir/file Apart from that, one could imagine ioslaves that support encrypted files or archives, thus requiring username or password. If something like the multi: scheme could be done with URIs, then how about the scheme I had originally suggested? It doesn't necessarily have to use parentheses; perhaps square brackets or curly braces would be preferable, e.g.: tar:(file:/dir/archive.tar)/file gzip:[sftp://user@host/dir/doc.ps.gz] bzip2:{tar:{fish://host/archive.tar}/file.bz2} I think any of those would be cleaner and more readable than the multi: scheme. (Things like gzip and bzip2 wouldn't really require the bracketing, but I think it's needed for consistency. The current gzip:/dir/file style URLs could still be supported for compatibility.) The difference in syntax of course also suggests a difference in implementation. The multi: scheme would only require one new ioslave that would be able to glue existing slaves together, probably using temporary files. The bracketing scheme on the other hand would require changes to all the relevant ioslaves, replacing code for accessing the local file system with code for emulating random access to the nested ioslave. Much of that though could be factored out into a common base class (KIO::FilterSlave?). This would also prepare it nicely for ioslaves that directly support random access, whereas the multi scheme couldn't do that without making substantial changes to the filter-style ioslaves too.
Adding nestability to kioslaves is a wonderful idea. When extending the kioslave implementation like this, please consider arbitrary nesting depth. A common example where this is highly useful is nested archives, e.g. a text file in a tar.bz2 in a tar.gz in a zip file. Catting such a file is relatively easy (unzip -p file.zip file.tar.gz | tar xzO file.tar.bz2 | tar xjO file.txt). Trying to fit access to such a file in a URI is, as the above discussion illustrates, difficult, especially if it is to be done in a user-friendly manner. The advantages of nested kioslaves are, however, huge. A good example would be a file indexer that would automatically be able to read all files on disc, regardless of whether they are in some sort of archive. A program that would greatly benefit from this is Kat (http://kat.sourceforge.net/).
> Catting such a file is relatively easy > (unzip -p file.zip file.tar.gz | tar xzO file.tar.bz2 | tar xjO file.txt) The URL form I proposed (KDE3): multi:///?p1=file,/home/me/file.zip&p2=zip,/file.tar.gz&p3=gzip&p4=tar,/file.tar.bz2&p5=bzip2&p6=tar,/file.txt The URI for KDE4 that I proposed: multi:file:///home/me/file.zip,zip:/file.tar.gz,gzip,tar:/file.tar.bz2,bzip2,tar:/file.txt The URI Andy proposed: tar:{bzip2:{tar:{gzip:{zip:{file:///home/me.file.zip}/file.zip}}/file.tar.bz2}}/file.txt
Just to check the proposals, I ran the URIs through QUrl and java.net.URI. Only number 1 and 2 are valid according to java.net.URI and only number 2 and 3 are valid according to QUrl (v3.2).
Created attachment 10462 [details] C++ code to check URIs
Created attachment 10463 [details] java code to check URIs
Looking at these examples, the bracketing scheme appears rather awkward to parse for a human, because it's difficult to match the curly braces without explicitly counting them. Of course in the usual cases with only one or two nesting levels this wouldn't be a problem, but still ... The multi scheme deals better with deep nesting, but it has other problems. First, 'multi:' itself is an implementation detail that shouldn't be visible to the user, who is only interested in where the data is, not how to glue components for accessing it together. Second, it raises the question of how to deal with nested multis. Third, the syntax does not reflect the fact that data filters like 'tar' or 'gzip' are conceptually different from data sources like 'http' or 'file'. Therefore, it permits things like these: multi:file:///home/me/file.zip,http://host multi:gzip:,zip:/file Because of the problems with both the bracketing and the multi schemes, I would like to throw the pipe syntax I had mentioned earlier back into the mix: file:///home/me/file.zip|zip/file.tar.gz|gzip|tar/file.tar.bz2|bzip2|tar/file.txt This copes well with deep nesting and clearly distinguishes between data sources and filters.
> I ran the URIs through QUrl Well, you can't do that. They are URIs, not URLs. Only the first one is supposed to be a URL and should validate against all. That indicates a problem in QUrl somewhere. > First, 'multi:' itself is an implementation detail that shouldn't be visible > to the user, who is only interested in where the data is, not how to glue > components for accessing it together. The idea is that Konqueror creates those automatically. If you click a zip file in file://home/me, it'll automatically generate the URL multi:///?p1=file,/home/me/file.zip&p2=zip,/ The user will see a mess in the Location, but it'll work. Hence the idea of a cleaner URI for KDE4. > Therefore, it permits things like these: > > multi:file:///home/me/file.zip,http://host > multi:gzip:,zip:/file We would enforce that only the first protocol is allowed a username-password-hostname-port part. That is, everything after the first one has just the first slash. As for the second option you gave, that would be an error because no file was given. > I would like to throw the pipe syntax I had mentioned earlier back into the > mix: > > file:///home/me/file.zip|zip/file.tar.gz|gzip|tar/file.tar.bz2|bzip2|tar/file.txt The big problem here is that that's a URL and we can't change its meaning. That URL means file "file.zip|zip/file.tar.gz|gzip|tar/file.tar.bz2|bzip2|tar/file.txt" in /home/me on your local filesystem. That can't be changed. We could add in a multi: prefix, which would allow us to change the meaning of the special characters. But in that case, we would stumble upon what I had proposed as URI, only with pipes instead of commas.
On 4/1/2005, "Thiago Macieira" <thiago@kde.org> wrote: >------- Additional Comments From thiago kde org 2005-04-01 13:01 ------- >> I ran the URIs through QUrl > >Well, you can't do that. They are URIs, not URLs. Only the first one is supposed to be a URL and should validate against all. That indicates a problem in QUrl somewhere. You're right, in Qt4 you can, but in Qt3 you can't. In Qt4 QUrl is really a URI! http://doc.trolltech.com/4.0/qurl.html#isValid When using Qt4, all 3 URI's are seen as valid. It is beta software though. A nice example of nested URIs is the Active URI specification: http://www.1060research-server-1.co.uk/docs/2.0.2/book/advdev/doc_guide_compoundURI.html http://www.1060research-server-1.co.uk/docs/2.0.2/book/introduction/doc_intro_concepts_requests.html The goal is not completely equal because the URI's are named and typed, which isn't necessary for the kioslaves. The choice of separators there is '@' and '+'. Important remark: "In order to be valid a URI a compound URI must be carefully escaped." This format is similar to multi:file:///home/me/file.zip,zip:/file.tar.gz,gzip,tar:/file.tar.bz2,bzip2,tar:/file.txt which, more generally formulated is multi:{escapted_uri},{escaped_uri},... I think this is the most elegant solution. One could argue about what delimiter to use. I'm in favor of the pipe. Would it be possible to make a piping kioslave for KDE3? Instead of starting from file:/home/user, you could then start from 'multi:file:/home/user'. In KDE4 one could then extend the behavior such that the linked URI can be a multi: automatically, when needed.
*** This bug has been marked as a duplicate of 73821 ***