Bug 238534 - man page HTML invalid
Summary: man page HTML invalid
Status: RESOLVED FIXED
Alias: None
Product: frameworks-kio
Classification: Frameworks and Libraries
Component: general (show other bugs)
Version: 5.47.0
Platform: unspecified Linux
: NOR normal
Target Milestone: ---
Assignee: David Faure
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-05-22 23:16 UTC by Christopher Yeleighton
Modified: 2018-08-25 18:55 UTC (History)
3 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Christopher Yeleighton 2010-05-22 23:16:52 UTC
Version:           4.3.5 (KDE 4.3.5) "release 0" (using 4.3.5 (KDE 4.3.5) "release 0", openSUSE 11.2)
Compiler:          gcc
OS:                Linux (x86_64) release 2.6.31.12-0.2-desktop

== Steps to reproduce ==

1. Open page man:ftp in Konqueror.
2. View source.
3. Upload the source to the HTML validator service [1].

== Actual results ==
3. 35 Errors, 2 warning(s)

== Expected results ==
3. Valid HTML

== References ==
[1] <URL:http://validator.w3.org/check>
Comment 1 Christopher Yeleighton 2010-05-22 23:24:09 UTC
The man page is generated by kio_man, part of kdebase4-runtime.  It should have a GENERATOR META tag but it has none.
Comment 2 Christopher Yeleighton 2010-05-23 00:47:09 UTC
The kio_man ioslave actually uses code from man2html [1], which is a horrible piece of ad-hoc code that tries to replace groff while not being one.
The output of `man -Thtml` is actually much better, and it is a standard GNU tool.  kio_man should rather use just that.

== References ==
[1] See for yourself: <URL:http://websvn.kde.org/trunk/KDE/kdebase/runtime/kioslave/man/man2html.cpp?view=markup>.  And watch your dreams so that it does not come to you at night :-)
Comment 3 Christopher Yeleighton 2010-05-23 14:07:27 UTC
Workaround: use: man '-Hdesktop-launch'.  (I think there should be a shorter name for desktop-launch, but that is another story.)
Comment 4 Christopher Yeleighton 2010-05-25 21:35:51 UTC
(In reply to comment #2)
> The kio_man ioslave actually uses code from man2html [1], which is a horrible
> piece of ad-hoc code that tries to replace groff while not being one.
> The output of `man -Thtml` is actually much better, and it is a standard GNU
> tool.  kio_man should rather use just that.
> 


I have to take that back since it turns out that groff output in HTML mode is not semantic; it uses HTML tags to place characters on the page here and there according to its internal measurements driven by font metric information.  However, there is no way to actually know the font metric information for the displaying browser, and these assumptions are likely to fail.
High-level information contained in document structuring macros is converted to low-level information on font sizes and character positions.  It is not possible to generate decent HTML from this intermediate result, and the high-level information has already been lost.  So I guess man2html must stay and it must be fixed.
Comment 5 Christopher Yeleighton 2010-11-09 21:13:59 UTC
(In reply to comment #4)
> high-level information has already been lost.  So I guess man2html must stay
> and it must be fixed.

OTOH, many packages come with TeXinfo sources that produce quite decent documentation in HTML, and the help system should use these when available.  The same goes for kio_info.

So the strategy would be, if there is known HTML documentation corresponding to the manual page, use that instead.  Otherwise, it is possible to use a modified groff macro package to generate structural information as processing instructions, and pass them through SAX to generate HTML on the fly.  

(Note that running groff in HTML mode will not work either because of known bugs; these bugs cannot be fixed because the standard HTML generator works on unmodified ditroff microcode, and it is not possible to recover structural information from that.)
Comment 6 Martin Koller 2011-01-07 18:50:30 UTC
I've now fixed a lot of bugs, but there's still some work to do ...
especially with the <div> tags
Comment 7 Martin Koller 2011-01-07 19:14:11 UTC
SVN commit 1212624 by mkoller:

CCBUG: 238534
CCBUG: 105765
BUG: 106067
BUG: 195241
BUG: 247012

Fix a lot of HTML-generation bugs by implementing some additional macros,
escape sequences, etc, etc.
Also start to make the code more readable and introduce more Qt/C++
while eliminating old C code, which is horrible to maintain.
The request keywords are now looked up by a gperf hash function.
The generated gperf file is committed, so that noone has to rely on the
existence of gperf on the build machine.

If I broke the rendering of some man pages, please inform me.


 M  +1 -1      CMakeLists.txt  
 M  +586 -614  man2html.cpp  
 A             request_gperf.c   [License: GENERATED FILE]
 A             request_hash.cpp   [License: UNKNOWN]
 A             request_hash.h   [License: UNKNOWN]
 A             requests.gperf  
 M  +4 -1      tests/CMakeLists.txt  


WebSVN link: http://websvn.kde.org/?view=rev&revision=1212624
Comment 8 Martin Koller 2011-01-07 22:05:37 UTC
SVN commit 1212666 by mkoller:

Backport from trunk:

CCBUG: 238534
CCBUG: 105765
CCBUG: 106067
CCBUG: 195241
CCBUG: 247012

Fix a lot of HTML-generation bugs by implementing some additional macros,
escape sequences, etc, etc.
Also start to make the code more readable and introduce more Qt/C++
while eliminating old C code, which is horrible to maintain.
The request keywords are now looked up by a gperf hash function.
The generated gperf file is committed, so that noone has to rely on the
existence of gperf on the build machine.

If I broke the rendering of some man pages, please inform me.



 M  +1 -1      CMakeLists.txt  
 M  +586 -614  man2html.cpp  
 A             request_gperf.c   [License: GENERATED FILE]
 A             request_hash.cpp   [License: LGPL (v2+)]
 A             request_hash.h   [License: LGPL (v2+)]
 A             requests.gperf  
 M  +4 -1      tests/CMakeLists.txt  


WebSVN link: http://websvn.kde.org/?view=rev&revision=1212666
Comment 9 Christopher Yeleighton 2011-01-25 19:02:22 UTC
kio/man has the following to say about sshd:

sshd -words [-46DdeiqTt ] [-b -file ... -bits ] [-C -file ... -connection_spec ] [-c -file ... -host_certificate_file ] [-f -file ... -config_file ] [-g -file ... -login_grace_time ] [-h -file ... -host_key_file ] [-k -file ... -key_gen_time ] [-o -file ... -option ] [-p -file ... -port ] [-u -file ... -len ]

Workaround: xditview "|man -TX100 -Z sshd"
Comment 10 Martin Koller 2011-01-25 19:49:40 UTC
I assume you did not test with current 4.6 RC2 ?
Because with that version, it looks like this:

sshd [-46DdeiqTt ] [-b bits ] [-C connection_spec ] [-c host_certificate_file ] [-f config_file ] [-g login_grace_time ] [-h host_key_file ] [-k key_gen_time ] [-o option ] [-p port ] [-u len ]

and this is also what I get in a terminal.
Comment 11 Julian Steinmann 2018-06-24 13:32:55 UTC
The situation has improved a lot since this bug has been opened, but I do still get two errors when validating man:ftp -> I'll reassign this to frameworks-kio where it belongs.
Comment 12 Martin Koller 2018-08-25 18:55:25 UTC
I checked today (after I forward ported my last 6 commits from 2015 in the old kde-runtime repo) with man:ftp and did an upload to https://validator.w3.org/check

I got no errors.
If you still get the mentioned 2 errors, please tell us which and reopen