(*** This bug was imported into bugs.kde.org ***) Package: khtml Version: 4.0 (using KDE 3.0.3 ) Severity: normal Installed from: Mandrake Linux Cooker i586 - Cooker Compiler: gcc version 3.2 (Mandrake Linux 9.0 3.2-1mdk) OS: Linux (i686) release 2.4.19-5mdk OS/Compiler notes: Many of the HTML 4 SGML Entities don't display at all. See my page at http://www.zipcon.net/~swhite/docs/computers/browsers/entities.html (Submitted via bugs.kde.org) (Called from KBugReport dialog)
This depends on the fonts used in the document. If I don't specify them (my browser is set to use a serif font by default), the arrow characters ← ↑ → ↓ ↔ ↵ ⇐ ⇒ ⇓ ⇔ are not displayed. If I specify the style body{ font-family: times, serif; } Then the above arrow symbols are displayed. This improves many other SGML elements, but not all: ◊ ♠ ♣ ♥ ♦ ⌈ ⌉ ⌊ ⌋ ⟨ ⟩ are not displayed. See also my page <http://www.zipcon.net/~swhite/docs/math/math.html>
Created attachment 396 [details] Table of all HTML 4 SGML Character Entities
if they are displayed as the entity, it's a bug. If they are displayed as a block (to show "font has nothing for this") it isn't. As I don#t see any entities, this bug is WFM
I disagree. The standard HTML 4 SGML Entities should be displayed, regardless of fonts. If the current font doesn't contain a glyph to represent the entity, the browser should try to find one that does. Failing that, the browser should find some other way to display the entity (either construct a nice representation, or fall back to displaying the entity name.) You will find that various browsers take different strategies on this point. The best browsers bend over backward to display the entities as the standard specifies that they're to be displayed.
*** Bug 39618 has been marked as a duplicate of this bug. ***
*** Bug 33332 has been marked as a duplicate of this bug. ***
*** Bug 32095 has been marked as a duplicate of this bug. ***
More information: First, here's a table with all the entities all in one page: http://www.zipcon.net/~swhite/docs/computers/browsers/entities_page.html Second, I've run numerous experiments using that page, with Konqueror, Opera, Netscape, and Explorer on Linux and Windows. I altered the charset in the meta tag, and the font-family specified in CSS. My reading of the HTML standard is that these entities should be displayed intelligibly regardless of the specified font-family or encoding. Konqueror is independent of encoding, but dependent on font (it displays perfectly if the font-family is "arial unicode ms"). Opera 7.11 Linux displays the entities correctly independently of both font and encoding. On Windows, it is missing a few characters. NS 7.2 Linux is independent of encoding, but depends on fonts (displays correctly using font "times", but many endities are messed up if use "arial unicode ms". Note this is *opposite* from Konqueror) MSIE 6.0.2800.1106 Is bad in both ways: a change of font or a change in page encoding will cause most of the SGML Character Entities to be displayed incorrectly.
that bug is in qt somehow.
*** Bug 66532 has been marked as a duplicate of this bug. ***
*** Bug 63751 has been marked as a duplicate of this bug. ***
*** Bug 70167 has been marked as a duplicate of this bug. ***
The page http://www.zipcon.net/~swhite/docs/computers/browsers/entities_page.html works for me! The font has not all characters but then the box is displayed. (So I suppose that Qt has been fixed in the meantime.) Have a nice day!
Dear Nicolas Goutte, you obviously missed the point: >My reading of the HTML standard is that these entities should be displayed >intelligibly regardless of the specified font-family or encoding. This bug makes konqueror useless for scientific text, which is a pitty IMHO;-(
Sorry, re-opening!
Subject: Re: HTML 4 SGML Entities don't all work > http://www.zipcon.net/~swhite/docs/computers/browsers/entities_page.html > works for me! The font has not all characters but then the box is > displayed. (So I suppose that Qt has been fixed in the meantime.) The displaying of the box _is_ the bug. Greetings, Stephan
Created attachment 4185 [details] Mozilla vs. Konqueror CVS showing "rarr; / larr;" arrows I can confirm this bug in CVS from 2-jan-2004. (see screenshot)
Still not there with KDE3.2RC1
*** Bug 75569 has been marked as a duplicate of this bug. ***
This bug is a major show stopper (for me). As it stands, the Wikipedia math pages cannot be read with Konqueror. We have long given up on IE, usually recommend that people use Mozilla, but Konqueror would be a good alternative!
*** Bug 85523 has been marked as a duplicate of this bug. ***
*** Bug 85709 has been marked as a duplicate of this bug. ***
*** Bug 86822 has been marked as a duplicate of this bug. ***
*** Bug 91495 has been marked as a duplicate of this bug. ***
*** Bug 77348 has been marked as a duplicate of this bug. ***
Marking as "should be fixed by Qt"
*** Bug 44290 has been marked as a duplicate of this bug. ***
Allan, Have the Qt folks been notified that they are going to fix this bug? (Or are you saying that some recent improvement in Qt should already have fixed it?)
Yes, they know. They even fixed it once in Qt 3.1.0 I think, but reverted it because it created new problems. I can only hope they have fixed it for real in Qt 4.
*** Bug 98394 has been marked as a duplicate of this bug. ***
*** Bug 101659 has been marked as a duplicate of this bug. ***
Allen, Can you provide me with a report number for this Qt issue? If it's their problem, I would like to pester them.
*** Bug 108820 has been marked as a duplicate of this bug. ***
The HTML character entity rarr displays as a box for me if the font is sans-serif, and displays correctly if it is serif. I also see frequent boxes in text webpages in places where I would expect dashes or quotes. Konqueror should display either the literal '&' 'rarr' ';' or find a font substitute in which the character is provided. My test case: <html> <head> <title>right arrow test</title> </head> <body> Here it is ... →<br> <font size="+1">1→</font> <font size="+2">2→</font> <font size="+3">3→</font> <font size="+4">4→</font> <font size="+5">5→</font><br> <font face="serif">serif→</font><br> <font face="sans">sans→</font> </body> </html>
*** Bug 116356 has been marked as a duplicate of this bug. ***
*** Bug 75314 has been marked as a duplicate of this bug. ***
I see only(8) boxes for #34's testcase. Increasing the font size manually doesn't change it. Using konqueror 3.5.1.
Also for some reason ∉ is displayed as ¬in; I've opened Bug 122047: ∉ is displayed as ¬in;
Marijn, The ∉ element is curious, but the display is intelligible, which is all the standard requires. But sure, in light of everything else, call it a bug. At this time, Konqueror is still failing to display, or displaying unintelligibly, about 20% of all the HTML 4 character entities. This page still works: http://www.zipcon.net/~swhite/docs/computers/browsers/entities_page.html
Still not solved in 3.5.1. Isn't that a more than 3 years old bug ? Wikipedia is sometimes unreadable (math's) and cannot be used correctly with Konqueror. What about QT solving it ?
*** Bug 124603 has been marked as a duplicate of this bug. ***
*** Bug 124689 has been marked as a duplicate of this bug. ***
Qt 4 solves it.
I want to confirm that it works beautifully with KDE 4. Thanks, Thiago & others!
I have nothing to do with this :-)
I just installed KDE 5.3.2, with Konqueror of the same version, as distributed in Kubuntu Linux 6.6 Things are vastly improved. There remain just a few glitches. The following entities are not correctly displayed: ∉ <- this one is rather amusing. ∨ ⌈ ⌉ ⌊ ⌋ ⟨ ⟩ Let's go for perfection! Once again, have a look at my table at http://www.zipcon.net/~swhite/docs/computers/browsers/entities_page.html
In Konquerer, at any rate, the SGML display is very font-dependent. This is not a good sign: According to the standard, SGML characters must be intelligibly displayed, regardless of such considerations. My table is missing only the above-mentioned entities if I set the font (in CSS) to display in the distro's "sans-serif" font. But in the serif font, things are much worse. Considering this, I now have to say things are not much improved since I first reported the problem. Notice that this does not happen in other browsers: Firefox in particular finds a font in which the required glyph is available, and uses that glyph. I'll work on my page so that you can easily switch the font it uses.
Apologies: I read 5.3.2, when it is 3.5.2. This is of course a lower number than the one everybody says fixes the trouble. So, to redeem myself now I have to go on and install KDE 4, I guess.
Sorry it is CANTFIX/WONTFIX for 3.5 and fixed in 4.0
*** Bug 130957 has been marked as a duplicate of this bug. ***
*** Bug 123133 has been marked as a duplicate of this bug. ***
*** Bug 138496 has been marked as a duplicate of this bug. ***
*** Bug 130171 has been marked as a duplicate of this bug. ***
*** Bug 146922 has been marked as a duplicate of this bug. ***
*** Bug 150228 has been marked as a duplicate of this bug. ***
*** Bug 151673 has been marked as a duplicate of this bug. ***
*** Bug 152462 has been marked as a duplicate of this bug. ***
*** Bug 166604 has been marked as a duplicate of this bug. ***
Hi again! Over six years after I reported this ugly bug, two years after it had been proclaimed fixed in development versions not easily available to the public or to me (and the bug accordingly closed) I can finally, gratefully report that in Ubuntu 8.05, with KDE/Konqueror 4.0.3, all the HTML 4 entities are displayed correctly, under some conditions. So I will finally agree that this problem is fixed. Whoo Hoo! Just in case somebody in the inner circle is listening: (not to dampen the festive mood, but) this is an AWFUL development cycle. Look at it as a challenge to find a way to streamline the development system so that the product does not continue to look stupid for the better part of a decade after a problem has been reported. (I mean, the development environments both of KDE and of QT.)
­ is still no correctly handled (in 4.2.2): If no break needed, both parts of the word separated by ­ are displayed stuck to each other, without hyphen (as they should). However, if a break is needed, they are displayed on two lines (as expected), but no hyphen is displayed after the first part! All other characters from http://www.zipcon.net/~swhite/docs/computers/browsers/entities.html are handles correctly. Test case: <html> <body> long long long sentence longlonglonglonglong­wordwordword test¡ </body> </html> Then make window narrow enough that longword is broken in two: the hyphen is missing!
Reopening this due to issue with ­ All other entities seem to be handled all richt
­ is fixed in either 4.2.3 or 4.2.4. And please don't touch a bug report with 20 or so people CC'd for an unrelated issue!