Bug 123133

Summary: Some unicode chars shown incorrectly
Product: [Applications] konqueror Reporter: Juuso Alasuutari <juuso.alasuutari>
Component: generalAssignee: Konqueror Developers <konq-bugs>
Status: RESOLVED DUPLICATE    
Severity: normal CC: kdebugs
Priority: NOR    
Version: unspecified   
Target Milestone: ---   
Platform: Compiled Sources   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:

Description Juuso Alasuutari 2006-03-05 20:05:11 UTC
Version:            (using KDE KDE 3.5.1)
Installed from:    Compiled From Sources
Compiler:          gcc 3.4.5 
OS:                Linux

Some unicode characters are not shown properly with Konqueror. These include (among others) the trademark character (&#8482), the euro character (&euro), and three dots character (&#8230).
The trademark char is shown as a square, euro as two horizontal lines (like a square missing its vertical lines), and three dots as "ΓΏ".
Other programs, like Kate and Konsole, show most chars properly. Of the three given examples, only the trademark char is still a square when sopy&pasted to Kate.

For font I use Bitstream Vera Sans both in Konqueror and system-wide. Encoding in Konqueror is set to Semi-Automatic. I have also tried at least utf8, iso 8859-1, and iso 8859-15, but that made no difference.
In Control Center, country is set to Finland and language to US English. Keyboard layout is Finland (fi) with variant "basic", keyboard model is generic 105-key.

My locale variables look like this:
LANG=fi_FI.utf8
LC_TIME=en_US.utf8
LC_MESSAGES=en_US.utf8
...and those locales do also exist.

To set up i18n in textmode (outside of X) I use kbd, and load settings with the following commands:
  loadkeymap fi-latin9
  setfont lat9u-16 -m 8859-15
  unicode_start

I use Source Mage GNU/Linux which is a source based distro. Compilation of kdebase is done normally, with these configure flags being used:
--prefix=/usr --sysconfdir=/etc --localstatedir=/var --mandir=/usr/share/man --infodir=/usr/share/info --disable-debug --disable-final --enable-dnotify --enable-new-ldflags --with-distribution --disable-dependency-tracking --with-arts --with-gl --with-shadow --with-pam=no --without-ldap --with-ssl --with-hal --with-kdm-xconsole --with-dpms --without-xdmcp --build=i686-pc-linux-gnu
Comment 1 Thiago Macieira 2006-03-06 16:32:06 UTC
Works fine here, using Bitstream Vera Sans.
Comment 2 Juuso Alasuutari 2006-04-04 23:38:54 UTC
This is a strange prob. It still persists, but there was a period of time after I filed this bug during which it didn't appear.

I've compiled the KDE packages quite a few times recently, and after one time Konqueror displayed most unicode characters correctly. Then, as magically as it had been cured, the "bug" again appeared after the next rebuild. (I'm a part-time developer of a source-based distro, so I compile a lot.)

Most certainly this has to be related to my build environment. But that also means that a solution to this, is one is found, should possibly be taken into account and/or documented by the KDE team.

I'm trying to figure out what I need to change. Could system locale/font/unicode settings affect compilation? I have unicode support enabled in tty, where I also do KDE package building.
Comment 3 Timothy Stotts 2006-09-15 23:40:30 UTC
I have the same issue, regardless of default KDE or Konq. font.

LC_ALL=en_US.utf8.  Gentoo Linux.  KDE and Konq. 3.5.2 .
Default font is Times. Same default font renders correctly in all Gecko/GTK browsers.

A few examples are HTML entities, such as mdash and ndash.  The render correctly in Akgregator's KHTML display, but not in Konqueror's.
Comment 4 Juuso Alasuutari 2006-09-16 14:49:06 UTC
What compiler optimizations do you use? Maybe us build-it-yourself'ers with our custom CFLAGS are to blame here. :)
Comment 5 Timothy Stotts 2006-09-16 18:28:04 UTC
# This bug is consistent over 2 of my machines, of two different architectures.  The only configure similarity that strikes me is: (1) debugging off, (2) --enable-final


# MACHINE #1:  Apple PowerBook G4
CFLAGS="-O2 -pipe -mcpu=7450 -mtune=7450 -maltivec -mabi=altivec"
CXXFLAGS="${CFLAGS}"
LC_ALL=en_US.utf8
gcc-4.1.1

# kdelibs, The following are regenerated with automake 1.9.6, etc.
Makefile.am
configure.files
onfigure.in
aclocal.m4
configure
config.h
Makefile ...

./configure --prefix=/usr --host=powerpc-unknown-linux-gnu
--mandir=/usr/share/man --infodir=/usr/share/info --datadir=/usr/share
--sysconfdir=/etc --localstatedir=/var/lib --with-distribution=Gentoo
--enable-libfam --enable-dnotify --with-libart --with-libidn --with-utempter
--without-acl --with-ssl --with-alsa --with-arts --without-gssapi
--with-tiff --with-jasper --with-openexr --enable-cups --enable-dnssd
--without-hspell --with-aspell --with-rgbfile=/usr/share/X11/rgb.txt
--disable-fast-malloc --with-x --enable-mitshm --with-xinerama
--with-qt-dir=/usr/qt/3 --enable-mt --with-qt-libraries=/usr/qt/3/lib
--disable-dependency-tracking --disable-debug --without-debug
--enable-final --with-arts --prefix=/usr/kde/3.5
--mandir=/usr/kde/3.5/share/man --infodir=/usr/kde/3.5/share/info
--datadir=/usr/kde/3.5/share --sysconfdir=/usr/kde/3.5/etc
--build=powerpc-unknown-linux-gnu

# Minor patch is applied to support xorg 7.1 (Modular X) headers
# Minor patch is applied to modify support for a few Kate languages (Ada, etc.)

kdebase is compiled in a modular/item-by-item fashion
Example, when building only libkonq:

./configure --prefix=/usr --host=powerpc-unknown-linux-gnu
--mandir=/usr/share/man --infodir=/usr/share/info --datadir=/usr/share
--sysconfdir=/etc --localstatedir=/var/lib --without-java --with-x
--enable-mitshm --with-xinerama --with-qt-dir=/usr/qt/3 --enable-mt
--with-qt-libraries=/usr/qt/3/lib --disable-dependency-tracking
--disable-debug --without-debug --enable-final --with-arts
--prefix=/usr/kde/3.5 --mandir=/usr/kde/3.5/share/man
--infodir=/usr/kde/3.5/share/info --datadir=/usr/kde/3.5/share
--sysconfdir=/usr/kde/3.5/etc --build=powerpc-unknown-linux-gnu


# MACHINE #2: Dell Pentium 4
CFLAGS="-O2 -pipe -march=pentium4 -fomit-frame-pointer"
CXXFLAGS="${CFLAGS}"
LC_ALL=POSIX
gcc-4.1.1

# kdelibs similar compile-fashion as MACHINE #1, but different with/without
./configure --prefix=/usr --host=i686-pc-linux-gnu --mandir=/usr/share/man
--infodir=/usr/share/info --datadir=/usr/share --sysconfdir=/etc
--localstatedir=/var/lib --with-distribution=Gentoo --enable-libfam
--enable-dnotify --with-libart --with-libidn --with-utempter --with-acl
--with-ssl --without-alsa --with-arts --without-gssapi --with-tiff
--with-jasper --without-openexr --enable-cups --disable-dnssd
--without-hspell --with-aspell --with-rgbfile=/usr/share/X11/rgb.txt
--disable-fast-malloc --with-x --enable-mitshm --with-xinerama
--with-qt-dir=/usr/qt/3 --enable-mt --with-qt-libraries=/usr/qt/3/lib
--disable-debug --without-debug --enable-final --with-arts
--prefix=/usr/kde/3.5 --mandir=/usr/kde/3.5/share/man
--infodir=/usr/kde/3.5/share/info --datadir=/usr/kde/3.5/share
--sysconfdir=/usr/kde/3.5/etc --build=i686-pc-linux-gnu

#libkonq in similar fashion to MACHINE #1, but different with/without
./configure --prefix=/usr --host=i686-pc-linux-gnu --mandir=/usr/share/man
--infodir=/usr/share/info --datadir=/usr/share --sysconfdir=/etc
--localstatedir=/var/lib --without-java --with-x --enable-mitshm
--with-xinerama --with-qt-dir=/usr/qt/3 --enable-mt
--with-qt-libraries=/usr/qt/3/lib --disable-debug --without-debug
--enable-final --with-arts --prefix=/usr/kde/3.5
--mandir=/usr/kde/3.5/share/man --infodir=/usr/kde/3.5/share/info
--datadir=/usr/kde/3.5/share --sysconfdir=/usr/kde/3.5/etc
--build=i686-pc-linux-gnu
Comment 6 Timothy Stotts 2006-09-16 18:30:24 UTC
Could this be an issue with QT (qt-3.3.6) rather than KDE?
Comment 7 Timothy Stotts 2006-09-16 19:27:55 UTC
A few other notes:
  - This issue persists across users, including starting with an empty home folder.

Comment 8 Timothy Stotts 2006-09-16 19:44:14 UTC
Wow.  This is strange.  The trademark and copyright characters always display fine, but not mdash, ndash, curly quotes, etc., with Times, Arial, the two most common Internet fonts.  But the following page displays everything fine:
    http://www.w3schools.com/tags/ref_entities.asp

If I switch to Veranda, Helvetica, Times New Roman, then the characters display okay. So this is definitely a font issue.

The question is, why does GTK have no problem with these fonts, and *only* KDE/QT applications do?

@Juuso Alasuutari: can you see the special characters of the above URL?
Comment 9 Timothy Stotts 2006-09-16 20:43:27 UTC
Here we are. Seems to be a QT3 issue.  Even if the font supplies the higher-number glyph, QT still performs substitution.  Sheesh...

http://www.alweb.dk/blog/anders/free_font_containing_more_unicode_glyphs

@Thiago Macieira:  can you show us your /etc/fonts/*.conf and configuration for the Xorg.org fonts (in either /etc/X11/xorg.conf or /etc/X11/fs/config)?
Comment 10 Juuso Alasuutari 2006-09-17 12:51:30 UTC
Yes, the page you linked to displays correctly. Perhaps there's something in the HTML code that doesn't anger Qt. I tested Konqueror's unicode abilities with the smallest possible html file. Try this:

  <html>
  &euro;
  </html>

The euro character is displayed correctly on my box. It's strange that the more complex pages (most websites) are screwed up.

Speaking of &euro, by the way: That's about the only character that isn't correctly displayed on the page you linked to. It's a four-spiked star.
Comment 11 Allan Sandfeld 2006-09-17 13:42:18 UTC
Wellknown Qt3 bug.

*** This bug has been marked as a duplicate of 47682 ***