Bug 82227 - Knode reads wrong all encodings except utf-8
Summary: Knode reads wrong all encodings except utf-8
Status: RESOLVED WORKSFORME
Alias: None
Product: knode
Classification: Miscellaneous
Component: general (show other bugs)
Version: 0.7.7
Platform: Compiled Sources Linux
: NOR normal
Target Milestone: ---
Assignee: kdepim bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-05-26 09:42 UTC by Federico Zenith
Modified: 2009-08-09 18:06 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Federico Zenith 2004-05-26 09:42:51 UTC
Version:           0.7.7 (using KDE KDE 3.2.2)
Installed from:    Compiled From Sources
Compiler:          g++ (GCC) 3.3.2 20031218 (Gentoo Linux 3.3.2-r5, propolice-3.3-7) 
OS:                Linux

When using Knode, default encoding is assumed to be utf-8, even though the post has been written in another encoding and this is reported in the post's header.

The reason only I am seeing his may be that I am one of the few that use utf-8 everywhere in the machine from the file system up. Many use Latin-1 or -9 as a default encoding, and since 95% of posts are written in those encodings they do not experience problems.

I would suggest that Knode use the post's reported encoding instead of the system's default as it appears to be doing. However Knode should already do this, and actually does correctly recognize ISO-8859-1 in all caps; but most (over 99% I would say) of the posts in this encoding come with an indicated "iso-8859-1".

I guess the solution is to make the algorithm case-insensitive.

Steps to reproduce:
1- configure a Unicode machine with the following instructions:
   if "locale -a | grep en_US" does not return at least one utf-8 locale, define:
       localedef -i en_US -f UTF-8 en_US.UTF-8
   and set the environment variable LC_CTYPE="en_US.UTF-8" before starting X and KDE;
2- Start Knode;
3- Look in any newsgroup with a predominance of Western European encodings.
Comment 1 Thomas Moschny 2004-07-02 18:15:19 UTC
Knode 0.7.7 from KDE 3.2.3 seems to recognize the charset, even when not given in caps.

But there is a related problem: The menu 'view -> charset' seems to override the charset for *all* articles, not only for those without a charset specification (as one could expect). 

In a utf-8 environment, to properly read all those articles without charset specification, I tried to choose latin1 from this menu, but then I have garbage for all umlauts in the utf-8 articles, *although* they are correctly marked as being utf-8. So there is no way of reading all articles without umlaut garbage.

[In a latin1 environment this problem does not arise, because one can leave the charset setting set to 'automatic', i.e. use the charset from the article, or the system default (latin1 in this case), if the article doesn't specify one.]

IMHO there are two solutions: Change the meaning of the 'view->charset' menu to not override a spec in the article, or add a means of specifying a charset as a fallback for articles without a spec (see Bug 44367).
Comment 2 Olivier Trichet 2009-08-09 18:06:15 UTC
- Charset are not case sensitive anymore (for long) and message with correct declaration of their encoding are decoded correctly.
- Fallback encoding can be setted per group (in the group property) or globally (in the "posting" section of the config -- there is already a bug to have distinct posting and reading default encoding).