Version: (using KDE Devel) Installed from: Compiled sources OS: Linux If I have accentuated characters (like conteúdo, etc) in javascript identifiers I get javascript errors and some things don't work.
Please provide a simple test case.
<body> Simple test case. <script language="JavaScript" type="text/JavaScript"> var conteúdo = "Accentuated identifier."; alert(conteúdo); </script> </body>
Although the parser code looks Unicode clean on first sight I suspect that Lexer::isIdentLetter() might be too strict. Have to check the spec. Assigning to kjs component.
CVS commit by porten: allow umlauts, accents as well as greek, cyrillic, thai etc. letters in identifier names. Note 100% precise but better than before. BUGS:102793 M +3 -0 ChangeLog 1.65 M +14 -1 lexer.cpp 1.62 --- kdelibs/kjs/lexer.cpp #1.61:1.62 @@ -577,8 +577,21 @@ bool Lexer::isWhiteSpace(unsigned short bool Lexer::isIdentLetter(unsigned short c) { - /* TODO: allow other legitimate unicode chars */ + // Allow any character in the Unicode categories + // Uppercase letter (Lu), Lowercase letter (Ll), + // Titlecase letter (Lt)", Modifier letter (Lm), + // Other letter (Lo), or Letter number (Nl). + // Also see: http://www.unicode.org/Public/UNIDATA/UnicodeData.txt */ return (c >= 'a' && c <= 'z' || c >= 'A' && c <= 'Z' || + // A with grave - O with diaeresis + c >= 0x00c0 && c <= 0x00d6 || + // O with stroke - o with diaeresis + c >= 0x00d8 && c <= 0x00f6 || + // o with stroke - turned h with fishook and tail + c >= 0x00f8 && c <= 0x02af || + // Greek etc. TODO: not precise + c >= 0x0388 && c <= 0x1ffc || c == '$' || c == '_'); + /* TODO: use complete category table */ } --- kdelibs/kjs/ChangeLog #1.64:1.65 @@ -1,4 +1,7 @@ 2005-04-24 Harri Porten <porten@kde.org> + * lexer.cpp (isIdentLetter): allow umlauts, accents as well as + greek, cyrillic, thai etc. letters in identifier names. + * date_object.cpp (KRFCDate_parseDate): correctly handle large year numbers in "MM/DD/YYYY" formats
You need to log in before you can comment on or make changes to this bug.