Version: 3.3.90 (using 4.3.90 (KDE 4.3.90 (KDE 4.4 RC1)), Kubuntu packages) Compiler: cc OS: Linux (x86_64) release 2.6.31-17-generic After upgrade from KDE 4.3.2 to KDE 4.4 beta1 I have found broken functionality on Encoding autodetection in kate. At 4.3 it successfully detect correct encoding in file, but now it set encoding always to default. How to reproduce: 1. Open kate 2. Go to options, set "Encoding" to "Unicode ( UTF-8 )", "Encoding autodetection" to "Cyrillic", press OK. 3. Open file "test.txt" with text "Тестовый текст" in CP1251 encoding. The file will opened in UTF-8 encoding on KDE 4.4 and in CP-1251 in KDE 4.3!
Created attachment 39762 [details] Test file with text "Тестовый текст" in cp1251 encoding.
In kate application output when opening this file I see: kate(4256)/Kate (Document) KateBuffer::doHighlight: HIGHLIGHTED END --- NEED HL, LINESTART: 0 LINEEND: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL UNTIL LINE: 0 MAX: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL DYN COUNT: 0 MAX: 512 kate(4256)/Kate (Document) KateView::updateView: KateView::updateView kate(4256)/Kate (Document) KateBuffer::doHighlight: HIGHLIGHTED END --- NEED HL, LINESTART: 0 LINEEND: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL UNTIL LINE: 0 MAX: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL DYN COUNT: 0 MAX: 512 kate(4256)/Kate (Document) KateView::updateView: KateView::updateView kate(4256)/Kate (Document) KateBuffer::doHighlight: HIGHLIGHTED END --- NEED HL, LINESTART: 0 LINEEND: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL UNTIL LINE: 0 MAX: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL DYN COUNT: 0 MAX: 512 kate(4256)/Kate (App) KateDocManager::slotDocumentNameChanged: docname changed: "Untitled" -----> "Untitled" kate(4256)/Kate (Document) KateFileLoader::open: PROBER TYPE: "Cyrillic" kate(4256)/Kate (Document) KateFileLoader::open: OPEN USES ENCODING: "windows-1251" kate(4256)/Kate (Document) KateBuffer::doHighlight: HIGHLIGHTED END --- NEED HL, LINESTART: 0 LINEEND: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL UNTIL LINE: 0 MAX: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL DYN COUNT: 0 MAX: 512 kate(4256)/Kate (Document) KateView::updateView: KateView::updateView kate(4256)/Kate (Document) KateBuffer::openFile: Broken UTF-8: false kate(4256)/Kate (Document) KateBuffer::openFile: LOADING DONE 1 kate(4256)/Kate (Document) KateModeManager::fileType: kate(4256)/kdecore (services) KMimeTypeFactory::parseMagic: Now parsing "/usr/local/share/mime/magic" kate(4256)/kdecore (services) KMimeTypeFactory::parseMagic: Now parsing "/usr/share/mime/magic" kate(4256)/kdecore (services) KMimeTypeFactory::parseMagic: Now parsing "/home/murz/.local/share/mime/magic" kate(4256)/Kate (Document) KateBuffer::doHighlight: HIGHLIGHTED END --- NEED HL, LINESTART: 0 LINEEND: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL UNTIL LINE: 0 MAX: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL DYN COUNT: 0 MAX: 512 kate(4256)/Kate (Document) KateView::updateView: KateView::updateView kate(4256)/Kate (Code Completion) KateCompletionWidget::abortCompletion: kate(4256)/Kate (Document) KateBuffer::doHighlight: HIGHLIGHTED END --- NEED HL, LINESTART: 0 LINEEND: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL UNTIL LINE: 0 MAX: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL DYN COUNT: 0 MAX: 512 kate(4256)/Kate (Document) KateView::updateView: KateView::updateView kate(4256)/Kate (Document) KateBuffer::doHighlight: HIGHLIGHTED END --- NEED HL, LINESTART: 0 LINEEND: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL UNTIL LINE: 0 MAX: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL DYN COUNT: 0 MAX: 512 kate(4256)/Kate (Document) KateBuffer::doHighlight: HIGHLIGHTED END --- NEED HL, LINESTART: 0 LINEEND: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL UNTIL LINE: 0 MAX: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL DYN COUNT: 0 MAX: 512 kate(4256)/Kate (Document) KateBuffer::doHighlight: HIGHLIGHTED END --- NEED HL, LINESTART: 0 LINEEND: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL UNTIL LINE: 0 MAX: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL DYN COUNT: 0 MAX: 512 kate(4256)/Kate (Document) KateBuffer::doHighlight: HIGHLIGHTED END --- NEED HL, LINESTART: 0 LINEEND: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL UNTIL LINE: 0 MAX: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL DYN COUNT: 0 MAX: 512 kate(4256)/Kate (Document) KateBuffer::doHighlight: HIGHLIGHTED END --- NEED HL, LINESTART: 0 LINEEND: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL UNTIL LINE: 0 MAX: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL DYN COUNT: 0 MAX: 512 kate(4256)/Kate (Document) KateView::updateView: KateView::updateView kate(4256)/Kate (App) KateDocManager::slotDocumentNameChanged: docname changed: "Untitled" -----> "test.txt" kate(4256)/Kate (Document) KateBuffer::doHighlight: HIGHLIGHTED END --- NEED HL, LINESTART: 0 LINEEND: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL UNTIL LINE: 0 MAX: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL DYN COUNT: 0 MAX: 512 kate(4256)/Kate (Document) KateView::updateView: KateView::updateView kate(4256)/Kate (Document) KateBuffer::doHighlight: HIGHLIGHTED END --- NEED HL, LINESTART: 0 LINEEND: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL UNTIL LINE: 0 MAX: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL DYN COUNT: 0 MAX: 512 kate(4256)/Kate (Document) KateView::updateView: KateView::updateView kate(4256)/Kate (Document) KateBuffer::doHighlight: HIGHLIGHTED END --- NEED HL, LINESTART: 0 LINEEND: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL UNTIL LINE: 0 MAX: 0 kate(4256)/Kate (Document) KateBuffer::doHighlight: HL DYN COUNT: 0 MAX: 512 kate(4256)/Kate (Document) KateView::updateView: KateView::updateView kate(4256)/Kate (App) KateViewDocumentProxyModel::opened: QModelIndex(0,0,0x0,KateViewDocumentProxyModel(0x13d1020) ) kate(4256)/Kate (App) KateMainWindow::slotUpdateHorizontalViewBar: slotUpdateHorizontalViewBar() kate(4256)/Kate (App) KateMainWindow::slotUpdateHorizontalViewBar: KateViewBar(0x1366f30) hiding container kate(4256)/kio (KDirListerCache) KDirListerCache::slotFileDirty: "/home/murz/Documents/checkpoint_in_progress" kate(4256)/kio (KDirListerCache) KDirListerCache::slotFileDirty: "/home/murz/Documents" kate(4256)/kio (KDirListerCache) KDirListerCache::updateDirectory: KUrl("file:///home/murz/Documents") kate(4256)/Kate (Document) KateView::slotLostFocus: KateView::slotLostFocus kate(4256)/Kate (Code Completion) KateCompletionWidget::abortCompletion: Main string is 'OPEN USES ENCODING: "windows-1251"', but it opened in UTF-8 and I see "�������� ����" instead of text!
*** Bug 222180 has been marked as a duplicate of this bug. ***
Problem is still here in KDE 4.4 RC2!
Bug is still exist in KDE 4.4 release too!
4.4.1 Bug still exists.
Removed auto-detection for KDE 4.5, too buggy :( The new basic idea is: 1. try standard encoding 2. if that not works out, try to detect encoding by BOM or use fallback encoding (default is latin-15, can be changed in config dialog, for example to your wanted encoding)
It is'nt very buggy, it works very well for me! It successfully detect unicode, cp1251, koi8-r, and etc. Basic idea isn't help, because I have three encodings, but "default" and "fallback" are only two. Can I get autodetection functionality via some separated package or patch in KDE 4.5?
Could you provide me with 2-3 test files? I will look into the issue then once more, perhaps introducing the auto-detection as an interim step before using fallback encoding.
Assigned to me ;)
Created attachment 41530 [details] test_cp1251.txt
Created attachment 41531 [details] test_koi8-r.txt
Created attachment 41532 [details] test_utf8.txt
I have attached 3 files with text in different Cyrillic encoding, that very often used by me. In KDE 4.3 I set encoding autodetection to "Cyrillic" and KDE succesfully detects it in all files. But in KDE 4.4 I lost this functionality!
My changes are post KDE 4.4.x, therefor they didn't cause this. But I will have a look and try to get this stuff back for KDE 4.5, in a more reliable way. Thanks a lot for attaching the examples.
SVN commit 1102076 by cullmann: reintroduce encoding prober, now loading is a four step thingy documented in code atm CCBUG: 222195 already works for tests provided in bug, but yes, there will be again global config option to alter prober type M +17 -8 katetextbuffer.cpp M +28 -7 katetextloader.h WebSVN link: http://websvn.kde.org/?view=rev&revision=1102076
SVN commit 1102099 by cullmann: introduce encoding detection again, loading now works this way, first working phase will be last one :) 1. standard encoding or the one from filedialog/command line taken 2. encoding detection runs: BOM check, if that fails the selected prober runs, default "universal" 3. fallback encoding is used 4. again encoding from 1. is used, the file is loaded read-only, as encoding errors occured BUG: 222195 fixes above bug, given standard encoding is utf-8 (fallback encoding doesn't matter), all attached test cases are opened with right encoding (even if the detection is default == universal, but ok with "cyrillic" too) M +2 -1 buffer/katetextbuffer.cpp M +20 -0 buffer/katetextbuffer.h M +0 -3 buffer/katetextloader.h M +21 -2 dialogs/katedialogs.cpp M +21 -4 dialogs/opensaveconfigwidget.ui M +4 -3 document/katebuffer.cpp M +74 -49 utils/kateconfig.cpp M +52 -6 utils/kateconfig.h M +9 -0 utils/kateglobal.cpp M +7 -1 utils/kateglobal.h WebSVN link: http://websvn.kde.org/?view=rev&revision=1102099
SVN commit 1102106 by cullmann: add unittests for cyrillic encoding probing BUG: 222195 M +10 -0 CMakeLists.txt A cp1251.txt A cyrillic_utf8.txt A koi8-r.txt WebSVN link: http://websvn.kde.org/?view=rev&revision=1102106
Created attachment 41648 [details] bani_text_utf-8.txt
Created attachment 41649 [details] cooler_utf-8.html
Created attachment 41650 [details] joomla_cp1251.php
Created attachment 41651 [details] joomla_frontend_cp1251.php
Created attachment 41652 [details] joomla_template_cp1251.php
Created attachment 41653 [details] kde.ru_index_utf-8.html
Created attachment 41654 [details] lug_ivanovo_koi8-r.html
Created attachment 41655 [details] page.tpl_utf-8.php
Created attachment 41656 [details] qs_index_utf-8.html
Created attachment 41657 [details] ruskde_koi8-r.htm
Created attachment 41658 [details] sensi_koi8-r.html
Created attachment 41659 [details] ubuntuclub.ru_cp1251.html
I search and add some files in utf-8, cp1251 and koi8-r cyrillic encoding for testing, hope it helps.