Bug 99199 - Amarok crashes with Hyper Threading (HT)
Summary: Amarok crashes with Hyper Threading (HT)
Status: RESOLVED FIXED
Alias: None
Product: amarok
Classification: Applications
Component: general (show other bugs)
Version: 1.2-beta4
Platform: unspecified Linux
: NOR crash with 40 votes (vote)
Target Milestone: ---
Assignee: Amarok Developers
URL:
Keywords:
: 102157 106480 109028 113206 (view as bug list)
Depends on:
Blocks:
 
Reported: 2005-02-12 16:39 UTC by Manuel Zamora-Morschhäuser
Modified: 2006-08-31 21:40 UTC (History)
6 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
backtrace after play/pause crash with xine and HT enabled (15.98 KB, text/plain)
2005-08-24 20:31 UTC, Frank Roscher
Details
Check lib version not kernel version (1.53 KB, patch)
2005-10-22 01:48 UTC, Jaison Lee
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Manuel Zamora-Morschhäuser 2005-02-12 16:40:19 UTC
Version:           1.2-beta4 (using KDE 3.3.2, compiled sources)
Compiler:          gcc version 3.3.4
OS:                Linux (i686) release 2.6.10

When amarok is playing and I do stuff next to it, like surfing the net or coding, nothing too cpu intensive, amarok sometimes crashes when it gets the focus. It appearantly happens randomly, but prior to the crash I have minimized, restored or just clicked the amarok window. I am using amarok 1.2 beta 4, self compiled, on a Slackware 10.1 System with a custom compiled kernel 2.6.10. Attached is a full backtrace (amarok compiled with enable-debug=full):

[...]
amarok: [Scrobbler] Submit successful
QString::arg(): Argument missing: 'Home' and one other track submitted, 1
[New Thread 327690 (LWP 13010)]
[Thread 327690 (LWP 13010) exited]
amarok: [ThreadWeaver] Job aborted: CollectionReader. Jobs pending: 0
[Thread 294921 (LWP 13000) exited]
amarok: BEGIN: void EngineSubject::stateChangedNotify(Engine::State)
amarok: END__: void EngineSubject::stateChangedNotify(Engine::State) - Took 0.08s

Program received signal SIG43, Real-time event 43.
[Switching to Thread 16384 (LWP 12936)]
0x41a3b361 in select () from /lib/libc.so.6
(gdb) bt
#0  0x41a3b361 in select () from /lib/libc.so.6
#1  0x414d8170 in ?? () from /usr/lib/qt/lib/libqt-mt.so.3
#2  0x00000023 in ?? ()
#3  0x08251108 in ?? ()
#4  0x00000001 in ?? ()
#5  0x40fde7da in QEventLoop::processEvents () from /usr/lib/qt/lib/libqt-mt.so.3
#6  0x41046ba8 in QEventLoop::enterLoop () from /usr/lib/qt/lib/libqt-mt.so.3
#7  0x41046a58 in QEventLoop::exec () from /usr/lib/qt/lib/libqt-mt.so.3
#8  0x41034aa1 in QApplication::exec () from /usr/lib/qt/lib/libqt-mt.so.3
#9  0x081615ab in QWizard::setFinish ()
#10 0x4199d469 in __libc_start_main () from /lib/libc.so.6
#11 0x080811d1 in ?? ()

I am currently using the xine engine, but this has happened to me with any other engine. 
The system is a Pentium IV 3GHz with HT enabled, Asus P4P800 Board (i865 chipset), 512 Megs or Infineon RAM. Nothing too exotic. The rest of the system is rock stable, amarok is the only application with random crashes like this one. 

I hope I could help! Thanks for all your work on this great music player.
Comment 1 Max Howell 2005-02-28 16:16:41 UTC
It looks like you have stripped amaroK, there is no amaroK-specific debug information in this backtrace
Comment 2 Greg Meyer 2005-03-19 06:57:08 UTC
Can you please try and reproduce with current cvs or maybe 1.2.2? Thanks.
Comment 3 Michal K 2005-04-05 16:43:21 UTC
Funny thing - i was searching for some randome crashes of amarok when i found this annotation. The funny is simmilarity between our PC's configutarion. 

My is : P4 HT, Asus P4P800 Board 512 MB with Slack 10.0 on board !

I also have some random crashes.
I will work on debug next week.

Is it possible, that this is the problem ??

sorry for my English :]
Comment 4 illogic-al 2005-05-07 06:25:40 UTC
Michal what version of amaroK are you using?
Comment 5 Michal K 2005-05-09 11:29:35 UTC
1.3-CVS
Comment 6 Michal K 2005-05-09 11:31:12 UTC
sorry i'm using 1.2.3
Comment 7 Greg Tourte 2005-07-12 16:50:48 UTC
 I get exactly the same problem with slackware 10.1, custom compiled kernel 2.6.11.9 on a pentium IV with HT enabled (smp) and amarok 1.3beta2.

it only seem to happen on smp configured kernel whether it is 2.4 or 2.6 branch. reverting to a non smp kernel makes amarok much more stable.

hope that helps
Comment 8 Mark Kretschmann 2005-07-12 17:34:15 UTC
On Tuesday 12 July 2005 16:50, Greg Tourte wrote:
> ------- Additional Comments From artourter gmail com  2005-07-12 16:50
> ------- I get exactly the same problem with slackware 10.1, custom compiled
> kernel 2.6.11.9 on a pentium IV with HT enabled (smp) and amarok 1.3beta2.
>
> it only seem to happen on smp configured kernel whether it is 2.4 or 2.6
> branch. reverting to a non smp kernel makes amarok much more stable.


Which audio engine are you using? I've heard that GStreamer has trouble with 
HT.
Comment 9 Greg Tourte 2005-07-12 19:41:06 UTC
I am using the xine engine. but I have tried with gstreamer and arts and all of them have the same result
Comment 10 Manuel Zamora-Morschhäuser 2005-07-12 21:05:03 UTC
I can second that; the engine type doesn't matter when it comes to these random crashes. Most of the crashes I do have these days happen either at a song switch or, just as now, somewhere in between playing. The program didn't crash but just froze. I had to SIGINT it and the following backtrace showed me this:

#0  0xb6a98309 in __lll_mutex_lock_wait () from /lib/tls/libpthread.so.0
#1  0xb6a9533e in _L_mutex_lock_29 () from /lib/tls/libpthread.so.0
#2  0x00000001 in ?? ()
#3  0xb46b9c94 in ?? () from /usr/lib/./libxine.so.1

Funnily all crashes I get have their origin in libpthread (including the song-switching ones). This could be slackware problem with its libpthread or some strange hardware problem concerning a P4 with HT.

When I get my next song-switching crash I'll post another backtrace... 
Comment 11 Greg Tourte 2005-07-12 21:14:32 UTC
Manuel, you seem to be uing the lib from the current tree (/lib/tls) but I am not. we have here at least to different versions of the libraries.

so unless it is something in the way Patrick compiles the libraries in general, I doubt it is a slackware problem, but who knows.
Comment 12 Manuel Zamora-Morschhäuser 2005-07-12 21:43:18 UTC
You are right, I am using -current. This does at least slim the chances of the problem being just slackware centric. 
Comment 13 Greg Tourte 2005-07-13 15:38:25 UTC
just one silly question:

out of you guys who have these relugar crashes, how many of you run setiat home or something similar? and what happen if setiathome (or equivalent) is truned off?
Comment 14 Greg Tourte 2005-07-13 16:02:40 UTC
woops sorry about the typos. my dyslexia is at it again! (I wish one could edit comments in here once they have been posted:/

What I really wanted to type was:
Out of you guys who have these regular crashes, how many of you run setiathome or something similar? and what happen if setiathome (or equivalent) is turned off? 

Anyway to add something to my own post, the reason I am asking is that I used to have setiathome running all the time until now when I have switched it off (to cool down the machine). And just to try amarok on a file today, I started amarok and it has been running now without crashing for more than an hour (which is very unusual as it usually crashes after 5 minutes when I am lucky).

well it did actually freeze at some point when I changed the analyser view and claimed that the GL driver didn't support something, but this is another issue entirely.
Comment 15 Manuel Zamora-Morschhäuser 2005-07-13 21:53:37 UTC
I don't run anything like seti@home. 

But if programs get unstable if you use 100% CPU Time, this could possibly be some sort of hardware unstability. But of course this would effect all applications, not only amarok.

The random crashes I get are total unrelated to the work I do. Amarok crashes when I just listen to music (ie. CPU = idle) or when I do heavy work (for example doing photo stuff). 
Comment 16 Greg Tourte 2005-07-13 23:34:58 UTC
yes, you are right. if it was a hardware unstability, it would affect the entire system. but my system is rock solid apart from amarok...

by the way if you use musepack files (.mpc) can you have a look at bug #109028. I seem to only be able to reproduce it on the smp/HT machine so it might be linked to this although that one is not random. I'd be interested to see if you get the same thing.
Comment 17 Matthieu Bedouet 2005-08-07 03:16:41 UTC
I can confirm this issue, when disabling HT in the BIOS amarok becomes very stable.

I get the same kind of backtraces, than involves libc, libqt-mt and libpthread.
they are quite random, but seem often related to images manipulation code.

I can't tell if it's a problem with glibc, qt or amarok, but the others kde apps are very stable.

I'm running Debian sid with:
kde 3.4.1
qt 3.3.4
glibc 2.3.5
linux 2.6.12.3 (with CONFIG_SMP, CONFIG_SCHED_SMT, CONFIG_PREEMPT)

amarok 1.3svn compiled with gcc 3.3 (same as kde and qt)

ldd /usr/bin/amarokapp
        ...
        libqt-mt.so.3 => /usr/lib/libqt-mt.so.3 (0xb6907000)
        ...
        libstdc++.so.5 => /usr/lib/libstdc++.so.5 (0xb6307000)
        libm.so.6 => /lib/tls/libm.so.6 (0xb62e1000)
        libgcc_s.so.1 => /lib/libgcc_s.so.1 (0xb62d6000)
        libc.so.6 => /lib/tls/libc.so.6 (0xb619e000)
        ...
Comment 18 Ian Monroe 2005-08-07 05:52:04 UTC
Not many KDE apps use multithreading as extensively as amaroK. If there was an issue with multithreading and HT, amaroK would be the app to show it. Though it could be the opposite, some bugs only show themselves when HT is on.
Comment 19 Frank Roscher 2005-08-23 19:46:04 UTC
Same issue here - Pentium IV 3GHz running Gentoo. When HT is enabled amaroK crashes about once each hour, the rest of the system remains quite stable.
Has anybody got an idea who we have to talk to to get this bug fixed? It simply can't stay like this, one shouldn't be forced to fiddle with BIOS settings to get a stable desktop - and most people won't even know about the cause of the problem.
Comment 20 Frank Roscher 2005-08-24 17:01:53 UTC
I tried reverting to an older version of pth, namely 1.4.0 instead of 2.0.3, but it didn't make any difference (I recompiled amaroK after the switch if that matters at all)

BTW, I can provoke a crash with HT enabled by hitting the play/pause button several times in a row - amaroK doesn't survive more than 4 or 5 of these commands. It still can create the mail with the backtrace in this case, but as other people already pointed out not all HT-related crashes are like this: Sometimes amaroK just freezes, sometimes it spawns two or even three mail windows with slightly different content (broken backtraces, missing information).

Probably this really isn't the fault of the amaroK developers, but for now you're the ones that at least are in a position to gather more information about this problem so we eventually can bug the people who can fix this issue. Please, help us :) 
Comment 21 Mark Kretschmann 2005-08-24 17:20:55 UTC
On Wednesday 24 August 2005 17:01, Frank Roscher wrote:
> Probably this really isn't the fault of the amaroK developers, but for now
> you're the ones that at least are in a position to gather more information
> about this problem so we eventually can bug the people who can fix this
> issue. Please, help us :)


The strange thing is that amaroK works just fine in a SMP environment, say 
with two processors. To my knowledge, HT should be entirely transparent to 
the application and behave just like SMP. So I'm wondering how one could get 
different results.

Does this not point to a bug in the kernel or perhaps one of the low-level 
libraries, like glibc?
Comment 22 Frank Roscher 2005-08-24 20:29:54 UTC
I noticed that the reproducible play/pause bug I described above is xine-specific (tested arts, gstreamer and helix). Maybe this can give a further clue about the nature of the problem? FWIW, I'll attach one of the backtraces I get - they look rather odd to me but as I'm no developer that doesn't say too much...

Before I forget: CONFIG_SCHED_SMT has no influence on the problem (I just tell you so other people don't waste their time trying this).
Comment 23 Frank Roscher 2005-08-24 20:31:36 UTC
Created attachment 12361 [details]
backtrace after play/pause crash with xine and HT enabled
Comment 24 Frank Roscher 2005-08-26 21:23:27 UTC
Well, looks like we're stuck here.
Maybe it would be best to simply file a bug on the kernel.org bugtracker and see if the people there jump at us for thinking it could be their fault ;)

BTW, is there an obvious reason why Kaffeine doesn't crash when I repeatedly hit play/pause although it uses xine, too? Could this mean the problematic code (or at least part of it) is in amaroK's xine engine?
Comment 25 Alexandre Oliveira 2005-09-10 05:22:33 UTC
Better summary.
Comment 26 Alexandre Oliveira 2005-09-10 05:29:15 UTC
*** Bug 106480 has been marked as a duplicate of this bug. ***
Comment 27 Mark Kretschmann 2005-09-22 09:30:49 UTC
We have found out that passing the kernel parameter "NOHT" at boot time fixes these issues.

The good thing is that HyperThreading will still be used. NOHT only deactivates a presumably buggy codepath in the kernel.
Comment 28 Frank Roscher 2005-09-22 12:57:11 UTC
Great! Works for me, too :)

Is there any information on what the kernel devs are going to do about this?
Comment 29 Ian Monroe 2005-09-22 15:49:01 UTC
You could make a bug on http://bugzilla.kernel.org
Comment 30 Mark Kretschmann 2005-09-24 14:52:43 UTC
*** Bug 113206 has been marked as a duplicate of this bug. ***
Comment 31 Mark Kretschmann 2005-09-25 11:20:11 UTC
SVN commit 463759 by markey:

Added a check for HyperThreading, using the proc file system. If HyperThreading is enabled, a warning dialog is shown at first startup.

CCBUG: 99199
CCMAIL: amarok-devel@lists.sf.net



 M  +1 -0      ChangeLog  
 M  +30 -1     src/app.cpp  
 M  +3 -0      src/app.h  


--- trunk/extragear/multimedia/amarok/ChangeLog #463758:463759
@@ -9,6 +9,7 @@
     * Added a mouse-over effect for the volume slider.
 
   CHANGES:
+    * Added a warning dialog when HyperThreading is enabled. (BR 99199)
     * Added a context menu to the volume slider.
     * When viewing covers in fullsize, the window has a maximum size, and
       scrollbars are shown if necessary. The user can also scroll the cover
--- trunk/extragear/multimedia/amarok/src/app.cpp #463758:463759
@@ -23,7 +23,6 @@
 #include "configdialog.h"
 #include "debug.h"
 #include "collectionbrowser.h"
-
 #include "effectwidget.h"
 #include "enginebase.h"
 #include "enginecontroller.h"
@@ -64,6 +63,7 @@
 #include <qpalette.h>            //applyColorScheme()
 #include <qpixmap.h>             //QPixmap::setDefaultOptimization()
 #include <qpopupmenu.h>          //genericEventHandler
+#include <qtimer.h>              //showHyperThreadingWarning()
 #include <qtooltip.h>            //default tooltip for trayicon
 
 
@@ -146,6 +146,20 @@
     pruneCoverImages();
     #endif
 
+    // Check for HyperThreading, see BUG 99199
+    QString line;
+    QFile cpuinfo( "/proc/cpuinfo" );
+    if ( cpuinfo.open( IO_ReadOnly ) ) {
+        while ( cpuinfo.readLine( line, 20000 ) != -1 ) {
+            if ( line.startsWith( "flags" ) ) {
+                const QString flagsLine = line.section( ":", 1 );
+                const QStringList flags = QStringList::split( " ", flagsLine );
+                if ( flags.contains( "ht" ) )
+                    QTimer::singleShot( 0, this, SLOT( showHyperThreadingWarning() ) );
+            }
+        }
+    }
+
     // Trigger collection scan if folder setup was changed by wizard
     if ( oldCollectionFolders != AmarokConfig::collectionFolders() )
         CollectionDB::instance()->startScan();
@@ -877,6 +891,21 @@
 }
 
 
+void App::showHyperThreadingWarning() const //SLOT
+{
+    const QString text =
+        i18n( "<p>You are using a processor with the <i>HyperThreading</i> "
+              "feature enabled. Please note that amaroK may be unstable with this "
+              "configuration.</p>"
+              "<p>If you are experiencing problems, use the Linux kernel option 'NOHT', "
+              "or disable <i>HyperThreading</i> in your BIOS setup.</p>"
+              "<p>More information can be found in the README file. For further assistance "
+              "join us at #amarok on irc.freenode.net.</p>" );
+
+    KMessageBox::information( 0, text, i18n( "Warning" ), "showHyperThreadingWarning" );
+}
+
+
 void App::pruneCoverImages()
 {
 #ifdef AMAZON_SUPPORT
--- trunk/extragear/multimedia/amarok/src/app.h #463758:463759
@@ -65,6 +65,9 @@
         void slotConfigEqualizer();
         void firstRunWizard();
 
+    private slots:
+        void showHyperThreadingWarning() const;
+
     private:
         void initGlobalShortcuts();
         void applyColorScheme();
Comment 32 Frank Roscher 2005-09-25 13:11:33 UTC
With HT, there are two CPUs listed in the file, resulting in 2 of the new dialogs showing up. Doesn't look too good :)

On a sidenote: The crashes returned for me the next day after I reported success with the kernel parameter. I checked everything that could have changed in that time, but they wouldn't disappear again. It could be useful if others reported how it worked for them.
Comment 33 Mark Kretschmann 2005-09-25 13:26:20 UTC
SVN commit 463789 by markey:

Don't show HyperThreading warning twice.

CCBUG: 99199
CCMAIL: yannick.torres@keliglia.com



 M  +3 -1      app.cpp  


--- trunk/extragear/multimedia/amarok/src/app.cpp #463788:463789
@@ -154,8 +154,10 @@
             if ( line.startsWith( "flags" ) ) {
                 const QString flagsLine = line.section( ":", 1 );
                 const QStringList flags = QStringList::split( " ", flagsLine );
-                if ( flags.contains( "ht" ) )
+                if ( flags.contains( "ht" ) ) {
                     QTimer::singleShot( 0, this, SLOT( showHyperThreadingWarning() ) );
+                    break;
+                }
             }
         }
     }
Comment 34 Mark Kretschmann 2005-09-25 20:16:44 UTC
SVN commit 463916 by markey:

Reduced the likeliness of false positives for the HyperThreading check. Now we count the number of (virtual) CPUs with the HT flag. If the number is greater than one, we assume that HT is enabled.

CCBUG: 99199



 M  +7 -5      app.cpp  


--- trunk/extragear/multimedia/amarok/src/app.cpp #463915:463916
@@ -146,21 +146,23 @@
     pruneCoverImages();
     #endif
 
-    // Check for HyperThreading, see BUG 99199
+    // BEGIN Check for HyperThreading, see BUG 99199
     QString line;
+    uint cpuCount = 0;
     QFile cpuinfo( "/proc/cpuinfo" );
     if ( cpuinfo.open( IO_ReadOnly ) ) {
         while ( cpuinfo.readLine( line, 20000 ) != -1 ) {
             if ( line.startsWith( "flags" ) ) {
                 const QString flagsLine = line.section( ":", 1 );
                 const QStringList flags = QStringList::split( " ", flagsLine );
-                if ( flags.contains( "ht" ) ) {
-                    QTimer::singleShot( 0, this, SLOT( showHyperThreadingWarning() ) );
-                    break;
-                }
+                if ( flags.contains( "ht" ) ) ++cpuCount;
             }
         }
     }
+    // If multiple CPUs are listed with the HT flag, we got HyperThreading enabled
+    if ( cpuCount > 1 )
+        QTimer::singleShot( 0, this, SLOT( showHyperThreadingWarning() ) );
+    // END
 
     // Trigger collection scan if folder setup was changed by wizard
     if ( oldCollectionFolders != AmarokConfig::collectionFolders() )
Comment 35 Mark Kretschmann 2005-09-26 17:43:19 UTC
*** Bug 109028 has been marked as a duplicate of this bug. ***
Comment 36 Mark Kretschmann 2005-09-26 19:05:39 UTC
*** Bug 102157 has been marked as a duplicate of this bug. ***
Comment 37 Ian Monroe 2005-10-12 07:48:43 UTC
Just asked a kernel hacker Con Kolivas what his opinion was, he suggested that we ask what -march options people are using (and other CFLAGS for that matter). 

So folks having this problem... what are your CFLAGS? You might have to do a little research if your using prepackaged binaries.
Comment 38 Matthieu Bedouet 2005-10-12 08:44:59 UTC
Hello,

I compile with unsermake, without modifying CFLAGS and march.
gcc is 4.0.2 from debian sid.
from Makefile:

CFLAGS = -std=iso9899:1990 -W -Wall -Wchar-subscripts -Wshadow -Wpointer-arith -Wmissing-prototypes -Wwrite-strings -D_XOPEN_SOURCE=500 -D_BSD_SOURCE -g3 -fno-inline   -Wformat-security -Wmissing-format-attribute

CPPFLAGS =  -DQT_THREAD_SUPPORT  -D_REENTRANT

CXXFLAGS = -Wno-long-long -Wundef -ansi -D_XOPEN_SOURCE=500 -D_BSD_SOURCE -Wcast-align -Wconversion -Wchar-subscripts -Wall -W -Wpointer-arith -g3 -fno-inline -Wformat-security -Wmissing-format-attribute -Wno-non-virtual-dtor -fno-exceptions -fno-check-new -fno-common -DQT_CLEAN_NAMESPACE -DQT_NO_ASCII_CAST -DQT_NO_STL -DQT_NO_COMPAT -DQT_NO_TRANSLATION

build = i686-pc-linux-gnu
build_alias =
build_cpu = i686
build_os = linux-gnu
build_vendor = pc
Comment 39 Bastian Venthur 2005-10-14 09:12:35 UTC
Same problem here on a Debian/Sid box and my system is:

Board: ASUS P4P800, CPU: INTEL 2,4GHZ HT, RAM: 512MB

Amarok crashes randomly no matter which Audioengine I use.
Comment 40 Frank Roscher 2005-10-14 10:33:58 UTC
Nothing too exotic here, from /etc/make.conf:

CFLAGS="-O2 -march=pentium4 -pipe"
Comment 41 Ian Monroe 2005-10-14 11:23:53 UTC
-march=pentium4 does have known issues according to Con. But given that other people don't have -march=pentium4 but crash anywas...

And there are some who have HT and it works just fine. Yummy.
Comment 42 Frank Roscher 2005-10-14 12:02:02 UTC
Could it maybe be related to the mobo? When I bought this computer and still used Windows as my primary OS, it used to go to it's knees (can't remember exactly, I think it froze just as amaroK does) from time to time due to a problem with the mobo chipset. After I installed a current VIA driver the problems disappeared.
Comment 43 Elad Lahav 2005-10-17 16:26:01 UTC
I'm having the same problems with Amarok on Archlinux (built with -march=i686). Same mobo as the other people on this list (ASUS P4P800).
Comment 44 Mark Kretschmann 2005-10-18 09:32:03 UTC
Here a workaround to get it running with HT:

http://amarok.kde.org/component/option,com_simpleboard/Itemid,57/func,view/id,8716/catid,9/
Comment 45 Mark Kretschmann 2005-10-19 12:28:52 UTC
SVN commit 471930 by markey:

    // Workaround for instability issues with HyperThreading CPU's, see BUG 99199.
    // First we detect the presence of HyperThreading. If active, we bind amarokapp
    // to the first CPU only (hard affinity).

PLEASE TEST!

BUG: 99199



 M  +1 -0      ChangeLog  
 M  +60 -39    src/app.cpp  
 M  +3 -3      src/app.h  


--- trunk/extragear/multimedia/amarok/ChangeLog #471929:471930
@@ -31,6 +31,7 @@
       icon, respectively.
 
   BUGFIXES:
+    * Workaround for stability issues with HyperThreading on Linux. (BR 99199)
     * xine-engine: Equalizer became inactive on trackchange when crossfading
       was enabled. (BR 114492)
     * Pausing a track would abort lyrics and wiki fetch jobs. (BR 114576)
--- trunk/extragear/multimedia/amarok/src/app.cpp #471929:471930
@@ -143,36 +143,14 @@
         EngineController::instance()->restoreSession();
     }
 
+    fixHyperThreading();
+
     // Refetch covers every 80 days or delete every 90 days to comply with Amazon license
     #ifdef AMAZON_SUPPORT
     new RefreshImages();
     pruneCoverImages();
     #endif
 
-    // BEGIN Check for HyperThreading, see BUG 99199
-    QString line;
-    uint cpuCount = 0;
-    QFile cpuinfo( "/proc/cpuinfo" );
-    if ( cpuinfo.open( IO_ReadOnly ) ) {
-        while ( cpuinfo.readLine( line, 20000 ) != -1 ) {
-            if ( line.startsWith( "vendor_id" ) && line.contains( "AuthenticAMD" ) ) {
-                // Special case for AMD CPU's like the Athlon 64 X2, which reports a bogus
-                // HT flag. @see BUG 114190
-                cpuCount = 1;
-                break;
-            }
-            if ( line.startsWith( "flags" ) ) {
-                const QString flagsLine = line.section( ":", 1 );
-                const QStringList flags = QStringList::split( " ", flagsLine );
-                if ( flags.contains( "ht" ) ) ++cpuCount;
-            }
-        }
-    }
-    // If multiple CPUs are listed with the HT flag, we got HyperThreading enabled
-    if ( cpuCount > 1 )
-        QTimer::singleShot( 0, this, SLOT( showHyperThreadingWarning() ) );
-    // END
-
     // Trigger collection scan if folder setup was changed by wizard
     if ( oldCollectionFolders != AmarokConfig::collectionFolders() )
         CollectionDB::instance()->startScan();
@@ -382,6 +360,64 @@
     }
 }
 
+
+void App::fixHyperThreading()
+{
+    // Workaround for instability issues with HyperThreading CPU's, see BUG 99199.
+    // First we detect the presence of HyperThreading. If active, we bind amarokapp
+    // to the first CPU only (hard affinity).
+
+    DEBUG_BLOCK
+
+    #ifdef __linux__
+    QString line;
+    uint cpuCount = 0;
+    QFile cpuinfo( "/proc/cpuinfo" );
+    if ( cpuinfo.open( IO_ReadOnly ) ) {
+        while ( cpuinfo.readLine( line, 20000 ) != -1 ) {
+            if ( line.startsWith( "vendor_id" ) && !line.contains( "GenuineIntel" ) ) {
+                // Ignore non-Intel CPU's, because some AMD CPU's (like Athlon 64 X2) report
+                // a bogus HT flag. @see BUG 114190
+                cpuCount = 1;
+                break;
+            }
+            if ( line.startsWith( "flags" ) ) {
+                const QString flagsLine = line.section( ":", 1 );
+                const QStringList flags = QStringList::split( " ", flagsLine );
+                if ( flags.contains( "ht" ) ) ++cpuCount;
+            }
+        }
+    }
+    // If multiple CPUs are listed with the HT flag, we got HyperThreading enabled
+    if ( cpuCount > 1 ) {
+        debug() << "CPU with active HyperThreading detected. Enabling WORKAROUND.\n";
+
+        #include <linux/version.h>
+        #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,0)
+        #include <sched.h>
+        cpu_set_t mask;
+        CPU_ZERO( &mask ); // Initializes all the bits in the mask to zero
+        CPU_SET( 0, &mask ); // Sets only the bit corresponding to cpu
+        if ( sched_setaffinity( 0, sizeof(mask), &mask ) == -1 )
+        #endif // LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,0)
+        {
+            warning() << "sched_setaffinity() call failed or unavailable." << endl;
+            const QString text =
+                i18n( "<p>You are using a processor with the <i>HyperThreading</i> "
+                      "feature enabled. Please note that amaroK may be unstable with this "
+                      "configuration.</p>"
+                      "<p>If you are experiencing problems, use the Linux kernel option 'NOHT', "
+                      "or disable <i>HyperThreading</i> in your BIOS setup.</p>"
+                      "<p>More information can be found in the README file. For further assistance "
+                      "join us at #amarok on irc.freenode.net.</p>" );
+
+            KMessageBox::information( 0, text, i18n( "Warning" ), "showHyperThreadingWarning" );
+        }
+    }
+    #endif //__linux__
+}
+
+
 /////////////////////////////////////////////////////////////////////////////////////
 // METHODS
 /////////////////////////////////////////////////////////////////////////////////////
@@ -915,21 +951,6 @@
 }
 
 
-void App::showHyperThreadingWarning() const //SLOT
-{
-    const QString text =
-        i18n( "<p>You are using a processor with the <i>HyperThreading</i> "
-              "feature enabled. Please note that amaroK may be unstable with this "
-              "configuration.</p>"
-              "<p>If you are experiencing problems, use the Linux kernel option 'NOHT', "
-              "or disable <i>HyperThreading</i> in your BIOS setup.</p>"
-              "<p>More information can be found in the README file. For further assistance "
-              "join us at #amarok on irc.freenode.net.</p>" );
-
-    KMessageBox::information( 0, text, i18n( "Warning" ), "showHyperThreadingWarning" );
-}
-
-
 void App::pruneCoverImages()
 {
 #ifdef AMAZON_SUPPORT
--- trunk/extragear/multimedia/amarok/src/app.h #471929:471930
@@ -65,10 +65,10 @@
         void slotConfigEqualizer();
         void firstRunWizard();
 
-    private slots:
-        void showHyperThreadingWarning() const;
+    private:
+        /** Workaround for HyperThreading CPU's, @see BUG 99199 */
+        void fixHyperThreading();
 
-    private:
         void initGlobalShortcuts();
         void applyColorScheme();
 
Comment 46 Matthieu Bedouet 2005-10-19 12:47:41 UTC
I can confirm that works very well.

thank you very much!
Comment 47 Jaison Lee 2005-10-20 20:23:40 UTC
I'm sorry I'm late to the party, but I think it might be better to check /proc/version instead of hardcoding the linux version number. It's very common for the headers on people's systems to not match the kernel they are running. Doing otherwise can produce weird library bugs. 
If noone has objections I can submit a patch myself, but I won't be able to get to it for at least a few days... A more active developer can snipe me if he/she wishes. :)
Comment 48 Ian Monroe 2005-10-20 21:47:14 UTC
Jaison, its a compile time option. We can't check on stuff like /proc/version in macro.
Comment 49 Mark Kretschmann 2005-10-21 01:12:09 UTC
SVN commit 472502 by markey:

Removed the check for GenuineIntel again. We got reports that the Athlon 64 X2 is also affected from the stability problems. Since this CPU does report the HT flag (without actually having HT), this is further proof for my theory that the HT scheduler is to blame.

CCBUG: 99199



 M  +1 -2      ChangeLog  
 M  +0 -6      src/app.cpp  


--- trunk/extragear/multimedia/amarok/ChangeLog #472501:472502
@@ -35,7 +35,7 @@
       icon, respectively.
 
   BUGFIXES:
-    * File browser now correctly checks for the availibity of remote 
+    * File browser now correctly checks for the availibity of remote
       directories. (BR 114498)
     * Podcast settings would not add a trailing slash to podcast save
       locations. (BR 114712)
@@ -62,7 +62,6 @@
     * Changing the podcast purge count could sometimes cause amaroK to hang.
     * NMM-engine: Fixed crash after playing a song to the end, the trackEnd
       signal was not emitted from the GUI thread.
-    * Don't show a HyperThreading warning for Athlon 64 X2 CPU. (BR 114190)
     * With Random Mode enabled and Repeat Playlist disabled, when it got to
       the last track, it would play it a second time and then keep on playing
       other tracks, instead of just stopping.
--- trunk/extragear/multimedia/amarok/src/app.cpp #472501:472502
@@ -378,12 +378,6 @@
     QFile cpuinfo( "/proc/cpuinfo" );
     if ( cpuinfo.open( IO_ReadOnly ) ) {
         while ( cpuinfo.readLine( line, 20000 ) != -1 ) {
-            if ( line.startsWith( "vendor_id" ) && !line.contains( "GenuineIntel" ) ) {
-                // Ignore non-Intel CPU's, because some AMD CPU's (like Athlon 64 X2) report
-                // a bogus HT flag. @see BUG 114190
-                cpuCount = 1;
-                break;
-            }
             if ( line.startsWith( "flags" ) ) {
                 const QString flagsLine = line.section( ":", 1 );
                 const QStringList flags = QStringList::split( " ", flagsLine );
Comment 50 Mark Kretschmann 2005-10-21 10:00:51 UTC
Well guys, I was hoping to get some feedback from you on my patch, before 1.3.4 is released. Does it actually work?
Comment 51 Thanos Kyritsis 2005-10-21 15:16:02 UTC
I need some help. I'm trying to test the patch, but I get:
amarok:     CPU with active HyperThreading detected. Enabling WORKAROUND.
amarok:     [WARNING!] sched_setaffinity() call failed or unavailable.

Do I need some specific options in the kernel for sched_setaffinity() to work ?!
Comment 52 Jaison Lee 2005-10-21 15:50:10 UTC
>Jaison, its a compile time option. We can't check on stuff like /proc/version in macro.

Yes, I know that. :) What I wasn't aware of was that the parameters to sched_setaffinity changed between 2.4 and 2.6 so we can't link to both (at least not that I know of). (You do understand why checking LINUX_VERSION_CODE at compile time is not a good test of whether or not the user is running that  kernel version, right?)

In it's current form, the patch only does what it was meant to on systems where the system libraries were compiled against the same version kernel headers, which in my experience are few and far between. Namely, every single Slackware user is compiling against 2.4 headers even if they are running the 2.6 kernel. This is probably the problem that Thanos Kyritsis is having.

I'm still planning to look at this over the weekend, but I'm open for suggestions on how to handle it. Perhaps check for the existence of the shell program "taskset" and use that? 
Comment 53 Mark Kretschmann 2005-10-21 16:05:06 UTC
On Friday 21 October 2005 15:50, Jaison Lee wrote:
> I'm still planning to look at this over the weekend, but I'm open for
> suggestions on how to handle it. Perhaps check for the existence of the
> shell program "taskset" and use that?


Unfortunately, taskset is from the package schedutils. Few people have this 
installed.
Comment 54 Thanos Kyritsis 2005-10-21 16:10:04 UTC
Yes, you are right, I am using Slackware (compiled with 2.4.x kernel headers) running on 2.6.11.12. 

I've tried taskset, but I don't think it worked either. I am monitoring CPU usage with gkrellm. Whatever I've done (noht, taskset, new patch), I never saw amarok CPU usage on one CPU, it always gives CPU usage on both CPUs.

And in the meantime, 1.3.3 is much more stable than 1.3.2 and 1.3.1. Even with HT on, amarok 1.3.3 can sometimes run for hours without crashing and sometimes it crashes in the start while loading my huge playlist. Even with taskset, sometimes it crashes, sometimes not, so I believe taskset doesn't work either.

But afar from all these, this patch should only be temporary imho. We (or, better, the kernel devs) should find what's going on with HT in the kernel :(

I'm running the kernel with CONFIG_SCHED_SMT on, do you think that this might be  "helping" the crashes ? (mark mentioned HT scheduler is to blame).

CONFIG_SCHED_SMT:
                                                                                             
SMT scheduler support improves the CPU scheduler's decision making                                                  
when dealing with Intel Pentium 4 chips with HyperThreading at a                                                    
cost of slightly increased overhead in some places. If unsure say                                                   
N here.
Comment 55 Mark Kretschmann 2005-10-21 16:11:58 UTC
On Friday 21 October 2005 15:50, Jaison Lee wrote:
> Yes, I know that. :) What I wasn't aware of was that the parameters to
> sched_setaffinity changed between 2.4 and 2.6 so we can't link to both (at
> least not that I know of).


sched_setaffinity() doesn't even exist on 2.4.
Comment 56 Ian Monroe 2005-10-21 16:39:09 UTC
People with 2.4 headers won't use this fix. Simple. There's no way around it.
Comment 57 Mark Kretschmann 2005-10-21 17:05:15 UTC
SVN commit 472669 by markey:

Improved debug output of HT fix routine. Now you get detailed information about the kind of failure.

CCBUG: 99199



 M  +26 -15    app.cpp  
 M  +3 -0      app.h  


--- trunk/extragear/multimedia/amarok/src/app.cpp #472668:472669
@@ -368,7 +368,8 @@
     // to the first CPU only (hard affinity).
     //
     // @see http://www-128.ibm.com/developerworks/linux/library/l-affinity.html
-    // (article on processor affinity with the linux kernel)
+    // @see http://www.linuxjournal.com/article/6799
+    // (articles on processor affinity with the linux kernel)
 
     DEBUG_BLOCK
 
@@ -391,30 +392,40 @@
 
         #include <linux/version.h>
         #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,0)
+        #include <errno.h>
         #include <sched.h>
         cpu_set_t mask;
         CPU_ZERO( &mask ); // Initializes all the bits in the mask to zero
         CPU_SET( 0, &mask ); // Sets only the bit corresponding to cpu
-        if ( sched_setaffinity( 0, sizeof(mask), &mask ) == -1 )
+        if ( sched_setaffinity( 0, sizeof(mask), &mask ) == -1 ) {
+            warning() << "sched_setaffinity() call failed with error code: " << errno << endl;
+            QTimer::singleShot( 0, this, SLOT( showHyperThreadingWarning() ) );
+            return;
+        }
+        #else
+        warning() << "Linux 2.6 kernel headers not found. sched_setaffinity() is unvailable." << endl;
+        QTimer::singleShot( 0, this, SLOT( showHyperThreadingWarning() ) );
         #endif // LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,0)
-        {
-            warning() << "sched_setaffinity() call failed or unavailable." << endl;
-            const QString text =
-                i18n( "<p>You are using a processor with the <i>HyperThreading</i> "
-                      "feature enabled. Please note that amaroK may be unstable with this "
-                      "configuration.</p>"
-                      "<p>If you are experiencing problems, use the Linux kernel option 'NOHT', "
-                      "or disable <i>HyperThreading</i> in your BIOS setup.</p>"
-                      "<p>More information can be found in the README file. For further assistance "
-                      "join us at #amarok on irc.freenode.net.</p>" );
-
-            KMessageBox::information( 0, text, i18n( "Warning" ), "showHyperThreadingWarning" );
-        }
     }
     #endif //__linux__
 }
 
 
+void App::showHyperThreadingWarning() // SLOT
+{
+    const QString text =
+        i18n( "<p>You are using a processor with the <i>HyperThreading</i> "
+              "feature enabled. Please note that amaroK may be unstable with this "
+              "configuration.</p>"
+              "<p>If you are experiencing problems, use the Linux kernel option 'NOHT', "
+              "or disable <i>HyperThreading</i> in your BIOS setup.</p>"
+              "<p>More information can be found in the README file. For further assistance "
+              "join us at #amarok on irc.freenode.net.</p>" );
+
+    KMessageBox::information( 0, text, i18n( "Warning" ), "showHyperThreadingWarning" );
+}
+
+
 /////////////////////////////////////////////////////////////////////////////////////
 // METHODS
 /////////////////////////////////////////////////////////////////////////////////////
--- trunk/extragear/multimedia/amarok/src/app.h #472668:472669
@@ -56,6 +56,9 @@
         void engineTrackPositionChanged( long position, bool /*userSeek*/ );
         void engineVolumeChanged( int );
 
+    private slots:
+        void showHyperThreadingWarning();
+
     public slots:
         void applySettings( bool firstTime = false );
         void slotConfigAmarok( const QCString& page = QCString() );
Comment 58 Jaison Lee 2005-10-21 18:45:06 UTC
> sched_setaffinity() doesn't even exist on 2.4

No, but the library call has been around for a while, and the sched* calls are apparently a popular backport for the kernel. Would it be better perhaps to just check the library version and call sched_setaffinity() if it is there and let the library decide what to do from there? Worst case scenario it just returns -1 and then we display the Message of Doom. :)

By compiling a program of my own it seems that sched_setaffinity() works as advertised on a Slackware 10.2 system running a 2.6 kernel. (Amarok has been running for an hour so far...we'll see how that goes...) I think if we ignore the supposed linux version and let the library sort things out things might be better.

Thoughts?
Comment 59 Jaison Lee 2005-10-22 01:48:44 UTC
Created attachment 13108 [details]
Check lib version not kernel version

I've putting the following patch up for review; it's also available at:
http://www.jaisonlee.net/amarok.patch

Instead of checking the linux version in the headers it simply checks the
library version, and will make the sched calls if the library supports them.
I've verified that this will compile and run on a Slackware 10.2 system with
stock headers and a 2.6 kernel, but that's all it's been tested on as yet. If
no one has any complaints over the weekend I can check it in myself next week,
or if a more experienced Amarok dev wants it in sooner you can of course check
it in whenever you like.
Comment 60 Mark Kretschmann 2005-10-22 08:41:57 UTC
SVN commit 472932 by markey:

Detect GLIBC version, instead of Linux kernel version. Patch by Jaison Lee <lee.jaison@gmail.com>.

CCBUG: 99199



 M  +9 -8      app.cpp  


--- trunk/extragear/multimedia/amarok/src/app.cpp #472931:472932
@@ -66,13 +66,13 @@
 #include <qtimer.h>              //showHyperThreadingWarning()
 #include <qtooltip.h>            //default tooltip for trayicon
 
-//for the HT fix
+// For the HyperThreading fix
 #ifdef __linux__
-    #include <linux/version.h>
-    #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,0)
+    #include <features.h>
+    #if defined(__GLIBC_PREREQ) && __GLIBC_PREREQ(2,3)
         #include <errno.h>
         #include <sched.h>
-    #endif //LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,0)
+    #endif //__GLIBC_PREREQ && __GLIBC_PREREQ(2,3)
 #endif //__linux__
 
 App::App()
@@ -398,7 +398,8 @@
     if ( cpuCount > 1 ) {
         debug() << "CPU with active HyperThreading detected. Enabling WORKAROUND.\n";
 
-        #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,0)
+        // If the library is new enough try and call sched_setaffinity.
+        #if defined(__GLIBC_PREREQ) && __GLIBC_PREREQ(2,3)
         cpu_set_t mask;
         CPU_ZERO( &mask ); // Initializes all the bits in the mask to zero
         CPU_SET( 0, &mask ); // Sets only the bit corresponding to cpu
@@ -408,9 +409,9 @@
             return;
         }
         #else
-        warning() << "Linux 2.6 kernel headers not found. sched_setaffinity() is unvailable." << endl;
+        warning() << "GLIBC version < 2.3: sched_setaffinity() is unvailable." << endl;
         QTimer::singleShot( 0, this, SLOT( showHyperThreadingWarning() ) );
-        #endif // LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,0)
+        #endif // __GLIBC_PREREQ && __GLIBC_PREREQ(2,3)
     }
     #endif //__linux__
 }
@@ -529,8 +530,8 @@
             m_pPlaylistWindow->setCaption( "amaroK - " + EngineController::instance()->bundle().veryNiceTitle() );
         else
             m_pPlaylistWindow->setCaption( "amaroK" );
-    
 
+
         //m_pPlaylistWindow->show(); //must be shown //we do below now
 
         //ensure that at least one Menu is plugged into an accessible UI element
Comment 61 Ilya Konstantinov 2005-10-25 11:43:47 UTC
Can we really call setting CPU affinity a fix and not just a workaround? It's clear that synchronization issues are lurking in Amarok.
Comment 62 caulier.gilles 2006-08-31 20:13:20 UTC
Hi Amarok team,

I a developper from digiKam project. We use multithreading in digiKam everywhere when it's possible (image correction in editor, file loading, batch processing). We using QThread.

We have exactly the same problem with Hyperthreading. Look this B.K.O file: 

http://bugs.kde.org/show_bug.cgi?id=133026

All feedback from you will be very appreciate.

Thanks in advance

Gilles Caulier
Comment 63 Martin Aumueller 2006-08-31 21:40:27 UTC
I'm pretty sure that such issues (as in this bug report just as in Digikam's case - http://bugs.kde.org/show_bug.cgi?id=133026) are not cause by hyperthreading. Instead, hyperthreading exposes other issues (unsafe use of QStrings - the reference counter for its implicit sharing might be accessed simultaneously from multiple threads, ...) in the code.

One possible explanation why mere SMP systems (and hyperthreading should appear to the application just as such a system) don't trigger these bugs so often could be the following:
- in a SMP system, when the same memory address is accessed (for writing) from multiple, the corresponding cache line has to be committed from the CPU's cache back to main memory and ownership has to be transferred to the other CPU wanting to write - a relatively slow process
- in a hyperthreading system, this does not have to happen, as there is only one CPU with only one cache, it's just that this CPU is used by two threads almost simultaneously, as control shifts from one virtual CPU to the other without OS intervention very very often
I think this can increase the probability of two threads accessing the same memory address at the same time quite a lot.

I hope that Qt4 will make these kind of problems mostly go away, as the implicit sharing will be really invisible to the user. In Qt3 you have to use QDeepCopy when passing QStrings between threads - which is really painful to get right in all cases.