Bug 95494 - artsd hangs "busy" during playback
Summary: artsd hangs "busy" during playback
Status: RESOLVED WORKSFORME
Alias: None
Product: arts
Classification: Miscellaneous
Component: artsd (show other bugs)
Version: 1.3.1
Platform: Debian testing Linux
: NOR normal
Target Milestone: ---
Assignee: Stefan Westerfeld
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2004-12-20 01:21 UTC by Andreas Feldner
Modified: 2008-01-03 22:18 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
Patch to cure this bug (2.90 KB, patch)
2004-12-20 01:39 UTC, Andreas Feldner
Details
Restart pcm channels with PIPE error condition also from getParam method (4.96 KB, patch)
2004-12-20 10:17 UTC, Andreas Feldner
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Andreas Feldner 2004-12-20 01:21:47 UTC
Version:           1.3.1 (using KDE KDE 3.2.3)
Installed from:    Debian testing/unstable Packages
Compiler:          gcc 3.3.4 
OS:                Linux

On apparently random occasions, artsd hangs during or immediately after
playing a sound. After a minute or so, the hanging process is terminated
and a warning message shows up in a message box ("cpu overload"). If
artsd was configured to run with highest priority, all the system hangs
for that minute.

In the following I'm trying to add as much information as I could think
of being useful in no specific order ;-)

All files I could verify the problem with where desktop sounds, i.e.
should be wav, I assume.

The machine is an 21164A.

The sound system is alsa.

artsd is running with the following command line (according to ps):
/usr/bin/artsd -F 18 -S 4096 -s 2 -m artsmessage -c drkonqi -l 3 -f

An snippet from strace output of a hanging artsd is appended at the end.

Even when artsd "sleeps" and frees the sound device, I have two artsd
processes running. Only one of them hangs, perhaps there's only one
process left, but I'm not sure about that.

Not relevant output in .xsession-errors. No relevant output in
/var/log/messages.

The previous release of artsd in Debian/testing didn't show these
problems.

-- System Information:
Debian Release: 3.1
  APT prefers testing
  APT policy: (990, 'testing')
Architecture: alpha
Kernel: Linux 2.4.26
Locale: LANG=de_DE@euro, LC_CTYPE=de_DE@euro

Versions of packages libarts1 depends on:
ii  libartsc0                1.3.0-1         aRts Sound system C support librar
ii  libasound2               1.0.6-2         Advanced Linux Sound Architecture 
ii  libaudio2                1.6d-2          The Network Audio System (NAS). (s
ii  libaudiofile0            0.2.6-4         Open-source version of SGI's audio
ii  libc6.1                  2.3.2.ds1-16    GNU C Library: Shared libraries an
ii  libesd0                  0.2.29-1        Enlightened Sound Daemon - Shared 
ii  libgcc1                  1:3.4.1-4sarge1 GCC support library
ii  libglib2.0-0             2.4.6-3         The GLib library of C routines
ii  libice6                  4.3.0.dfsg.1-8  Inter-Client Exchange library
ii  libjack0.80.0-0          0.98.1-5        JACK Audio Connection Kit (librari
ii  libmad0                  0.15.1b-1       MPEG audio decoder library
ii  libogg0                  1.1.0-1         Ogg Bitstream Library
ii  libpng12-0               1.2.5.0-7       PNG library - runtime
ii  libqt3c102-mt            3:3.3.3-4.1     Qt GUI Library (Threaded runtime v
ii  libsm6                   4.3.0.dfsg.1-8  X Window System Session Management
ii  libstdc++5               1:3.3.4-13      The GNU Standard C++ Library v3
ii  libvorbis0a              1.0.1-1         The Vorbis General Audio Compressi
ii  libvorbisenc2            1.0.1-1         The Vorbis General Audio Compressi
ii  libvorbisfile3           1.0.1-1         The Vorbis General Audio Compressi
ii  libx11-6                 4.3.0.dfsg.1-8  X Window System protocol client li
ii  libxext6                 4.3.0.dfsg.1-8  X Window System miscellaneous exte
ii  libxt6                   4.3.0.dfsg.1-8  X Toolkit Intrinsics
ii  xlibs                    4.3.0.dfsg.1-8  X Window System client libraries m
ii  zlib1g                   1:1.2.1.1-7     compression library - runtime
Comment 1 Andreas Feldner 2004-12-20 01:39:00 UTC
Created attachment 8728 [details]
Patch to cure this bug

I found out that the bug is caused by a broken pipe error on the alsa output
channel. This error condition is tested in several methods of the class
AudioIOALSA, also, there are methods available to cure the channel from this
condition. I therefore assume the broken condition of the output channel to be
something "given" and didn't try to find out the reason for it to occur.

Such a test is missing in at least one method, and, unfortunately, it's exactly
the place where a broken pipe would be detected after the modification of event
dispatching between versions 1.2 and 1.3:
After artsd wakes up from a select on the active file descriptors, the first
alsa snd_* function called is, via AudioIOALSA::getParam, snd_pcm_avail_update.
Its return value was not checked at all but directly run through
snd_pcm_frames_to_bytes. In error case, this returns a negative number (a
corresponding multiple of the true error code). The event processor only checks
for the returned value > 0 and returns from the event handling method
immediately. So, none of the methods that handle the broken pipe condition in
AudioIOALSA ever gets called.
As a result, the select statement of the event dispatcher immediately returns
because bytes could be written to output, the event dispatcher calles the
corresponding event handler which immediately returns because it assumes that
it can't put any bytes to output. This results in the described busy hang of
artsd, even a machine hang if artsd is run with "real time priority" settings.

The patch I appended consists of copying the usual failure recovery from the
broken pipe condition found e.g. in the playback methods AudioIOALSA to the
getParam method. In practise, these work well, i.e. the audio channel has
always been usable after returning from the method in all my tests.

Additionally, I put a test into audiosubsys to abort artsd in the case that the
event handling function for the "ready to write" event finds that no bytes are
free on the output channel - this appears to be an obviously inconsistency in
the sound system, no matter what the underlying implementation is. I did not
test this, however, with other implementations than AudioIOALSA.
Comment 2 Allan Sandfeld 2004-12-20 02:22:12 UTC
Nice, but you cannot assume you can write just because ALSA has restarted you. Far too many drivers are buggy and just restarts the client every now and then.
Comment 3 Andreas Feldner 2004-12-20 10:17:06 UTC
Created attachment 8734 [details]
Restart pcm channels with PIPE error condition also from getParam method

I now put the error checking/restart/xrun code into a separate method to be
used from all methods where it may be relevant. The idea is that the
handleError method checks a supplied return code for error conditions it can
handle. If so, it will restart or xrun the relevant pcm channel and return
-EAGAIN to the caller. So, getParam wraps handleError around the called
function, and if the result is -EAGAIN, calles that function again.

It already safes a lot of code duplication, though still the solution is not
too elegant, as the caller still has to check for a certain error condition and
re-call the failed function. To cure this design flaw, I can only think of a
hierarchy of sound action objects that describe a specific alsa call each and
are supplied to an issueCall method or so, that handles error checking and
re-calling itself. But this again seems to be stupidly over-designed.
Comment 4 pelzi 2006-01-08 00:03:45 UTC
The problem is still there in Version 1.4.3
Comment 5 Andreas Feldner 2008-01-03 22:18:02 UTC
It's gone for a while. Probably with version 1.5.