Summary: | Powerdevil crashes a lot in ddc_close_display() after update to Plasma 6.1.0 | ||
---|---|---|---|
Product: | [Plasma] Powerdevil | Reporter: | Pavel Urusov <pavel.urusov> |
Component: | general | Assignee: | Plasma Bugs List <plasma-bugs> |
Status: | RESOLVED FIXED | ||
Severity: | crash | CC: | auxsvr, ben.schroeder, eugene.savitsky, jai.mu, jpetso, kde.h2vne, madLyfe, me, me, natalie_clarius, nate, sam, smowtenshi |
Priority: | HI | Keywords: | regression |
Version: | 6.1.0 | ||
Target Milestone: | --- | ||
Platform: | openSUSE | ||
OS: | Linux | ||
Latest Commit: | https://invent.kde.org/plasma/powerdevil/-/commit/209782d50796d8aa2bfe36753c69ed6c77ec5c99 | Version Fixed In: | 6.1.2 |
Sentry Crash Report: |
Description
Pavel Urusov
2024-06-25 11:04:39 UTC
Let's take it one thing at a time. Powerdevil crashing is clearly not good, and may be the root cause of the other issues. Let's debug that. Can you get a backtrace with debug symbols of the crashing thread? The one you pasted is obfuscated, shows everything, and is missing debug symbols. See https://community.kde.org/Guidelines_and_HOWTOs/Debugging/How_to_create_useful_crash_reports#Retrieving_a_backtrace_using_coredumpctl to learn how to do this. [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". Core was generated by `/usr/libexec/org_kde_powerdevil'. Program terminated with signal SIGABRT, Aborted. #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 44 return INTERNAL_SYSCALL_ERROR_P (ret) ? INTERNAL_SYSCALL_ERRNO (ret) : 0; [Current thread is 1 (Thread 0x7f75d56006c0 (LWP 2544))] Missing separate debuginfos, use: zypper install libdrm_amdgpu1-debuginfo-2.4.121-1.1.x86_64 libpciaccess0-debuginfo-0.18.1-1.2.x86_64 libpolkit-gobject-1-0-debuginfo-123-2.3.x86_64 libsasl2-3-debuginfo-2.1.28-8.1.x86_64 libwayland-server0-debuginfo-1.23.0-1.1.x86_64 libxcb-dri2-0-debuginfo-1.17.0-1.1.x86_64 libxcb-dri3-0-debuginfo-1.17.0-1.1.x86_64 libxshmfence1-debuginfo-1.3.2-1.4.x86_64 (gdb) bt #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 #1 0x00007f75fd294a63 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78 #2 0x00007f75fd241176 in __GI_raise (sig=6) at ../sysdeps/posix/raise.c:26 #3 0x00007f75fee54c03 in KCrash::defaultCrashHandler (sig=6) at /usr/src/debug/kcrash-6.3.0/src/kcrash.cpp:597 #4 <signal handler called> #5 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 #6 0x00007f75fd294a63 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78 #7 0x00007f75fd241176 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #8 0x00007f75fd228917 in __GI_abort () at abort.c:79 #9 0x00007f75fc0d8025 in g_assertion_message (domain=domain@entry=0x7f75fc17000e "GLib", file=file@entry=0x7f75fc1712f9 "../glib/ghash.c", line=line@entry=493, func=func@entry=0x7f75fc181870 <__func__.25> "g_hash_table_remove_node", message=message@entry=0x7f75a00018b0 "assertion failed: (hash_table->nnodes > 0)") at ../glib/gtestutils.c:3331 #10 0x00007f75fc13f77a in g_assertion_message_expr (domain=domain@entry=0x7f75fc17000e "GLib", file=file@entry=0x7f75fc1712f9 "../glib/ghash.c", line=line@entry=493, func=func@entry=0x7f75fc181870 <__func__.25> "g_hash_table_remove_node", expr=expr@entry=0x7f75fc1712e2 "hash_table->nnodes > 0") at ../glib/gtestutils.c:3357 #11 0x00007f75fc0fecff in g_hash_table_remove_node (hash_table=hash_table@entry=0x56028d8b8360, i=i@entry=0, notify=notify@entry=1) at ../glib/ghash.c:493 #12 0x00007f75fc100a27 in g_hash_table_remove_internal (notify=1, key=0x7f75a0001880, key@entry=0x0, hash_table=0x56028d8b8360) at ../glib/ghash.c:1717 #13 g_hash_table_remove (hash_table=0x56028d8b8360, key=key@entry=0x7f75a0001880) at ../glib/ghash.c:1745 #14 0x00007f75fd151702 in ddc_close_display (dh=0x7f75a0001880) at ddc/ddc_packet_io.c:374 #15 0x00007f75fd192bcc in ddca_close_display (ddca_dh=0x7f75a0001880) at libmain/api_displays.c:587 #16 0x00007f75fee161e6 in BrightnessWorker::ddcSetBrightness (this=0x56028dadb890, value=<optimized out>, display=<optimized out>) at /usr/src/debug/powerdevil-6.1.0/daemon/controllers/ddcutildisplay.cpp:159 #17 0x00007f75fdbd1fbb in QObject::event (this=0x56028dadb890, e=0x56028dbd6970) at /usr/src/debug/qtbase-everywhere-src-6.7.1/src/corelib/kernel/qobject.cpp:1452 #18 0x00007f75fdb8ee20 in QCoreApplication::notifyInternal2 (receiver=0x56028dadb890, event=0x56028dbd6970) at /usr/src/debug/qtbase-everywhere-src-6.7.1/src/corelib/kernel/qcoreapplication.cpp:1134 #19 0x00007f75fdb8ee59 in QCoreApplication::sendEvent (receiver=<optimized out>, event=<optimized out>) at /usr/src/debug/qtbase-everywhere-src-6.7.1/src/corelib/kernel/qcoreapplication.cpp:1575 #20 0x00007f75fdb8f180 in QCoreApplicationPrivate::sendPostedEvents (receiver=0x0, event_type=0, data=0x56028dada510) at /usr/src/debug/qtbase-everywhere-src-6.7.1/src/corelib/kernel/qcoreapplication.cpp:1932 #21 0x00007f75fddc2973 in postEventSourceDispatch (s=0x7f75a0000f20) at /usr/src/debug/qtbase-everywhere-src-6.7.1/src/corelib/kernel/qeventdispatcher_glib.cpp:244 #22 0x00007f75fc112740 in g_main_dispatch (context=0x7f75a0000c60) at ../glib/gmain.c:3344 #23 g_main_context_dispatch_unlocked (context=context@entry=0x7f75a0000c60) at ../glib/gmain.c:4152 #24 0x00007f75fc114388 in g_main_context_iterate_unlocked (context=context@entry=0x7f75a0000c60, block=block@entry=1, dispatch=dispatch@entry=1, self=<optimized out>) at ../glib/gmain.c:4217 #25 0x00007f75fc114a3c in g_main_context_iteration (context=0x7f75a0000c60, may_block=1) at ../glib/gmain.c:4282 #26 0x00007f75fddc147c in QEventDispatcherGlib::processEvents (this=0x7f75a0000b70, flags=...) at /usr/src/debug/qtbase-everywhere-src-6.7.1/src/corelib/kernel/qeventdispatcher_glib.cpp:394 #27 0x00007f75fdb9959b in QEventLoop::exec (this=0x7f75d55ffaa0, flags=...) at /usr/src/debug/qtbase-everywhere-src-6.7.1/src/corelib/global/qflags.h:34 #28 0x00007f75fdc74ad4 in QThread::exec (this=<optimized out>) at /usr/src/debug/qtbase-everywhere-src-6.7.1/src/corelib/global/qflags.h:74 #29 0x00007f75fdced139 in operator() (__closure=<optimized out>) at /usr/src/debug/qtbase-everywhere-src-6.7.1/src/corelib/thread/qthread_unix.cpp:326 #30 (anonymous namespace)::terminate_on_exception<QThreadPrivate::start(void*)::<lambda()> > (t=...) at /usr/src/debug/qtbase-everywhere-src-6.7.1/src/corelib/thread/qthread_unix.cpp:262 #31 QThreadPrivate::start (arg=0x56028dafe960) at /usr/src/debug/qtbase-everywhere-src-6.7.1/src/corelib/thread/qthread_unix.cpp:285 #32 0x00007f75fd292ba2 in start_thread (arg=<optimized out>) at pthread_create.c:447 #33 0x00007f75fd31400c in clone3 () at ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78 Thanks. Looks like it's crashing in DDC code. *** Bug 489163 has been marked as a duplicate of this bug. *** *** Bug 489222 has been marked as a duplicate of this bug. *** I don't see it in this backtrace, but the one from Bug 489163 has two different threads (Thread 1 and Thread 3) both in BrightnessWorker::ddcSetBrightness(). This made me think - the DDCutilDisplay class's m_brightnessWorkerThread is created per instance, which means that for more than one monitor we'd have more than one brightness setter thread. I think this could be the root of this issue, and would explain why I've never seen the crash in my single-DDC-monitor setup despite months of use. I'll try tomorrow to refactor it so that the same thread is used across all different monitors, to prevent simultaneous function execution messing up ddcutil's internal (global) state. A possibly relevant merge request was started @ https://invent.kde.org/plasma/powerdevil/-/merge_requests/390 I can confirm that unplugging the 2nd monitor stops the powerdevil from crashing Git commit 7e1130ffcf7ad667272dabf8345152236a38bf2a by Jakob Petsovits. Committed on 02/07/2024 at 12:03. Pushed by jpetso into branch 'master'. daemon/controllers: Reintroduce a mutex around ddca_open_display2() ddcutil keeps some global state, and in particular we've seen a crash in ddc_close_display() related to its open_displays hashmap. One likely culprit for such a crash is that brightness workers run on separate threads, one per DDCutilDisplay object, so they could get executed simultaneously when more than one external monitor is involved. A less likely, but still possible culprit is when a DDCutilDisplay constructor opens a display handle at the same time that a brightness worker sets the brightness of a different display on a different thread. We could look into combining all brightness setter calls into a single BrightnessWorker that gets shared across DDCutilDisplay objects, but this would introduce extra complexity and still wouldn't address the constructor vs. brightness setter issue. So this commit reintroduces a mutex to isolate all of these usages from each other. We had one in 5.x, which was removed in 6.0 at my (in hindsight incorrect) request: https://invent.kde.org/plasma/powerdevil/-/merge_requests/312#note_865976, https://invent.kde.org/plasma/powerdevil/-/merge_requests/312#note_867942 M +5 -1 daemon/controllers/ddcutildetector.cpp M +41 -30 daemon/controllers/ddcutildisplay.cpp M +3 -1 daemon/controllers/ddcutildisplay.h https://invent.kde.org/plasma/powerdevil/-/commit/7e1130ffcf7ad667272dabf8345152236a38bf2a Git commit 209782d50796d8aa2bfe36753c69ed6c77ec5c99 by Jakob Petsovits. Committed on 02/07/2024 at 12:06. Pushed by jpetso into branch 'Plasma/6.1'. daemon/controllers: Reintroduce a mutex around ddca_open_display2() ddcutil keeps some global state, and in particular we've seen a crash in ddc_close_display() related to its open_displays hashmap. One likely culprit for such a crash is that brightness workers run on separate threads, one per DDCutilDisplay object, so they could get executed simultaneously when more than one external monitor is involved. A less likely, but still possible culprit is when a DDCutilDisplay constructor opens a display handle at the same time that a brightness worker sets the brightness of a different display on a different thread. We could look into combining all brightness setter calls into a single BrightnessWorker that gets shared across DDCutilDisplay objects, but this would introduce extra complexity and still wouldn't address the constructor vs. brightness setter issue. So this commit reintroduces a mutex to isolate all of these usages from each other. We had one in 5.x, which was removed in 6.0 at my (in hindsight incorrect) request: https://invent.kde.org/plasma/powerdevil/-/merge_requests/312#note_865976, https://invent.kde.org/plasma/powerdevil/-/merge_requests/312#note_867942 (cherry picked from commit 7e1130ffcf7ad667272dabf8345152236a38bf2a) Co-authored-by: Jakob Petsovits <jpetso@petsovits.com> M +5 -1 daemon/controllers/ddcutildetector.cpp M +41 -30 daemon/controllers/ddcutildisplay.cpp M +3 -1 daemon/controllers/ddcutildisplay.h https://invent.kde.org/plasma/powerdevil/-/commit/209782d50796d8aa2bfe36753c69ed6c77ec5c99 The merge narrowly missed 6.1.2, so it looks like 6.1.3 in two weeks will be the first release containing this fix. *** Bug 489630 has been marked as a duplicate of this bug. *** So, did it made into 6.1.2? Nate changed the Version Fixed In field to 6.1.2. (In reply to ezh from comment #13) > So, did it made into 6.1.2? Nate changed the Version Fixed In field to 6.1.2. The release ended up recreating the powerdevil 6.1.2 release tarball before it got released to the public. So you'll get it anywhere that Plasma 6.1.2 is available. I can confirm that after upgrade to 6.1.2, powerdevil no longer crashes when I try to adjust the brightness of my monitors. The brightness also changes correctly. Thank you *** Bug 489863 has been marked as a duplicate of this bug. *** *** Bug 489721 has been marked as a duplicate of this bug. *** *** Bug 489715 has been marked as a duplicate of this bug. *** After updating to 6.1.2 I still see many ddcutil crashes when attaching my laptop to an external docking station (WD19TBS). Output of journalctl -p err: Jul 08 09:17:48 bespin systemd-coredump[9774]: [🡕] Process 9422 (ddcutil) of user 0 dumped core. Stack trace of thread 9422: #0 0x000078ee746d41dd n/a (libc.so.6 + 0x16a1dd) #1 0x000057dd78414031 strsplit (ddcutil + 0x8b031) #2 0x000057dd784264ba n/a (ddcutil + 0x9d4ba) #3 0x000057dd78426778 is_possible_monitor_by_hiddev_name (ddcutil + 0x9d778) #4 0x000057dd784268a9 check_usb_monitor (ddcutil + 0x9d8a9) #5 0x000057dd783a4752 main (ddcutil + 0x1b752) #6 0x000078ee7458fc88 n/a (libc.so.6 + 0x25c88) #7 0x000078ee7458fd4c __libc_start_main (libc.so.6 + 0x25d4c) #8 0x000057dd783a4895 _start (ddcutil + 0x1b895) ELF object binary architecture: AMD x86-64 Jul 08 09:17:48 bespin (udev-worker)[9420]: hiddev3: Process '/usr/bin/ddcutil chkusbmon /dev/usb/hiddev3 -v' terminated by signal SEGV. Jul 08 09:17:48 bespin (udev-worker)[9420]: hiddev3: Failed to wait for spawned command '/usr/bin/ddcutil chkusbmon /dev/usb/hiddev3 -v': Input/output error Jul 08 09:17:48 bespin systemd-coredump[9775]: [🡕] Process 9421 (ddcutil) of user 0 dumped core. Stack trace of thread 9421: #0 0x00007c6fd0b941dd n/a (libc.so.6 + 0x16a1dd) #1 0x00005ee53c609031 strsplit (ddcutil + 0x8b031) #2 0x00005ee53c61b4ba n/a (ddcutil + 0x9d4ba) #3 0x00005ee53c61b778 is_possible_monitor_by_hiddev_name (ddcutil + 0x9d778) #4 0x00005ee53c61b8a9 check_usb_monitor (ddcutil + 0x9d8a9) #5 0x00005ee53c599752 main (ddcutil + 0x1b752) #6 0x00007c6fd0a4fc88 n/a (libc.so.6 + 0x25c88) #7 0x00007c6fd0a4fd4c __libc_start_main (libc.so.6 + 0x25d4c) #8 0x00005ee53c599895 _start (ddcutil + 0x1b895) ELF object binary architecture: AMD x86-64 Jul 08 09:17:48 bespin (udev-worker)[9294]: hiddev1: Process '/usr/bin/ddcutil chkusbmon /dev/usb/hiddev1 -v' terminated by signal SEGV. Jul 08 09:17:48 bespin (udev-worker)[9294]: hiddev1: Failed to wait for spawned command '/usr/bin/ddcutil chkusbmon /dev/usb/hiddev1 -v': Input/output error Jul 08 09:17:49 bespin systemd-coredump[9828]: [🡕] Process 9802 (ddcutil) of user 0 dumped core. Stack trace of thread 9802: #0 0x000070bcab6a41dd n/a (libc.so.6 + 0x16a1dd) #1 0x000064e2032bd031 strsplit (ddcutil + 0x8b031) #2 0x000064e2032cf4ba n/a (ddcutil + 0x9d4ba) #3 0x000064e2032cf778 is_possible_monitor_by_hiddev_name (ddcutil + 0x9d778) #4 0x000064e2032cf8a9 check_usb_monitor (ddcutil + 0x9d8a9) #5 0x000064e20324d752 main (ddcutil + 0x1b752) #6 0x000070bcab55fc88 n/a (libc.so.6 + 0x25c88) #7 0x000070bcab55fd4c __libc_start_main (libc.so.6 + 0x25d4c) #8 0x000064e20324d895 _start (ddcutil + 0x1b895) ELF object binary architecture: AMD x86-64 Jul 08 09:17:49 bespin (udev-worker)[9427]: hiddev1: Process '/usr/bin/ddcutil chkusbmon /dev/usb/hiddev1 -v' terminated by signal SEGV. Jul 08 09:17:49 bespin (udev-worker)[9427]: hiddev1: Failed to wait for spawned command '/usr/bin/ddcutil chkusbmon /dev/usb/hiddev1 -v': Input/output error Jul 08 09:17:49 bespin systemd-coredump[9829]: [🡕] Process 9804 (ddcutil) of user 0 dumped core. coredumpctl debug 9422: (gdb) thread apply all backtrace full Thread 1 (Thread 0x78ee744bbb80 (LWP 9422)): #0 __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:76 No locals. #1 0x000057dd78414031 in strsplit (str_to_split=0x0, delims=0x57dd7842faea ":") at util/string_util.c:455 debug = false max_pieces = <optimized out> workstruct = <optimized out> piecect = <optimized out> str_to_split_dup = <optimized out> rest = 0x57dd9f453b90 "/dev/usb/hiddev3" token = <optimized out> result = <optimized out> __func__ = <optimized out> ndx = <optimized out> curpiece = <optimized out> #2 0x000057dd784264ba in avoid_device_by_usb_interfaces_property_string (interfaces=0x0) at usb/usb_displays.c:346 debug = false pieces = <optimized out> avoid = <optimized out> ndx = <optimized out> __func__ = "avoid_device_by_usb_interfaces_property_string" #3 0x000057dd78426778 in is_possible_monitor_by_hiddev_name (hiddev_name=hiddev_name@entry=0x57dd9f453b90 "/dev/usb/hiddev3") at usb/usb_displays.c:405 interfaces = <optimized out> debug = false devsum = 0x57dd9f466a60 avoid = false result = <optimized out> __func__ = "is_possible_monitor_by_hiddev_name" #4 0x000057dd784268a9 in check_usb_monitor (device_name=0x57dd9f453b90 "/dev/usb/hiddev3") at usb/usb_displays.c:903 debug = false ol = DDCA_OL_VERBOSE result = false __PRETTY_FUNCTION__ = "check_usb_monitor" __func__ = <optimized out> #5 0x000057dd783a4752 in main (argc=<optimized out>, argv=<optimized out>) at app_ddcutil/main.c:1034 is_monitor = <optimized out> main_debug = false s = <optimized out> main_rc = 0 start_time_reported = <optimized out> explicit_syslog_level = <optimized out> syslog_opened = <optimized out> preparse_verbose = <optimized out> skip_config = <optimized out> parsed_cmd = 0x57dd9f4536c0 program_start_time = 1720423054 program_start_time_s = <optimized out> new_argv = 0x57dd9f453650 new_argc = 4 untokenized_cmd_prefix = 0x0 configure_fn = 0x0 preparsed_level = <optimized out> bye = <optimized out> --Type <RET> for more, q to quit, c to continue without paging--c errs = <optimized out> callopts = 0 '\000' values = 0x0 end_time = 140727784448032 end_time_s = <optimized out> __func__ = "main" __PRETTY_FUNCTION__ = "main" Looks like a different crash. This bug report is about ddc_close_display() |