A 100% consistent behavior. When I remove power to my router (odroid archlinux server), my the CPU usage for my archlinux desktop runs high and every button in the system taskbar/tray becomes unresponsive. Even though the buttons become unresponsive, I can still ALT-TAB switch or move or click on windows. Once the network connection is restored (server/router is powered back on), everything goes back to normal. Reproducible: Always Steps to Reproduce: 1. Power off router 2. Try switch window by clicking taskbar button. Actual Results: No action from taskbar button click. cpu usage is high. Expected Results: Everything should work as to be expected except for any network access. Don't know if it would matter that NFS is being used.
Adding that the KDE system stat widgets also freeze with that network loss. Also, "kworker/2:1H" (whatever that is) is the one item making one of the cpu cores run at 100%. Now I'm thinking that NFS may be a big part of the issue. After taking my router offline and connecting my desktop directly to the modem, cpu has come back to normal levels, however the taskbar and widgets are still unresponsive.
Can you perhaps debug the kworker eating all CPU? Maybe this [1] helps. [1] https://askubuntu.com/questions/33640/kworker-what-is-it-and-why-is-it-hogging-so-much-cpu
Kai, can you be more specific regarding "debugging the kworker"? Does that involve using gdb, "syscall profiling tools" or simply trial-and-error by unplugging devices, shutting down services...?
Whoa, looking further down that thread, it gets a lot more detailed. I was simply looking at what was marked as "Answer". I think I'll be able to use stuff mentioned in the other answers.
At long last, I've got something to show. It's been a problem for a long time, but either I didn't have the instructions in my notes at the time or I just didn't have time or want to deal with it. When I ran "perf report", perf ended up hanging on "processing events". More time passed and I finally decided to look into that problem. Seems perf may have a bug now: https://bugs.freedesktop.org/show_bug.cgi?id=97879#c26 So after learning some stuff (like getting the questionable fd from strace), I closed the fd which was "/dev/dri/renderD128" and "perf report" finished processing the events and showed the results. Does anyone know where I might want to submit a bug for perf (unless it's a video issue)? ***** Samples: 48K of event 'cycles:ppp', Event count (approx.): 36686319551 Children Self Command Shared Object Symbol ◆ - 29.77% 0.00% kworker/0:1H [kernel.vmlinux] [k] kthread ▒ - kthread ▒ - 29.69% worker_thread ▒ - 26.62% process_one_work ▒ - 15.50% xs_tcp_setup_socket ▒ - 6.51% xs_create_sock ▒ - 3.35% __sock_create ▒ - 2.23% inet_create ▒ 0.92% sk_alloc ▒ - 0.71% tcp_v4_init_sock ▒ 0.63% tcp_init_sock ▒ - 0.82% sock_alloc ▒ - 0.69% new_inode_pseudo ▒ 0.65% alloc_inode ▒ - 2.24% xs_bind ▒ - kernel_bind ▒ - 2.03% inet_bind ▒ 0.86% inet_csk_get_port ▒ - 0.62% kernel_setsockopt ▒ 0.55% sock_setsockopt ▒ - 3.42% kernel_setsockopt ▒ - 2.49% sock_common_setsockopt ▒ - 2.41% tcp_setsockopt ▒ - 2.21% do_tcp_setsockopt.isra.6 ▒ 0.75% release_sock ▒ 0.58% sock_setsockopt ▒ - 1.40% xprt_force_disconnect ▒ - 1.07% rpc_wake_up_status ▒ 0.66% rpc_wake_up_task_on_wq_queue_locked.part.19 ▒ - 1.38% xprt_unlock_connect ▒ - 0.92% xs_tcp_release_xprt ▒ - 0.89% xprt_release_xprt ▒ - 0.83% xprt_clear_locked ▒ queue_work_on ▒ - 1.38% kernel_connect ▒ - 1.33% inet_stream_connect ▒ - 0.70% __inet_stream_connect ▒ 0.52% tcp_v4_connect ▒ - 10.00% xprt_autoclose ▒ - 8.85% xs_tcp_shutdown ▒ - 8.62% xs_reset_transport ▒ - 5.59% sock_release ▒ - 3.49% inet_release ▒ - 3.19% tcp_close ▒ - 1.22% inet_csk_destroy_sock ▒ 0.84% tcp_v4_destroy_sock ▒ - 0.63% sk_free ▒ 0.60% __sk_free ▒ - 1.93% iput ▒ - 1.61% evict ▒ - 0.89% destroy_inode ▒ 0.59% sock_destroy_inode ▒ - 2.20% kernel_sock_shutdown ▒ - 2.09% inet_shutdown ▒ - 1.35% xs_tcp_state_change ▒ 0.69% xprt_disconnect_done ▒ - 0.63% xs_tcp_release_xprt ▒ 0.54% xprt_release_xprt ▒ - 2.41% schedule ▒ - 2.26% __schedule ▒ - 1.08% deactivate_task ▒ - 0.83% dequeue_task_fair ▒ 0.67% dequeue_entity ▒ 0.57% pick_next_task_fair ▒ + 29.77% 0.00% kworker/0:1H [kernel.vmlinux] [k] ret_from_fork + 29.69% 0.33% kworker/0:1H [kernel.vmlinux] [k] worker_thread ▒ + 26.63% 0.48% kworker/0:1H [kernel.vmlinux] [k] process_one_work ▒ + 25.78% 0.00% kworker/1:1H [kernel.vmlinux] [k] kthread ▒ + 25.78% 0.00% kworker/1:1H [kernel.vmlinux] [k] ret_from_fork ▒ + 25.72% 0.27% kworker/1:1H [kernel.vmlinux] [k] worker_thread ▒ + 23.13% 0.49% kworker/1:1H [kernel.vmlinux] [k] process_one_work ▒ + 15.52% 0.72% kworker/0:1H [sunrpc] [k] xs_tcp_setup_socket ▒ + 13.55% 0.55% kworker/1:1H [sunrpc] [k] xs_tcp_setup_socket ▒ + 10.01% 0.17% kworker/0:1H [sunrpc] [k] xprt_autoclose ▒ + 9.37% 0.00% kworker/2:1H [kernel.vmlinux] [k] kthread ▒ + 9.37% 0.00% kworker/2:1H [kernel.vmlinux] [k] ret_from_fork ▒ + 9.34% 0.11% kworker/2:1H [kernel.vmlinux] [k] worker_thread ▒ + 8.87% 0.07% kworker/0:1H [sunrpc] [k] xs_tcp_shutdown ▒ + 8.62% 0.28% kworker/0:1H [sunrpc] [k] xs_reset_transport ▒ + 8.55% 0.15% kworker/1:1H [sunrpc] [k] xprt_autoclose ▒ + 8.35% 0.18% kworker/2:1H [kernel.vmlinux] [k] process_one_work ▒ + 7.70% 0.09% kworker/1:1H [sunrpc] [k] xs_tcp_shutdown ▒ + 7.44% 0.21% kworker/1:1H [sunrpc] [k] xs_reset_transport ▒ + 7.13% 0.00% plasmashell libc-2.24.so [.] __xstat64 *****
If we block at kernel level device calls, the process will freeze. There is not much we can or will do about it, especially as KDE provides it's only userspace NFS/SMB clients for exactly this reason. There's a similar bug in plasmashell which lists some options workarounds, and some patches that can supress the issue from plasmashell.