Bug 429125

Summary: gcompris 0.97.1 crashes on Raspberry Pi OS with latest updates
Product: [Applications] gcompris Reporter: kevkearney
Component: generalAssignee: Bruno Coudoin <bruno.coudoin>
Status: RESOLVED FIXED    
Severity: crash CC: animtim, daniele+kde, gcompris, jazeix, kevkearney, pierre
Priority: NOR    
Version First Reported In: git master   
Target Milestone: ---   
Platform: Other   
OS: Linux   
Latest Commit: Version Fixed/Implemented In:
Sentry Crash Report:
Attachments: dmesg

Description kevkearney 2020-11-15 00:38:32 UTC
SUMMARY
Gcompris crashes with the latest raspberry pi os updates, but not the latest release. I am trying to track down what the difference is. The application UI starts up, and the icons start to wiggle, but within seconds the mouse, keyboard, and UI freeze and won't respond to any other input. The cout and cerr streams don't give any clue.

STEPS TO REPRODUCE
1. Install Raspberry Pi OS from August 20th 2020 but don't run "apt full-upgrade".
2. Install the 0.76.1 raspberry pi version of gcompris
3. Install libpulse-dev, because gcompris requires it
4. Start gcompris, it runs properly without a crash
5. Run a "sudo apt update" followed by "sudo apt full-upgrade".
6. Run gcompris

OBSERVED RESULT
gcompris freezes with the latest updates and kernel available for rasperry pi os installed, but works fine on vanilla version from August 20th 2020.

EXPECTED RESULT
gcompris doesn't freeze with the latest updates

SOFTWARE/OS VERSIONS
Windows: 
macOS: 
Linux/KDE Plasma: 
(available in About System)
KDE Plasma Version: 
KDE Frameworks Version: 
Qt Version: 

ADDITIONAL INFORMATION
Tested and verified with both Desktop and Full version of 08/20/2020 raspberry pi OS
Tested on several different Raspberry PI 4 boards, and several SD cards.
Tested and verified with both raspberry pi compiled script, and building from scratch
Comment 1 kevkearney 2020-11-15 00:43:17 UTC
Going to continue trying to track this down by updating packages incrementally, my hunch right now is that it's the kernel.
Comment 2 kevkearney 2020-11-15 01:45:26 UTC
After some testing and updating, I think it's the latest bootloader. Need more evidence.
Comment 3 animtim 2020-11-15 02:01:21 UTC
The 0.76 version is quite old, can you please try to use our latest standalone package 0.97 and see if it runs better?

Take a look at https://gcompris.net/downloads-en.html and read carefully the "Raspberry Pi" part to learn how to download, install and use it.
Comment 4 kevkearney 2020-11-15 02:12:10 UTC
I'm sorry. Total mistype. Am using 0.97.1
Comment 5 kevkearney 2020-11-15 02:19:40 UTC
Latest Bootloader is the problem

Updated Steps to Reproduce:
STEPS TO REPRODUCE
1. Install Raspberry Pi OS from August 20th 2020 but don't install any updates
2. Install the 0.97.1 raspberry pi version of gcompris
3. Install libpulse-dev, because gcompris requires it
4. Start gcompris, it runs properly without a crash
5. Run a "sudo apt update" followed by "sudo apt install raspberrypi-bootloader"
   This will install the following other deps:
   libraspberrypi-bin libraspberrypi-dev libraspberrypi-doc libraspberrypi0
   raspberrypi-bootloader
   Specifically: Unpacking raspberrypi-bootloader (1.20201022-1) over (1.20200819-
6. Run gcompris
7. Freeze
Comment 6 animtim 2020-11-15 02:22:48 UTC
Can you check in raspi-config if you have OpenGL support activated ?
Comment 7 kevkearney 2020-11-15 02:31:34 UTC
I believe that the experimental OpenGL FakeKMS option is default in this version of raspberry pi OS. I did however try both options, and got roughly the same result. Open GL ends up being a hard hang that can only be recovered with a REISUB, or hard poweroff.

Legacy Driver still required a REISUB, but at least it unmounted the sdcard and stuff.
Comment 8 animtim 2020-11-15 02:49:22 UTC
OK. Just for reference, on my test system with Raspbian 9, I have it set to 
"G1 GL (Full KMS) OpenGL desktop driver with full KMS" to get it running properly.

We will need to investigate this issue with the latest bootloader. Thanks for the bug report.
Comment 9 kevkearney 2020-11-15 02:54:46 UTC
(In reply to animtim from comment #8)
> OK. Just for reference, on my test system with Raspbian 9, I have it set to 
> "G1 GL (Full KMS) OpenGL desktop driver with full KMS" to get it running
> properly.
> 
> We will need to investigate this issue with the latest bootloader. Thanks
> for the bug report.

Ah interesting. That doesn't appear to be an option anymore in the latest raspi-config. I only have 2 options, legacy and Fake KMS.
Thanks, let me know if I can be of any help.
Comment 10 kevkearney 2020-11-15 02:55:37 UTC
I can pull new source right from github and build right on my machine if nesscessary.
Comment 11 kevkearney 2020-11-15 16:54:47 UTC
Nov 15 11:52:40 test8 kernel: [  158.596761] v3d fec00000.v3d: MMU error from client L2T (0) at 0x981000, pte invalid
Nov 15 11:52:43 test8 kernel: [  160.796992] ------------[ cut here ]------------
Nov 15 11:52:43 test8 kernel: [  160.797064] WARNING: CPU: 0 PID: 89 at drivers/firmware/raspberrypi.c:63 rpi_firmware_transaction+0xfc/0x138
Nov 15 11:52:43 test8 kernel: [  160.797071] Firmware transaction timeout
Nov 15 11:52:43 test8 kernel: [  160.797078] Modules linked in: cmac rfcomm bnep hci_uart btbcm bluetooth ecdh_generic ecc fuse 8021q garp stp llc brcmfmac brcmutil sha256_generic libsha256 cfg80211 rfkill v3d bcm2835_codec(C) gpu_sched bcm2835_isp(C) v4l2_mem2mem bcm2835_v4l2(C) raspberrypi_hwmon bcm2835_mmal_vchiq(C) videobuf2_dma_contig videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common vc4 vc_sm_cma(C) videodev cec mc snd_bcm2835(C) drm_kms_helper drm drm_panel_orientation_quirks snd_soc_core snd_compress snd_pcm_dmaengine snd_pcm snd_timer rpivid_mem snd syscopyarea sysfillrect sysimgblt fb_sys_fops uio_pdrv_genirq uio i2c_dev ip_tables x_tables ipv6
Nov 15 11:52:43 test8 kernel: [  160.797346] CPU: 0 PID: 89 Comm: kworker/0:2 Tainted: G         C        5.4.72-v7l+ #1356
Nov 15 11:52:43 test8 kernel: [  160.797353] Hardware name: BCM2711
Nov 15 11:52:43 test8 kernel: [  160.797412] Workqueue: events v3d_clock_down_work [v3d]
Nov 15 11:52:43 test8 kernel: [  160.797418] Backtrace: 
Nov 15 11:52:43 test8 kernel: [  160.797435] [<c020e1c8>] (dump_backtrace) from [<c020e544>] (show_stack+0x20/0x24)
Nov 15 11:52:43 test8 kernel: [  160.797445]  r7:ffffffff r6:00000000 r5:60000013 r4:c12a1130
Nov 15 11:52:43 test8 kernel: [  160.797459] [<c020e524>] (show_stack) from [<c0a84cb0>] (dump_stack+0xd4/0x118)
Nov 15 11:52:43 test8 kernel: [  160.797468] [<c0a84bdc>] (dump_stack) from [<c0222424>] (__warn+0xe0/0x108)
Nov 15 11:52:43 test8 kernel: [  160.797476]  r10:d59eef00 r9:00000000 r8:00000009 r7:0000003f r6:c08eb020 r5:00000009
Nov 15 11:52:43 test8 kernel: [  160.797479]  r4:c0e3e5d8 r3:00000000
Nov 15 11:52:43 test8 kernel: [  160.797484] [<c0222344>] (__warn) from [<c0222824>] (warn_slowpath_fmt+0xa8/0xd0)
Nov 15 11:52:43 test8 kernel: [  160.797487]  r7:c08eb020 r6:0000003f r5:c0e3e5d8 r4:c0e3e5f8
Nov 15 11:52:43 test8 kernel: [  160.797493] [<c0222780>] (warn_slowpath_fmt) from [<c08eb020>] (rpi_firmware_transaction+0xfc/0x138)
Nov 15 11:52:43 test8 kernel: [  160.797497]  r8:00001000 r7:efa10440 r6:00000000 r5:ffffff92 r4:efa10440
Nov 15 11:52:43 test8 kernel: [  160.797505] [<c08eaf24>] (rpi_firmware_transaction) from [<c08eb11c>] (rpi_firmware_property_list+0xc0/0x184)
Nov 15 11:52:43 test8 kernel: [  160.797512]  r6:dec04024 r5:00000018 r4:dec04000
Nov 15 11:52:43 test8 kernel: [  160.797520] [<c08eb05c>] (rpi_firmware_property_list) from [<c08eb250>] (rpi_firmware_property+0x70/0x118)
Nov 15 11:52:43 test8 kernel: [  160.797528]  r10:d59eef0c r9:00038002 r8:efa10440 r7:d7653e18 r6:0000000c r5:00000018
Nov 15 11:52:43 test8 kernel: [  160.797536]  r4:d59eef00
Nov 15 11:52:43 test8 kernel: [  160.797553] [<c08eb1e0>] (rpi_firmware_property) from [<c073c33c>] (raspberrypi_clock_property+0x58/0x8c)
Nov 15 11:52:43 test8 kernel: [  160.797559]  r10:0ee6b280 r9:1dcd6500 r8:00000000 r7:efb2ca80 r6:ef050540 r5:00000000
Nov 15 11:52:43 test8 kernel: [  160.797562]  r4:d7653e40 r3:0000000c
Nov 15 11:52:43 test8 kernel: [  160.797567] [<c073c2e4>] (raspberrypi_clock_property) from [<c073c4cc>] (raspberrypi_fw_set_rate+0x54/0xcc)
Nov 15 11:52:43 test8 kernel: [  160.797570]  r5:00000000 r4:ef0509c0
Nov 15 11:52:43 test8 kernel: [  160.797579] [<c073c478>] (raspberrypi_fw_set_rate) from [<c07347f8>] (clk_change_rate+0xf4/0x478)
Nov 15 11:52:43 test8 kernel: [  160.797582]  r6:00000000 r5:00000000 r4:efb2ca80
Nov 15 11:52:43 test8 kernel: [  160.797588] [<c0734704>] (clk_change_rate) from [<c0734d08>] (clk_core_set_rate_nolock+0x18c/0x1b4)
Nov 15 11:52:43 test8 kernel: [  160.797592]  r10:00000000 r9:00000000 r8:00000000 r7:efb2ca80 r6:00002710 r5:00000000
Nov 15 11:52:43 test8 kernel: [  160.797594]  r4:efb2ca80
Nov 15 11:52:43 test8 kernel: [  160.797600] [<c0734b7c>] (clk_core_set_rate_nolock) from [<c0734d70>] (clk_set_rate+0x40/0x98)
Nov 15 11:52:43 test8 kernel: [  160.797607]  r7:eff16600 r6:eff13300 r5:00002710 r4:d7378440
Nov 15 11:52:43 test8 kernel: [  160.797622] [<c0734d30>] (clk_set_rate) from [<bf2bef54>] (v3d_clock_down_work+0x24/0x68 [v3d])
Nov 15 11:52:43 test8 kernel: [  160.797630]  r5:d7634080 r4:d76643ac
Nov 15 11:52:43 test8 kernel: [  160.797648] [<bf2bef30>] (v3d_clock_down_work [v3d]) from [<c023f09c>] (process_one_work+0x214/0x538)
Nov 15 11:52:43 test8 kernel: [  160.797653]  r5:d7634080 r4:d76643ac
Nov 15 11:52:43 test8 kernel: [  160.797661] [<c023ee88>] (process_one_work) from [<c023f40c>] (worker_thread+0x4c/0x580)
Nov 15 11:52:43 test8 kernel: [  160.797668]  r10:eff13300 r9:c1203d00 r8:eff13318 r7:00000008 r6:eff13300 r5:d7634094
Nov 15 11:52:43 test8 kernel: [  160.797671]  r4:d7634080
Nov 15 11:52:43 test8 kernel: [  160.797678] [<c023f3c0>] (worker_thread) from [<c02468bc>] (kthread+0x144/0x170)
Nov 15 11:52:43 test8 kernel: [  160.797684]  r10:ef8ffe7c r9:ef334f9c r8:d7634080 r7:d7652000 r6:00000000 r5:d7638200
Nov 15 11:52:43 test8 kernel: [  160.797686]  r4:ef334f80
Nov 15 11:52:43 test8 kernel: [  160.797693] [<c0246778>] (kthread) from [<c02010ac>] (ret_from_fork+0x14/0x28)
Nov 15 11:52:43 test8 kernel: [  160.797699] Exception stack(0xd7653fb0 to 0xd7653ff8)
Nov 15 11:52:43 test8 kernel: [  160.797707] 3fa0:                                     00000000 00000000 00000000 00000000
Nov 15 11:52:43 test8 kernel: [  160.797715] 3fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Nov 15 11:52:43 test8 kernel: [  160.797722] 3fe0: 00000000 00000000 00000000 00000000 00000013 00000000
Nov 15 11:52:43 test8 kernel: [  160.797730]  r10:00000000 r9:00000000 r8:00000000 r7:00000000 r6:00000000 r5:c0246778
Nov 15 11:52:43 test8 kernel: [  160.797733]  r4:d7638200
Nov 15 11:52:43 test8 kernel: [  160.797739] ---[ end trace 54ea3c43a183ec13 ]---
Nov 15 11:52:43 test8 kernel: [  160.797778] raspberrypi-clk soc:firmware:clocks: Failed to change fw-clk-v3d frequency: -110
Nov 15 11:52:45 test8 kernel: [  162.877100] hwmon hwmon1: Failed to get throttled (-110)
Nov 15 11:52:46 test8 kernel: [  163.917116] cpu cpu0: dev_pm_opp_set_rate: failed to find current OPP for freq 4294967186 (-34)
Nov 15 11:52:47 test8 kernel: [  164.957103] raspberrypi-clk soc:firmware:clocks: Failed to change fw-clk-arm frequency: -110
Comment 12 animtim 2020-11-17 20:07:48 UTC
Can you please test with this installer if it works on your system?

https://gcompris.net/download/qt/raspberrypi/gcompris-qt-1.0-Raspberry-beta.sh
Comment 13 kevkearney 2020-11-17 20:46:15 UTC
Installed this next to version 0.97.1. I was able to get through the background music download process, but then on the next execution it froze again.
Comment 14 walrii 2020-12-14 01:10:54 UTC
My experience duplicates the description, but with different versions.  I have observed that if I add -m (muted), it runs fine.  Perhaps this will provide a hint to where the problem is.  The freeze is so bad that my only option is to power down.

Raspberry Pi 400.  Using original SD (believed to be Raspian NOOBS) ran apt update/upgrade before I tried to get gcompris (and after).  Installed using apt install gcompris.

--version reports:
qt5ct: using qt5ct plugin
gcompris-qt 0.95
Comment 15 Pierre C 2020-12-21 00:51:53 UTC
Hi,

Exact same setup as #14. Raspberry Pi 400, original SD, upgraded today, `gcompris` freezes, `gcompris -m` working.

Not sure why gcompris triggers this only with audio.

This seems related to reclocking the v3d core. If I understand correctly the linux kernel changes the frequency of the v3d core (gpu) down, which is done by talking to a firmware. This fails and the system freezes, maybe due to not comprehensive error handling in the kernel side (mutex taken?) or the firmware getting stuck.

The firmware communication layer seems to be used all the time (rpi_firmware_*) so not surprising that if something gets stuck the system freezes.

It's possible that the reclock fails because a previous audio related message to the firmware crashed it, but there's no logs about this.

v3d_clock_down_work: https://github.com/raspberrypi/linux/blob/31d364af258ff9754a1a9c7d8ea532da962797bd/drivers/gpu/drm/v3d/v3d_gem.c#L23-L32

v3d_clock_up_put: reclocks down after 100ms of no one holding the "clock up" https://github.com/raspberrypi/linux/blob/31d364af258ff9754a1a9c7d8ea532da962797bd/drivers/gpu/drm/v3d/v3d_gem.c#L51-L60
Comment 16 Pierre C 2020-12-21 01:00:34 UTC
Created attachment 134242 [details]
dmesg

My kernel log
Comment 17 Jazeix Johnny 2020-12-21 10:21:44 UTC
Hi,

thank you for all your analysis.
Can you try running with --software-renderer (but without muting)?
We can see if it comes from openGL or if it is related to sounds.

We had issues a 2-3 years ago with freezing on Linux due to audio playing and we had to disable sounds at that time.
Comment 18 Pierre C 2020-12-21 11:29:40 UTC
OK I will try that.

In the meantime I've created an issue on raspberrypi/firmware, I believe it's the right place for this issue
https://github.com/raspberrypi/firmware/issues/1516
Comment 19 Pierre C 2020-12-21 13:41:12 UTC
Does not work with: --software-renderer -s
Works with: --software-renderer -s (as expected)
Comment 20 Pierre C 2020-12-21 13:43:24 UTC
(In reply to Pierre C from comment #19)
> Does not work with: --software-renderer -s
> Works with: --software-renderer -s (as expected)
Correction:
Does not work with: --software-renderer -s
Works with: --software-renderer -m (as expected)

(there's no "edit comment" in bugzilla?)
Comment 21 walrii 2021-01-03 23:14:22 UTC
Can confirm that in my case, running with --software-renderer -s did not work.

However, installing installing gstreamer1.0-pulseaudio did solve the problem.  See  Bug 424242.
Comment 22 Jazeix Johnny 2021-01-04 17:58:28 UTC
@kevkearney@gmail.com, @Pierre C, can you confirm if it works on your side after installing gstreamer1.0-pulseaudio package?
Comment 23 kevkearney 2021-01-05 16:45:32 UTC
Just tried installing gstreamer1.0.

Verified that it crashed before install, and works properly after install of the package.
Comment 24 Pierre C 2021-01-05 17:23:45 UTC
@Jazeix Johnny yes, installing gstreamer1.0-pulseaudio fixes it for me.

So qt5 does audio through gstreamer -> pulseaudio once this plugin is installed I guess. Do you know which audio backend it uses if gstreamer1.0-pulseaudio is not installed? (which is the faulty one)
Comment 25 Jazeix Johnny 2021-01-05 17:57:20 UTC
(In reply to Pierre C from comment #24)
> @Jazeix Johnny yes, installing gstreamer1.0-pulseaudio fixes it for me.
> 
> So qt5 does audio through gstreamer -> pulseaudio once this plugin is
> installed I guess. Do you know which audio backend it uses if
> gstreamer1.0-pulseaudio is not installed? (which is the faulty one)

I don't know, but you can add additional trace from Qt by setting the environment variable QT_DEBUG_PLUGINS to 1 that may show which plugins it loads (but I'm not sure if it will display the gstreamer ones)
Comment 26 Pierre C 2021-01-05 18:26:08 UTC
I also found QT_DEBUG_PLUGINS while searching but it only says it uses gstreamer if I understood correctly.

I found the info using `GST_DEBUG=4 gcompris-qt |& grep actual-sink`
Or more easily with: `GST_DEBUG=4 gst-launch-1.0 audiotestsrc ! autoaudiosink |& grep actual-sink`

When gstreamer1.0-pulseaudio is installed pulsesink is used which works.
When it's not, omxhdmiaudiosink is used (at least in my setup) which generates the firmware bug.