Bug 462268

Summary: Error storing podcast episode description in external MySQL database
Product: [Applications] amarok Reporter: Stephan Schöttl <stschoettl>
Component: PodcastAssignee: Amarok Bugs <amarok-bugs-null>
Status: RESOLVED FIXED    
Severity: normal CC: bart.cerneels, stschoettl, tuomas
Priority: NOR    
Version First Reported In: unspecified   
Target Milestone: kf5   
Platform: openSUSE   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:

Description Stephan Schöttl 2022-11-26 13:36:22 UTC
Certain podcast episodes are not saved to the database. A dialog with a MySQL appears with an error message. The error hast been seen with the podcast https://zoo-podcast.podigee.io/feed/mp3 and an external mysql database

STEPS TO REPRODUCE
1. Subscribe podcast https://zoo-podcast.podigee.io/feed/mp3
2. Update podcast episodes
3. wait

OBSERVED RESULT
***
MySQL-server query failed! (1366) Incorrect string value: '\xF0\x9F\x8E\xA4 T...' for column `amarok26db`.`podcastepisodes`.`description` at row 1 on INSERT INTO podcastepisodes (url,channel,localurl,guid,title,subtitle,sequencenumber,description,mimetype,pubdate,duration,filesize,isnew,iskeep) VALUES ( 'https://cdn.podigee.com/media/podcast_18387_mia_san_tier_der_zoo_pdcst_aus_hellabrunn_episode_417392_laut_und_verspielt_mahnenrobben_in_hellabrunn.mp3?v=1617282197&source=feed', 10, '', 'ed19282cc9dad42e0b89167f1263e9f2', 'Laut und verspielt - Mähnenrobben in Hellabrunn', 'Folge 32', 0, 'Mit verbundenen Augen würde Tierpfleger Helmut jede seiner Robben am "Röhren" erkennen. MiaSanTier, der Zoo-Podcast aus Hellabrunn, ist zu Gast bei den Mähnenrobben - die haben erstaunlich kleine Ohren. Ihr lernt die Robben-Gruppe näher kennen und erfahrt, wer gerade in wen verliebt ist. Diesmal wird es auf jeden Fall laut und verspielt! 
\n
\nPodcast-Redakteurin 🎤 Tina Gentner hat diesmal Zoologin Lena Bockreiß befragt und war mit Tierpfleger Helmut bei der Robben-Fütterung dabei.', 'audio/mpeg', '2021-04-02T05:00:00Z', 0, 19784629, 0, 0);
***

EXPECTED RESULT
no error dialog

SOFTWARE/OS VERSIONS
Linux/KDE Plasma: OpenSuse 15.4
Amarok version: amarok-2.9.75git.20220614T014846~dc55a00143-bp154.2.3.1.x86_64 from Opensuse repository

ADDITIONAL INFORMATION
using an external mysql database (not sure if that makes any difference)
Comment 1 Tuomas Nurmi 2025-01-05 11:03:43 UTC
Also encountering this myself with some podcast.
The problem seems to be that databases are created with utf8mb3 character set, but full UTF8 support would require them to be utf8mb4, detailed e.g. at https://stackoverflow.com/questions/1168036/how-to-fix-incorrect-string-value-errors , so some UTF8 characters fail to get saved to database. Looking through existing and new mariadb databases, the column type seems to be utf8mb3 everywhere.

Comment at mariadb documentation  https://mariadb.com/kb/en/supported-character-sets-and-collations/
"The intention I believe is at some point to switch off the default old mode setting, so that utf8 will then be an alias for utf8mb4."

MySQL seems to be further already, documentation at https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-utf8mb3.html :
"The recommended character set for MySQL is utf8mb4. All new applications should use utf8mb4. The utf8mb3 character set is deprecated. utf8mb3 remains supported for the lifetimes of the MySQL 8.0.x and following LTS release series, as well as in MySQL 8.0.
Expect utf8mb3 to be removed in a future major release of MySQL. "

Will need some database schema migration. Should be pretty straightforward, but needs a lot of care. Last one seems to have been done in 2012.
Comment 2 Bug Janitor Service 2025-01-05 12:35:25 UTC
A possibly relevant merge request was started @ https://invent.kde.org/multimedia/amarok/-/merge_requests/134
Comment 3 Tuomas Nurmi 2025-03-15 11:52:39 UTC
Git commit 0e7509d0a8fc6e70ee3b60f1ec5db9080c3c524c by Tuomas Nurmi.
Committed on 15/03/2025 at 11:38.
Pushed by nurmi into branch 'master'.

Add some database bug related tests that fail for now

Database should be utf8mb4 for complete UTF8 support.
Directory changedates shouldn't hit year 2038 problem.
Related: bug 426807

M  +14   -0    tests/core-impl/collections/db/sql/TestSqlCollectionLocation.cpp
M  +2    -0    tests/core-impl/collections/db/sql/TestSqlCollectionLocation.h
M  +12   -0    tests/core-impl/collections/db/sql/TestSqlTrack.cpp
M  +2    -0    tests/core-impl/collections/db/sql/TestSqlTrack.h

https://invent.kde.org/multimedia/amarok/-/commit/0e7509d0a8fc6e70ee3b60f1ec5db9080c3c524c
Comment 4 Tuomas Nurmi 2025-03-15 11:52:41 UTC
Git commit 371c8ba323a159d8b889ab341cf9af2aa6cb0ec3 by Tuomas Nurmi.
Committed on 15/03/2025 at 11:38.
Pushed by nurmi into branch 'master'.

Change database to use utf8mb4 & convert existing databases

Requires a database version upgrade, which includes column length changes.
Basic tests done and seems to work, but would definitely benefit from extra
reviews and testing.

M  +1    -0    ChangeLog
M  +64   -20   src/core-impl/collections/db/sql/DatabaseUpdater.cpp
M  +1    -0    src/core-impl/collections/db/sql/DatabaseUpdater.h
M  +2    -2    src/core-impl/collections/db/sql/SqlQueryMaker.cpp
M  +4    -4    src/core-impl/storage/sql/mysql-shared/MySqlStorage.cpp
M  +2    -2    src/core-impl/storage/sql/mysqlestorage/MySqlEmbeddedStorage.cpp
M  +2    -2    src/core-impl/storage/sql/mysqlserverstorage/MySqlServerStorage.cpp
M  +2    -2    src/importers/amarok/AmarokEmbeddedSqlConnection.cpp
M  +1    -1    src/services/ServiceSqlQueryMaker.cpp
M  +2    -2    tests/importers/files/amarok2_mysqle/amarok/db.opt

https://invent.kde.org/multimedia/amarok/-/commit/371c8ba323a159d8b889ab341cf9af2aa6cb0ec3