Bug 418814 - Add option to remove some special characters from filenames when extracting audio
Summary: Add option to remove some special characters from filenames when extracting a...
Status: REPORTED
Alias: None
Product: k3b
Classification: Applications
Component: Copying (show other bugs)
Version: 19.12
Platform: Other Linux
: NOR wishlist
Target Milestone: ---
Assignee: k3b developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-03-13 12:24 UTC by forenkram
Modified: 2020-12-29 15:40 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
debugging output (566 bytes, text/plain)
2020-03-13 12:24 UTC, forenkram
Details

Note You need to log in before you can comment on or make changes to this bug.
Description forenkram 2020-03-13 12:24:47 UTC
Created attachment 126760 [details]
debugging output

SUMMARY
Ripping CD fails on tracks containing question marks. 

STEPS TO REPRODUCE
1. rip track with question marks

OBSERVED RESULT

Ripping CD fails on tracks containing question marks with error. 
The status update says


Reading CD table of contents
Start digital audio extraction
Command failed: flac -V -o [...]
Error while encoding track 4

Debugging output is 

Devices
-----------------------
HL-DT-ST DVD+-RW GU90N A1C3 (/dev/sr0, CD-R, CD-RW, CD-ROM, DVD-ROM, DVD-R, DVD-RW, DVD-R DL, DVD+R, DVD+RW, DVD+R DL) [DVD-ROM, DVD-R Sequential, DVD-R Dual Layer Sequential, DVD-R Dual Layer Jump, DVD-RAM, DVD-RW Restricted Overwrite, DVD-RW Sequential, DVD+RW, DVD+R, DVD+R Dual Layer, CD-ROM, CD-R, CD-RW] [SAO, TAO, RAW, SAO/R96P, SAO/R96R, RAW/R16, RAW/R96P, RAW/R96R, Restricted Overwrite, Layer Jump] [%7]

System
-----------------------
K3b Version: 18.8.1
KDE Version: 5.49.0
Qt Version:  5.11.3
Kernel:      4.19.0-8-amd64


This happens both on DEBIAN Buster with version 18.08
AND Archlinux with version 19.12.
Note that the tracks are handled fine with asunder, which replaces the offending 
characters. There seems to be missing encoding error handling or something. 
It is really annoying. 




EXPECTED RESULT
the encoding not to stop and encode the damn track. 

SOFTWARE/OS VERSIONS
Debian 10 
KDE Plasma Version: 5.14.5
KDE Frameworks Version: 5.54.0
Qt Version: 5.11.3
K3b Version: 18.8.1
Kernel:      4.19.0-8-amd64

Archlinux
KDE Plasma Version: 5.18.3
KDE Frameworks Version: 5.67.0
Qt Version: 5.14.1
K3b Version: 19.12
Kernel:      5.5.8 (x86_64)

ADDITIONAL INFORMATION

This happens both on DEBIAN Buster with version 18.08
AND Archlinux with version 19.12.
Note that the tracks are handled fine with asunder, which replaces the offending 
characters. There seems to be missing encoding error handling or something. 
It is really annoying.
Comment 1 Albert Astals Cid 2020-03-18 22:41:04 UTC
Can you please paste the full
"flac -V -o"
command line?

I just tried adding a ? and it worked fine here
Comment 2 forenkram 2020-03-19 09:11:11 UTC
(In reply to Albert Astals Cid from comment #1)
> Can you please paste the full
> "flac -V -o"
> command line?
> 
> I just tried adding a ? and it worked fine here

OK, your comment got me thinking as due to space restrictions I was encoding to a usb hard drive. I retested the following (flac output below):
Encode a track with question mark but save it to my hard disk (ext4) instead of an external drive.
Everything works fine. That is, with the latest version of k3b etc. as I'm on my home computer running Arch.

If I tell k3b to save the ripped file to an external hard drive (ntfs), however, 
the error occurs with the following output in the console (excerpt)

"flac -V -o /run/media/david/Speicherkiste/Take_Five_(CD04)/04-Why_Do_I_Love_You?.flac --force-raw-format --endian=little --channels=2 --sample-rate=44100 --sign=signed --bps=16 -T ARTIST=The Dave Brubeck Quartet -T TITLE=Why Do I Love You? -T TRACKNUMBER=04 -T DATE=1954 -T ALBUM=Take Five (CD04) -"
K3bQProcess::QProcess(0x0)
started
(K3b::CdparanoiaLib) initReading(  67639 ,  93335  )
(K3b::CdparanoiaLib) need to seek before read. Looks as if we are reusing the paranoia instance.
"Ripping track 4 (The Dave Brubeck Quartet - Why Do I Love You?)"
error while encoding.
( "Flac" )  "flac 1.3.3"
( "Flac" )  "Copyright (C) 2000-2009  Josh Coalson, 2011-2016  Xiph.Org Foundation"
( "Flac" )  "flac comes with ABSOLUTELY NO WARRANTY.  This is free software, and you are"
( "Flac" )  "welcome to redistribute it under certain conditions.  Type `flac' for details."
( "Flac" )  "(No runtime statistics possible; please wait for encoding to finish...)"
( "Flac" )  "-: ERROR initializing encoder"
( "Flac" )  "   init_status = FLAC__STREAM_ENCODER_INIT_STATUS_ENCODER_ERROR"
( "Flac" )  "   state = FLAC__STREAM_ENCODER_IO_ERROR"
( "Flac" )  "An error occurred opening the output file; it is likely that the output"
( "Flac" )  "directory does not exist or is not writable, the output file already exists and"
( "Flac" )  "is not writable, or the disk is full."

Mind you, the hard drive is not full. It is ntfs formated however.

Next test: rip same track to a usb stick formated with fat32. 
Success. 
Note, that when ripping to fat32 the question mark in the track name gets replaced by an underscore, which doesn't happen when the file gets ripped to ext4. 

So the issue seems to lie in the ntfs-3g driver?
Do these characte replacements happen driverside of the filesystem? Or is this still a bug in k3b?

I wanted to test if all this is also true on Debian Buster (have my laptop from work here) but when I open the CD in k3b, the tracklist gets shown without the question marks to begin with. The CD info seems to be getting loaded differently or from another source. 
Weird. That didn't happen on my main work machine, which also runns Buster. The laptop is up to date, the work computer might be a week behind, since it gets updated every week by the admins. I cannot test this further at the moment since I'm in home office. 

Hope that helps.
Comment 3 Albert Astals Cid 2020-03-19 21:23:53 UTC
Honestly i don't think it's k3b issue to circunvent issues with filesystems.

Can you double check if you can create manually a file that contains a ? in your ntfs drive?
Comment 4 forenkram 2020-03-20 08:49:29 UTC
you are right, I cannot create a file with questionmarks in the name on the ntfs drive, so it is definetly a filesystem driver issue. Damn that's a bummer. 

I guess you can close this as it is unrelated to k3b. Or do I need to mark this as closed?

Would it be an idea to add a checkbox to remove problematic characters like that? Of course, I don't have a clue which other characters might be problematic =(.

Anyway, thank you for you time and effort, I will file a bugreport to the ntfs-3g guys. 

Cheers, 
David
Comment 5 forenkram 2020-03-20 17:17:25 UTC
Ok, it turns out, that this is not a bug, but an option of the ntfs-3g driver to restrict file naming to names that are valid on Windows (makes sense). 
It turns out and is documented here, 

https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file#naming-conventions

that the question mark is invalid for file naming in Windows.

I understand that it is not the duty of the k3b devs to worry about (notably) Windows file systems, however, since it is desirable to use music files that have been ripped on one platform on a possibly entirely other platform, I would suggest to consider removing the characters described in the above link to make the files more platform independent, i.e. Windows compatible.
If you don't want to implement this by default, then it would be desirable to have it as an option. 

Cheers
Comment 6 Albert Astals Cid 2020-03-20 18:30:21 UTC
I honestly don't think it makes any sense, but it's a wish, we can leave it open as such.
Comment 7 Jonathan Wakely 2020-12-29 15:38:23 UTC
This isn't only a problem on Windows, it also applies when ripping files to any device formatted as FAT32. This is very common, because SD cards and USB sticks are often formatted as FAT32 for compatibility with devices like car stereo systems.

If I try to rip a track with a colon or question mark in its name it fails (after ripping any earlier tracks) because the filename can't be created on the FAT32 partition. The only solution is rip to a different partition (e.g. an ext4 drive on the computer) then rename the files then copy them over.

There is already an option to replace blanks with another character. I also like need to be able to specify that ':' and '?' characters should be replaced. Ideally it would be fully customisable so that any characters can be replaced, maybe like the UNIX tr utility (e.g. replace all uppercase with lowercase). But that is less important than just being able to rip to FAT32 destinations.
Comment 8 Jonathan Wakely 2020-12-29 15:40:17 UTC
(In reply to Albert Astals Cid from comment #3)
> Honestly i don't think it's k3b issue to circunvent issues with filesystems.

It's not an "issue" with the filesystem, it's a property of the filesystem. Just like not being able to use '/' in filenames on unix-like systems.

k3b assumes it is always writing to a unix-like filesystem, which isn't true if you're ripping to a removable device like a portable hdd, sdcard, memory stick etc.