Created attachment 151708 [details] source html file SUMMARY An epub file with html content filenames with spaces in the epub zip file cause a doubling on the content with the TOC pointing to the second copy. STEPS TO REPRODUCE 1. create epub file with spaces in component filenames 1.1. create html file: "te st.html" (with space) nano "te st.html" <html><body> <h1>Chapter 1</h1> <h1>Chapter 2</h1> <h1>Chapter 3</h1> <h1>Chapter 4</h1> <h1>Chapter 5</h1> </body></html> 1.2. convert to epub file (with spaces in component filenames) ebook-convert "te st".{html,epub} 1.3. review contents unzip -l "te st.epub" 2. view with okular okular "te st.epub" OBSERVED RESULT The reader will show Title Page, Chapter 1, Chapter 2, Chapter 3, Chapter 4, Chapter 5, Chapter 1, Chapter 2, Chapter 3, Chapter 4, Chapter 5. The table of contents will point to the second occurrence so Chapter 1 will be on page 7. EXPECTED RESULT Reader should show Chapter 1, Chapter 2, Chapter 3, Chapter 4, Chapter 5. The TOC should place Chapter 1 on page 2. SOFTWARE/OS VERSIONS Linux/KDE Plasma: 5.19.4-arch1-1 x86_64 GNU/Linux Window Manager: jwm 2.3.7-3 ADDITIONAL INFORMATION Manually removing spaces in component file names (te st_split_000.html, etc.) and editing content in conent.opf and toc.ncx to remove space and %20 in references corrects the problem. ebook-viewer does not share this problem. There is no doubling of content references in either content.opf or toc.ncx. Playing around with it: If I have: test__split_000.html te st__split_001.html te st__split_002.html test__split_003.html te st__split_004.html and edit contents.opt to have lines: <item id="html5" href="test_split_000.html" media-type="application/xhtml+xml"/> <item id="html4" href="te st_split_001.html" media-type="application/xhtml+xml"/> <item id="html3" href="te st_split_002.html" media-type="application/xhtml+xml"/> <item id="html2" href="test_split_003.html" media-type="application/xhtml+xml"/> <item id="html1" href="te st_split_004.html" media-type="application/xhtml+xml"/> and edit toc.ncx, changing lines: <content src="te%20st_split_000.html"/> ... <content src="te%20st_split_003.html"/> changed to: <content src="test_split_000.html"/> ... <content src="test_split_003.html"/> The book shows Title Page, Chapter 1, Chapter 2, Chapter 3, Chapter 4, Chapter 5, Chapter 2, Chapter 3, Chapter 5 Second copies of Chapters 1 and 4 are missing. The TOC shows Chapters 1-5 pointing to pages 2, 7, 8, 4, 9.
Can you attach such an epub file?
Created attachment 151774 [details] an epub file illustrating the bug This html source differs from the previous source in that it only has chapters 1-3. A necessary change so that the epub will fit wtihin the 4k file size submission restraint. It still illustrates the repeated content.
I don't see any problem with the attached file. What is wrong with it?
Created attachment 152038 [details] screenshot of okular mistakenly displaying a second copy of the epub content Note that the TOC shows chapters 1-3 pointing to pages 4-6. This is the second copy of the content. The first copy of chapters 1-3 is on pages 1-3. Page through the document and you'll see it displayed as: page 1 Chapter 1 page 2 Chapter 2 page 3 Chapter 3 page 4 Chapter 1 (pointed to by TOC) page 5 Chapter 1 (pointed to by TOC) page 6 Chapter 3 (pointed to by TOC) page 7 (blank) . Yet, the epub only contains (chapter) files: te st_split_000.html te st_split_001.html te st_split_002.html . Rename the files to "test_split_000.html", "test_split_001.html", and "test_split_002.html", and edit "content.opf" and "toc.ncx" to contain the new names and okular stops displaying the first copy of the chapters.
On 2022-09-13 15:57, Albert Astals Cid wrote: > https://bugs.kde.org/show_bug.cgi?id=458516 > > --- Comment #3 from Albert Astals Cid <aacid@kde.org> --- > I don't see any problem with the attached file. > > What is wrong with it? > The file itself is OK. The problem is in Okular's displaying of the file. I've added a screenshot of Okular illustrating the problem to the bug report [ https://bugsfiles.kde.org/attachment.cgi?id=152038 ]. Note that the TOC shows chapters 1-3 pointing to pages 4-6. This is the second copy of the content. The first copy of chapters 1-3 is on pages 1-3. Page through the document and you'll see it displayed as: page 1 Chapter 1 page 2 Chapter 2 page 3 Chapter 3 page 4 Chapter 1 (pointed to by TOC) page 5 Chapter 2 (pointed to by TOC) page 6 Chapter 3 (pointed to by TOC) page 7 (blank) Yet, the epub contains only (chapter) files: te st_split_000.html te st_split_001.html te st_split_002.html Rename the epub files to "test_split_000.html", "test_split_001.html", and "test_split_002.html", and edit "content.opf" and "toc.ncx" to contain the new names and Okular stops displaying the first copy of the chapters. Pages in the TOC are corrected as well.
You mean you get 7 pages when opening https://bugs.kde.org/attachment.cgi?id=151774 ?
Comment on attachment 152038 [details] screenshot of okular mistakenly displaying a second copy of the epub content Yes, Okular inserts the chapters a first time, then adds them again with the TOC pointing to the second copy. So the output shown for just a 3 chapter document is: page 1 Chapter 1 page 2 Chapter 2 page 3 Chapter 3 page 4 Chapter 1 (pointed to by TOC) page 5 Chapter 2 (pointed to by TOC) page 6 Chapter 3 (pointed to by TOC) page 7 (blank) . when the chapter filenames contain spaces.
Dear Bug Submitter, This bug has been in NEEDSINFO status with no change for at least 15 days. Please provide the requested information as soon as possible and set the bug status as REPORTED. Due to regular bug tracker maintenance, if the bug is still in NEEDSINFO status with no change in 30 days the bug will be closed as RESOLVED > WORKSFORME due to lack of needed information. For more information about our bug triaging procedures please read the wiki located here: https://community.kde.org/Guidelines_and_HOWTOs/Bug_triaging If you have already provided the requested information, please mark the bug as REPORTED so that the KDE team knows that the bug is ready to be confirmed. Thank you for helping us make KDE software even better for everyone!
This needs to be re-checked i can't reproduce what the reporter says
(In reply to Albert Astals Cid from comment #9) > This needs to be re-checked i can't reproduce what the reporter says Please post a screenshot of what you do get when you display the EPUB file.