Bug 198443

Summary: JJ: fix podcast episodes that have invalid pubdate
Product: [Applications] amarok Reporter: andreaswuest
Component: generalAssignee: Amarok Developers <amarok-bugs-dist>
Status: RESOLVED FIXED    
Severity: normal CC: bart.cerneels, rasasi78, saigkill
Priority: NOR    
Version: 2.2.0   
Target Milestone: ---   
Platform: unspecified   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: screenshot of the problem
image showing the problem

Description andreaswuest 2009-06-30 20:25:36 UTC
Version:           2.1.1 (using 4.2.4 (KDE 4.2.4), Kubuntu packages)
Compiler:          cc
OS:                Linux (x86_64) release 2.6.28-11-generic

hi,

some podcasts are (again) not ordered by date. the latest podcast (so the first in 
the list) should be "Hoffnung für die Zukunft", however, it is displayed in the middle
of the other podcasts. see the attached screenshot.

tell me, if you need more information.

podcast feed is: http://www.swr3.de/rss/Das_20SWR3-Topthema/-/id=477150/did=447000/b4xq4r/index.xml

cheers,
andy
Comment 1 andreaswuest 2009-06-30 20:26:16 UTC
Created attachment 34947 [details]
screenshot of the problem
Comment 2 Myriam Schweingruber 2009-07-01 18:08:25 UTC
Did you check the stream date? You can look at the stream details in the playlist: right click, edit track details.

I checked all ym podcasts and those are in the correct order, using 2.2-SVN
Comment 3 andreaswuest 2009-07-06 18:31:23 UTC
here is the required information:

"Hoffnung für die Zukunft" :
http://www.swr3.de/mpg/themen/topthemen/20090630/284115.6444m.mp3

"Das Vorbild ist Münchhausen" :
http://www.swr3.de/mpg/themen/topthemen/20090629/283810.6444m.mp3

so the ordering is obviously not ok, just like i reported.i just updated
the podcasts the newest podcast should be 
"Blitz, Hagel und Ungewitter - heute mehr als früher?"

however it is inserted in the middle as well (see the new screenshot)!
Comment 4 andreaswuest 2009-07-06 18:32:12 UTC
Created attachment 35098 [details]
image showing the problem
Comment 5 andreaswuest 2009-07-06 18:44:05 UTC
> I checked all ym podcasts and those are in the correct order
yes, if i add a podcast that is generally the case, however after updating
the podcasts a few times (after several days). the order gets broken for some
reason.
Comment 6 andreaswuest 2009-07-06 21:00:23 UTC
so i finally checked out the content of the database (sql is from the
bug i reported earlier):

i executed:
Amarok.Collection.query("SELECT e.title, e.pubdate FROM podcastepisodes AS e LEFT JOIN podcastchannels AS c ON e.channel=c.id WHERE c.url='http://www.swr3.de/rss/Das_20SWR3-Topthema/-/id=477150/did=447000/b4xq4r/index.xml';");

and this is thre result i got. the e.pubdate contains very strange dates!!
just have a look

Der Bildungsstreik - Studentenprotest gegen verkürzte Studienzeiten,2019-02-02T13:03:04,
Google Street View - Straßenkarten oder peinliche Bilder im Internet,2019-02-02T13:03:04,
Piraten im Cockpit - Tarifstreit in der Formel I,2054-12-08T11:59:04,
Der Widerstand gegen Klonfleisch wächst,2071-04-27T04:54:48,
Drei Millionen gefälschte Wahlzettel - Irans Wächterrat stellt Unregelmäßigkeit fest,2071-04-27T04:54:48,
Das Kabinett beschließt Schulden wie noch nie - die Rechnung werden wir bezahlen müssen,1973-01-16T01:23:12,
Deutsche Geiseln im Jemen sollen tot sein,2019-02-02T13:03:04,
Unfassbar viel Geld - schwindelerregende Transfersummen im internationalen Fußball,2019-02-02T13:03:04,
Vernetzt auf Teherans Strassen - wie das Internet die Macht der Zensur bricht,2019-02-02T13:03:04,
"Das Vorbild ist Münchhausen",2077-12-31T15:13:12,
Die Welt trauert um Michael Jackson,2077-12-31T15:13:12,
Die sensationelle Eiszeitflöte und das moderne Marketing,2077-12-31T15:13:12,
"Hoffnung für die Zukunft",2022-01-10T06:06:56,Blitz, 
Hagel und Ungewitter - heute mehr als früher?,2032-05-23T11:01:36,
Ein paar Fakten und Fragezeichen",2032-05-23T11:01:36,
"Silberstreifen und optische Täuschungen",2032-05-23T11:01:36,
Topthema "Das Ende der Abzocke",2032-05-23T11:01:36
Comment 7 Bart Cerneels 2009-07-11 15:58:42 UTC
I think you used an amarok 2.1 beta before that had a bug related to the pubDate parsing. The bogus pubdate values are probably because of that.

Someone could write a simple script that will clear the pubdate values where they are bogus. If they are empty the episodes will be in the order they are in the feed (reverse chronological).
Here is a hint for the SQL code: "UPDATE podcastepisodes SET pubdate='' WHERE (pubdate > today || pubdate < 2003);"

Make sure the user gives his OK (by way of a dialog) before doing this, perhaps write an undo .sql script automatically.

I'm marking this a junior job.
Comment 8 andreaswuest 2009-07-13 20:23:50 UTC
Hi,

i am currently using the version provided by kubuntu, which says it is a 2.1.1.
i removed the swr3 podcast and added it again. then i executed the sql query
from two posts above and i got the following result:

Der Trümmergipfel von L'Aquila - Berlusconis bizarre Betteltour,1983-11-01T14:06:56,Urlaub in der Wirtschaftskrise - Deutschland profitiert,1983-11-01T14:06:56,Der Strom aus der Wüste - welches Potential steckt in der Solarenergi,1983-11-01T14:06:56,SWR3-Topthema 09.07.2009,1983-11-01T14:06:56 

pubdate is now something 1983 ! so for me this looks like the date parse is
broken. so imho this is not a junior job, but an issue.

 Is there a testcase to test the parser? i'd really appreciate a 
testcase since i reported this kind of problem quite some times, and it never
really got fixed (sorry, i am slightly frustrated atm).
unfortunately i am a java guy, otherwise i would provide a testcase myself.
Comment 9 Raúl 2009-10-06 17:26:47 UTC
Hello:

Looks this problem is still happennig on 2.2.0. I'm attaching the current xml for the podcast which url is: http://www.rtve.es/podcast/radio-3/el-vuelo-del-fenix/SELVUEL.xml

Also take a look at the screenshot where you can see that newest are latest in the list. Sometimes new episodes appears at the beggining of the list, but this is not what usually happens.

Regards,
Comment 10 andreaswuest 2009-10-31 10:04:21 UTC
i just tried version 2.2.0 and the issue still persist.
i added the podcast 
http://www.networkworld.com/podcasts/openmic/index.xml

and the following query returns the following result:

query:
Amarok.Collection.query("SELECT e.title, e.pubdate FROM podcastepisodes AS e LEFT JOIN podcastchannels AS c ON e.channel=c.id WHERE c.url='http://www.networkworld.com/podcasts/openmic/index.xml';");

result:
In the Linux Driver Seat with Kernel Developer Greg Kroah-Hartman,2106-02-07T07:28:15,From the iPhone to the Cloud: What's new with the Mono Project,2106-02-07T07:28:15 

the pubdate is definitely not 2106-02-07T07:28:15 nor 2106-02-07T07:28:15 !
Comment 11 Mark Kretschmann 2009-10-31 12:37:23 UTC
Got a patch for us? :)
Comment 12 andreaswuest 2009-10-31 14:14:05 UTC
unfortunately my cpp-days are history :-( 
but i have some more information. i had a look at the source and the problem
seems to be the parsePubDate Method in 
http://gitorious.org/amarok/amarok/blobs/master/src/podcasts/PodcastReader.cpp

what strptime method is used here ?
strptime( datestring.toAscii().data(), "%a, %d %b %Y %H:%M:%S %z", &tmp );

i could only find:
http://www.opengroup.org/onlinepubs/009695399/functions/strptime.html
which however does not define a conversion specification for "%z", maybe
this is the problem?!

isn't is possible to write a testecase for the parsePubDate Method with
an input of 'Fri, 16 Oct 2009 00:00:00 -0400', since the result is known!
should be rather easy, or am i mistaken?
Comment 13 Bart Cerneels 2009-10-31 16:16:02 UTC
%z in strptime is a gnu extension. Since you are using kubuntu, the same as me, and I can not reproduce this I have no idea what is going on.

Perhaps this is a database issue? Might also be a locale problem? I'm using locale "en-US" I guess yours is "de-DE"?

Mark: another of those hard to track down issues. Can you please help.
Comment 14 andreaswuest 2009-10-31 16:37:06 UTC
is it possible to add some debug messages for the next version ?
i guess it should be enough to add the following line 
in the method parsePubDate before the return: 
 
debug() << "date string : " << datestring << ", parsed date: " << pubDate;

maybe the used locale should also be added!

this would at least show me if the date was really parsed correctly.
Comment 15 Myriam Schweingruber 2009-11-06 14:40:27 UTC
Changing version.
Comment 16 Bart Cerneels 2009-11-10 11:14:47 UTC
v2.2.0-695-gf4b6bf5 is using KDateTime, so 2.2.1 might fix your problem.
Comment 17 Myriam Schweingruber 2009-11-16 14:13:29 UTC
Andreas, ca you reproduce this with Amarok 2.2.1?
Comment 18 andreaswuest 2009-11-16 18:46:50 UTC
correct my if i am wrong, but 2.2.1 has been tagged but not yet officially released. i cannot test unless there is a version 2.2.1 for kubuntu, sorry. 
i guess you have to be a little more patient.
Comment 19 Myriam Schweingruber 2009-11-16 19:07:04 UTC
Andreas, since its due for release today, I don't think I will have to wait very long :)
Comment 20 andreaswuest 2009-11-17 19:03:46 UTC
finally this issues has been fixed. thanks.
Comment 21 Bart Cerneels 2009-11-22 15:22:26 UTC
*** Bug 215560 has been marked as a duplicate of this bug. ***