Bug 236567 - amarok wikipedia applet blocked no user agent
Summary: amarok wikipedia applet blocked no user agent
Status: RESOLVED NOT A BUG
Alias: None
Product: amarok
Classification: Applications
Component: Context View/Wikipedia (show other bugs)
Version: 2.3.0.90
Platform: openSUSE Linux
: NOR normal
Target Milestone: 2.3.1
Assignee: Amarok Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-05-06 13:37 UTC by Paolo Crosetto
Modified: 2010-05-07 16:36 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Paolo Crosetto 2010-05-06 13:37:15 UTC
Version:           2.3.1 beta (using KDE 4.4.2)
OS:                Linux
Installed from:    openSUSE RPMs

In Amarok 2.3.beta1 wikipedia applet does not load any information. If 'reload' is chosen, the applet displays the following text:

'Scripts should use an informative User-Agent string with contact information, or they may be IP-blocked without notice.'

Same behaviour here with amarok 2.3.0 and previous.

In short: wikimedia project has applied some restriction on queries, and it looks as if my amarok does not comply. Note that all other applet (last.fm, flickr, youtube, etc...) are working flawlessly.

how to reproduce: play any song, and look up wikipedia content. nothing is shown.

backtraces: for all details, have a look at this discussion:
http://forum.kde.org/viewtopic.php?f=115&t=87550

Thanks
Comment 1 Sven Krohlas 2010-05-07 10:24:51 UTC
I just looked at it using Wireshark: a user agent string is being sent:

Mozilla/5.0 (compatible; Konqueror/4.4; Linux) KHTML/4.4.3 (like Gecko) SUSE

So if for you Wikipedia doesn't get a user agent string very likely something along the road in between Amarok and Wikipedia is modifying your http request. The proxy maybe? A router? Your government?

Please use wireshark to check if you Amarok is also sending out a user agent string.

/me strongly tends mark this as INVALID
Comment 2 Paolo Crosetto 2010-05-07 15:18:42 UTC
I tend to agree that this has to be my issue and not general. But...

Sven I did as you asked. It's my first time with wireshark so I do not know if I got the right string; but this is the request sent to wikipedia (it seems not to have any user-agent:

-------No.     Time        Source                Destination           Protocol Info
   4768 39.244167   172.20.8.58           172.20.5.25           HTTP     GET http://en.wikipedia.org/w/index.php?title=Manu%20Chao%20%28band%29&useskin=monobook HTTP/1.1 

Frame 4768 (481 bytes on wire, 481 bytes captured)
Ethernet II, Src: Dell_82:ae:26 (00:15:c5:82:ae:26), Dst: Cisco_e9:52:80 (00:1b:2a:e9:52:80)
Internet Protocol, Src: 172.20.8.58 (172.20.8.58), Dst: 172.20.5.25 (172.20.5.25)
    Version: 4
    Header length: 20 bytes
    Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00)
    Total Length: 467
    Identification: 0x2b8b (11147)
    Flags: 0x04 (Don't Fragment)
    Fragment offset: 0
    Time to live: 64
    Protocol: TCP (0x06)
    Header checksum: 0xa81e [correct]
    Source: 172.20.8.58 (172.20.8.58)
    Destination: 172.20.5.25 (172.20.5.25)
Transmission Control Protocol, Src Port: 39799 (39799), Dst Port: ndl-aas (3128), Seq: 1, Ack: 1, Len: 415
    Source port: 39799 (39799)
    Destination port: ndl-aas (3128)
    Sequence number: 1    (relative sequence number)
    [Next sequence number: 416    (relative sequence number)]
    Acknowledgement number: 1    (relative ack number)
    Header length: 32 bytes
    Flags: 0x18 (PSH, ACK)
    Window size: 5888 (scaled)
    Checksum: 0x6741 [incorrect, should be 0x0978 (maybe caused by "TCP checksum offload"?)]
        [Good Checksum: False]
        [Bad Checksum: True]
    Options: (12 bytes)
        NOP
        NOP
        Timestamps: TSval 16889212, TSecr 3183302784
Hypertext Transfer Protocol
    GET http://en.wikipedia.org/w/index.php?title=Manu%20Chao%20%28band%29&useskin=monobook HTTP/1.1\r\n
        Request Method: GET
        Request URI: http://en.wikipedia.org/w/index.php?title=Manu%20Chao%20%28band%29&useskin=monobook
        Request Version: HTTP/1.1
    Host: en.wikipedia.org\r\n
    Proxy-Connection: Keep-Alive\r\n
    Pragma: no-cache\r\n
    Cache-control: no-cache\r\n
    Accept: text/html, image/jpeg;q=0.9, image/png;q=0.9, text/*;q=0.9, image/*;q=0.9, */*;q=0.8\r\n
    Accept-Encoding: x-gzip, x-deflate, gzip, deflate\r\n
    Accept-Charset: utf-8, utf-8;q=0.5, *;q=0.5\r\n
    Accept-Language: en-US, en\r\n
    \r\n
----------

Wireshark marks it black; and the checksum is, as you see, bad (and red-highlighted). I see no user-agent sent. Or should I look somewhere else?


The answer to this query comes 0.2 secs afterwards, and it tells me there is no user-agent:

----------
No.     Time        Source                Destination           Protocol Info
   4773 39.455038   172.20.5.25           172.20.8.58           HTTP     HTTP/1.0 403 Forbidden  (text/html)

Frame 4773 (183 bytes on wire, 183 bytes captured)
Ethernet II, Src: Cisco_e9:52:80 (00:1b:2a:e9:52:80), Dst: Dell_82:ae:26 (00:15:c5:82:ae:26)
Internet Protocol, Src: 172.20.5.25 (172.20.5.25), Dst: 172.20.8.58 (172.20.8.58)
    Version: 4
    Header length: 20 bytes
    Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00)
    Total Length: 169
    Identification: 0x0f9b (3995)
    Flags: 0x04 (Don't Fragment)
    Fragment offset: 0
    Time to live: 63
    Protocol: TCP (0x06)
    Header checksum: 0xc638 [correct]
    Source: 172.20.5.25 (172.20.5.25)
    Destination: 172.20.8.58 (172.20.8.58)
Transmission Control Protocol, Src Port: ndl-aas (3128), Dst Port: 39799 (39799), Seq: 688, Ack: 416, Len: 117
    Source port: ndl-aas (3128)
    Destination port: 39799 (39799)
    Sequence number: 688    (relative sequence number)
    [Next sequence number: 805    (relative sequence number)]
    Acknowledgement number: 416    (relative ack number)
    Header length: 32 bytes
    Flags: 0x18 (PSH, ACK)
    Window size: 6912 (scaled)
    Checksum: 0xa253 [correct]
        [Good Checksum: True]
        [Bad Checksum: False]
    Options: (12 bytes)
        NOP
        NOP
        Timestamps: TSval 3183302837, TSecr 16889212
    [SEQ/ACK analysis]
    TCP segment data (117 bytes)
[Reassembled TCP Segments (804 bytes): #4771(687), #4773(117)]
Hypertext Transfer Protocol
    HTTP/1.0 403 Forbidden\r\n
        Request Version: HTTP/1.0
        Response Code: 403
    Date: Fri, 07 May 2010 12:58:49 GMT\r\n
    Server: Apache\r\n
    Cache-Control: private, s-maxage=0, max-age=0, must-revalidate\r\n
    Content-Encoding: gzip\r\n
    Vary: Accept-Encoding\r\n
    Content-Length: 117
    Content-Type: text/html\r\n
    X-Cache: MISS from sq39.wikimedia.org\r\n
    X-Cache-Lookup: HIT from sq39.wikimedia.org:3128\r\n
    X-Cache: MISS from knsq27.knams.wikimedia.org\r\n
    X-Cache-Lookup: HIT from knsq27.knams.wikimedia.org:3128\r\n
    X-Cache: MISS from knsq5.knams.wikimedia.org\r\n
    X-Cache-Lookup: MISS from knsq5.knams.wikimedia.org:80\r\n
    X-Cache: MISS from proxy.luiss.it\r\n
    X-Cache-Lookup: MISS from proxy.luiss.it:3128\r\n
    Via: 1.0 proxy-ass.luiss.it:3128 (squid/2.7.STABLE3)\r\n
    Connection: close\r\n
    \r\n
    Content-encoded entity body (gzip): 117 bytes -> 120 bytes
Line-based text data: text/html
--------

Final note: requests by amarok (to flickr, last.fm...) never show a user-agent; all of them have the 'bad checksum' problem; all but the wikipedia call send a 'cookie' alongside (that is different in different calls, eg lastfm vs flickr); all of them but the wikipedia work.
Note: this wireshark report is under a proxy, but the problem is exactly the same when _not_ under a proxy (now I am at work and proxy-ed but if you wish I could wireshark from home later).
Comment 3 Sven Krohlas 2010-05-07 15:32:07 UTC
Interesting.

The only difference I can see now it that my systems are already at KDE 4.4.3 and yours is at 4.4.2. We even both use OpenSuse.

Amarok uses KIO fo get the website. Maybe you configured KDE/Konqueror to show an empty user agent string? Please check that.

But still that does not explain the wrong checksums.
Comment 4 Paolo Crosetto 2010-05-07 16:35:30 UTC
Sven,

you were right: my konqueror was set NOT to send user agent. Thank you, I would _never_ have got it without your hint.

So it is no bug. Sorry for having raised it, and thanks for your kind help.

P

PS: about wrong checksums: will look into it, but it works, so if it ain't broke, don't fix it, right? :)