Bug 407086

Summary: Scam detection is too sensitive for URLs that trivially differ and are not a scam
Product: [Applications] kmail2 Reporter: Jonathan Marten <jjm>
Component: generalAssignee: kdepim bugs <kdepim-bugs>
Status: REOPENED ---    
Severity: normal CC: dev-kde, montel
Priority: NOR    
Version: Git (master)   
Target Milestone: ---   
Platform: Other   
OS: Linux   
Latest Commit: Version Fixed In: 5.11.1
Attachments: Scam detection message
Screenshot

Description Jonathan Marten 2019-04-30 11:56:47 UTC
Created attachment 119742 [details]
Scam detection message

(This bug really belongs to messagelib, but there doesn't seem to be a Bugzilla category for that. Please reassign if necessary.)

SUMMARY

The scan detection checks for URLs that display different text to their actual destination.  This is good, but the check seems to be very sensitive and detects URLs that differ trivially (with redundant percent encoding or a trailing slash).  For example, see the attached message triggered by an Amazon confirmation email - I have partly redacted the URLs to remove personal information but they were identical before doing so.  The only difference is the %5C <-> / encoding near the end.

Possibly the display and destination URLs need to be decoded and canonicalised (with QUrl::StripTrailingSlash and QUrl::NormalizePathSegments) before comparison.
Comment 1 Laurent Montel 2019-04-30 21:07:01 UTC
Git commit a21c98334a709bd925faff75e7787a639d300fac by Laurent Montel.
Committed on 30/04/2019 at 21:06.
Pushed by mlaurent into branch 'Applications/19.04'.

Fix Bug 407086 - Scam detection is too sensitive for URLs that trivially differ and are not a scam

FIXED-IN: 5.11.1

M  +8    -0    messageviewer/src/scamdetection/autotests/scamdetectionwebenginetest.cpp
M  +15   -8    messageviewer/src/scamdetection/scamdetectionwebengine.cpp

https://commits.kde.org/messagelib/a21c98334a709bd925faff75e7787a639d300fac
Comment 2 Frank Steinmetzger 2022-11-30 14:56:10 UTC
Created attachment 154169 [details]
Screenshot

Unfortunately, the trailing slash issue has not been resolved. I was about to file a new bug, but the list of possible duplicates showed me this one.

I was just reading a plain-text mail which contains a list of URLs of Gentoo package repositories. For some reason, KMail re-interprets one (and only one) of those links by removing its trailing slash.
Comment 3 Frank Steinmetzger 2022-11-30 14:57:40 UTC
(In reply to Frank Steinmetzger from comment #2)

> I was just reading a plain-text mail which contains a list of URLs of Gentoo
> package repositories. For some reason, KMail re-interprets one (and only
> one) of those links by removing its trailing slash.

Sorry, forgot to mention:
Operating System: Arch Linux
KDE Plasma Version: 5.26.3
KDE Frameworks Version: 5.100.0
Qt Version: 5.15.7
Kernel Version: 6.0.9-arch1-1 (64-bit)
Graphics Platform: X11