Bug 396797 - Online web source "KMyMoney Currency" does not support price pairs without decimal
Summary: Online web source "KMyMoney Currency" does not support price pairs without de...
Status: RESOLVED FIXED
Alias: None
Product: kmymoney
Classification: Applications
Component: importer (show other bugs)
Version: 4.8.2
Platform: Other Microsoft Windows
: NOR normal
Target Milestone: ---
Assignee: KMyMoney Devel Mailing List
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-07-23 17:42 UTC by Ralf Habacker
Modified: 2019-01-28 11:12 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In: 4.8.3, 5.0.2
Sentry Crash Report:


Attachments
fx-rate broken setup (26.89 KB, image/jpeg)
2018-07-23 17:42 UTC, Ralf Habacker
Details
part of the code I gor using cURL (638 bytes, text/plain)
2018-07-25 14:48 UTC, shizuma
Details
QRegExp test case (507 bytes, text/x-python)
2018-07-26 06:37 UTC, Ralf Habacker
Details
Qt5 test case (660 bytes, text/x-python)
2018-07-26 07:01 UTC, Ralf Habacker
Details
QRegExp test case (updated) (1.30 KB, text/x-python)
2018-07-27 10:55 UTC, Ralf Habacker
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ralf Habacker 2018-07-23 17:42:34 UTC
Created attachment 114087 [details]
fx-rate broken setup

From shizuma@teksavvy.com:

Fx-rate (Now renamed KmYMoney Currency default never made for me.  Not once, even with the patches applied.  Nor in 4.8.0, nor in 4.8.2.  Have to resort to alphavantage setup mentioned earlier in the yahoo finance thread by myself.
Comment 1 Ralf Habacker 2018-07-23 18:12:44 UTC
The problem is that BTC/CAD price, which returns 

1 Bitcoin =</span><br> 10154 Canadian Dollar

does not have decimals and the current Price regex expects a price with decimals as shown below

1 American Dollar =</span><br> 0.8542 Euro
Comment 2 Ralf Habacker 2018-07-23 18:15:34 UTC
Git commit 3cc0235040fcfa4566cbb262e46348a0721c012d by Ralf Habacker.
Committed on 23/07/2018 at 19:16.
Pushed by habacker into branch '4.8'.

Fix 'Online web source "KMyMoney Currency" does not support price pairs without decimal'

The price regex needs to be changed to '1[ a-zA-Z]+=</span><br */?> *(\d+|\d\.\d+)'
FIXED-IN:4.8.3

M  +1    -1    kmymoney/converter/webpricequote.cpp

https://commits.kde.org/kmymoney/3cc0235040fcfa4566cbb262e46348a0721c012d
Comment 3 Thomas Baumgart 2018-07-23 19:03:41 UTC
Git commit 18cf542243def84fd2acee88170a81262a3209e0 by Thomas Baumgart.
Committed on 23/07/2018 at 19:03.
Pushed by tbaumgart into branch 'master'.

Support online web source providing price pairs without decimal

Manually cherry-picked from 3cc0235040fcfa4566cbb262e46348a0721c012d and
adjusted

M  +1    -1    kmymoney/converter/webpricequote.cpp

https://commits.kde.org/kmymoney/18cf542243def84fd2acee88170a81262a3209e0
Comment 4 shizuma 2018-07-24 16:19:35 UTC
(In reply to Ralf Habacker from comment #1)
> The problem is that BTC/CAD price, which returns 
> 
> 1 Bitcoin =</span><br> 10154 Canadian Dollar
> 
> does not have decimals and the current Price regex expects a price with
> decimals as shown below
> 
> 1 American Dollar =</span><br> 0.8542 Euro

Sorry to pester you, but in my country, they does have decimals.  See attachment https://bugsfiles.kde.org/attachment.cgi?id=114064

The comma is the thousand seperator and the dot is the decimal separator.

Maybe fx-rate uses different locales.
Comment 5 shizuma 2018-07-24 16:35:05 UTC
Please ignore my comment #4.  I had a look at the source code of fx-rate and you're spot on.  The regex you use used required a decimal point whereas no one was provided for amounts greater than 10,000 CAD.

Please accept my sincere apologies.  I'll have a look at 2.8.2 on my 7-SP1 portable and confirm all is well by tomorrow.

Thanks again for solving this so quickly.
Comment 6 shizuma 2018-07-24 18:13:46 UTC
Sad news,

After testing on my 7 SP1 box with you new regex, I still get errors:

Fetching URL https://fx-rate.net/BTC/CAD...
Price found: '10838' (10,838)
Unable to update price for BTC (no price or no date)
Fetching URL https://fx-rate.net/ETH/CAD...
Price found: '625.12' (625.12)
Unable to update price for ETH (no price or no date)
Fetching URL https://fx-rate.net/FLC/CAD...
Unable to update price for FLC (no price or no date)
Fetching URL https://fx-rate.net/XAG/CAD...
Price found: '20.547' (20.547)
Unable to update price for XAG (no price or no date)
Fetching URL https://fx-rate.net/XAU/CAD...
Price found: '2273.5' (2,273.5)
Unable to update price for XAU (no price or no date)
Fetching URL https://fx-rate.net/XMR/CAD...
Unable to update price for XMR (no price or no date)


I've tested your regexes for view-source:https://fx-rate.net/BTC/CAD at https://regex101.com/ to be certain and they work great, both for dates or prices.

Anything else you want me to check?

Must be something else within the code.
Comment 7 shizuma 2018-07-24 19:11:58 UTC
Think I've fixed (for me) your price date issue:

price: '1[ a-zA-Z]+=</span><br */?> *([\d\.]+)\s'
date:  '>updated \d+:\d+:\d+\([A-Z]+\) (\d{1,2}/\d{2}/\d{4})'


Fetching URL https://fx-rate.net/BTC/CAD...
Price found: '10863' (10,863)
Date found: 'Tue Jul 24 2018'
Price for BTC updated (id E000003)
Fetching URL https://fx-rate.net/ETH/CAD...
Price found: '627.3' (627.3)
Date found: 'Tue Jul 24 2018'
Price for ETH updated (id E000005)
Comment 8 Ralf Habacker 2018-07-25 08:31:59 UTC
(In reply to shizuma from comment #6)
> Sad news,
> 
> After testing on my 7 SP1 box with you new regex, I still get errors:
> 
> Fetching URL https://fx-rate.net/BTC/CAD...
> Price found: '10838' (10,838)
> Unable to update price for BTC (no price or no date)
The date is the issue here

> Fetching URL https://fx-rate.net/ETH/CAD...
> Price found: '625.12' (625.12)
> Unable to update price for ETH (no price or no date)
also date regex issue

> Fetching URL https://fx-rate.net/FLC/CAD...
> Unable to update price for FLC (no price or no date)
What is FLC - it is unknown by https://en.wikipedia.org/wiki/ISO_4217

> Fetching URL https://fx-rate.net/XAG/CAD...
> Price found: '20.547' (20.547)
> Unable to update price for XAG (no price or no date)
also date issue

> Fetching URL https://fx-rate.net/XAU/CAD...
> Price found: '2273.5' (2,273.5)
> Unable to update price for XAU (no price or no date)
also date issue

> Fetching URL https://fx-rate.net/XMR/CAD...
> Unable to update price for XMR (no price or no date)
XMR seems not to be supported by this site

> Must be something else within the code.
Some issues seems to be locale specific, as you already suspected - with a de_DE locale the date is returned as

  10:20:44(CEST) 25/07/2018

which matches the date regex. Can you please 

Other issues are unsupported currencies.
Comment 9 Ralf Habacker 2018-07-25 08:44:55 UTC
(In reply to shizuma from comment #7)
> Think I've fixed (for me) your price date issue:
> 
> price: '1[ a-zA-Z]+=</span><br */?> *([\d\.]+)\s'
> date:  '>updated \d+:\d+:\d+\([A-Z]+\) (\d{1,2}/\d{2}/\d{4})'
>updated \d+:\d+:\d+\([A-Z]+\) (\d{1,2}/\d{2}/\d{4})
comparing to the original, which is
 updated\s\d+:\d+:\d+\(\w+\)\s+(\d{1,2}/\d{2}/\d{4})

\s is equal to ' ', so I only see a real difference in using [A-Z]+ instead of \w+, which limits the allowed characters 

  from http://doc.qt.io/archives/qt-4.8/qregexp.html
     \w - Matches a word character (QChar::isLetterOrNumber(), QChar::isMark(), or '_').

and ' ' instead of \s+, which limits the number of spaces to exactly one. 

> Fetching URL https://fx-rate.net/BTC/CAD...
> Price found: '10863' (10,863)
> Date found: 'Tue Jul 24 2018'
> Price for BTC updated (id E000003)
> Fetching URL https://fx-rate.net/ETH/CAD...
> Price found: '627.3' (627.3)
> Date found: 'Tue Jul 24 2018'
> Price for ETH updated (id E000005)

Can you provide the raw html code for the date e.g. the part between 'updated...</span>' from the page source you got ?
Comment 10 shizuma 2018-07-25 14:48:27 UTC
Created attachment 114116 [details]
part of the code I gor using cURL

FLC is Flashcoin, another cryptocurrency.  I know I have some having no entries.  I don't see any way to remove the accounts once they're created.  Or maybe they're securities or categories.  So I keep there just in case someone one day put data on the online currency converter.

Also could be a price issue.  Sometimes, depending on the browser, source page, developer tools i get '=</span><br>', other times i get '=</span><br />'.  That's why I've changed your price regex.
Comment 11 shizuma 2018-07-25 14:54:07 UTC
according to regex101, '\s' matches any whitespace character (equal to [\r\n\t\f\v ])

that's why I've replaced your '\s+' in the date regex.

Anyways it works for me.
Comment 12 shizuma 2018-07-25 15:05:49 UTC
Last problem encountered the '(\d+|\d\.\d+)' you've put in your price regex has the nasty effect of stripping everything after the decimal point, including the decimal point itself on currencies bearing decimals.  That's why I changed this part too.
Comment 13 Ralf Habacker 2018-07-26 06:37:10 UTC
Created attachment 114132 [details]
QRegExp test case
Comment 14 Ralf Habacker 2018-07-26 06:42:44 UTC
(In reply to Ralf Habacker from comment #13)
> Created attachment 114132 [details]
> QRegExp test case

 (In reply to shizuma from comment #12)
> Last problem encountered the '(\d+|\d\.\d+)' you've put in your price regex
> has the nasty effect of stripping everything after the decimal point,
> including the decimal point itself on currencies bearing decimals.  That's
> why I changed this part too.

https://regex101.com/ results differs from what kmymoney 4.8 uses. Running the appended QRegexp test case returns

  622.5
  10722

which was expected.
Comment 15 Ralf Habacker 2018-07-26 07:01:14 UTC
Created attachment 114133 [details]
Qt5 test case

This test case shows that https://regex101.com/ behaves more like QRegularExpression, which is used in the 5.x branch

QRegExp: 622.5
QRegularExpression: 622
QRegExp: 10722
QRegularExpression: 10722

This test case also show that QRegExp and QRegularExpression has subtle differences, which may have an impact on updating any kmymoney 4 based installation to 5.x without further inspection of the used regular expressions.
Comment 16 Ralf Habacker 2018-07-27 10:55:38 UTC
Created attachment 114159 [details]
QRegExp test case (updated)

> Also could be a price issue.  Sometimes, depending on the browser, source
> page, developer tools i get '=</span><br>', other times i get '=</span><br
> />'.  That's why I've changed your price regex.

python qregex.py
622.5  # <br>
10722  # <br>
622.5  # <br />
10722  # <br />
27/07/2018
27/07/2018

The price regex handles this also.
I'm going to apply an updated regex (including a fix for an optional thousand separator to be sure) to the source for being in the next release. See bug 396901 for an issue in this area.
Comment 17 Ralf Habacker 2018-07-27 11:28:05 UTC
Git commit 89a55fa65acbd1e7527f304d04cabd47d2615b66 by Ralf Habacker.
Committed on 27/07/2018 at 10:19.
Pushed by habacker into branch '4.8'.

Adjust price regex to be KMyMoney 5 and thousand separator save

M  +1    -1    kmymoney/converter/webpricequote.cpp

https://commits.kde.org/kmymoney/89a55fa65acbd1e7527f304d04cabd47d2615b66
Comment 18 Thomas Baumgart 2018-07-27 12:05:46 UTC
Git commit bfb370ae34691739bb2f1ef5408720d99eee2b70 by Thomas Baumgart.
Committed on 27/07/2018 at 12:06.
Pushed by tbaumgart into branch 'master'.

Adjust price regex to be KMyMoney 5 and thousand separator save
(cherry picked from commit 89a55fa65acbd1e7527f304d04cabd47d2615b66)

M  +1    -1    kmymoney/converter/webpricequote.cpp

https://commits.kde.org/kmymoney/bfb370ae34691739bb2f1ef5408720d99eee2b70
Comment 19 Ralf Habacker 2019-01-28 11:12:15 UTC
shizuma has reopened this ticket after patches were applied, but he hasn't explained why, so I close this ticket. If you still have a problem with this, please reopen it with a reason.