| Summary: | Handle missing colon | ||
|---|---|---|---|
| Product: | [Applications] KOpeningHours | Reporter: | HubMiner <cal030> |
| Component: | general | Assignee: | Volker Krause <vkrause> |
| Status: | RESOLVED FIXED | ||
| Severity: | normal | ||
| Priority: | NOR | ||
| Version First Reported In: | unspecified | ||
| Target Milestone: | --- | ||
| Platform: | Other | ||
| OS: | Other | ||
| Latest Commit: | https://invent.kde.org/libraries/kopeninghours/commit/024831289df89c2282637eea7bfe774286b20b22 | Version Fixed/Implemented In: | |
| Sentry Crash Report: | |||
|
Description
HubMiner
2021-11-26 20:16:30 UTC
This is indeed a somewhat common mistake in OSM data. It's unfortunately not easy to support (if possible at all), as 4 digit numbers match year numbers as well. Ie. "1900-2100" is a valid expression, but it's a year range, not a time range. Thanks for taking a look at this, I agree this is a frequent mistake.
Random ideas:
- Is it possible to parse in stages, so after likely year placeholders are parsed, prefer that that 4 digits represent hhmm?
- Is it possible to tighten year definition:
1900: drop or deprioritize 19## as year
20[0-3]#: recognize this as year, drop or deprioritize anything after that.
Right, accepting 4 digit numbers that are a valid time and that fall outside of the expected year values should be possible and could probably cover the majority of these cases already. I'll give that a try. A possibly relevant merge request was started @ https://invent.kde.org/libraries/kopeninghours/-/merge_requests/81 Git commit 024831289df89c2282637eea7bfe774286b20b22 by Volker Krause. Committed on 05/12/2021 at 10:32. Pushed by vkrause into branch 'release/21.12'. Support 4 digit times to the extend possible The problem with 4 digit times is the ambiguity with year numbers. To solve this we now assume anything that would be a valid time outside of the [2001-2099] range to be a time. That leaves all practically relevant years valid and still covers the vast majority of 4 digit times found in the OSM corpus. This is worth it given how common the 4 digit time mistake is in OSM data. M +2 -2 autotests/evaluatetest.cpp M +5 -1 autotests/parsertest.cpp M +12 -6 src/lib/openinghourslexer.l M +4 -0 src/lib/openinghoursparser.y https://invent.kde.org/libraries/kopeninghours/commit/024831289df89c2282637eea7bfe774286b20b22 |