Bug 445785 - Add correction for Russian statements
Summary: Add correction for Russian statements
Status: RESOLVED FIXED
Alias: None
Product: KOpeningHours
Classification: Applications
Component: parser (show other bugs)
Version: unspecified
Platform: Other Other
: NOR normal
Target Milestone: ---
Assignee: Volker Krause
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-11-20 03:42 UTC by HubMiner
Modified: 2021-12-14 18:07 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description HubMiner 2021-11-20 03:42:43 UTC
I am taking a guess when reg-ex vs fixed strings are quoted, please correct as needed.


https://invent.kde.org/libraries/kopeninghours/-/blob/master/src/lib/openinghourslexer.l

откры.*    { yylval->state = State::Open;    return T_STATE; }
закры.*  { yylval->state = State::Closed;  return T_STATE; }
неизв.* { yylval->state = State::Unknown; return T_STATE; }

"рассвет"    { yylval->time = { Time::Dawn,    0, 0 }; return T_EVENT; }
"восход" { yylval->time = { Time::Sunrise, 0, 0 }; return T_EVENT; }
"закат"  { yylval->time = { Time::Sunset , 0, 0 }; return T_EVENT; }
"сумерки"    { yylval->time = { Time::Dusk,    0, 0 }; return T_EVENT; }

 /* Month names in Russian */
"Январь" { yylval->num = 1; return T_MONTH; }
"Февраль" { yylval->num = 2; return T_MONTH; }
"Март" { yylval->num = 3; return T_MONTH; }
"Апрель" { yylval->num = 4; return T_MONTH; }
"Май" { yylval->num = 5; return T_MONTH; }
"Июнь" { yylval->num = 6; return T_MONTH; }
"Июль" { yylval->num = 7; return T_MONTH; }
"Август" { yylval->num = 8; return T_MONTH; }
"Сентябрь" { yylval->num = 9; return T_MONTH; }
"Октябрь" { yylval->num = 10; return T_MONTH; }
"Ноябрь" { yylval->num = 11; return T_MONTH; }
"Декабрь" { yylval->num = 12; return T_MONTH; }


  /* Russian localized day names */
Понедельник|Пон|Пк  { yylval->num = 1; return T_WEEKDAY; }
Вторник|Вто|Вт { yylval->num = 1; return T_WEEKDAY; }
Среда|Сре|Ср { yylval->num = 2; return T_WEEKDAY; }
Четверг|Чет|Чт  { yylval->num = 3; return T_WEEKDAY; }
Пятница|Пят|Пя|Пт  { yylval->num = 4; return T_WEEKDAY; }
Суббота|Суб|Су|Сб { yylval->num = 5; return T_WEEKDAY; }
Воскресенье|Вос|Во { yylval->num = 6; return T_WEEKDAY; }
Comment 1 Bug Janitor Service 2021-11-20 10:27:02 UTC
A possibly relevant merge request was started @ https://invent.kde.org/libraries/kopeninghours/-/merge_requests/78
Comment 2 Volker Krause 2021-11-20 10:52:59 UTC
https://invent.kde.org/libraries/kopeninghours/-/merge_requests/78 adds the day and month names (and fixes long-form time ranges that so far were only partially covered).

The other strings for open/closed and the sun-based events don't seem to appear in the entire OSM corpus, or only appear in a free text context that we don't parse anyway. If they actually appear in the input data you are working with I don't mind adding those as well, but I'd need actual samples for adding unit test for those.

Thank you!
Comment 3 HubMiner 2021-11-20 16:18:06 UTC
Thank you for the quick change!

I will look for actual examples. As I work with Osmose and when I encounter those examples, can I correct them right away and send you a proposed UT, or do you want to see them in live OSM data?
Comment 4 Volker Krause 2021-11-20 16:51:37 UTC
(In reply to cal030 from comment #3)
> I will look for actual examples. As I work with Osmose and when I encounter
> those examples, can I correct them right away and send you a proposed UT, or
> do you want to see them in live OSM data?

That's perfectly fine, I just need something to actually test this with, doesn't need to be in OSM.
Comment 5 HubMiner 2021-11-23 05:14:34 UTC
Here are some ideas for unit tests:

IN:  рассвет - сумерки
OUT: dawn-dusk

IN:  от рассвета да сумерек
OUT: dawn-dusk

IN:  восход - закат
OUT: sunrise-sunset

IN:  от восхода дo закатa
OUT: sunrise-sunset

// Possibly tokens or junk words list:
// от = from
// дo = until

// Update to light-based events to accommodate for word conjugation:
рассвет.*   { yylval->time = { Time::Dawn,    0, 0 }; return T_EVENT; }
сумер.?к.*  { yylval->time = { Time::Dusk,    0, 0 }; return T_EVENT; }
восход.*    { yylval->time = { Time::Sunrise, 0, 0 }; return T_EVENT; }
закат.*     { yylval->time = { Time::Sunset , 0, 0 }; return T_EVENT; }


IN:  Среда открыто; Пятница закрыто; Суббота неизвестно
OUT: We open; Fr closed; Sa unknown
Comment 6 HubMiner 2021-11-23 05:17:18 UTC
One more:

IN:  с восхода пo закат
OUT: sunrise-sunset

// c - from
// по - until
Comment 7 HubMiner 2021-12-02 18:57:17 UTC
If not too late, please add: Вых.*|Выходной="day off"

IN: Вт Выходной
OUT: Tu off
Comment 8 David Faure 2021-12-04 18:59:31 UTC
Git commit e50c9edd827b0b5f8ee8763a6f4ff1f9ee075424 by David Faure, on behalf of Volker Krause.
Committed on 04/12/2021 at 18:52.
Pushed by dfaure into branch 'release/21.12'.

Add localized Russian month names

M  +1    -0    autotests/parsertest.cpp
M  +15   -0    src/lib/openinghourslexer.l

https://invent.kde.org/libraries/kopeninghours/commit/e50c9edd827b0b5f8ee8763a6f4ff1f9ee075424
Comment 9 David Faure 2021-12-04 18:59:39 UTC
Git commit 15a84ba07448f8d73d916bfbf8d4f58335461b3f by David Faure, on behalf of Volker Krause.
Committed on 04/12/2021 at 18:52.
Pushed by dfaure into branch 'release/21.12'.

Add Russian language support for states and sun-based events

This is a bit ugly in parts as Flex isn't using Unicode but working on
an 8bit input stream, and we need to match Cyrillic letters here.

M  +7    -0    autotests/parsertest.cpp
M  +12   -3    src/lib/openinghourslexer.l

https://invent.kde.org/libraries/kopeninghours/commit/15a84ba07448f8d73d916bfbf8d4f58335461b3f
Comment 10 David Faure 2021-12-04 18:59:47 UTC
Git commit 714239588b16ce18934d55433be1c95ee7dca280 by David Faure, on behalf of Volker Krause.
Committed on 04/12/2021 at 18:52.
Pushed by dfaure into branch 'release/21.12'.

Parse localized Russian day names

M  +2    -0    autotests/parsertest.cpp
M  +9    -0    src/lib/openinghourslexer.l

https://invent.kde.org/libraries/kopeninghours/commit/714239588b16ce18934d55433be1c95ee7dca280
Comment 11 HubMiner 2021-12-13 17:45:03 UTC
As always, thank you for implementing this.
Do you have a rough idea when this will be running in Osmose?
Comment 12 Volker Krause 2021-12-14 18:07:00 UTC
(In reply to HubMiner from comment #11)
> Do you have a rough idea when this will be running in Osmose?

I'm not involved myself with the Osmose integration/deployment, so that would be a question for David (cc-ing him).