I am taking a guess when reg-ex vs fixed strings are quoted, please correct as needed. https://invent.kde.org/libraries/kopeninghours/-/blob/master/src/lib/openinghourslexer.l откры.* { yylval->state = State::Open; return T_STATE; } закры.* { yylval->state = State::Closed; return T_STATE; } неизв.* { yylval->state = State::Unknown; return T_STATE; } "рассвет" { yylval->time = { Time::Dawn, 0, 0 }; return T_EVENT; } "восход" { yylval->time = { Time::Sunrise, 0, 0 }; return T_EVENT; } "закат" { yylval->time = { Time::Sunset , 0, 0 }; return T_EVENT; } "сумерки" { yylval->time = { Time::Dusk, 0, 0 }; return T_EVENT; } /* Month names in Russian */ "Январь" { yylval->num = 1; return T_MONTH; } "Февраль" { yylval->num = 2; return T_MONTH; } "Март" { yylval->num = 3; return T_MONTH; } "Апрель" { yylval->num = 4; return T_MONTH; } "Май" { yylval->num = 5; return T_MONTH; } "Июнь" { yylval->num = 6; return T_MONTH; } "Июль" { yylval->num = 7; return T_MONTH; } "Август" { yylval->num = 8; return T_MONTH; } "Сентябрь" { yylval->num = 9; return T_MONTH; } "Октябрь" { yylval->num = 10; return T_MONTH; } "Ноябрь" { yylval->num = 11; return T_MONTH; } "Декабрь" { yylval->num = 12; return T_MONTH; } /* Russian localized day names */ Понедельник|Пон|Пк { yylval->num = 1; return T_WEEKDAY; } Вторник|Вто|Вт { yylval->num = 1; return T_WEEKDAY; } Среда|Сре|Ср { yylval->num = 2; return T_WEEKDAY; } Четверг|Чет|Чт { yylval->num = 3; return T_WEEKDAY; } Пятница|Пят|Пя|Пт { yylval->num = 4; return T_WEEKDAY; } Суббота|Суб|Су|Сб { yylval->num = 5; return T_WEEKDAY; } Воскресенье|Вос|Во { yylval->num = 6; return T_WEEKDAY; }
A possibly relevant merge request was started @ https://invent.kde.org/libraries/kopeninghours/-/merge_requests/78
https://invent.kde.org/libraries/kopeninghours/-/merge_requests/78 adds the day and month names (and fixes long-form time ranges that so far were only partially covered). The other strings for open/closed and the sun-based events don't seem to appear in the entire OSM corpus, or only appear in a free text context that we don't parse anyway. If they actually appear in the input data you are working with I don't mind adding those as well, but I'd need actual samples for adding unit test for those. Thank you!
Thank you for the quick change! I will look for actual examples. As I work with Osmose and when I encounter those examples, can I correct them right away and send you a proposed UT, or do you want to see them in live OSM data?
(In reply to cal030 from comment #3) > I will look for actual examples. As I work with Osmose and when I encounter > those examples, can I correct them right away and send you a proposed UT, or > do you want to see them in live OSM data? That's perfectly fine, I just need something to actually test this with, doesn't need to be in OSM.
Here are some ideas for unit tests: IN: рассвет - сумерки OUT: dawn-dusk IN: от рассвета да сумерек OUT: dawn-dusk IN: восход - закат OUT: sunrise-sunset IN: от восхода дo закатa OUT: sunrise-sunset // Possibly tokens or junk words list: // от = from // дo = until // Update to light-based events to accommodate for word conjugation: рассвет.* { yylval->time = { Time::Dawn, 0, 0 }; return T_EVENT; } сумер.?к.* { yylval->time = { Time::Dusk, 0, 0 }; return T_EVENT; } восход.* { yylval->time = { Time::Sunrise, 0, 0 }; return T_EVENT; } закат.* { yylval->time = { Time::Sunset , 0, 0 }; return T_EVENT; } IN: Среда открыто; Пятница закрыто; Суббота неизвестно OUT: We open; Fr closed; Sa unknown
One more: IN: с восхода пo закат OUT: sunrise-sunset // c - from // по - until
If not too late, please add: Вых.*|Выходной="day off" IN: Вт Выходной OUT: Tu off
Git commit e50c9edd827b0b5f8ee8763a6f4ff1f9ee075424 by David Faure, on behalf of Volker Krause. Committed on 04/12/2021 at 18:52. Pushed by dfaure into branch 'release/21.12'. Add localized Russian month names M +1 -0 autotests/parsertest.cpp M +15 -0 src/lib/openinghourslexer.l https://invent.kde.org/libraries/kopeninghours/commit/e50c9edd827b0b5f8ee8763a6f4ff1f9ee075424
Git commit 15a84ba07448f8d73d916bfbf8d4f58335461b3f by David Faure, on behalf of Volker Krause. Committed on 04/12/2021 at 18:52. Pushed by dfaure into branch 'release/21.12'. Add Russian language support for states and sun-based events This is a bit ugly in parts as Flex isn't using Unicode but working on an 8bit input stream, and we need to match Cyrillic letters here. M +7 -0 autotests/parsertest.cpp M +12 -3 src/lib/openinghourslexer.l https://invent.kde.org/libraries/kopeninghours/commit/15a84ba07448f8d73d916bfbf8d4f58335461b3f
Git commit 714239588b16ce18934d55433be1c95ee7dca280 by David Faure, on behalf of Volker Krause. Committed on 04/12/2021 at 18:52. Pushed by dfaure into branch 'release/21.12'. Parse localized Russian day names M +2 -0 autotests/parsertest.cpp M +9 -0 src/lib/openinghourslexer.l https://invent.kde.org/libraries/kopeninghours/commit/714239588b16ce18934d55433be1c95ee7dca280
As always, thank you for implementing this. Do you have a rough idea when this will be running in Osmose?
(In reply to HubMiner from comment #11) > Do you have a rough idea when this will be running in Osmose? I'm not involved myself with the Osmose integration/deployment, so that would be a question for David (cc-ing him).