SUMMARY [] appears in regexs. That means a single character but is not allowed to be anything, there is nothing between the [ and ]. The ICU regex engine I am using rejects this. STEPS TO REPRODUCE 1. Read doxygen.xml 2. It declares and entity wordsep as "(?:[][,?;()]|\.$|\.?\s)" 3. This entity is used in RegExpr's OBSERVED RESULT This is ok EXPECTED RESULT Should be an error and the .xml corrected SOFTWARE/OS VERSIONS Windows: macOS: Linux/KDE Plasma: (available in About System) KDE Plasma Version: KDE Frameworks Version: Qt Version: head of https://github.com/KDE/syntax-highlighting ADDITIONAL INFORMATION
[]] is valid with PCRE (regex engine used) where ] as the first character does not correspond to a closure (same with [^]]). ICU regex does not seem to support all PCRE syntax, it lacks for example (?|...) or \R which are also used.
Thanks, I've switched from ICU to PCRE, much faster. Part of the problem is that ICU jumps through hoops to be correct. For example in German the regex (case insensitive) "^ẞ$" matches "SS" [2 code points], no other regex implementations do this that I have seen. ICU was getting into a internal infinite loop and throwing a "regex out of stack space" after 0.5 sec, lots of times, which made a .sh file take 30 seconds to syntax highlight!