SUMMARY For context-sensitive grammars, I find it can be fairly difficult to track all necessary context (in the theoretical sense) through the actual context constructs (the KDE syntax highlight "context" item) because there can only ever one active at once, from what I can tell. This is severely limiting whenever there's a more complex kind of theoretical context/state to be tracked on multiple layers, which can be necessary sometimes. I therefore propose that rules can set named state variables tracked by the syntax engine, as well as have preconditions depending on them: <keyword attribute="DeclKeyword" context="NamedItem" String="namedcodeblockitemkeywords" state-set="expect_block = 1" /> <DetectChar attribute="OtherSymbol" context="#stay" char="{" state-check="expect_block == 1" /> (I chose this syntax so at some point it could be extended for example to += or -= for the set part, or >= for the check part, or whatever. I think as a start, just = for setting and ==/!= for checking should be enough.) My apologies if this feature already exists and I just missed it. STEPS TO REPRODUCE 1. Encounter a situation like this: "i'm looking to write a kate syntax highlighting ( https://invent.kde.org/frameworks/syntax-highlighting ) but i ran into a problem: in the language i'm writing it for, `{` opens a code block and should therefore be a folding region only if it's the first `{` to follow a specific list of "code block keywords" that fulfills all of the following conditions: 1. it's not nested inside any other `{` or `(` or `[` bracket pairs opened after that code block keyword, and 2. the last non-whitespace character before the `{` bracket it isn't part of specific negative symbol list, and 3. it isn't before potential in-between whitespace preceded by a keyword of a specific negative keyword list. i might have missed some corner cases but i'm fairly sure this would basically be always correct" 2. Realize it would be way easier to track if "code block start keywords" could enter some code block expecting state, and different "code block stop keywords" could exit it again. Same for "code block allowing symbols" and "code block preventing symbols", and this should be fairly quick to put together as a result, at least outside of the nesting part. OBSERVED RESULT It's too complicated right now to handle many variants of state (=context in grammars that require it) if there's more than one layer of that required. EXPECTED With the proposed addition, it should hopefully be fairly easy to track such additional state. SOFTWARE/OS VERSIONS probably all are affected ADDITIONAL INFORMATION
You can do the same with context switches, even if that naturally is not that nice. I don't see us adding additional vars, that will require tracking them which is costly and in addition introduce yet another level of complexity. Given even complex languages like C++ work ok with the current level of expressiveness I don't see us going there. If you have time to provide a merge request that shows this can be added without a large impact, one might reconsider.
I don't see how without running into scalability issues, doesn't that require basically an exponential explosion of context constructs? E.g. for 3 independent states, 27 context constructs already? Unless you expect me to write a compiler to output the syntax formatting and no longer be able to write it myself, I don't see how that's feasible. Therefore, the reasoning that it should be expressible seems somewhat theoretical and not really true in practice as soon as the grammar is more context-sensitive than usual. I would assume C++ to be less context sensitive because it has line terminators, while what I'm working on doesn't.
Is there some good place to ask how to best do the actual implementation of the XML? I'm slightly at a loss how to do it without turning into kind of a chaos. While with this feature, it would be fairly easy and clean.
I currently have these contexts or plan to add them, that would turn into a problem (I use these to mark variable names that are declared with dsVariable which seems very helpful to make the names easy to spot, and I might add more later): NamedItem, WithStatementBeforeIn, WithStatementAtLabel. If I want this feature to work I think I would need to manage it with these new contexts, something like this: CodeBlockAtNextBrace, CodeBlockDeferred. Both would need to be able to coincide with the first list, so this would give me: NamedItem, NamedItemCodeBlockAtNextBrace, NamedItemCodeBlockDeferred, WithStatementBeforeIn, WithStatementBeforeInCodeBlockAtNextBrace, WithStatementBeforeInCodeBlockDeferred, WithStatementAtLabel, WithStatementAtLabelCodeBlockAtNextBrace, WithStatementAtLabelCodeBlockDeferred. I hope this shows why I anticipate this turning into a bigger problem. It doesn't seem like a sustainable approach.
I'm still stuck on this, for what it's worth. Would be curious to hear if there's any good place to ask about such questions to figure this out!