120544 – kate: syntax highlighting needs more flexible context rules: pop N contexts, then push a sequence of named contexts

Bug 120544 - kate: syntax highlighting needs more flexible context rules: pop N contexts, then push a sequence of named contexts

Summary: kate: syntax highlighting needs more flexible context rules: pop N contexts, ...

Status:	RESOLVED FIXED

Alias:	None

Product:	kate
Classification:	Applications
Component:	general (show other bugs)
Version:	2.4.1
Platform:	Compiled Sources Linux

Importance:	NOR wishlist
Target Milestone:	---
Assignee:	Christoph Cullmann

URL:
Keywords:

Depends on:
Blocks:

Reported:	2006-01-21 08:44 UTC by esigra
Modified:	2006-09-13 22:39 UTC (History)
CC List:	1 user (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:

Attachments
Code in the test language specifically created for demonstrating this bug (302 bytes, text/plain) 2006-01-21 14:55 UTC, esigra	Details
Syntax file for the test language. (3.35 KB, text/plain) 2006-01-21 14:56 UTC, esigra	Details
View All Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description esigra 2006-01-21 08:44:12 UTC

Version:           2.4.1 (using KDE Devel)
Installed from:    Compiled sources
OS:                Linux

When changing the context in Kate syntax highlighting, it seem like it is only possible to do one of the following:
1. Pop N contexts or
2. Push one named context.

For some languages it is necessary to first pop N contexts, then push a context. Like it is now, you can back up from a hole so to speak, but you will stand in front of it again and may get back into it. You should be able to back up and jump over the hole in one command, so you can not get into it again.

Sometimes it is also necessary/convenient/efficient to push a sequence of several context.

Comment 1 Anders Lund 2006-01-21 10:43:48 UTC

On Saturday 21 January 2006 08:44, sigra@home.se wrote:
> When changing the context in Kate syntax highlighting, it seem like it is
> only possible to do one of the following: 
> 1. Pop N contexts

#pop#pop#pop...

> 2. Push one named context.

There is no such thing push. Please explain what you mean

Comment 2 esigra 2006-01-21 11:09:57 UTC

> > When changing the context in Kate syntax highlighting, it seem like it is 
> > only possible to do one of the following: 
> > 1. Pop N contexts 
> 
> #pop#pop#pop... 
> 
> > 2. Push one named context. 
> 
> There is no such thing push. Please explain what you mean 

There is something like push in kdelibs/kate/part/katehighlight.cpp:
m_contexts.push_back (ctxNew);

The contexts are stored in a stack data structure [http://en.wikipedia.org/wiki/Stack_(data_structure)]. So push is when a new context is entered (with the command context="somecontext").

Comment 3 Anders Lund 2006-01-21 11:35:11 UTC

On Saturday 21 January 2006 11:09, sigra@home.se wrote:
> There is something like push in kdelibs/kate/part/katehighlight.cpp:
> m_contexts.push_back (ctxNew);

To write a highlight file, you need to read the documentation related to that. 
Your local URL is help:/kate/highlight.html. The behavior of the internal 
classes is not for you as a highlight file author.

> The contexts are stored in a stack data structure
> [http://en.wikipedia.org/wiki/Stack_(data_structure)]. So push is when a
> new context is entered (with the command context="somecontext").

Why on earth do you relate this to writing a highlight file????????
In any case, do not. Go read the help:/ url above instead.

Comment 4 Anders Lund 2006-01-21 11:41:11 UTC

A misunderstanding caused by not having read the proper documentation.
Btw, you can get support very easy by writing our mailing list at kwrite-devel@kde.org (ask to be CC'd if you do not subscribe) or by visiting the #kate at the freenode irc network. Look for dhaumann (or dh), or sredna for help with syntax highlight files.

Comment 5 esigra 2006-01-21 12:51:52 UTC

I am subscribed to kwrite-devel@kde.org and this message will go there. I hope dhaumann or sredna are also subscribed.

Sorry for assuming that the highlighting contexts have anything to do with a stack. It just looked like that to me, with those pops. Entering a context seemed to be a push of that context to a stack. The fact that I was quite successfully building a highlighting module under this false assumption further entrenched it in my mind. Thanks for correcting me on that point.

Since I do not have the bleeding edge sources from SVN installed, I was reading the online version of the documentation (http://kate.kde.org/doc/main_hlhowto.php) but neither there nor in my installed version (help:/kate/highlight.html) is it clearly demonstrated, in a way that ordinary people like I can understand, how to do any of the 2 things that I asked for in this report (that is why I had to report them). It even seems to indicate that it is not possible to pop and then enter some context:
<RuleName attribute="(identifier)" context="(identifier|order)" [rule specific attributes] />
I interpret this such that identifier and order are mutually exclusive, which would mean that this bug indeed exists. But since it does not, it must be possible to do something like context="#pop#popsomecontext" to leave the two previous contexts and then enter the context named somecontext. Sorry for not being able to read that from the documentation. Maybe the documentation is somewhat unclear on this point? Maybe I could open a new report for that instead? Or can preferably this report be reclassified to documentation?

I also do not understand how my second point supposed to work? It can not reasonably be context="somecontextsomeothercontext" because then it would be ambiguous wether to enter the context named somecontext and then enter the context named someothercontext, or to just enter the context named somecontextsomeothercontext.

Since I can not figure it out myself, could you please just post the code examples that you made to prove that the two issues reported here are invalid?

Comment 6 Anders Lund 2006-01-21 13:22:04 UTC

On Saturday 21 January 2006 12:51, sigra@home.se wrote:
> I am subscribed to kwrite-devel kde org and this message will go there. I
> hope dhaumann or sredna are also subscribed.


We surely are ;)

> Since I do not have the bleeding edge sources from SVN installed, I was
> reading the online version of the documentation
> (http://kate.kde.org/doc/main_hlhowto.php) but neither there nor in my
> installed version (help:/kate/highlight.html) is it clearly demonstrated,
> in a way that ordinary people like I can understand, how to do any of the 2
> things that I asked for in this report (that is why I had to report them).
> It even seems to indicate that it is not possible to pop and then enter
> some context: <RuleName attribute="(identifier)"
> context="(identifier|order)" [rule specific attributes] /> I interpret this
> such that identifier and order are mutually exclusive, which would mean
> that this bug indeed exists. 


Yes, they are mutually exclusive.

What bug?


> But since it does not, it must be possible to 
> do something like context="#pop#popsomecontext" to leave the two previous
> contexts and then enter the context named somecontext. 


No, you can't.

I'm not sure I understand what you mean though, but i guess you want to change 
the context stack behind you to be able to #pop to the right place later. You 
might be able to work around that by using extra contexts.

You should mark this a wishlist item and reopen it maybe, and the suggested 
behavior can possibly be added with KDE 4.

Comment 7 esigra 2006-01-21 14:55:48 UTC

Created attachment 14326 [details]
Code in the test language specifically created for demonstrating this bug

This file contains some code written in a test language specifically created
for demonstrating this bug. The language only consists of function
declarations, which begin with the keyword "function", followed by an
identifier for the function's name. If the function takes any parameters, a
parameter list follows. The list begins with a '(' and ends with a ')',
Parameters are separated by ';'. After the parameter list comes the keyword
"return" followed by an identifier for the return type. The declaration is then
terminated with ';'. The string "--" starts a comment that lasts to the end of
the line. Any whitespace is allowed.

Comment 8 esigra 2006-01-21 14:56:56 UTC

Created attachment 14327 [details]
Syntax file for the test language.

Comment 9 esigra 2006-01-21 15:03:53 UTC

<context name="function parameter list expecting ; or )" lineEndContext="#stay" attribute="error">
	<IncludeRules context="normal" />
	<DetectChar  char=";" attribute=";" context="#pop" />
	<DetectChar  char=")" attribute="()" context="#pop#pop" />
<!-- This is where the command should have pushed the context named "function expecting return" after popping. Otherwise multiple parameter lists are accepted as valid syntax. As far as I can see, fixing this requires fixing bug 120544.-->
</context>

Comment 10 Anders Lund 2006-01-21 15:10:34 UTC

There is no bug.

You may feel that the current ability of kates highlighting lanugage is not 
sufficient for you, if so file a *wish*. But kate does exactly what it 
intends to with the current rules and attributes.

I already said that i agree that it could be nice to add more flexibility to 
the highlight engine. But there still is no bug.

Discussions about future development should be in our mainglist and irc 
channel, or in wishlist items in the bug system.

Now stop the spamming

Comment 11 Anders Lund 2006-01-21 15:14:55 UTC

On Saturday 21 January 2006 14:56, sigra@home.se wrote:

I can't see from your files what you expect to happen without doing a big 
analysis, and frankly i have better stuff to do with my time.

How ever, the simplicity of your sample text indicates that highlighting it 
would be similarly simple.

Comment 12 esigra 2006-01-21 15:46:02 UTC

So I have constructed a testcase and you call that spamming. It seems like the word bug is really annoying you. You should not take that word so seriously since it can mean different things in different contexts. I can assure you that if there had been any way for me to change the status of the report to wish, I would have done that by now. But I really can not find the link for that here in the bug tracking system. And even if it is classified as wish, it is still considered a bug in the bug tracking system sense of the word; the url would still contain bugs.kde.org and show_bug.cgi and it could still be accessible by bug: in Konqueror. That would have been the case even if it had been classified as a wish from the beginning. So I did not think it is such a big issue to call it a bug in that sense. But hereby I officially admit that the issue is NOT a bug as in program error, but a request for enhanced functionality. I will be very careful with the classification of reports in the bug tracking system from now on.

Now about the testcase. The redundant parameter lists in the last function declaration are highlighted as any other parameter list. The expected behaviour is to show them red with <itemData name="error" defStyleNum="dsError" /> referenced from <context name="function expecting return" lineEndContext="#stay" attribute="error">, which would have been entered after the double pop in the piece of code pasted in comment #9.

Comment 13 Anders Lund 2006-01-21 15:53:45 UTC

Comment 14 Anders Lund 2006-01-21 16:15:13 UTC

On Saturday 21 January 2006 15:46, sigra@home.se wrote:
> Now about the testcase. The redundant parameter lists in the last function
> declaration are highlighted as any other parameter list. The expected
> behaviour is to show them red with <itemData name="error"
> defStyleNum="dsError" /> referenced from <context name="function expecting
> return" lineEndContext="#stay" attribute="error">, which would have been
> entered after the double pop in the piece of code pasted in comment #9.


This is because you didn't ASK it to be highlighted as error. Go fix tour 
file. What you want is fully possible with the current features of kates 
highlight system.

Comment 15 esigra 2006-01-21 16:38:02 UTC

> This is because you didn't ASK it to be highlighted as error. Go fix tour
> file.

You are absolutely right about that. The purpouse of the testcase is to show a situation where the proposed enhancement of functionality would make it easy to get the desired highlighting result. The proposed functionality would make it very easy to ASK for the code in question to be highlighted as error. It would just be a matter of telling it to enter the right context.


> What you want is fully possible with the current features of kates
> highlight system.

It may be so, I do not know. I believe it when I see it. But I doubt that it would be nearly as easy (and efficient) as with the proposed enhanced functionality.

Comment 16 Anders Lund 2006-01-21 17:24:40 UTC

On Saturday 21 January 2006 16:38, sigra@home.se wrote:
> But I doubt that it would be nearly as easy (and efficient) as with the
> proposed enhanced functionality.


Which is why we have wishlist items.

Meanwhile, you can jump to any context you want from with any rule matching.

Comment 17 esigra 2006-01-21 17:42:33 UTC

> Meanwhile, you can jump to any context you want from with any rule matching.

The problem with that comes when popping back at the end. When simply jumping to any context, the current context is still there below:
Suppose the highlighter encounters a function declaration without argument list. The stack of contexts will grow like this:
0. default
1. function expecting identifier
2. function expecting parameter list or return
3. function expecting return
4. function expecting return type
5. function expecting ;

When the highlighter encounters the ; it can unwind to the correct context by doing 5 pops. But if the highlighter encounters a function declaration with argument list, the stack will grow like this instead:
0. default
1. function expecting identifier
2. function expecting parameter list or return
3. function parameter list expecting parameter
4. function parameter list expecting ; or )

At this point, the highlighter has to pop twice before it goes into the context "function expecting return". Otherwise the stack will be different when the ';' is reached and the five pops will not suffice.

Comment 18 Anders Lund 2006-01-21 18:04:13 UTC

On Saturday 21 January 2006 17:42, sigra@home.se wrote:
> The problem with that comes when popping back at the end. When simply
> jumping to any context, the current context is still there below: Suppose
> the highlighter encounters a function declaration without argument list.
> The stack of contexts will grow like this: 0. default
> 1. function expecting identifier
> 2. function expecting parameter list or return
> 3. function expecting return
> 4. function expecting return type
> 5. function expecting ;
>
> When the highlighter encounters the ; it can unwind to the correct context
> by doing 5 pops. But if the highlighter encounters a function declaration
> with argument list, the stack will grow like this instead: 0. default
> 1. function expecting identifier
> 2. function expecting parameter list or return
> 3. function parameter list expecting parameter
> 4. function parameter list expecting ; or )
>
> At this point, the highlighter has to pop twice before it goes into the
> context "function expecting return". Otherwise the stack will be different
> when the ';' is reached and the five pops will not suffice.

You may be able to use the fallthroughcontext system to do the extra pop. It 
allows a context to to switch if no rules match in a iteration of the 
context.

As an alternative, you can use the lookAhead attribute to be able to switch 
context with a match, without moving the highlighters position index.

A slightly offtopic note:
You should try to highlight *correct* code.
You should not try to highlight *incorrect* code

The syntax highlighter is not built for that, you need a language parser to 
detect language syntax errors.

As a scary warning, look at the xml (debug) highlight.

If you are still in doubt, consider covering *ALL* the possible syntax errors 
in your language. Because if you let your users believe that you highlight 
syntax errors, you let them down if you don't do it all the time.

Of course in many cases the highlight system covers simple errors, like 
missing quotation marks. But that is more like a side effect of highlighting 
the string, not a dedicated search for errors.

-anders

Comment 19 esigra 2006-01-21 18:19:28 UTC

There is a quite trivial fix for this problem: Extend the DTD with a common attribute for the detection functions. Call it for example replacecurrentcontext. Default would be false to preserve current behaviour. Example:
<DetectChar  char="(" attribute="()" context="function parameter list expecting parameter" replacecurrentcontext="true" />
It would be equivalent to a #pop before entering the new context.

This extension could be made at any time because it does not break the current interface. It is compatible with all existing valid highlight definitions. In the example above, the stack would first be:

0. default

then

0. default
1. function expecting identifier

then

0. default
1. function expecting parameter list or return

then

0. default
1. function parameter list expecting parameter

then

0. default
1. function parameter list expecting ; or )

then, when the ')' is found

0. default
1. function expecting return

then

0. default
1. function expecting return type

then

0. default
1. function expecting ;

then finally after a pop

0. default

Comment 20 Anders Lund 2006-01-21 18:53:06 UTC

On Saturday 21 January 2006 18:19, sigra@home.se wrote:
> This extension could be made at any time because it does not break the
> current interface.


allowing #pop[#pop..]#newcontext wouldn't break any existing files either.

Comment 21 Matthew Woehlke 2006-08-17 16:59:56 UTC

...and if I haven't already, independantly filed a wish for this *exact* same function using almost the *exact* same implementation (my thought was actualy "context[#pop[#pop...]]"), I'd like to note for the record that I am doing so now.

I have a C Preprocessor hioghlighter that tries to do error detection (a little easier, as you can get most of it that can be gotten from anything short of actual preprocessing). It has something like three or four contexts that ONLY exist because of this limitation; trying to work around this can quickly lead to huge amounts of code duplication, and is just plain really ugly.

I've looked at patching this myself, but I don't understand KATE's guts enough to have a good idea where to look. If anyone would care to help me out here I would very much appreciate it.

Comment 22 Christoph Cullmann 2006-08-19 11:07:24 UTC

I will look at it as soon as I have spare time again ;)
Not sure which style I will use, but should be no problem to implement it, maybe some tweaking here and there ;)
I tend to prefer the contextname#pop#pop style, as this would be more trivial, as even pop would be allowed contextname then, still, while with #pop#pop#pop, is the last pop a context or a "pop" operation? ;)

Comment 23 Dominik Haumann 2006-08-20 12:56:52 UTC

#pop and #stay are reserved keywords. The order of #pop[#pop..]#newcontext looks more straight forward to me. The problem, as you said, is something like #pop#pop#popText, where "popText" is a name of a context. But if we do a QString::split("#"); (or equivalent) this should be no issue, right? :)

Comment 24 esigra 2006-08-21 17:10:48 UTC

I think it looks strange with the contextname#pop syntax. As if you would push contextname and then immediately pop it again. That would do nothing. While #pop#contextname would pop the current context and then push contextname.

Comment 25 Matthew Woehlke 2006-08-21 17:28:08 UTC

My complaint against "#pop#context" is a: it is ambiguous (does not allow a context named 'pop'), and b: it now adds the concept of a normal context being prefixed with '#' (does this mean "#context" would be valid?). Whereas "context#pop" is unambiguous and easy to process, even if it does mean reading from right to left.

I would change the processing rules to be as follows:
- If the string is exactly "#stay", then do not change contexts (current behavior).
- Otherwise, for each trailing "#pop", pop one context. If anything is left when you're done doing this, that is a new context to push ("#stay" at this point would not be valid).

This is clean, simple, and fully supports both the existing and proposed semantics.

I wouldn't want to see the context at the end unless the delimiter is changed to something other than '#' (maybe '##', although that already has the 'external' semantic). Is saying "context stacks are read right-to-left" really that hard for people to understand?

Comment 26 Christoph Cullmann 2006-09-13 22:39:53 UTC

In KDE 4 it's implemented, it works via:

#pop#pop!contextname

this will pop two contexts + push "contextname" context to the stack

arbitary counts of #pop are allowed (#stay may be intermixed for fun, that's just ignored)