Bug 103257 - accommodate overlapping syntax regions (especially for php)
Summary: accommodate overlapping syntax regions (especially for php)
Status: RESOLVED FIXED
Alias: None
Product: kate
Classification: Applications
Component: syntax (show other bugs)
Version: 2.3.2
Platform: unspecified Linux
: NOR wishlist
Target Milestone: ---
Assignee: KWrite Developers
URL:
Keywords:
: 125604 126673 143022 (view as bug list)
Depends on:
Blocks:
 
Reported: 2005-04-05 02:20 UTC by William Kilian
Modified: 2009-07-22 16:25 UTC (History)
4 users (show)

See Also:
Latest Commit:
Version Fixed In:


Attachments
example from initial report (50 bytes, text/x-php4-src)
2005-04-05 02:22 UTC, William Kilian
Details
"Better" php.xml (2.91 KB, patch)
2009-07-18 11:09 UTC, James Sleeman
Details
Full version of modified php.xml (for people who just want to grab and use it). (237.16 KB, application/xml)
2009-07-18 11:11 UTC, James Sleeman
Details
Updated php folding for alternate block syntax. (238.92 KB, text/plain)
2009-07-22 15:59 UTC, James Sleeman
Details

Note You need to log in before you can comment on or make changes to this bug.
Description William Kilian 2005-04-05 02:20:08 UTC
Version:           2.3.2 (using KDE 3.3.2, Gentoo)
Compiler:          gcc version 3.3.5 (Gentoo Hardened Linux 3.3.5-r1, ssp-3.3.2-3, pie-8.7.7.1)
OS:                Linux (i686) release 2.6.10-gentoo-r6-athlon

Valid PHP can have regions that overlap. Here is a trivial example:

	<?php
	
	if (true) {
		?>text in if<?php
	}
	
	?>
	
	html text

It is really nice that I can define regions based on curly braces and the php code start and end delimiters. However, when syntax regions overlap, kate/syntax does not detect that the first region has closed. Contexts work fine with overlapping regions. In the example above, "text in if" and "html text" is properly highlighted as html content. The php code is all highlighted properly also. But the code folding regions do not work as expected.

There are code folding buttons for the open curly brace and both "<?php" strings, but all three fold the rest of the file. The desired behaviour of the code folding would: have the first "<?php" string hide the top part of the if block but show the bottom curly brace; have the open curly brace only hide the if block; and have the second "<?php" string hide the bottom curly brace but not the top of the if.

I know coding for overlapping regions is more complicated than coding only for non-overlapping regions. A nice compromise would be to ignore the "?>" and "<?php" strings inside the curly braces since there is neither a "<?php" string inside the if block but before the "?>" string nor a "?>" string inside the if block but after the "<?php" string. Ignoring those delimiters for code folding purposes should both allow the "<?php" at the beginning of the example to end at the "?>" after the if block and allow the curly braces to fold just the if block. However, this compromise might be more complex to code than simply using separate stacks for regions named differently in the syntax file.
Comment 1 William Kilian 2005-04-05 02:22:57 UTC
Created attachment 10517 [details]
example from initial report
Comment 2 Thomas Friedrichsmeier 2007-12-03 00:31:31 UTC
*** Bug 143022 has been marked as a duplicate of this bug. ***
Comment 3 Thomas Friedrichsmeier 2007-12-03 00:35:20 UTC
Note: Folding no longer collapses until the end of the file since SVN rev. 744206, but now stops at the "?>". This makes the problem less severe, but we'd still really need overlapping regions to solve it correctly.
Comment 4 Thomas Friedrichsmeier 2007-12-03 01:46:40 UTC
Thinking about it, actually, we don't really need overlapping regions, we just need deeper nesting, and a good way to define it in the highlighting definition.

Then your example would be mapped as:
1. Start in "HTML" region
2. On "<?"   enter "PHP" region
3. On "{"    enter "Brace" region
4. On "?>"   enter "Embedded HTML" region
5. On "<?"   leave "Embedded HTML" region
6. On "}"    leave "Brace" region
7. On "?>"   leave "PHP" region

The "Brace" region is basically just a region that includes the general PHP rules, but #pops on encountering "}" and dispatches to "Embedded HTML" when encountering "?>". This last bit is the difficult one in the current setup, as from within php.xml we have no way of knowing which context would need to be included as "Embedded HTML". That's why right now we have no choice but to pop back to initial context, instead of starting a new level of nesting as would be required to get folding to work correctly.
Comment 5 William Kilian 2007-12-03 03:12:30 UTC
I took a cursory glance at the php.xml. It looks to me your idea would work if there were two different phpsource contexts. Make the brace enter the second context instead of just a new region, then that second phpsource context could treat the "?>" as entering "Embedded HTML". The only thing that would mess that up would be a file with unmatched open braces. I don't remember if it's possible to legally do that in php -- it might be if the matching close brace is in an include()'d file -- I haven't done anything in php in awhile and can't remember what php would do in that case.
Comment 6 Thomas Friedrichsmeier 2007-12-07 16:15:15 UTC
*** Bug 126673 has been marked as a duplicate of this bug. ***
Comment 7 Thomas Friedrichsmeier 2007-12-07 16:20:40 UTC
*** Bug 125604 has been marked as a duplicate of this bug. ***
Comment 8 Matthew Schultz 2008-03-06 17:00:18 UTC
I can confirm that this is still a problem with Kate 2.5.8 on KDE 3.5.8.  Code folding in php does not work in this manner and for that matter, brace syntax highlighting across these blocks are now broken.  It didn't used to be broken in previous versions.
Comment 9 James Sleeman 2009-07-18 11:09:44 UTC
Created attachment 35428 [details]
"Better" php.xml

The attached patch against php.xml (version 1.36), in my opinion, simply works far better and completely avoids this bug.

Quite simply, it forgets about folding on the <?php and ?> all togethor, and this prevents the entire problem with under-zealous or over-zealous folding which this bug (and #143022) is in regard to.  

With this patch, one can correctly fold on php braces and any embedded html inside is folded away, you can also correctly fold on (foldable) html elements like <table> and any php inside is folded away.

It is my opinion that going to lengths, as have been in the past evidenced here, to preserve the ability to fold <?php and ?>, gives a very small reward and due the problems reported in this bug it produces a very big downsides in terms of much reduced folding usability for PHP+HTML - which is, after all, the typical way people write PHP "in the real world".

Simply avoiding the situation, as my patch does, I believe brings a large benefit for, in my opinion, insignificant loss.

I'll attach a full version of the file in a second.
Comment 10 James Sleeman 2009-07-18 11:11:49 UTC
Created attachment 35429 [details]
Full version of modified php.xml (for people who just want to grab and use it).
Comment 11 Milian Wolff 2009-07-18 14:10:02 UTC
Yes! Thank you very much, I appreciate your work on this issue. Your new file is already superior to the old one, so if you have commit rights, please go ahead and push it into KDE SVN. Else I'll commit it for you.

There is at least one thing I spotted which you might try to work on. As far as I know that is unrelated to your changes and did not work before either:

<div>
<?php if ( something ): ?>
fuubar
<?php endif; ?>
</div>

With your patch I can fold the div (great feature!), but I cannot fold the conditional, because it doesn't use curly braces to mark the context.
Comment 12 James Sleeman 2009-07-22 03:33:14 UTC
Yes Milian, please do commit it if you think it's ok.

I'll take a look at that other syntax, don't use it myself but I think it should be possible.
Comment 13 Milian Wolff 2009-07-22 13:58:40 UTC
Commited, thanks James!
Comment 14 James Sleeman 2009-07-22 15:59:30 UTC
Created attachment 35551 [details]
Updated php folding for alternate block syntax.

The attached file provides folding for the PHP alternate block syntax.

The case and default statements ( case 'foo':, default: ) are not treated as blocks as we can't reliably say when they end (could be break, return, closing brace, the next case etc, who knows), so case and default simply don't fold (unless you make it explicit with braces of curse).
Comment 15 Milian Wolff 2009-07-22 16:25:18 UTC
Committed, thanks James!