Bug 155100 - Syntax highlighting wrong for certain C++ comments (doxygen)
Summary: Syntax highlighting wrong for certain C++ comments (doxygen)
Status: RESOLVED FIXED
Alias: None
Product: kate
Classification: Applications
Component: syntax (show other bugs)
Version: unspecified
Platform: Ubuntu Linux
: NOR normal
Target Milestone: ---
Assignee: KWrite Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-01-04 18:58 UTC by Christian Convey
Modified: 2008-01-06 21:12 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Christian Convey 2008-01-04 18:58:33 UTC
Version:           2.5.8 (using KDE KDE 3.5.8)
Installed from:    Ubuntu Packages
OS:                Linux

Create a file, such as /tmp/x.cpp, with Kate, and then paste the following 3 lines of text into the file:

/**
It will always be the case that root1 < root2 when this returns.
*/

Starting with the '<' character, the rest of the line is highlighted as though it wasn't part of a comment.

Note that if you modify the first line to be "/*" rather than "/**", the problem goes away.
Comment 1 Christian Convey 2008-01-06 15:37:26 UTC
Note that this bug is still present after I updated Kate's C++ syntax highlighting file to version 1.41.
Comment 2 Thomas Friedrichsmeier 2008-01-06 18:56:07 UTC
The problem is that comments starting with "/**" (or "///") are seen as doxygen comments, and therefore get doxygen highlighting. Therefore some issues like this are not really avoidable.

I guess to alleviate the problem, one could limit HTML tag detection in doxygen to only work on "<something", but not on "< something". To do so, in doxygen.xml, the regexps starting an "HTML Tag" would need to be adjusted from "&lt;\s*\/?\s*[a-zA-Z_:][a-zA-Z0-9._:-]*" to "&lt;\/?[a-zA-Z_:][a-zA-Z0-9._:-]*" (4 instances of this regexp in doxygen.xml).

In fact, doxygen 1.5.3 does not appear to treat constructs with spaces between "<" and the "tagname" as an HTML tag. This seems reasonable, as "a < b" might really be something you may want to write in a comment.

Ok to commit such as patch?
Comment 3 Dominik Haumann 2008-01-06 19:15:44 UTC
It's a workaround for the specific case mentioned in the bug report.
I've been thinking for a long time already about removing the html-highlighting support. It has lots of false positives, and doxygen itself has nothing to do with html: it simply doesn't touch html tags.
The highlighting would be still good enough, as you usually do *not* have loads of html tags in doxygen comments.

What do you think?
Comment 4 Dominik Haumann 2008-01-06 19:16:26 UTC
Side note: doxygen has \< and \> for this purpose :)
Comment 5 Thomas Friedrichsmeier 2008-01-06 19:36:14 UTC
Doxygen does seem to touch HTML tags, and seems to (very cursory testing only) convert "<" and ">" to "&lt;" and "&gt;" for some things that look like tags but are none. For instance, "<test>bla</test>" will be converted to "&lt;test&gt;bla&lt;/test&gt;" in doxygen 1.5.3 .
There is a list of supported HTML tags at http://www.stack.nl/~dimitri/doxygen/htmlcmds.html .

But that aside, I guess it may indeed be a good idea to just remove the html highlighting support from doxygen.xml, entirely, for the reasons you stated in comment #3.
Comment 6 Dominik Haumann 2008-01-06 19:52:35 UTC
Ok, if you want you can fix it, i.e. either apply your patch or remove all the html contexts + rules needed for it.
Comment 7 Thomas Friedrichsmeier 2008-01-06 20:07:12 UTC
SVN commit 758022 by tfry:

When detecting HTML tags, do not allo space between '<' and the tagname.
CCBUG: 155100


 M  +5 -5      doxygen.xml  


WebSVN link: http://websvn.kde.org/?view=rev&revision=758022
Comment 8 Thomas Friedrichsmeier 2008-01-06 20:10:48 UTC
Ok, the above commit fixes the reported problem (for KDE 4.1). As you can see it's the more conservative approach of comment #1, instead of removing HTML support, entirely. I guess we can see if (when?) further false positives are discovered, and can still remove all HTML support, then, if needed.

I'll commit to branches/KDE/4.0 in a minute (need a checkout, first).
Comment 9 Dominik Haumann 2008-01-06 20:40:55 UTC
False positives are still email addresses, like <a@b.c> :) I ignored that so far ;) \<a@b.c\> works, of course.
Comment 10 Anders Lund 2008-01-06 21:12:21 UTC
On Sunday 06 January 2008, Thomas Friedrichsmeier wrote:
> The problem is that comments starting with "/**" (or "///") are seen as
> doxygen comments, and therefore get doxygen highlighting. Therefore some
> issues like this are not really avoidable.


That should not happen from inside the cpp highlight in any case, but from the 
doxygen one. How does it happen that a rule in doxygen.xml becomes evaluated 
in a c++ comment? There must be a mistake in doxygen.xml.