Bug 386827 - Errors in syntax definitions
Summary: Errors in syntax definitions
Status: RESOLVED FIXED
Alias: None
Product: frameworks-syntax-highlighting
Classification: Frameworks and Libraries
Component: syntax (show other bugs)
Version: unspecified
Platform: Other Linux
: VHI normal
Target Milestone: ---
Assignee: KWrite Developers
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-11-13 01:39 UTC by Gene Thomas
Modified: 2018-08-13 14:34 UTC (History)
1 user (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments
list of errors (70.89 KB, text/plain)
2017-11-13 01:39 UTC, Gene Thomas
Details
all errors including trivial ones (75.40 KB, text/plain)
2017-11-13 01:40 UTC, Gene Thomas
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Gene Thomas 2017-11-13 01:39:54 UTC
Created attachment 108824 [details]
list of errors

I am implementing a terminal editor that loads kde syntax highlighting .xml definitions. It does extensive error checking, as such I have a list of errors in the syntax definitions. e.g. Bad <context> or <itemData> references, or invalid regular expressions.

A common fault is [] in a regular expression. An empty character match should never match.

syntax-definitio-errors.txt is the list of errors.
syntax-test-all.log includes trivial errors such as, chr="xx" being longer than one character.

version: latest download from git://anongit.kde.org/syntax-highlighting
Comment 1 Gene Thomas 2017-11-13 01:40:48 UTC
Created attachment 108825 [details]
all errors including trivial ones
Comment 2 Dominik Haumann 2017-11-24 21:38:06 UTC
This is a nice list. We already have a static syntax checker, and it seems this syntax checker is missing more checks.

Could you provide a patch and upload to phabricator.kde.org?

Related: https://phabricator.kde.org/D8662
Comment 3 Gene Thomas 2017-11-25 16:04:58 UTC
Hello,
   Glad you like my list.

> We already have a static syntax checker.

What is it called/what do I apt install to get it?

> Could you provide a patch and upload to phabricator.kde.org?

What do you mean? I've implemented my own syntax highlighting engine that loads Kate/Kpart syntax highlighting .xml definitions?

If it is helpful I could extract the email address from the .xml files that have errors so the authors could be emailed?

Gene Thomas.
Comment 4 Dominik Haumann 2017-11-25 21:14:56 UTC
I am proposing that you patch katehighlightingindexer.cpp located at:
https://github.com/KDE/syntax-highlighting/tree/master/src/indexer

This way, everyone will benefit from the fixes, since the syntax-highlighting repository is still the master repository that contains the most up-to-date syntax files.

And of course, it would be nice to get patches for the fixed .xml files as well :-)
Comment 5 Dominik Haumann 2017-12-03 15:24:07 UTC
The following checks are now in the KSyntaxHighlighting Framework 5.42:

- Validate that for all attributes an itemData exists
  https://phabricator.kde.org/R216:2b8b664e15c0dd2945458d9373f2324e0c69056e

- Highlighting indexer: Check DetectChar and Detect2Chars
  https://phabricator.kde.org/R216:9d30439abda876601d8b6bb1ba2785b9d49e88a8

- Highlighting Indexer: Check for duplicate itemDatas
  https://phabricator.kde.org/R216:6b7cd0c8735aecf543abf54bf8dcabb783ce6b98

- Highlighting Indexer: Warn about duplicate contexts
  https://phabricator.kde.org/R216:621e282acbcb98d486e61f3d65d3293bf342489d

- Highlighting Indexer: Check keyword lists
  https://phabricator.kde.org/R216:463dfc78b4be7b11150420c076a8582be8b98607

- Try detecting unused contexts
  https://phabricator.kde.org/R216:cd7f2837aad12f18b908e7e733a891ed71fdd400

That is, the highlighting indexer has a lot of checks now, and currently raises many many warnings. These should be fixed until the next 5.42 release.
Comment 6 Dominik Haumann 2017-12-03 16:46:35 UTC
Current state (as of 2017-12-03):

Unused keyword lists:
- "ample.xml" Unused keyword lists: QSet("icprops", "sgfct", "dvafct")
- "ansforth94.xml" Unused keyword lists: QSet("attention")
- "css.xml" Unused keyword lists: QSet("mediatypes_op")
- "dosbat.xml" Unused keyword lists: QSet("not", "else")
- "euphoria.xml" Unused keyword lists: QSet("constants")
- "freebasic.xml" Unused keyword lists: QSet("Assembly Operators")
- "fsharp.xml" Unused keyword lists: QSet("symbols")
- "ilerpg.xml" Unused keyword lists: QSet("pkeywords", "evalopcodes8")
- "metafont.xml" Unused keyword lists: QSet("notDefined", "EnvDelimiters")
- "pango.xml" Unused keyword lists: QSet("endtags", "tags", "int_attributes", "plain_attributes", "color_attributes")
- "pony.xml" Unused keyword lists: QSet("literal", "types")
- "powershell.xml" Unused keyword lists: QSet("operators", "attributes")
- "prolog.xml" Unused keyword lists: QSet("lists ISO", "listing non-ISO", "directives non-ISO", "lists non-ISO", "terms non-ISO", "streams deprecated", "list+is_list non-ISO")
- "scss.xml" Unused keyword lists: QSet("mediatypes_op")
- "vhdl.xml" Unused keyword lists: QSet("forOrWhile", "directions")
- "xonotic-console.xml" Unused keyword lists: QSet("Aliases")
- "build/frameworks/syntax-highlighting/data/css-php.xml" Unused keyword lists: QSet("mediatypes_op")

Unused contexts (attention: might be used via IncludeRules):
- "abc.xml" Unused contexts: QSet("Part")
- "ample.xml" Unused contexts: QSet("AfterHash", "Region Marker", "Outscoped")
- "ansys.xml" Unused contexts: QSet("functions_arg", "functions")
- "asterisk.xml" Unused contexts: QSet("Commentar 1")
- "boo.xml" Unused contexts: QSet("Single A-comment", "Single Q-comment")
- "cg.xml" Unused contexts: QSet("Commentar/Preprocessor", "Outscoped")
- "cisco.xml" Unused contexts: QSet("Parameter", "String")
- "dosbat.xml" Unused contexts: QSet("Assign")
- "elixir.xml" Unused contexts: QSet("Find closing block brace", "Comment Line", "regexpr_rules")
- "email.xml" Unused contexts: QSet("body-context")
- "fasm.xml" Unused contexts: QSet("Preprocessor")
- "ferite.xml" Unused contexts: QSet("unknown 2", "unknown")
- "fgl-4gl.xml" Unused contexts: QSet("Normal Text 2", "Normal Text 3")
- "fgl-per.xml" Unused contexts: QSet("Normal Text 2", "Normal Text 3")
- "ftl.xml" Unused contexts: QSet("comment", "values")
- "gcc.xml" Unused contexts: QSet("GNUMacros")
- "haml.xml" Unused contexts: QSet("comment2", "Comment Line", "stringx", "string", "comment0")
- "ilerpg.xml" Unused contexts: QSet("EvalOCCont")
- "isocpp.xml" Unused contexts: QSet("DetectIdentifierEnd")
- "jam.xml" Unused contexts: QSet("RuleDefinitionFull")
- "julia.xml" Unused contexts: QSet("curly", "squared", "nested")
- "kotlin.xml" Unused contexts: QSet("symbols")
- "latex.xml" Unused contexts: QSet("ToEndOfLine")
- "metafont.xml" Unused contexts: QSet("ToEndOfLine")
- "modula-2.xml" Unused contexts: QSet("Comment3")
- "nesc.xml" Unused contexts: QSet("Some Context2", "Some Context")
- "perl.xml" Unused contexts: QSet("package_qualified_blank", "end_handle")
- "povray.xml" Unused contexts: QSet("Commentar")
- "protobuf.xml" Unused contexts: QSet("Commentar")
- "ruby.xml" Unused contexts: QSet("Comment Line")
- "sisu.xml" Unused contexts: QSet("indent")
- "stata.xml" Unused contexts: QSet("Comment 2", "Base")
- "systemverilog.xml" Unused contexts: QSet("Define", "Outscoped")
- "tcsh.xml" Unused contexts: QSet("SubstCommand", "HereDoc", "SubstFile", "AssignSubscr", "ProcessSubst")
- "varnishtest.xml" Unused contexts: QSet("varnish_expectation_arg1_quoted-string", "varnish_expectation_arg1_brace-string", "varnish_expectation_arg1_unquoted-string")
- "varnishtest4.xml" Unused contexts: QSet("varnish_expectation_arg1_quoted-string", "varnish_expectation_arg1_brace-string", "varnish_expectation_arg1_unquoted-string")
- "verilog.xml" Unused contexts: QSet("Some Context2", "Port")
- "xharbour.xml" Unused contexts: QSet("logic")
- "xmldebug.xml" Unused contexts: QSet("45:Enumeration or End", "51:unused")

Reference of non-existing keyword list:
- "asp.xml" Reference of non-existing keyword list: QSet("Others")
- "qml.xml" Reference of non-existing keyword list: QSet("keywords")
- "template-toolkit.xml" Reference of non-existing keyword list: QSet("Others")

Reference of non-existing itemData:
- "haml.xml" Reference of non-existing itemData attributes: QSet("Escaped Text", "Ruby embedded in haml", "Array")
- "lilypond.xml" Reference of non-existing itemData attributes: QSet("Tremolo")
- "metafont.xml" Reference of non-existing itemData attributes: QSet("Tex", "Bullet", "Verbatim")
- "relaxng.xml" Reference of non-existing itemData attributes: QSet("Entity Reference")
- "rhtml.xml" Reference of non-existing itemData attributes: QSet("RUBY RAILS ERB Text")
- "rmarkdown.xml" Reference of non-existing itemData attributes: QSet("Markdown", "Document Headers")
- "stata.xml" Reference of non-existing itemData attributes: QSet("String2")

Duplicate contexts:
- "diff.xml" context duplicate: "File"
- "objectivecpp.xml" context duplicate: "Preprocessor"

Too many chars in DetectChar / Detect2Chars:
- "xmldebug.xml" line 130 'char' must contain exactly one char: "\"(\\s+|$)"
- "xmldebug.xml" line 135 'char' must contain exactly one char: "'(\\s+|$)"
- "xmldebug.xml" line 374 'char' must contain exactly one char: "\"(\\s+|$)"
- "xmldebug.xml" line 379 'char' must contain exactly one char: "'(\\s+|$)"
- "xmldebug.xml" line 416 'char' must contain exactly one char: "\"(\\s+|$)"
- "xmldebug.xml" line 421 'char' must contain exactly one char: "'(\\s+|$)"
- "xmldebug.xml" line 468 'char' must contain exactly one char: "\"(\\s+|$)"
- "xmldebug.xml" line 473 'char' must contain exactly one char: "'(\\s+|$)"
Comment 7 Dominik Haumann 2017-12-17 12:01:06 UTC
Remaining issues:

ContextChecker::processElement: "diff.xml" Duplicate context: "File"
ContextChecker::processElement: "objectivecpp.xml" Duplicate context: "Preprocessor"

AttributeChecker::check: "haml.xml" Reference of non-existing itemData attributes: QSet("Ruby embedded in haml", "Array", "Escaped Text")
AttributeChecker::check: "lilypond.xml" Reference of non-existing itemData attributes: QSet("Tremolo")
AttributeChecker::check: "metafont.xml" Reference of non-existing itemData attributes: QSet("Tex", "Bullet", "Verbatim")
AttributeChecker::check: "relaxng.xml" Reference of non-existing itemData attributes: QSet("Entity Reference")
AttributeChecker::check: "rhtml.xml" Reference of non-existing itemData attributes: QSet("RUBY RAILS ERB Text")
AttributeChecker::check: "rmarkdown.xml" Reference of non-existing itemData attributes: QSet("Document Headers", "Markdown")

KeywordChecker::check: "prolog.xml" Unused keyword lists: QSet("list+is_list non-ISO", "lists ISO", "listing non-ISO", "directives non-ISO", "streams deprecated", "lists non-ISO", "terms non-ISO")
Comment 8 Dominik Haumann 2017-12-17 14:52:27 UTC
@Gene Thomas: Can you please update to today's highlighting files from the KSyntaxHighlighting framework?

I fixed all issues except the regular expression things you found (since Qt seems to not find these issues). It would be nice to get another update from you so that we can fix really as many issues as possible.
Comment 9 Dominik Haumann 2017-12-23 21:02:13 UTC
@Gene Thomas: ping
Comment 10 Dominik Haumann 2018-05-29 21:05:23 UTC
@Gene Thomas: Friendly ping again.
Comment 11 Dominik Haumann 2018-08-13 14:34:41 UTC
We have improved this heavily, see commits, so I'll close this as fixed. If there are more issues, let's open a new report.