Bug 398314 - Single quotes in literal style block scalars break YAML syntax highlighting
Summary: Single quotes in literal style block scalars break YAML syntax highlighting
Status: RESOLVED FIXED
Alias: None
Product: frameworks-syntax-highlighting
Classification: Frameworks and Libraries
Component: syntax (show other bugs)
Version: 5.50.0
Platform: Other Linux
: NOR normal
Target Milestone: ---
Assignee: Nibaldo G.
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-09-06 09:01 UTC by 林博仁(Buo-ren, Lin)
Modified: 2018-09-28 03:22 UTC (History)
2 users (show)

See Also:
Latest Commit:
Version Fixed In: 5.51.0
Sentry Crash Report:


Attachments
sample file (4.27 KB, application/x-yaml)
2018-09-06 09:01 UTC, 林博仁(Buo-ren, Lin)
Details
sample screenshot (64.37 KB, image/png)
2018-09-06 09:01 UTC, 林博仁(Buo-ren, Lin)
Details

Note You need to log in before you can comment on or make changes to this bug.
Description 林博仁(Buo-ren, Lin) 2018-09-06 09:01:17 UTC
Created attachment 114802 [details]
sample file

Refer the following attached sample file and screenshot, the content in a literal style block scalars should allow any printable characters:

```
Inside literal scalars, all (indented) characters are considered to be content, including white space characters. Note that all line break characters are normalized. In addition, empty lines are not folded, though final line breaks and trailing empty lines are chomped.

There is no way to escape characters inside literal scalars. This restricts them to printable characters. In addition, there is no way to break a long literal line. 
```

But currently, all contents after a single quote is highlighted red until another single quote is encountered.

Refer to: Literal Style - Block Scalar Styles - Block Styles - YAML Ain’t Markup Language (YAML™) Version 1.2 <http://yaml.org/spec/1.2/spec.html#id2795688>
YAML highlight definition file: version 4
Comment 1 林博仁(Buo-ren, Lin) 2018-09-06 09:01:56 UTC
Created attachment 114803 [details]
sample screenshot
Comment 2 Nibaldo G. 2018-09-13 08:18:09 UTC
I tried to fix this bug for KDE Frameworks 5.50, but I could not. I had trouble capturing the exact indentation of the Key, since all lines with the Key's indentation plus a space are considered literal.
A quick (and temporary) option, would highlight only the first line after "|" or ">" as literal...
Comment 3 Nibaldo G. 2018-09-13 08:21:26 UTC
I have re-assigned to "frameworks-syntax-highlighting", version 5.50.0
Comment 4 Nibaldo G. 2018-09-13 09:01:25 UTC
I forgot to mention that this problem also exists in editors such as Atom, Sublime Text and Visual Studio Code. But, they partially solve this problem by highlighting strings and brackets (and other values) only if these are at the beginning of a line (or of a key's value).

A similar solution could also be applied in KSyntaxHighlighting. This would solve errors, like the one 林博仁 shows in his code, but there would still be problems, for example, if a quot is at the beginning of a literal line.
Comment 5 林博仁(Buo-ren, Lin) 2018-09-13 09:08:47 UTC
Improvement is appreciated even if it is not a complete fix, the big range of red text is unbearable.
Comment 6 Nibaldo G. 2018-09-26 20:57:05 UTC
I managed to capture the indentation correctly!
Proposed patch: https://phabricator.kde.org/D15780
Comment 7 Nibaldo G. 2018-09-28 03:22:50 UTC
Git commit c32cf6020931ba47e04f477fd9144230c97c25dc by Nibaldo González.
Committed on 28/09/2018 at 03:22.
Pushed by ngonzalez into branch 'master'.

YAML: add literal & folded block styles

Summary:
Highlight literal blocks after the operators `|`, `|-`, `|+`, `>`, `>-` and `>+`.

To do this correctly, the indentation of the Key or operator is captured (with dynamic rules). Note that in nested block collections, the `-` and `?` characters are considered as part of the indentation (ref. [2] & [3]):

* With Key: Text lines with indentation of the Key plus a space are considered literal. The `-` and `?` operators are considered as part of the indentation:
{F6286907}

* If there is no Key present: the literal/folded operator is at the beginning of the line or there is `-` or `?` character before. In the first case, the indentation of the literal/folded operator is captured and, in the second, the indentation of `-` or `?`. In nested blocks or sequences, the indentation of the last operator `?` or `-` is captured:
{F6286908}

* But, this implementation has a limitation: it only supports 6 nested operadors (`?` and `-`) at most.

This only works with indentations with spaces. If a tab is detected, it is highlighted with "Alert".
The empty lines are also part of the literal block.

Also, some minor improvements are included: the sequences require a dash plus a space.

**Source**:
YAML 1.2 Specs:
* [1] Chapter 8: Block Styles: http://yaml.org/spec/1.2/spec.html#style/block/
* [2] 6.1. Indentation Spaces: http://yaml.org/spec/1.2/spec.html#id2777534
* [3] 8.2.1. Block Sequences: http://yaml.org/spec/1.2/spec.html#id2797382
FIXED-IN: 5.51.0

Test Plan:
The changes I verified according to:
* https://hackage.haskell.org/package/YamlReference
* https://github.com/haphan/yaml-validator

Reviewers: cullmann, dhaumann, #framework_syntax_highlighting, turbov

Reviewed By: turbov

Subscribers: turbov, kwrite-devel, kde-frameworks-devel

Tags: #kate, #frameworks

Differential Revision: https://phabricator.kde.org/D15780

M  +78   -5    autotests/folding/test.yaml.fold
M  +78   -5    autotests/html/test.yaml.html
M  +77   -4    autotests/input/test.yaml
M  +78   -5    autotests/reference/test.yaml.ref
M  +265  -19   data/syntax/yaml.xml

https://commits.kde.org/syntax-highlighting/c32cf6020931ba47e04f477fd9144230c97c25dc