Bug 91123

Summary: ruby code folding support
Product: [Applications] kate Reporter: Thibauld Favre <tfavre>
Component: generalAssignee: KWrite Developers <kwrite-bugs-null>
Status: RESOLVED FIXED    
Severity: wishlist CC: langstefan, tfavre
Priority: NOR    
Version: 2.3   
Target Milestone: ---   
Platform: unspecified   
OS: Linux   
Latest Commit: Version Fixed In:
Sentry Crash Report:
Attachments: patch to add ruby code folding support to kate
the full ruby.xml file
New version (1.10) of ruby.xml
A test file
MANY enhancements for Ruby syntax highlighting, diff -u against version 1.10
ruby.xml version 1.11 (Source of last submitted patch 8620)
ruby.xml file (1.12)
Updated test file
ruby.xml 1.11, + code folding on else/elsif/rescue/ensure
ruby.xml version 1.13 - adds support for the most common general delimited input formats
Updated testcases
ruby.xml version 1.14 - A couple fixes more
Testcases - updated with a few more examples
ruby.xml 1.14, fix "Member Access" context and "\\" problem.
Add tests, passes "ruby -c" syntax check now
ruby.xml, made the three requested fixes
A few fixes for 1.15
ruby.xml 1.15

Description Thibauld Favre 2004-10-11 16:20:08 UTC
Version:           2.3 (using KDE 3.3.0,  (3.1))
Compiler:          gcc version 3.3.4 (Debian 1:3.3.4-12)
OS:                Linux (i686) release 2.6.7

It would be just awesome to have kate support code folding for ruby.

Thanks,

Thibauld
Comment 1 Dominik Haumann 2004-10-11 21:35:28 UTC
As far as I know this is not that easy, but we'll see what we can do :)
Btw, everybody is allowed to implement new features, so if you want to help 
us just go ahead.

(I CCed the author of the ruby.xml file)

Comment 2 Dominik Haumann 2004-10-12 18:15:31 UTC
Btw, you might want to try using 
# BEGIN comment
# END comment

This are the region markers, which are available for at least KDE 3.3.0 (maybe even KDE 3.2.x). This way you can add folding markers for the enclosed code.
Comment 3 Dominik Haumann 2004-10-14 23:07:53 UTC
> Example:
>
> # in ruby we can have
> if (condition) do
>    puts "something"
> end
> # and also
> puts "something" if (condition) # this is the 'predicated' case

To make it more clear.
If for example *EVERY* "do" had a corresponding "end" we could add code 
folding for do ... end. But if there is one exception, this won't work.
And afaik there are exceptions, aren't there? :)

Comment 4 Thibauld Favre 2004-11-01 23:32:18 UTC
Created attachment 8128 [details]
patch to add ruby code folding support to kate
Comment 5 Thibauld Favre 2004-11-01 23:33:03 UTC
You'll find a patch attached that adds code folding support for ruby. It's the first time I come to grips with kate so it might (actually it _has to_) contain awkward things ! I didn't test it extensively so far but I tested it on some of my files and it looks ok.
It also address a ruby highlighting bug with quoted strings that take several lines.
KNOWN PROBLEMS :
- "when" statement aren't supported. I tried to but was unable to get it to work correctly especially when dealing with nested "case" statements.
Comment 6 Thibauld Favre 2004-11-02 09:01:54 UTC
Created attachment 8130 [details]
the full ruby.xml file

Here's the full ruby.xml file because I think my patch was wrongly made. I also
incremented the file version to 1.09.
Comment 7 Dominik Haumann 2004-11-03 20:10:11 UTC
On Wednesday 03 November 2004 18:43, Sebastian Vuorinen wrote:
> Example:
>
> def my_method()
>  puts "foo"
> end # my_method
>       ^^^
> The comment following end is not highlighted as one. It used to be
> recognized before.
>
> Other than that, good work. The folding seems to handle most cases like
> it should.
Most? Not *all*? I don't like to commit buggy code-folding.

Comment 8 Thibauld Favre 2004-11-04 01:02:41 UTC
Created attachment 8154 [details]
New version (1.10) of ruby.xml

This version solves 2 bugs :
1) the one reported about comments not being highlighted after a "end"
statement
2) there was also a problem with the inline "if" statement which has been
solved.

Now concerning the code folding support being "mostly" complete, I personnally
consider ruby code folding support to be "fully" complete with this file. As I
said, the only thing left is being able to fold "when" statements but I think
this is not a showstopper to commit the file.
Comment 9 Thibauld Favre 2004-11-04 01:06:34 UTC
Created attachment 8155 [details]
A test file

Here's a file containing ruby code samples from the "Programming ruby" book. I
checked and there's no regression in highlighting compared to the old version
of ruby.xml.
Comment 10 Dominik Haumann 2004-11-04 16:32:48 UTC
>Now concerning the code folding support being "mostly" complete, I
>personnally consider ruby code folding support to be "fully" complete with
>this file. As I said, the only thing left is being able to fold "when"
>statements but I think this is not a showstopper to commit the file.   
As I said, no code folding for a special case does not harm at all. What 
harms is *wrong* code folding which will confuse/breake the code folding.
So just give me time - at the weekend I will commit your file! :)
Comment 11 Dominik Haumann 2004-11-07 18:10:54 UTC
Committed, thanks.
As a side note: You use many RegExps. Regular expressions are *very* slow. So there is room to improove the performance, for KDE 3.4 (curren CVS) there are new features like firstNonSpace="true", so a regexp like "^..." is not necessary anymore.
Comment 12 Stefan Lang 2004-12-11 18:57:43 UTC
Created attachment 8620 [details]
MANY enhancements for Ruby syntax highlighting, diff -u against version 1.10

Improvements for Ruby syntax highlighting through this patch:
[ compared to current CVS version (1.10) of ruby.xml]
* Code folding (class/module level, do - end, def - end , for[do] end .......)
        (Current CVS version: folds on { - } only)
* Recognizes HERE docs (highlights substitutions in HERE docs)
* Highlights method calls (with invocant)
* Highlights constants (e.g.: MY_CONSTANT)
* Highlights class/module names (e.g.: MyClass)
* Highlights global vars (e.g.: $var)
* Handles string spanning over multiple lines correct.
* CVS version stops string even on \" (same for %w{} %q{}....)
* Recognizes more operators (e.g.: +=, -= (and similar), ..., .., <=> etc.)
* Recognizes (more) number literals in more places
* Highlights Kernel methods (put, print, etc.)
* Recognizes __END__ line
* Recognizes class/instance var substitution (e.g.: "Name: #@name")
* Multiline regexes if lines end with "\"
and some other things

Note: I will attach the whole new "ruby.xml" also, because nearly
everything changed compared to current CVS version, and the
patch itself is bigger than the new file.
Comment 13 Stefan Lang 2004-12-11 19:01:47 UTC
Created attachment 8622 [details]
ruby.xml version 1.11 (Source of last submitted patch 8620)

The (last submitted) patch 8620 was created with
diff of this attachement against ruby.xml version 1.10.

Read the comment of the above mentioned patch about
improvements.
Comment 14 Stefan Lang 2004-12-11 19:56:52 UTC
Additional note to the patch (8622, ruby.xml 1.11) I sent:
I wrote that I added code folding:
The code folding of 1.10 gets confused very often.
Just look at some Ruby sources from the Ruby 1.8
library (e.g. date.rb).

I tested my new ruby.xml with many library files of Ruby 1.8
and there is only one issue I found in cgi.rb:
<code>
methods += <<-BEGIN + nO_element_def(element) + <<-END
          def #{element.downcase}(attributes = {})
        BEGIN
          end
        END
</code>
The HERE document is only recognized up to BEGIN, so the
"end" between BEGIN and END is interpreted as the end
of a code block.
But code folding works correct for the first 2000 lines
of this file.
(Note that the problems start much earlier with
ruby.xml 1.10)
Comment 15 Thibauld Favre 2004-12-12 00:00:33 UTC
Hi Stefan,

Your comments made me find a few flaws on my code folding code indeed ! A few comments on your ruby.xml file though :
* it doesn't handle code folding correctly for this basic code (should be easy to fix) :
-------
rescue SystemCallError
  $stderr.print "IO failed: " + $!
  opFile.close
  File.delete(opName)
  raise
end
-------
* Code folding support is more basic than in version 1.10 : There's no support for "else" and "elsif" statements.

* Regarding date.rb file, at some point I simply don't understand the code :) (I'm far from being a ruby expert). Does the following code make sense ? (see inline comments)
-------
##########
# lot of well formed code... (line 384)
##########
  def self.commercial_to_jd(y, w, d, ns=GREGORIAN)
    jd = civil_to_jd(y, 1, 4, ns)
    (jd - (((jd - 1) + 1) % 7)) +
      7 * (w - 1) +
      (d - 1)
  end
#######
# Still ok for me so far....
#######
  %w(self.clfloor clfloor).each do |name| ##1 This code doesn't belong to any method ! Is it ok ?
    module_eval <<-"end;"
      def #{name}(x, y=1) ##2 Can we define something inside a block of code ?
 q, r = x.divmod(y)
 q = q.to_i
 return q, r
      end
    end;
  end ##3 Considering the code that follows, we're not at class end so to me this last "end" has nothing to do here, or am I missing smth ??
--------
(By the way, your code breaks here too :))

All in all, I agree that my ruby.xml needs improvements but so far yours didn't quite convince me either ;)
Comment 16 Anders Lund 2004-12-12 00:25:44 UTC
Couldn't you two try to cooperate and merge your work?
And if any of you ren cvs HEAD or the 3.4 alpha, please look at optimization options.
Comment 17 Thibauld Favre 2004-12-12 10:44:48 UTC
Created attachment 8630 [details]
ruby.xml file (1.12)

Sure. I think we should capitalize on Stefan's work rather than mine. To me,
the most important thing to do would now would be to support "else/elsif"
blocks. Here's the ruby.xml of Stefan with a one line change to support
"rescue" statement.
Comment 18 Thibauld Favre 2004-12-12 10:48:55 UTC
Created attachment 8631 [details]
Updated test file

Here's an updated test file. I added the ruby code we can found in cgi.rb and
date.rb that is problematic.
Comment 19 Stefan Lang 2004-12-12 13:41:27 UTC
> %w(self.clfloor clfloor).each do |name| ##1 This code doesn't belong to any > method ! Is it ok ? 
>      module_eval <<-"end;" 
>        def #{name}(x, y=1) ##2 Can we define something inside a block of > code ? 
>   q, r = x.divmod(y) 
>   q = q.to_i 
>   return q, r 
>        end 
>      end; 
>    end ##3 Considering the code that follows, we're not at class end so to>  me this last "end" has nothing to do here, or am I missing smth ?? 

ad ##1: It should be OK. A class definition is executeable code,
    and the "self" should refer to the current class.

ad ##2: This is the first line of a HERE document,
   which is terminated by "end;" on its own line.

ad ##3: This is the terminating "end" for the "do"
    in the first line of this code snippet.

There are some syntax errors in your test file,
please check it with "ruby -c".

WRT the "rescue" statement:
It must appear inside a "begin" or "def" block,
so the code folding section starts at "begin"/"def" (others?)
end ends with the corresponding "end".

Your example should appear e.g. inside a method definition:

def do_somthing
  ... # code which throws SystemCallError
rescue SystemCallError 
   $stderr.print "IO failed: " + $! 
   opFile.close 
   File.delete(opName) 
   raise 
end  # This "end" belongs to "def",not "rescue"

PS: Wheter am I an expert on the Ruby language, nor on the English one.
Comment 20 Stefan Lang 2004-12-13 00:05:07 UTC
Created attachment 8643 [details]
ruby.xml 1.11, + code folding on else/elsif/rescue/ensure

OK, ruby.xml version 1.11.
Replacement for attachement 8622.

Additional code folding on else/elsif/rescue/ensure.
Also recognizes the "ASCII code" operator, e.g. "?a".
This solves another problem: The previous versions of
ruby.xml interpretet the quote in ?" as string start.
Comment 21 Sebastian Vuorinen 2004-12-13 13:36:20 UTC
Created attachment 8647 [details]
ruby.xml version 1.13 - adds support for the most common general delimited input formats

The general delimited input formats were highlighted as keywords which is
mixing apples with oranges. If the user really wants them to be the same color
he can always set it up in kate's 'settings > configure kate > schemas >
highlighting text styles'. I didn't take any stance on what the default color
for GDL input should be so it's default for dsOthers for now.

There were also some commonly used formats not supported.

This file includes Stefan's latest 1.11 version.

Sebastian
Comment 22 Sebastian Vuorinen 2004-12-13 13:44:58 UTC
Created attachment 8648 [details]
Updated testcases

I reviewed the highlighting file against the 1.8.2pr3 version of Webrick. I
added snippets from problematic places in the testcase file.

Sebastian
Comment 23 Dominik Haumann 2004-12-13 15:59:41 UTC
Is this now the "final" version to commit? If so, tell us, as there pops up a new version every day right now :-)
Comment 24 Stefan Lang 2004-12-14 00:59:48 UTC
> Is this now the "final" version to commit?

IMHO yes (at least for the next four weeks ;)
What do the others say? Sebastian, Thibauld?
Comment 25 Sebastian Vuorinen 2004-12-14 14:34:27 UTC
Created attachment 8655 [details]
ruby.xml version 1.14 - A couple fixes more

This has the following changes:
 - The GDL input had wrong values for some of the marker escaping
 - GDL input format for shell commands can extend to multiple lines now.
   This is useful when piping commands together for example.
 - Memember access context now handles nested modules as it should.
   Ex.: Foo::Bar::baz
 - '/=' requires a space after it now, so it wont get confused with regexps
   anymore. This is the same behaviour as '/'. Not ideal but works.
 - Removed Array, Integer, Float and String from kernel-methods. While these
   classes are builtin into Ruby this is an implementation detail that should
   not be exposed to users. If we really want to handle std-lib class names
   there should be a std-lib category for them.
Comment 26 Sebastian Vuorinen 2004-12-14 14:42:17 UTC
Created attachment 8656 [details]
Testcases - updated with a few more examples

A new testcase file that has the old cases and adds a few more.

I reviewed the highlighting against the classes in the Net module. Some of the
things I fixed in 1.14 others are shown in this file for future reference.
Things like the proper handling for heredoc are still open problems.

Sebastian
Comment 27 Sebastian Vuorinen 2004-12-14 15:09:23 UTC
On Tuesday 14 December 2004 01:59, Stefan Lang wrote:
> > Is this now the "final" version to commit?
>
> IMHO yes (at least for the next four weeks ;)
> What do the others say? Sebastian, Thibauld?

It turns out it was not 'final' ;)
I myself would be happy to commit version 1.14.

It still has some problems as shown in the testcase file, but those are hard 
cases, some of which would require additional features from the katepart 
highlighting. 

Handling HEREDOC correctly really would require the ability to store the match 
on which a context is entered. This would make the GDL input formats easier 
to handle too. We wouldn't have to duplicate so much of the rules.

For the moment I haven't got any idea how to go about the remaining issues.

Unless Thibauld has something to add, lets go with version 1.14.

Sebastian

PS. version 1.14 still works with Kate 2.2.1 too. We'll have to see about the 
optimizations in the newer versions at some point.

Comment 28 Anders Lund 2004-12-14 16:31:48 UTC
On Tuesday 14 December 2004 15:09, Sebastian Vuorinen wrote:
> Handling HEREDOC correctly really would require the ability to store the
> match on which a context is entered. This would make the GDL input formats
> easier to handle too. We wouldn't have to duplicate so much of the rules.

The latest versions of kate has a way to produce a dynamic context. We use 
that for handling HERE strings in several highlights, for example in perl, 
which I use as example below:

This is the rule that is used for finding the start of a HERE string. Note 
that this rule does not match the HERE delimiter, it just looks for it (using 
a lookahead assertion in the regex pattern). It redirects to the 
'find_here_document' context.
<RegExpr attribute="Operator" context="find_here_document" String="\s*&lt;&lt;
(?=\w+|\s*[&quot;'])" beginRegion="HereDocument" />

In that context, we find this rule:
<RegExpr attribute="Keyword" context="here_document" String="(\w+)\s*;?" />
This rule matches a unquoted HERE delimiter optionally followed by some 
whitespace and a semicolon, the interresting part is that it captures the 
delimiter string by putting a paren around it. It points to the 
'here_document' context.

Here is the start of the 'here_document' context:
<context name="here_document" attribute="String (interpolated)" 
lineEndContext="#stay" dynamic="true">
        <RegExpr attribute="Keyword" context="#pop#pop" String="%1" column="0" 
dynamic="true" endRegion="HereDocument"/>

The interresting part is the keyword 'dynamic' in the context and the regex 
rule. This means that when the rule is matched a copy of the context with 
"%1" in this rule replaced by the capture from the matching rule, in this 
case the HERE delimiter.

If the same delimiter is used multiple times, the dynamic context created is 
reused.

Note that if you want to use a cynamic capture in a detectchar rule, you must 
just put 'char="1" dnamic="true"' in it, because the char attribute is a 
char. For rules taking a string, use "%1". Captures other than the first may 
be used, just use "%N", of course.

For more examples, look in the perl.xml file (I use dynamic contexts a lot 
there for the quoting mechanisms of perl), or php.xml, bash.xml which also 
supports HERE strings using dynamic contexts.

In the hope that this helps,
-anders

Comment 29 Stefan Lang 2004-12-14 19:20:26 UTC
Created attachment 8661 [details]
ruby.xml 1.14, fix "Member Access" context and "\\" problem.

Fixes "Member Access" context, so that e.g.:
MyModule::MyClass.CONSTANT.some_attr.do_something() will work.

Fixed "escaping the escape character", e.g.:
"\\", %|\\| and similar work now.

Classname "POP3Session" in test file works now.
Comment 30 Stefan Lang 2004-12-14 19:30:31 UTC
Created attachment 8663 [details]
Add tests, passes "ruby -c" syntax check now
Comment 31 Dominik Haumann 2004-12-30 19:34:18 UTC
I looked through the ruby.xml file and want to give some hints:

1. ? in keyword lists

<!-- Doesn't work. Because of question mark? Included regex below. -->
<item> autoload? </item>

Fix for this: add ? to the weakDeliminator list in the <general> section:
old: <keywords casesensitive="1"/>
new: <keywords casesensitive="1" weakDeliminator="?"/>

2. In the latest test file you attached on the very bottom there is:
..'text ...'\
       unless ...
and a comment 'we would need to use the \ to continue the line.

This is possible with the rule:
    <LineContinue attribute="attrib" context="#stay"/>

3. Here Docs. Since KDE 3.3 we have support for so called 'dynamic contexts' as Anders already said in comment #28, maybe you want to look into php.xml or so to see how it is done there.

Finally this list holds me back from committing the file, maybe you want to fix it? :) Btw., feel free to join #kate in irc.kde.org for live help.
Comment 32 Stefan Lang 2004-12-31 00:48:56 UTC
Created attachment 8867 [details]
ruby.xml, made the three requested fixes

All three issues fixed :)
ad 1) Fixed the way you have shown it, the same for
      methods ending in "!"
ad 2) Fixed with the help of LineContinue and a new context.
ad 3) Was fixed by Sebastian with the help of the "dynamic" attribute.
Comment 33 Sebastian Vuorinen 2005-01-02 14:08:07 UTC
On Thu, 2004-12-30 at 23:48 +0000, Stefan Lang wrote:

> All three issues fixed :)
> ad 1) Fixed the way you have shown it, the same for
>       methods ending in "!"
> ad 2) Fixed with the help of LineContinue and a new context.
> ad 3) Was fixed by Sebastian with the help of the "dynamic" attribute.

Some additional comments...
Version number is now 1.15 since upto 1.14 the file did work with 
Kate 2.2. You might want to put version 1.14 on the Kate homepage as I
believe there are still many users using older Kate versions (FC 1 for
example) and despite all the problems with version 1.14, it still is the
most correct highlight for the older Kate versions.

Sebastian


Comment 34 Sebastian Vuorinen 2005-01-02 14:56:47 UTC
Created attachment 8894 [details]
A few fixes for 1.15

This contains only two small fixes to Stefan's version.

- We were not using the default styles dsError and dsAlert at all.
  I added entries for these and made the keywords defined in the 'attention'
  block highlight as Alerts.
- The ASCII code operator highlighting was overriding methods ending in ?
  (ex: foobar?(baz)). I corrected this, by requiring the ASCII code operator
  to have a space before it. Not ideal, but using word boundary doesn't quite  

  work like it should either.
Comment 35 Stefan Lang 2005-01-03 00:11:28 UTC
Am Sonntag, 2. Januar 2005 14:56 schrieb Sebastian Vuorinen:
> This contains only two small fixes to Stefan's version.
>
> - We were not using the default styles dsError and dsAlert at all.
>   I added entries for these and made the keywords defined in the
> 'attention' block highlight as Alerts.
> - The ASCII code operator highlighting was overriding methods ending in ?
>   (ex: foobar?(baz)). I corrected this, by requiring the ASCII code
> operator to have a space before it. Not ideal, but using word boundary
> doesn't quite
>
>   work like it should either.

Is there a special reason for this change:
<!-- Generally a module or class name like "File", "MyModule_1", .. -->
- <RegExpr attribute="Constant" String= "\b[A-Z]+_*([0-9]|[a-z])
[_a-zA-Z0-9]*\b" context="#stay"/>
+<RegExpr attribute="Constant" String= "\s[A-Z]+_*([0-9]|[a-z])
[_a-zA-Z0-9]*\s" context="#stay"/>

To require whitespace around constants is not that good.
The following class/module names aren't highlighted with Kate 2.3:

class MyClass
    include MyModule
end

because newline comes immediately after the class/module name.
What did \b not work for (with regards to constants)?
May I revert that change?

Cheers,
Stefan

Comment 36 Sebastian Vuorinen 2005-01-03 18:54:55 UTC
Created attachment 8903 [details]
ruby.xml 1.15

I had by mistake introduced a bug while fixing others. That's what you get for
backporting fixes in a hurry ;)

Sebastian
Comment 37 Dominik Haumann 2005-01-14 18:56:00 UTC
CVS commit by dhaumann: 

commit new ruby.xml by Stefan Lang and Sebastian Vuorinen. Thanks!
This file will be in for KDE 3.4
CCBUG: 91123


  M +777 -145  ruby.xml   1.15



Comment 38 Dominik Haumann 2005-02-13 20:20:29 UTC
CVS commit by dhaumann: 

optimisations:
- use firstNonSpace="true" instead of ^\s*...
- use column="0" instead of ^...
- replace many reg exprs by Detect2Chars
This version needs KDE 3.4! Please - if you change things, do it in this version
and not from an old one from BRANCH. This was a lot of work. :)
CCBUG: 91123


  M +131 -129  ruby.xml   1.16