Screwtape's Notepad

Intro to Kakoune highlighters

I’ve been writing some syntax-highlighters for Kakoune recently, and although Kakoune comes with decent reference material for how highlighters work, there isn’t really a high-level overview. So, while the thing is fresh in my mind, I thought I’d write down some notes.

The highlighter hierarchy

The most important thing about highlighters is that they exist in a hierarchy. The add-highlighter command creates a new highlighter at a particular path, which determines which buffers it can apply to. Generally, you can give a highlighter any path you want, but the first component of the path must be one of these alternatives:

global
contains highlighters that apply everywhere
buffer
contains highlighters that apply to this buffer, in every window that displays it
window
contains highlighters that apply only to this buffer when it is displayed in the current window
shared
contains highlighters that are defined, but not being used.

The first three values are basically Kakoune’s standard scopes, and work the same way as they do for mappings, hooks, etc. The last one is a bit special, however.

Some file types can be very complex, even requiring shell-scripts to generate the highlighting instructions, and it can be quite slow to set them all up and tear them all down every time the filetype option changes. Therefore, Kakoune provides the shared highlighter scope, and the ref highlighter, which works like a symlink.

Most syntax-highlighting plugins for Kakoune will include a fragment like this:

# Set up some highlighters in the shared scope
add-highlighter shared/MyFileSharedHighlighters fill module

# When the filetype option in a window is set to our filetype...
hook global WinSetOption filetype=MyFileType %{
    # ...link our shared highlighters into this window...
    add-highlighter window/MyFileWindowHighlighters ref MyFileSharedHighlighters

    # ...and if the filetype option changes again, unlink them.
    hook -once -always window WinSetOption filetype=.* %{
        remove-highlighter window/MyFileWindowHighlighters
    }
)

You can think of each scope as like a list of painting instructions. To paint the text of a particular buffer, first all the instructions in the global scope are applied, in the order they were defined, then the buffer scope, then the window scope. If different highlighters try to paint the same text, they can overwrite each other, and the end result depends on what faces the highlighters use. For example, if one highlighter tries to paint “red text on default background”, and another highlighter uses “default text on blue background”, you can wind up with “red text on blue background”.

If you invoke add-highlighter and give it a full path:

add-highlighter shared/somename ...

…then the resulting highlighter will have exactly that name, which must be unique. However, if you leave off the final component of the path, so it just ends with a slash (/):

add-highlighter shared/ ...

…then Kakoune will make up a unique name for you. There’s no way for a script to find out what name Kakoune picked, so if you need to refer to a highlighter later (such as with the ref highlighter), you’ll need to use a specific name.

Alternative syntaxes: Regions

Sometimes you don’t want different highlighters to overlap. For example, in Python if should be highlighted as a keyword wherever it appears, but not inside a string:

if True:  # "if" is a keyword
    print("what if dog, but too much")  # "if" is not a keyword

Kakoune’s solution is “regions”, and they work like this:

Regions cannot overlap; if the text that starts one region appears between the start and end text of another region, it doesn’t start an overlapping instance. This means that regions are good for areas of a file that follow separate syntax rules, like string literals, or CSS and JavaScript blocks inside HTML.

Here’s a snippet that implements the example we gave above, where if is a keyword, but not inside a comment or a string:

# Make a regions highlighter to contain our regions
add-highlighter shared/example1 regions

# A region from a `#` to the end of the line
# is filled with the "comment" face.
add-highlighter shared/example1/ region '#' '\n' fill comment

# A region starting and ending with a double-quote
# is filled with the "string" face.
add-highlighter shared/example1/ region '"' '"' fill string

# Everywhere else, "if" is a keyword.
add-highlighter shared/example1/ default-region regex "if" 0:keyword

Combined syntaxes: Groups

Sometimes, you do want highlighters to overlap, because structures in the text actually overlap, or because it’s a lot easier than trying to make a proper parser. Kakoune’s solution here is “groups”, and they work like this:

This is basically the same thing that happens when you define highlighters directly inside the global, buffer, or window scopes. However, if you put all your highlighting rules inside a single group inside the shared scope, you can link the entire group into the window scope with one command, and remove it again just as easily. This is much more convenient than having to link all your highlighters individually.

Here’s a snippet that creates a group that highlights some common features of programming languages:

# Create a group for all our highlighters to live in.
add-highlighter shared/example2 group

# Highlighting for numbers.
add-highlighter shared/example2/ regex (\+|-)?[0-9]+(\.[0-9]+)? 0:value

# Highlighting for booleans.
add-highlighter shared/example2/ regex true|false 0:value

# Highlighting for keywords.
add-highlighter shared/example2/ regex if|for|while|return|continue 0:keyword

Groups in regions

The real power of groups and regions comes in combination. If a region contains a group, then all the highlighters in the group will paint text inside the region, and not outside it.

For example, many languages support backslash-escapes and printf-style formatting instructions inside strings. It’d be nice to visually distinguish those, but we don’t want to highlight them outside of strings, any more than we want to highlight keywords like if inside strings. We can do that by modifying our previous region example:

# As before, make a regions highlighter to contain our regions
add-highlighter shared/example3 regions

# A region starting and ending with a double-quote
# is a group of highlighters.
add-highlighter shared/example3/dqstring region '"' '"' group

# By default, a double-quoted string is string-coloured.
add-highlighter shared/example3/dqstring/ fill string

# Some backslash-escaped characters are effectively keywords,
# but most are errors.
add-highlighter shared/example3/dqstring/ regex \
    (\\[\\abefhnrtv\n])|(\\.) 1:keyword 2:Error

Bonus: a highlighter skeleton

Now that you know the basics of groups and regions, and how they nest within the highlighter hierarchy, you’re ready to write your own Kakoune syntax highlighters! However, there’s a few common conventions for highlighter plugins you might want to follow.

Here’s a basic skeleton of a highlighter plugin for a fictional “example” filetype, which you can use as a starting point:

# Setting up all the highlighters for this syntax
# could be expensive, so we'll define them inside a module
# that won't be loaded until we need it.
#
# Because this module might contain a bunch of regexes with
# unbalanced grouping symbols, we'll use some other character
# as a delimiter.
provide-module examplesyntax %&
    # Define our highlighters in the shared namespace,
    # so we can link them later.
    add-highlighter shared/examplesyntax regions

    # A region from a `#` to the end of the line is a comment.
    add-highlighter shared/examplesyntax/ region '#' '\n' fill comment

    # A region starting and ending with a double-quote
    # is a group of highlighters.
    add-highlighter shared/examplesyntax/dqstring region '"' '"' group

    # By default, a double-quoted string is string-coloured.
    add-highlighter shared/examplesyntax/dqstring/ fill string

    # Some backslash-escaped characters are effectively keywords,
    # but most are errors.
    add-highlighter shared/examplesyntax/dqstring/ \
        regex (\\[\\abefhnrtv\n])|(\\.) 1:keyword 2:Error

    # Everything outside a region is a group of highlighters.
    add-highlighter shared/examplesyntax/other default-region group

    # Highlighting for numbers.
    add-highlighter shared/examplesyntax/other/ \
        regex (\+|-)?[0-9]+(\.[0-9]+)? 0:value

    # Highlighting for booleans.
    add-highlighter shared/examplesyntax/other/ \
        regex true|false 0:value

    # Highlighting for keywords.
    add-highlighter shared/examplesyntax/other/ \
        regex if|for|while|return|continue 0:keyword
&

# When a window's `filetype` option is set to this filetype...
hook global WinSetOption filetype=example %{
    # Ensure our module is loaded, so our highlighters are available
    require-module examplesyntax

    # Link our higlighters from the shared namespace
    # into the window scope.
    add-highlighter window/examplesyntax ref examplesyntax

    # Add a hook that will unlink our highlighters
    # if the `filetype` option changes again.
    hook -once -always window WinSetOption filetype=.* %{
        remove-highlighter window/examplesyntax
    }
}

# Lastly, when a buffer is created for a new or existing file,
# and the filename ends with `.example`...
hook global BufCreate .+\.example %{
    # ...we recognise that as our filetype,
    # so set the `filetype` option!
    set-option buffer filetype example
}