I’ve been writing some syntax-highlighters for Kakoune recently, and although Kakoune comes with decent reference material for how highlighters work, there isn’t really a high-level overview. So, while the thing is fresh in my mind, I thought I’d write down some notes.
The most important thing about highlighters
is that they exist in a hierarchy.
The add-highlighter
command
creates a new highlighter
at a particular path,
which determines which buffers it can apply to.
Generally,
you can give a highlighter any path you want,
but the first component of the path must be one of these alternatives:
The first three values are basically Kakoune’s standard scopes, and work the same way as they do for mappings, hooks, etc. The last one is a bit special, however.
Some file types can be very complex,
even requiring shell-scripts to generate the highlighting instructions,
and it can be quite slow to set them all up
and tear them all down
every time the filetype
option changes.
Therefore, Kakoune provides the shared
highlighter scope,
and the ref
highlighter,
which works like a symlink.
Most syntax-highlighting plugins for Kakoune will include a fragment like this:
# Set up some highlighters in the shared scope
add-highlighter shared/MyFileSharedHighlighters fill module
# When the filetype option in a window is set to our filetype...
hook global WinSetOption filetype=MyFileType %{
# ...link our shared highlighters into this window...
add-highlighter window/MyFileWindowHighlighters ref MyFileSharedHighlighters
# ...and if the filetype option changes again, unlink them.
hook -once -always window WinSetOption filetype=.* %{
remove-highlighter window/MyFileWindowHighlighters
}
)
You can think of each scope as like
a list of painting instructions.
To paint the text of a particular buffer,
first all the instructions in the global
scope are applied,
in the order they were defined,
then the buffer
scope,
then the window
scope.
If different highlighters try to paint the same text,
they can overwrite each other,
and the end result depends on what faces the highlighters use.
For example, if one highlighter tries to paint “red text on default background”,
and another highlighter uses “default text on blue background”,
you can wind up with “red text on blue background”.
If you invoke add-highlighter
and give it a full path:
add-highlighter shared/somename ...
…then the resulting highlighter will have exactly that name,
which must be unique.
However,
if you leave off the final component of the path,
so it just ends with a slash (/
):
add-highlighter shared/ ...
…then Kakoune will make up a unique name for you.
There’s no way for a script to find out what name Kakoune picked,
so if you need to refer to a highlighter later
(such as with the ref
highlighter),
you’ll need to use a specific name.
Sometimes you don’t want different highlighters to overlap.
For example,
in Python if
should be highlighted as a keyword wherever it appears,
but not inside a string:
if True: # "if" is a keyword
print("what if dog, but too much") # "if" is not a keyword
Kakoune’s solution is “regions”, and they work like this:
regions
highlighter,
with some specific name.region
highlighters,
whose start and end are defined by particular regexes.default-region
highlighter,
which is used for all the text that isn’t covered by another region.Regions cannot overlap; if the text that starts one region appears between the start and end text of another region, it doesn’t start an overlapping instance. This means that regions are good for areas of a file that follow separate syntax rules, like string literals, or CSS and JavaScript blocks inside HTML.
Here’s a snippet that implements the example we gave above,
where if
is a keyword, but not inside a comment or a string:
# Make a regions highlighter to contain our regions
add-highlighter shared/example1 regions
# A region from a `#` to the end of the line
# is filled with the "comment" face.
add-highlighter shared/example1/ region '#' '\n' fill comment
# A region starting and ending with a double-quote
# is filled with the "string" face.
add-highlighter shared/example1/ region '"' '"' fill string
# Everywhere else, "if" is a keyword.
add-highlighter shared/example1/ default-region regex "if" 0:keyword
Sometimes, you do want highlighters to overlap, because structures in the text actually overlap, or because it’s a lot easier than trying to make a proper parser. Kakoune’s solution here is “groups”, and they work like this:
group
highlighter,
with some specific name.This is basically the same thing that happens
when you define highlighters directly inside
the global
, buffer
, or window
scopes.
However,
if you put all your highlighting rules inside a single group
inside the shared
scope,
you can link the entire group into the window
scope
with one command,
and remove it again just as easily.
This is much more convenient
than having to link all your highlighters individually.
Here’s a snippet that creates a group that highlights some common features of programming languages:
# Create a group for all our highlighters to live in.
add-highlighter shared/example2 group
# Highlighting for numbers.
add-highlighter shared/example2/ regex (\+|-)?[0-9]+(\.[0-9]+)? 0:value
# Highlighting for booleans.
add-highlighter shared/example2/ regex true|false 0:value
# Highlighting for keywords.
add-highlighter shared/example2/ regex if|for|while|return|continue 0:keyword
The real power of groups and regions comes in combination. If a region contains a group, then all the highlighters in the group will paint text inside the region, and not outside it.
For example,
many languages support backslash-escapes
and printf
-style formatting instructions
inside strings.
It’d be nice to visually distinguish those,
but we don’t want to highlight them outside of strings,
any more than we want to highlight keywords like if
inside strings.
We can do that by modifying our previous region
example:
# As before, make a regions highlighter to contain our regions
add-highlighter shared/example3 regions
# A region starting and ending with a double-quote
# is a group of highlighters.
add-highlighter shared/example3/dqstring region '"' '"' group
# By default, a double-quoted string is string-coloured.
add-highlighter shared/example3/dqstring/ fill string
# Some backslash-escaped characters are effectively keywords,
# but most are errors.
add-highlighter shared/example3/dqstring/ regex \
(\\[\\abefhnrtv\n])|(\\.) 1:keyword 2:Error
Now that you know the basics of groups and regions, and how they nest within the highlighter hierarchy, you’re ready to write your own Kakoune syntax highlighters! However, there’s a few common conventions for highlighter plugins you might want to follow.
Here’s a basic skeleton of a highlighter plugin for a fictional “example” filetype, which you can use as a starting point:
# Setting up all the highlighters for this syntax
# could be expensive, so we'll define them inside a module
# that won't be loaded until we need it.
#
# Because this module might contain a bunch of regexes with
# unbalanced grouping symbols, we'll use some other character
# as a delimiter.
provide-module examplesyntax %&
# Define our highlighters in the shared namespace,
# so we can link them later.
add-highlighter shared/examplesyntax regions
# A region from a `#` to the end of the line is a comment.
add-highlighter shared/examplesyntax/ region '#' '\n' fill comment
# A region starting and ending with a double-quote
# is a group of highlighters.
add-highlighter shared/examplesyntax/dqstring region '"' '"' group
# By default, a double-quoted string is string-coloured.
add-highlighter shared/examplesyntax/dqstring/ fill string
# Some backslash-escaped characters are effectively keywords,
# but most are errors.
add-highlighter shared/examplesyntax/dqstring/ \
regex (\\[\\abefhnrtv\n])|(\\.) 1:keyword 2:Error
# Everything outside a region is a group of highlighters.
add-highlighter shared/examplesyntax/other default-region group
# Highlighting for numbers.
add-highlighter shared/examplesyntax/other/ \
regex (\+|-)?[0-9]+(\.[0-9]+)? 0:value
# Highlighting for booleans.
add-highlighter shared/examplesyntax/other/ \
regex true|false 0:value
# Highlighting for keywords.
add-highlighter shared/examplesyntax/other/ \
regex if|for|while|return|continue 0:keyword
&
# When a window's `filetype` option is set to this filetype...
hook global WinSetOption filetype=example %{
# Ensure our module is loaded, so our highlighters are available
require-module examplesyntax
# Link our higlighters from the shared namespace
# into the window scope.
add-highlighter window/examplesyntax ref examplesyntax
# Add a hook that will unlink our highlighters
# if the `filetype` option changes again.
hook -once -always window WinSetOption filetype=.* %{
remove-highlighter window/examplesyntax
}
}
# Lastly, when a buffer is created for a new or existing file,
# and the filename ends with `.example`...
hook global BufCreate .+\.example %{
# ...we recognise that as our filetype,
# so set the `filetype` option!
set-option buffer filetype example
}