It's perhaps easiest to explain by looking at example. This example is adapted from https://microsoft.github.io/language-server-protocol/specifications/specification-3-17/#textDocument_semanticTokens.
Suppose your file had the contents \n\n foo bars\n\n bazzled\n
, i.e. rendered with whitespace it would look like this:
foo bars
bazzled
This has three tokens at the following positions (using 0-indexing):
foo
, at line 2, char 5
bars
, at line 2, char 10
bazzled
, at line 5, char 2
Here's one possible valid data
that the server could respond with:
// 1st token, 2nd token, 3rd token
[ 2,5,3,0,3, 0,5,4,1,0, 3,2,7,2,0 ]
This is just a compressed form of the following info:
[ { deltaLine: 2, deltaStartChar: 5, length: 3, tokenType: 0, tokenModifiers: 3 }, // first token
{ deltaLine: 0, deltaStartChar: 5, length: 4, tokenType: 1, tokenModifiers: 0 }, // second token
{ deltaLine: 3, deltaStartChar: 2, length: 7, tokenType: 2, tokenModifiers: 0 } // third token
]
That is, each group of 5 contiguous elements corresponds to one token. These 5 elements have the following meaning, in order:
deltaLine
: the line difference between this token and the previous
deltaStartChar
: either the start character difference between this token and the previous (if the previous is on the same line), or just the start character of this token (if the previous is on a different line)
length
: the length of this token
tokenType
: index into the token type legend
tokenModifiers
: bit flags for token modifiers
So, for example, the second token has deltaLine = 0
, meaning it is on the same line as the first token, and deltaStartChar = 5
means that it starts 5 characters after the first token starts. The first token doesn't have a token before it, so its position is instead taken to be absolute.
The tokenType
is an index into the token types legend, which is established during the initialization handshake of the protocol. The legend for the token modifiers is also established during the initialization handshake.
Although the tokenModifiers
value above is just an integer, it will be interpreted as a bit vector, where each bit indicates whether the corresponding modifier is on or off. For example, the above assigns the first token (foo
) the modifiers 0b11
, indicating that both the 0th and the 1st token modifier are active, and all other modifiers do not apply to this token.