2

How can I use utf8.offset(s, n, i) to find the offset of the character at n starting from specific byte and character positions? The i parameter is documented as:

Returns the position (in bytes) where the encoding of the n-th character of s (counting from position i) starts...

I got i is what I need, but I can't understand if it's byte position or character position. How can I use it?

  • 2
    `i` is byte position (starting from 1). You can use result returned by `utf8.offset` as value for `i` in the next invocation of `utf8.offset(s, n, i)` for the same `s` – Egor Skriptunoff Jul 09 '17 at 12:34
  • @EgorSkriptunoff Thanks! So Lua caches the string in this function, correct? –  Jul 09 '17 at 13:36
  • 1
    No, Lua does not cache the string. It just starts to parse the string from byte position you specified in argument `i`, that's why you can use it in step-by-step manner to avoid unnecessary parsing. – Egor Skriptunoff Jul 09 '17 at 14:39
  • @EgorSkriptunoff Hum, so, for example, the position 1 is equivalent to `i`? –  Jul 09 '17 at 16:15

1 Answers1

1

All string offsets in the Lua manual are in bytes, unless the manual specifically says otherwise. So i is a byte offset, as is utf8.offset's return value.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982