0

I'm working with a SQLite FTS3 table as explained here: https://www.sqlite.org/fts3.html

I'm interested in the field end_block, described as:

This field may contain either an integer or a text field consisting of two integers separated by a space character (unicode codepoint 0x20). The first, or only, integer is the blockid that corresponds to the interior node with the largest blockid that belongs to this segment b-tree. Or zero if the entire segment b-tree fits on the root node. If it exists, this node is always an interior node.

The second integer, if it is present, is the aggregate size of all data stored on leaf pages in bytes. If the value is negative, then the segment is the output of an unfinished incremental-merge operation, and the absolute value is current size in bytes.

I'm trying to make a consistency checker to make sure some FTS3 tables haven't been modified.

I need a way to encode strings in FTS3 to get the block_number but haven't been able to find anything on the internet. Some example of encoding:

Good morning! How is it going? - 0 96

Everything is okey - 0 71

Okay I will get back to you once everything is in place - 0 167

EDIT: To clarify the question, what I really need is some method to input a string as "Everything is okey" and get the second integer of the end_block field (71).

Any idea?

Arnau EC
  • 44
  • 9
  • Do you actually mean "decode"? – CL. May 15 '17 at 09:10
  • Well, it is not decode per se, I need to be able to replicate what fts3 does and get the block_size of different strings, maybe with a script or something like that. – Arnau EC May 15 '17 at 09:22
  • In the question, you say you need to get the block_number. Please clarify what you actually want to do. – CL. May 15 '17 at 09:26
  • What I need is to input some text and get the second integer of the end_block field. – Arnau EC May 15 '17 at 09:27
  • You have to check the structure of the segments (the B-trees). How exactly those look depends on how the table data was inserted; there is no unique correct value that depends on the text alone. – CL. May 15 '17 at 09:32
  • But there has to be some way to replicate the procedure and get the end_block parameters right? I could create a database and import things there... but that would be the worse way. – Arnau EC May 15 '17 at 09:35
  • 1
    When you insert the same data in a different order, or with different DELETEs between them, you get different B-trees for the *same* data. You cannot replicate the same index; you can only check that the values are consistent. – CL. May 15 '17 at 09:37
  • Okay, that is exactly what I need, how could I check that the values are consistent? Is there any easy way to do it? Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/144234/discussion-between-arnau-ec-and-cl). – Arnau EC May 15 '17 at 09:38
  • All the data structures are described in the documentation you linked to. – CL. May 15 '17 at 10:14

0 Answers0