I am currently using Sphinx for indexing a MySQL query with 20+ million records.
I am using a delta index to update the main index and add all new records.
Unfortunately allot of changes to the tables are deleted.
I understand that I can use sql_query_killlist to get all document ID's that need to be deleted or updated. Unfortunately I don't understand how this actually works and the documentation from Sphinx does not have a good enough example for me to understand.
If I use the following example, how could I implement the killlist?
in MySQL
CREATE TABLE sph_counter
(
counter_id INTEGER PRIMARY KEY NOT NULL,
max_doc_id INTEGER NOT NULL
);
in sphinx.conf
source main
{
# ...
sql_query_pre = SET NAMES utf8
sql_query_pre = REPLACE INTO sph_counter SELECT 1, MAX(id) FROM documents
sql_query = SELECT id, title, body FROM documents \
WHERE id<=( SELECT max_doc_id FROM sph_counter WHERE counter_id=1 )
}
source delta : main
{
sql_query_pre = SET NAMES utf8
sql_query = SELECT id, title, body FROM documents \
WHERE id>( SELECT max_doc_id FROM sph_counter WHERE counter_id=1 )
}
index main
{
source = main
path = /path/to/main
# ... all the other settings
}
note how all other settings are copied from main, but source and path are overridden (they MUST be) index delta : main
{
source = delta
path = /path/to/delta
}