1

I'm currently working on a tricky task.

Background:

  • I have a MSSQL 2012 database with multiple tables and a huge amount of rows. Since searching takes quite a while in this database, i search for ways to improve the queries. (And yes I'm using indexes, already had look to the execution plan and stuff like this :-) ) After some investigation i found Column Store Indexes in MSSQL 2012 Enterprise. This gave me a huge performance gain during search.
  • Disadvantage: When the index is enabled it is not possible to INSERT/UPDATE/DELETE data

Target:

  • I want to have the fast search available
  • It should be possible to run multiple workers at a time to INSERT/UPDATE/DELETE data (split the work among multiple workers that access the same tables) that have column store indexes (normally done over night)
  • After a worker has finished the task execution indexes should be rebuild (and the other workers should wait until then)
  • Afterwards workers should proceed and disable the indexes again where it is necessary

Current solution:

Currently there is a solution in place, but it is not working 100% because is still get messages that sometimes UPDATES and INSERTS cannot be performed because column store indexes should be disabled first (but they should be according to what i do) or that there is a deadlock during one of the calls of a stored procedure to manipulate the data.

A short overview what i did. I'm not sure if its the best way to update tables with column store indexes. Also read about partition switch but partitions are currently not used (due to the structure of the data and the search)

I have the following stored procedures in the MSSQL database.

sp_columnstore_entity_disable (Disables the index on the table)

ALTER INDEX [ColumnStoreIndex_Entity] ON dbo.[Entity] DISABLE

sp_columnstore_entity_rebuild (Rebuilds the index on the table)

ALTER INDEX [ColumnStoreIndex_Entity] ON dbo.[Entity] REBUILD

sp_entity_insert_update

-- Whenever this stored procedure is executed index should be disabled in case it active
EXEC sp_columnstore_entity_disable

-- Insert or Update the entity

Program code of the worker looks like this:

// get entities to process
for(int i = 0; i < num_entities; i++)
{
   // do some work
   // insert / update entity
}

// Rebuild column store indexes again
DBRebuildColumnStoreIndexes();

Problem:

Sometimes i receive error messages like Transaction (Process ID) was deadlocked on lock resources with another process and has been chosen as the deadlock victim and my process crashes.

Sometimes i get the error that INSERT or UPDATE is not possible because columnstore index is active.

I already thought about table locks to not run into race condition during rebuild and modification.

I'm glad about any suggestion or help to solve the problem

Anubis
  • 36
  • 4
  • You do realize on a table of a couple of Hundred gigs that you will LOCK the table completely during rebuild? Why would you want to do this? – Namphibian Dec 17 '13 at 03:43
  • Yes i realize that. If there is a better way where i can keep this fast search and do data manipulation with multiple workers at a time please let me know. :) Data manipulation is currently done overnight so it wont affect the regular working time. – Anubis Dec 17 '13 at 07:25
  • 1
    If data manipulation is all done overnight, would it be acceptable if you dropped the columnstore index before any data manipulation begins (for any worker), and you rebuild it after all data manipulation (for all workers) is done? Instead of doing it for every worker. – Blaž Dakskobler Jun 01 '14 at 20:17

1 Answers1

0

On my old team we were considering column store indexing for our archived records that we would want to perform data warehousing tasks on, however the database was a highly transactional database that required 24/7 up time. Our solution was to set up an ETL process that dumped the data into a separate data warehousing database which we enabled column store indexing on. That said, we were using 2014 and enabled clustered column store indexing which allows for insert/updates/deletes but having it in a separate table allowed for maintenance tasks for the table and index without impacting operations.

In my experience, it is best to have a dedicated database for operations and a separate one for reporting. Especially if operationally, you can archive a large number of your records that you would want to keep for reporting.

Andrew O'Brien
  • 1,793
  • 1
  • 12
  • 24