4

I am new to the use of columnstore index. The new different structure of columnstored data raised a question. How we know which data from one column1 (page1) are connected to other column2 (page2).

For example if we have the following representation of a table using traditional rowstore:

row1  1   2  3 -- page1
row2  4   5  6 -- page2

And for columnstore index:

col1  col2  col3
1      2     3
4      5     6

How we know using columnstore index which data are connected to who?

alexithymia
  • 317
  • 2
  • 5
  • 18
  • I'm flagging this to get moved over to DBA stack exchange. Maybe some of the super high-profile SQL Server folks over there can teach us a thing or two. – dfundako Mar 16 '18 at 17:24

2 Answers2

3

There is no explicit connection, just as there is no explicit connection between column values in a row-based table. Even so, we can always go from one to the other by simply enumerating.

Imagine reading off the column groups in a row-based manner (first value of col1, first value of col2, first value of col3) and there's your row. When identical column values are compressed into ranges, imagine they carry numbers telling you how many times they occur -- you can still read off rows this way by simple counting, even though the process is inefficient. Asking for any particular row (SELECT * FROM T WHERE Column = uniquevalue) entails searching for that value in the columnstore, which is very fast, and then using its position to find all the other values in all the other column groups to get back a row, which generally is not, since we need to read through all the values in all the ranges in the worst case. (Of course, a traditional B-tree index can help with this, which is why you'd use those for row lookups.)

Jeroen Mostert
  • 27,176
  • 2
  • 52
  • 85
1

You are not totally getting rid of the relationship between the columns and their rows. The simplified difference is the way the table is stored. Traditional storage is physically stored in a row-wise manner while a columnstore is stored column-wise. The doc link provided here has much more info that I would prefer to not copy and paste.

From the docs:

Key terms and concepts These are key terms and concepts are associated with columnstore indexes.

columnstore A columnstore is data that is logically organized as a table with rows and columns, and physically stored in a column-wise data format.

rowstore A rowstore is data that is logically organized as a table with rows and columns, and then physically stored in a row-wise data format. This has been the traditional way to store relational table data. In SQL Server, rowstore refers to table where the underlying data storage format is a heap, a clustered index, or a memory-optimized table.

So here is an example of how the relationship is retained in TSQL. Run this against a table that has a CS index (Disclaimer: I am not a pro at CS indexing):

SELECT o.name AS table_,
i.name AS index_, 
i.type_desc AS index_type, 
p.partition_number, 
rg.row_group_id, 
cs.column_id, 
c.name AS column_


FROM sys.objects o

INNER JOIN sys.indexes i 
    ON i.object_id = o.object_id

INNER JOIN sys.partitions p
    ON p.object_id = o.object_id
    AND i.index_id = p.index_id

INNER JOIN sys.column_store_row_groups rg
    ON rg.object_id = o.object_id 
    AND i.index_id = rg.index_id

INNER JOIN sys.column_store_segments cs
    ON cs.partition_id = p.partition_id

INNER JOIN sys.columns c
    ON c.object_id = o.object_id
    AND c.column_id = cs.column_id


WHERE o.object_id = OBJECT_ID(your_table_name)
dfundako
  • 8,022
  • 3
  • 18
  • 34
  • The docs do not actually explain how a columnstore is able to reconstitute itself back into rows again. It's a useful thought exercise to spell this out. – Jeroen Mostert Mar 16 '18 at 16:46
  • @JeroenMostert Columnstore indexes are based on column segments which come from rowgroups. If each column segment is 1 column from the rowgroup, I would imagine the people as MS would have retained metadata to say which column segment came from which rowgroup and which rowgroup is aligned with which table. – dfundako Mar 16 '18 at 16:51
  • I don't think that's what the OP meant with the relationship -- at least, that's not what *I* imagined with that question. But I'll cook up my own answer. – Jeroen Mostert Mar 16 '18 at 17:10
  • Thank you for your answer! :) – alexithymia Mar 19 '18 at 11:13