The tree search cost is probably negligible compared to the cost of physical writing to disk and page splitting and logging.
1) You should definitely insert data in bulk, rather than row by row.
2) To reduce page splitting of the point_log_idx1 index you can try to use ORDER BY
in the INSERT
statement. It still doesn't guarantee the physical order on disk, but it does guarantee the order in which point_log_id IDENTITY
would be generated, and hopefully it will hint to process source data in this order. If source data is processed in the requested order, then the b-tree structure of the point_log_idx1 index may grow without unnecessary costly page splits.
I'm using SQL Server 2008. I have a system that collects a lot of monitoring data in a central database 24/7. Originally I was inserting data as it arrived, row by row. Then I realized that each insert was a separate transaction and most of the time system spent writing into the transaction log.
Eventually I moved to inserting data in batches using stored procedure that accepts table-valued parameter. In my case a batch is few hundred to few thousand rows. In my system I keep data only for a given number of days, so I regularly delete obsolete data. To keep the system performance stable I rebuild my indexes regularly as well.
In your example, it may look like the following.
First, create a table type:
CREATE TYPE [dbo].[PointValuesTableType] AS TABLE(
point_id int,
timestamp datetime,
value int
)
Then procedure would look like this:
CREATE PROCEDURE [dbo].[InsertPointValues]
-- Add the parameters for the stored procedure here
@ParamRows dbo.PointValuesTableType READONLY
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
BEGIN TRANSACTION;
BEGIN TRY
INSERT INTO dbo.point_log
(point_id
,timestamp
,value)
SELECT
TT.point_id
,TT.timestamp
,TT.value
FROM @ParamRows AS TT
ORDER BY TT.point_id, TT.timestamp;
COMMIT TRANSACTION;
END TRY
BEGIN CATCH
ROLLBACK TRANSACTION;
END CATCH;
END
In practice you should measure for your system what is more efficient, with ORDER BY
, or without.
You really need to consider performance of the INSERT
operation as well as performance of subsequent queries.
It may be that faster inserts lead to higher fragmentation of the index, which leads to slower queries.
So, you should check the fragmentation of the index after INSERT
with ORDER BY
or without.
You can use sys.dm_db_index_physical_stats to get index stats.
Returns size and fragmentation information for the data and indexes of
the specified table or view in SQL Server.