1

I have a DB with 50 tables having the same structure (same column names, types) clustered Indexed on the Created Date column . Each of these tables have around ~ 100,000 rows and I need to pull all of them for some columns.

select * from customerNY

created date | Name | Age | Gender
__________________________________
25-Jan-2016  | Chris|  25 | M
27-Jan-2016  | John |  24 | M
30-Jan-2016  | June |  34 | F

select * from customerFL

created date | Name | Age | Gender
__________________________________
25-Jan-2016  | Matt |  44 | M
27-Jan-2016  | Rose |  24 | F
30-Jan-2016  | Bane |  34 | M

The above is an example of the tables in the DB. I need an SQL that runs quickly pulling all the data. Currently, I am using UNION ALL for this but it takes a lot of time for completing the report. Is there another way for this where I can pull in data without using UNION ALL such as

select Name, Age, Gender from [:customerNY:customerFL:]

Out of context: Can I pull in the table name in the result?

Thanks for any help. I've been putting my mind to this but I can't find a way to do it quicker.

  • 4
    Fix your data! You should not be storing similar data in different tables with the same format. You should have a single table. If you need performance, then SQL Server offers functionality such as indexes and partitions to speed queries. – Gordon Linoff Jun 20 '18 at 10:58
  • This is exactly the problem you will run into if you split the same entity on different tables. If you want to find customer `Matt` you need 50 different queries. Create a table `Customer` and add a column `State`, then insert all your records from all states in here and index it depending on your needs. 5 million rows isn't much and queries should be pretty fast. – EzLo Jun 20 '18 at 11:29
  • That's a good point but we have already thought of this. Everything else in our system is based on this design and would be a huge no-no from management for this approach. I just want to know if there is another possible way to pull records from multiple tables other than UNION ALL. Thanks. – Rathan Gopi Jun 20 '18 at 11:40
  • Either you union all tables or do your query 50 times saving the result on some variable or temporary table. – EzLo Jun 20 '18 at 11:51
  • That is what I am trying, both ways the time taken is very similar. I'll stick to the initial code. Thanks. – Rathan Gopi Jun 20 '18 at 12:11
  • If you really have no other options, this is the real bottleneck in your user experience and you have plenty of server memory, you could make these tables memory optimised. 5m rows of several columns might not hurt too much. https://www.mssqltips.com/sqlservertip/4612/inmemory-oltp-using-memoryoptimized-tables-to-replace-sql-server-temporary-tables/ – cloudsafe Jun 20 '18 at 12:15

2 Answers2

0

This dynamic SQL approach should meet your criteria, it selects table names from the schema and creates a SELECT statement at runtime for it to execute, and to meet the criteria of the UNION ALL each SELECT statement is given a UNION ALL then I use STUFF to remove the first one.

DECLARE @SQL AS VarChar(MAX)
SET @SQL = ''

SELECT @SQL = @SQL + 'UNION ALL SELECT Name, Age, Gender FROM ' + TABLE_SCHEMA + '.[' + TABLE_NAME + ']' + CHAR(13)
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME LIKE 'Customer%'
SELECT @SQL = STUFF(@SQL,1,10,'')

EXEC (@SQL)

However I do not recommend using this and you should do what people have suggested in the comments to restructure your data.

WhatsThePoint
  • 3,395
  • 8
  • 31
  • 53
0

Memory Optimising the test tables below gave a 7x speed increase compared to the same data in regular tables. Samples are 50 tables of 100000 rows. Please only run this on a test server as it creates filegroups/tables etc.:

    USE [master]
    GO
    ALTER DATABASE [myDB] ADD FILEGROUP [MemOptData] CONTAINS MEMORY_OPTIMIZED_DATA 
    GO
    ALTER DATABASE [myDB] ADD FILE ( NAME = N'Mem', FILENAME = N'C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\DATA' ) TO FILEGROUP [MemOptData] --Change Path for your version
    Go
    use [myDB]
    go
set nocount on
declare @loop1 int = 1
declare @loop2 int = 1
declare @NoTables int = 50
declare @noRows int = 100000
declare @sql nvarchar(max)

while @loop1 <= @NoTables
    begin
    set @sql = 'create table [MemCustomer' + cast(@loop1 as nvarchar(6)) + '] ([ID] [int] IDENTITY(1,1) NOT NULL,[Created Date] date, [Name] varchar(20), [Age] int, Gender char(1), CONSTRAINT [PK_Customer' + cast(@loop1 as nvarchar(6)) + '] PRIMARY KEY NONCLUSTERED 
(
    [ID] ASC
)) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)'
    exec(@sql)
    while @loop2 <= @noRows
        begin
        set @sql = 'insert into [MemCustomer' + cast(@loop1 as nvarchar(6)) + '] ([Created Date], [Name], [Age], [Gender]) values (DATEADD(DAY, ROUND(((20) * RAND()), 0), DATEADD(day, 10, ''2018-06-01'')), (select top 1 [name] from (values(''bill''),(''steve''),(''jack''),(''roger''),(''paul''),(''ozzy''),(''tom''),(''brian''),(''norm'')) n([name]) order by newid()), FLOOR(RAND()*(85-18+1))+18, iif(FLOOR(RAND()*(2))+1 = 1, ''M'', ''F''))'
        --print @sql
        exec(@sql)
        set @loop2 = @loop2 + 1
        end
    set @loop2 = 1
    set @loop1 = @loop1 + 1
    end
    ;with cte as (
    Select * from MemCustomer1 
    UNION
    Select * from MemCustomer2 
    UNION
    ...
    UNION
    Select * from MemCustomer50
    )
    select * from cte where [name] = 'tom' and age = 27 and gender = 'F'
cloudsafe
  • 2,444
  • 1
  • 8
  • 24
  • Ahh ! Nice to hear this improves the time consumed (by 7x !! ) but unfortunately, the company I am working in uses Express where as this requires Enterprise. Thanks anyway, someday I may need this. – Rathan Gopi Jun 22 '18 at 10:54
  • @RathanGopi Shame. It is available in 2016 on Standard Edition. – cloudsafe Jun 22 '18 at 10:58