SQL Server How to add a linked server to the same instance without performance impact

Question

in my company, we have several environments with MS SQL database servers (SQL 2008 R2, SQL 2014). For the sake of simplicity, let us consider just a TEST environment and a PROD environment and two sql servers in each. Let the servers be called srTest1, srTest2, srProd1, srProd2 and each be running a default MS SQL Server instance. We work with multiple databases, say DataDb, ReportDb, DWHDb.

We want to keep the same source code in T-SQL for both TEST and PROD, but the problem is the architecture or distribution of the above mentioned databases in each environment:

TEST:

srTest1 - DataDb
srTest2 - DWHDb, ReportDb

PROD:

srProd1 - DataDb, ReportDb
srProd2 - DWHDb

Now, say, in ReportDb, we write stored procedures with many SELECTs referencing tables and other objects in DataDb and DWHDb. In order to have source code as universal as possible, we decided to create linked servers for each database on each db server in each environment and name them with respect to the database they're created for. Therefore, there'll be these linked servers:

lnkDataDb, lnkReportDb and lnkDWHDb on srTest1,
lnkDataDb, lnkReportDb and lnkDWHDb on srTest2,
lnkDataDb, lnkReportDb and lnkDWHDb on srProd1,
lnkDataDb, lnkReportDb and lnkDWHDb on srProd2.

And we'll adjust the source in the stored procs accordingly. For instance:

Instead of

SELECT * FROM DataDb.dbo.Contact

We'll write

SELECT * FROM lnkDataDb.DataDb.dbo.Contact

The example above is reasonable for a situation where the database from which you execute the query (ReportDb) lies on a different server than that with the referenced table (DataDb). Which is the case for the TEST environment. But not so in PROD. It is performance I'm here concerned about. The SQL Server will treat that SELECT as a "remote query" no matter whether, in fact, it is a reference to a local object or not.

Now, it comes the most important part:

If you check these 3 queries for their actual execution plans, you'll see an interesting thing:

(1) SELECT * FROM DataDb.dbo.Contact
(2) SELECT * FROM srProd1.DataDb.dbo.Contact
(3) SELECT * FROM lnkDataDb.DataDb.dbo.Contact

The first two (query #1 and #2) have the same execution plan (the fastest possible) even if you use the four-part name manner of referencing the table Contact in #2. The last query has a different plan (remote query, thus slower).

The question is:

Can you somehow create a linked server to self (the same sql server instance, the default instance actually) as an "alias" to the name of the host (srProd1) in order for the SQL server to be forced to understand it as local and not issue "remote execution" plans?

Thanks a lot for any hints

Pavel

score 2 · Answer 1 · answered Nov 07 '16 at 10:39

Recently I found a workaround which seems to solve this kind of issues more efficiently and more elegantly than the solution with self-pointing linked servers.

If you work (making reports, for example) with multiple databases on multiple SQL servers and the physical distribution of the databases on the servers is a challenge since it may differ from one environment to another (e.g. TEST vs PROD), I suggest this:

Use three-part db object names whenever possible. If the objects are local, then execution plans are also local, and thus effective.

Example:

SELECT * FROM DataDb.dbo.Contact

If you happen to run the above query from within a different SQL server instance (residing on a different physical machine, for example, but this not necessarily, the other SQL server instance could be installed even on the same machine), briefly if you're about to use a four-part name:

SELECT * FROM lnkDataDb.DataDb.dbo.Contact

Then you can circumvent that using the following trick:

Let's assume lnkDataDb points to srTest2 and you're executing your queries from srTest1. Now, you'll create a "fake" database DataDb on your local server (srTest1). This fake DataDb shall contain no real db objects (no tables, no views, no stored procedures, no UDFs etc.). There shall only be synonyms defined in it. (And there also shall be the same schemas in it as those in the real DataDb on srTest2). These synonyms shall be named exactly the same way as their real db-object counterparts in DataDb on srTest2. Example:

-- To be executed on srTest1.

EXEC sp_addlinkedserver
  @server       = N'lnkDataDb',
  @srvproduct   = N'',
  @provider     = N'SQLNCLI',
  @datasrc      = N'srTest2'
;
GO

CREATE DATABASE [DataDb];
GO

USE [DataDb];
GO

CREATE SYNONYM dbo.Contact FOR lnkDataDb.DataDb.dbo.Contact;
GO

Now, if you want to SELECT rows from the table dbo.Contact residing in the database DataDb on srTest2 and you're executing your query from srTest1, you'll use a simple three-part table name:

SELECT * FROM DataDb.dbo.Contact

Of course, on srTest1, this is not a table, that's just a synonym referencing the same-named table on srTest2. However, that's the trick, you use the same query syntax as if you were executing it on srTest2 where the real db object resides.

There are disadvantages of this approach:

On the local server, at the beginning, there must not be a database with the same name as the remote one. Because you're about to create a "fake" database with that name to reflect the names of remote db objects.
You're creating one database that is almost empty, thus increasing the mess of various databases residing on your local SQL server. This might provoke reluctance of your database admin if they prefer having as few databases as possible.
If you're developing your T-SQL scripts in SQL Server Management Studio, for example, using synonyms cuts you off from the convenience of the IntelliSense feature.

Advantages outweigh the above-mentioned disadvantages, though:

Your scripts work in any environment (DEV, TEST, PROD) without the need to change any part of the source code.
If the other database you're querying data from resides on the same SQL server instance as your script, you also use the three-part name convention and your SQL server evaluates the query in execution plan as local which is OK. (This is what the original question of this post was searching to solve.)
If the other database you're querying data from resides on another SQL server instance, you still use a "local syntax manner" of a SQL query (with the synonym) which, only at runtime, evaluates in a remote execution plan. Which is also fine because the db object actually is remote.

To summarize

The query executes as local if the referenced object is local, the query executes as remote if the referenced object is remote, but the T-SQL script is always the same. You don't have to change a letter in it.

SQL Server How to add a linked server to the same instance without performance impact

1 Answers1