0

I have two servers. One is mine and the other is of the other company. In the second server I can´t create any database or add any functions or store procedures, but I need to return information to do cross join with my database.

for example,

select fieldA, fieldB from localTBL l
left join linkedserver.remoteDB.dboremoteTBL r on l.ID = r.ID

or

select fieldA, fieldB from linkedserver.remoteDB.dboremoteTBL r
where r.ID in (select l.ID from localTBL l)

I did this but the performance was very horrible.

Is it possible to do this with better performance?

AstroCB
  • 12,337
  • 20
  • 57
  • 73

3 Answers3

1

For better performance with linked servers, use openquery. Otherwise, you bring back all the data from the remote server first and apply the where clause afterwards.

In your situation, run the subquery first and return the list of values to a variable. Then use that variable in your openquery.

Dan Bracuk
  • 20,699
  • 4
  • 26
  • 43
  • I mostly agree with the sentiment of using OPENQUERY, but this is not as black and white as you make it sound. Using a distributed query (linkedserver.database.schema.table) **may** return all the data first and create a copy in tempdb if it deems it appropriate then apply the filter. This answer makes it sound like it will always do this. – GarethD Jan 24 '14 at 14:09
  • In addition, a distributed query can take advantage of statistics on the linked server, so if a where clause means that not many rows will be returned it won't create a full copy, and it will adjust any further interactions accordingly (e.g. may deem a nested loop join more efficient than a hash match because there is only 50 rows). OPENQUERY cannot do this, and (AFAIK) always assumes 10,000 rows will be returned. Which can result in using a less inefficient join back to other data. – GarethD Jan 24 '14 at 14:10
1

A CTE can be used to bring only the information you require across the wire and then perform the join against the calling server. Something like:

DECLARE @Id As int;
SELECT @Id = 45;

with cte (ID, fieldB)
AS
(
    SELECT ID, fieldB
    FROM linkedserver.remoteDB.dboremoteTBL
    WHERE ID = @Id
)

SELECT lt.fieldA, cte.fieldB
FROM localTbl lt
    INNER JOIN cte ON lt.ID = cte.ID
ORDER BY lt.ID;
Eric G.
  • 11
  • 2
0

Yep. Performance will be horrible. It's down to the network between you and the other company, and any authentications and authorisations that have to be done on the way.

This is why Linked Servers aren't used very much, even within a single company: Performance is usually bad. (I've never seen a Linked Server in a separate company and can only sympathise!)

Unless you can upgrade the network link between you there's not much you can do querying from a linked server.

This setup sounds like a short-term solution to a problem which needed a fast fix, and which has lasted longer than expected. If you can get a business case for spending money on it two alternatives are:

Cheapest alternative: is to cache the data locally: have a background Service running which drags the latest version of the data out of the Linked Server tables into a set on table in the local database and then run your queries against the local tables. This does depend on how changeable the remote data is, and how up-to-date your queries have to be. Forex, if you're doing things like getting yesterday's sales data, you might be able to do an overnight pull. If you need more up-to-date data, maybe an hourly pull. You can get quite picky sometimes, and if the data structures support it only pull out data which has changed since that last pull: that makes each pull much smaller and allows more frequent ones, maybe..

More expensive involving work by your and the other company: is to re-architect it so that the other company pushed changes to you as they happen, via a WCF service (or something) you expose. This can then update your local copy as the data comes in.

simon at rcl
  • 7,326
  • 1
  • 17
  • 24