2

This may be a stupid question, but I am trying to understand what happens under the covers when you are retrieving data from SQL Server. Let's assume we are selecting data from an indexed view to eliminate ambiguity.

So my understanding is that the query optimizer will treat this the same as a table. OK, but what are the steps that occur between that view being called and the actual data returning to the client? Data is retrieved from the physical file structure in SQL Sever and I assume some kind of streaming occurs as it is returned to the calling client? What are the steps in-between?

Let's now compare calling this view directly on the server versus some remote client somewhere. How is the data returned to the remote client? Let's assume this is through ODBC, but does SQL Server itself return the data in the same way regardless of transport? So, would it retrieve the results and then pass these on to the client or would it stream those results back in some way over the transport mechanism?

I hope this makes sense. Thanks in advance for any enlightenment! :-)

Jon Archway
  • 4,852
  • 3
  • 33
  • 43
  • 1
    Your question has a lot of areas that it covers. Mainly the way data goes back to the client is different based on the cursor you choose to use, but it can also change based on the transport mechanism as well. I am not an expert in this area, I know both end fairly well, but not to middle. I just know it works. :) – David Parvin Nov 16 '11 at 18:31
  • It has a lot of things in the engine to deliver proper data, since index building and analyzing to loading pre-compiled view structures; your question can be have a great answer from some network enginner, it could explain how data is transfered, local and remote, protocols, layers and models. It's really interesting,but a bit complex since it mixes different areas. – Hamikzo Nov 16 '11 at 19:24
  • "selecting data from an indexed view to eliminate ambiguity" I suspect the indexed view is not doing what you think it is doing. – Hogan Nov 16 '11 at 19:33
  • Sorry, the ambiguity part was mentioned simply to give a firm example rather than just say I am returning some data. – Jon Archway Nov 17 '11 at 08:30

3 Answers3

3

When a query is launched into execution it will eventually start producing results, one row at a time. It doesn't matter if is a query from a table, from an indexed view, from a table constructor expression or whatever. Eventually is going to reach a stage in which it will have a result row ready and will need to send it to the client. The Tabular Datastream Protocol specifications describe what is exactly the format under which 'send' occurs. It doesn't matter the protocol used (sockets, net pipes, shared memory), the format is the same under all protocols. Client side drivers all implement the parsing of the TDS stream and then transform the data in the TDS format into the appropriate format of the client API.If is ODBC then the data is moved into the buffers specified in the column binding when SQLBindCol was called. An OleDB client would specify the memory area via the DBBINDING structures. Managed SqlClient apps do no specify the bindings as the manged memory management is different and shuns pointers, but instead the SqlClient itself copies the data into objects that are then returned when SqlDataReader.GetValue is called. As clients are satisfied inspecting the row values they call the API's version of NextRow (IRowset::GetNextRows, SQLFetch, SqlDataReader.Read etc) until the API return 'no more rows'.

This marshaling from the server back to the client continues until all the rows are produced and sent back. If the client delays processing for long time (is stuck in processing a value and does not call the PAI's flavour of NextRow) then eventually the transport flow control kicks in and the server would block in an ASYNC_NETWORK_IO wait type, until the client resumes the iteration of the result and unblocks the transport flow control. A somehow related discussion is Speeding up the rate that IIS/.NET/LINQ retrieves data from the Network Buffers.

Community
  • 1
  • 1
Remus Rusanu
  • 288,378
  • 40
  • 442
  • 569
0

Yes....

I'll use ODBC to illustrate. It's basically an 'interface'. You talk to SQL server through an ODBC driver, it translates ODBC to sql server and sql server to odbc, sql server doesn't do anything different.

Equally asking for your data on a client PC via TCP/IP or inside a trigger doesn't change how the query optimiser figures out what to do or how the underlying data is read from disk.

A key part of good software design is modularisation. This bit talks to the disk system, this bit optimises, this bit sends data down a socket.

Don't try and relate how these bits on that disk drive ended up as these pixels on this monitor, aside from being damn difficult, it negates programming as an approach to problem solving.

Tony Hopkinson
  • 20,172
  • 3
  • 31
  • 39
  • Thanks for the reply. I am trying to figure out the performance difference between returning data directly versus over a network. Where the latency will be greatest and how much impact that is. Let's say the view is returning thousands of records, would we start to get some kind of buffering on the server before latency on the network and therefore this would make the cost negligable? Just trying to quantify this some how. – Jon Archway Nov 16 '11 at 19:10
  • Valuable but not really anything to do with sql server, also it will be heavily dependant on how busy your network is. Server wise all you can do is play with how much data / how often and which way you go is again dependant on how the network is set up. Rule one of client server though is if you can do it on the server do so. If you are maxing it out and want to delegate teh operation toa client it's not going to be time critical anyway. – Tony Hopkinson Nov 16 '11 at 21:22
0

To cover the area of how data is retrieved, when you make a call to SQL to retrieve some data SQL will first check to see if the data pages you are looking for is in memory in order to speed up the delivery of data. If not then it will retrieve this data from the data files on disk and read them back into memory. From there your data is presented back to the client. In the case of a view this is a object that just has a underlining SQL statement that builds this view. So this statement will be executed to build the view then whatever predicates you have passed to the view will be evaluated and passed to the client.

For how the data is delivered to the client this will depend on if you server is listening over TCP/IP(this is the most common), named pipes, shared memory. In terms of ODBC, SQL will deliver the data to the ODBC driver and encapsulate the data in a TCP/IP packet and deliver it to the client on whatever port you are connected to (SQL default is 1433).

Hope this helps.

RodMeans
  • 49
  • 1
  • 3