31

I have a SqlDataReader and need to read a varbinary(max) column from it using the SqlDataReader.GetBytes() method. This method populates a byte array and therefore needs to know what length of data to read.

This is where I get confused.. Clearly I want to read all the data that has been returned from the database in this row/column so what 'length' parameter should I pass?

As far as I can see, the SqlDataReader doesn't provide any methods to discover what length of data is available, therefore this method seems fairly awkward to me.

I'm tempted to just pass int.MaxValue here and forget about the issue but something about this doesn't sit right with me.

I am aware that I can instead call

byte[] value = (byte[])dataReader["columnName"];

.. and this seems to completely take care of the length issue internally. However I am working with a set of complicated code generation templates that have been built around the SqlDataReader.GetXXXX() methods. So I am tied into using GetBytes and need to understand its proper usage.

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
Martyn
  • 1,446
  • 2
  • 19
  • 30
  • SQL has methods to ask for the length in a varbinary - you must put that into SQL: http://msdn.microsoft.com/en-us/library/ms173486.aspx - DATALENGTH. Select Field, DataLength(FIeld) from Table and you can see the length. – TomTom Jun 21 '12 at 09:37
  • Thanks, I did consider using DATALENGTH in the SQL but it isn't really an acceptable solution. As you can probably guess this all forms part of a data access layer, so forcing conventions such as this in hand written stored procedures to support basic functionality isn't great. – Martyn Jun 21 '12 at 09:46
  • @Martyn while I'm here... there are libraries *dedicated* to making sure you don't need to write tedious ADO.NET code... "dapper" is (in my heavily biased opinion) excellent for hiding away these concerns, while remaining very light-weight and not being intrusive. – Marc Gravell Jun 21 '12 at 09:51
  • Forcigng conventions? A DAL that is not optimally using a SQL layer is one thing - broken. This is not "convention" this is how the data storage layer is designed. – TomTom Jun 21 '12 at 10:08
  • @Marc Gravell I'm maintaining a legacy product here, hence the mention of code templates that I cant really deviate from too much. How I would love to be using an off-the shelf DAL or ORM! But that's a discussion that you'd need to have with my boss :) – Martyn Jun 21 '12 at 10:12
  • @TomTom Yes.. If someone is writing a stored procedure then they would have to write it in a certain way and include the DATALENGTH for it to work properly. This is why that idea was ruled out a while ago. – Martyn Jun 21 '12 at 10:14
  • @Martyn but then you basially tll people to not sue SQL Srver properly. See, this is the way SQL Server works, like it or not - you either do that or have limitations or maintain your own length field somewhere. Programming a SP in SQL Server triying not to use the SQL Server API = not smart, sorry. – TomTom Jun 23 '12 at 18:33

2 Answers2

47

When dealing with varbinary(max), there are two scenarios:

  • the length of the data is moderate
  • the length of the data is big

GetBytes() is intended for the second scenario, when you are using CommandBehaviour.SequentialAccess to ensure that you are streaming the data, not buffering it. In particular, in this usage you would usually be writing (for example) in a stream, in a loop. For example:

// moderately sized buffer; 8040 is a SQL Server page, note
byte[] buffer = new byte[8040]; 
long offset = 0;
int read;
while((read = reader.GetBytes(col, offset, buffer, 0, buffer.Length)) > 0) {
    offset += read;
    destination.Write(buffer, 0, read); // push downstream
}

However! If you are using moderately sized data, then your original code:

byte[] data = (byte[])reader[col];

is fine!!. There is nothing wrong with this approach, and in fact the Get* API is broken in a few cases - GetChar() being a notable example (hint: it doesn't work).

It doesn't matter that you have existing code that uses Get* - in this case, the cast approach is perfectly appropriate.

Marc Gravell
  • 1,026,079
  • 266
  • 2,566
  • 2,900
  • 1
    Thanks, this is a great answer! In my particular case, the entire data will always be read into a byte array and not buffered or pushed into another stream. In this case, is there any performance advantage to writing the read loop manually as opposed to just calling GetBytes once and passing int.MaxValue as the length? – Martyn Jun 21 '12 at 10:00
  • 2
    @Martyn `ReadBytes` is a streaming API; you can **never** pass a big value into such and assume it read everything - you **always** need to loop until you get back a non-positive value. My main guess here (unchecked) is that *internally* it already knows the length (at least, in the buffered usage), and is going to allocate a right-sized array and just block-copy the data. If you do it via `GetBytes()` you'll need to write it to a `MemoryStream` and then call `ms.ToArray()` at the end. The problem there is: additional (unnecessary) `byte[]` allocations. Up to you whether that is a problem. – Marc Gravell Jun 21 '12 at 10:34
  • Yep, of course you're right. The looping is required because the data might be longer than int.MaxValue. Thanks for your help - accepted. – Martyn Jun 21 '12 at 10:43
  • 4
    @Martyn no, you misunderstand me. The looping is required because it is perfectly legal to only return 10 bytes, even if the underlying data has 100,000 and the buffer has room for 1000. You have to loop because **the only way to know when you've reached the end is to keep reading** – Marc Gravell Jun 21 '12 at 17:02
3

You could probably do this. Found on MSDN. Probably it could server your purpose

    // Reset the starting byte for the new BLOB.
  startIndex = 0;

  // Read the bytes into outbyte[] and retain the number of bytes returned.
  retval = myReader.GetBytes(1, startIndex, outbyte, 0, bufferSize);

 // Continue reading and writing while there are bytes beyond the size of the buffer.
  while (retval == bufferSize)
  {
    bw.Write(outbyte);
    bw.Flush();

    // Reposition the start index to the end of the last buffer and fill the buffer.
    startIndex += bufferSize;
    retval = myReader.GetBytes(1, startIndex, outbyte, 0, bufferSize);
  }

  // Write the remaining buffer.
  bw.Write(outbyte, 0, (int)retval - 1);
  bw.Flush();

http://msdn.microsoft.com/en-us/library/87z0hy49%28v=vs.71%29.aspx#Y132

Or this one

int ndx = rdr.GetOrdinal("<ColumnName>");
            if(!rdr.IsDBNull(ndx))
           {
            long size = rdr.GetBytes(ndx, 0, null, 0, 0);  //get the length of data
            byte[] values = new byte[size];

            int bufferSize = 1024;
            long bytesRead = 0;
            int curPos = 0;

            while (bytesRead < size)
            {
                bytesRead += rdr.GetBytes(ndx, curPos, values, curPos, bufferSize);
                curPos += bufferSize;
            }
           }
Anand
  • 14,545
  • 8
  • 32
  • 44
  • 1
    Let me guess - you took that straight from the documentation of the method? RTFM question ;) – TomTom Jun 21 '12 at 09:37
  • Yes :) Trust MSDN to core. !! – Anand Jun 21 '12 at 09:40
  • 1
    Thanks, I've seen this on MSDN previously and although it demonstrates the obvious way to retrieve data using this method, it doesn't provide any background information or any hints as to what the correct usage is for particular scenarios. – Martyn Jun 21 '12 at 10:04
  • Thanks for the 2nd example. Used this to pull a GIF blob out of SQLite when reader.GetValue was returning a string. – spedsal Sep 19 '13 at 20:11