1

I have a method that converts a file to bytes so that I can later send it over the internet. anyways because I plan to send large files I send chunks of files instead of sending the whole file. each chunk consist of an array of bytes (byte[]) . I am new to all this so I wanted to save each chunk in an List of chunks ( List ) before sending it . so my class looks like:

public class SomeClass
{

    public List<byte[]> binaryFileList;

    public void SendChunk(byte[] data, int index)
    {
        binaryFileList.Add(data);
        // later I will add code in here to do something with data
    }

    public void test(string path)
    {
        binaryFileList = new List<byte[]>();

        System.IO.FileStream stream = new System.IO.FileStream(path,
            System.IO.FileMode.Open, System.IO.FileAccess.Read);

        var MaxChunkSize = 10000;
        byte[] chunk = new byte[MaxChunkSize];
        while (true)
        {
            int index = 0;
            // There are various different ways of structuring this bit of code.
            // Fundamentally we're trying to keep reading in to our chunk until
            // either we reach the end of the stream, or we've read everything we need.
            while (index < chunk.Length)
            {
                int bytesRead = stream.Read(chunk, index, chunk.Length - index);

                if (bytesRead == 0)
                {
                    break;
                }
                index += bytesRead;
            }
            if (index != 0) // Our previous chunk may have been the last one
            {
                SendChunk(chunk, index); // index is the number of bytes in the chunk
            }
            if (index != chunk.Length) // We didn't read a full chunk: we're done
            {
                return;
            }
        }


    }
}

and when I execute:

SomeClass s = new SomeClass();
s.test(@"A:\Users\Tono\Desktop\t.iso");

binaryFileList List gets populated with chunks of the file: A:\Users\Tono\Desktop\t.iso

Now the problem came when I tied to create a file from that data. when debuging I noticed that the problem was because items in binaryFileList changed as I entered data. let me show you what I mean:

enter image description here

notice that in this debug it is the first time I add an item to binaryFileList. and also you can see each byte of that item in the array...

now I will let the method run more times adding more items to binaryFileList.

so now binaryFileList has 278 items instead of one like on the last picture:

enter image description here

so everything so far looks ok right? but did you guys recall that the first item of binaryFileList contained an array of bytes with almost all 0's? take a look at the first item of binaryFileList:

enter image description here

and as I keep adding items to binaryFileList note how the first item changes:

enter image description here

In other words binaryFileList is a list of byte[]. and when I add a byte[] to binaryFileList other byte[] should not change. they do change! why!?

Tono Nam
  • 34,064
  • 78
  • 298
  • 470

6 Answers6

2

The following line has to go inside the loop:

byte[] chunk = new byte[MaxChunkSize];

You create the chunk only once and overwrite it each time with new data. What you store in you list, is just a reference to this chunk, not a copy of it.

Achim
  • 15,415
  • 15
  • 80
  • 144
1

You are using the same reference of byte[] for chunk when you call stream.Read.

Daniel A. White
  • 187,200
  • 47
  • 362
  • 445
1

You are reading into the same chunk each time, and adding that same chunk with the lastest value into the list each time. To correct you need to create a new byte[] each time:

    while (true)
    {
        // *** need to create new array each time...
        var chunk = new byte[MaxChunkSize];

        int index = 0;
        // There are various different ways of structuring this bit of code.
        // Fundamentally we're trying to keep reading in to our chunk until
        // either we reach the end of the stream, or we've read everything we need.
        while (index < chunk.Length)
        {
            int bytesRead = stream.Read(chunk, index, chunk.Length - index);

            if (bytesRead == 0)
            {
                break;
            }
            index += bytesRead;
        }
        if (index != 0) // Our previous chunk may have been the last one
        {
            SendChunk(chunk, index); // index is the number of bytes in the chunk
        }
        if (index != chunk.Length) // We didn't read a full chunk: we're done
        {
            return;
        }
    }
James Michael Hare
  • 37,767
  • 9
  • 73
  • 83
  • James I got that method from: http://stackoverflow.com/questions/5659189/how-to-split-a-large-file-into-chunks-in-c/5659258#5659258 and since the file is till open the next time I call the Read method it will read the next bytes. that method works when writing data wright away... – Tono Nam Aug 23 '11 at 19:39
  • Whoever down voted, what was the reason? Many of us came up with the same answer at the same time while the others were replying. Sucks to get a down-vote and not know why. – James Michael Hare Aug 23 '11 at 19:39
  • @Tono: Yes, but you aren't using it right away, you are storing it in a List and then looping arround. The array was created once (say at theoretical location 1000) and then a reference to the byte array @ 1000 is pushed into the List. When you loop around and read again, you are reading the new bytes into the byte [] at location 1000 again! This is the problem. You need a new byte [] if you are going to be storing them off somewhere (or to clone it before you add to list). Either way, you need to create an array for each one you want to store. – James Michael Hare Aug 23 '11 at 19:41
  • James I voted down and will change it. I voted down because that method is not reading into the same chunk every time. It reads a different chunk every time. – Tono Nam Aug 23 '11 at 19:42
  • @Tono: so if you consume the contents of the byte [] immediately, it's not an issue, but once you store it and loop to get the next, it will overwrite it. That's why you need the new byte [] each time is so that each entry in the list is a new entry, not just another entry pointing to the same byte [] – James Michael Hare Aug 23 '11 at 19:43
  • @Tono: Yeah, one of the things about S.O. is it's easy for several people to be typing the same answer at the same time in varying degrees of descriptiveness and not be aware of it until they already submitted it. – James Michael Hare Aug 23 '11 at 19:46
0

My guess would be that you are re-using the byte array that you are passing to the SendChunk method. Arrays are reference types- you should create a new byte array for each method call.

Chris Shain
  • 50,833
  • 6
  • 93
  • 125
0

You keep adding your buffer variable to the array, which puts the same reference to the same object in each list position.

You need to allocate a new byte[] and copy the buffer array into that new array, or alternately allocate a new buffer array for each chunk you read.

Here's an excellent two-part post on references by Eric Lippert that I suggest you read to better understand the issue:

http://blogs.msdn.com/b/ericlippert/archive/2011/03/07/references-and-pointers-part-one.aspx

http://blogs.msdn.com/b/ericlippert/archive/2011/03/10/references-and-pointers-part-two.aspx

Eric J.
  • 147,927
  • 63
  • 340
  • 553
0

The problem is because you allocated the chunk array only once:

byte[] chunk = new byte[MaxChunkSize];

and you are reading new portion of file to the same array again and again. Remember that array is passed as an reference in method parameter. Move declaration inside your loop and you should be fine.

michalczerwinski
  • 1,069
  • 9
  • 6