0

I have a StringBuilder of length 1,539,121,968. When calling StringBuilder .ToString() on it fails with OutOfMemoryException. I tried creating a char array but was not allowed to create such big array.

I need to store it a byte array in UTF8 format. Is it possible?

serializer
  • 1,003
  • 2
  • 13
  • 27
  • 6
    Is there any chance you could avoid building such a huge StringBuilder to start with? It's feasible to stream this, but it's a fair amount of work... it may well be better to rethink your approach. – Jon Skeet Oct 26 '16 at 15:53
  • 2
    The maximum size of any single object in .NET is 2GB. However, take a look at [this](https://msdn.microsoft.com/en-us/library/hh285054(v=vs.110).aspx). – itsme86 Oct 26 '16 at 15:54
  • 3
    Just *don't* create such a StringBuilder, that's not its job. A StringBuilder is meant to make building a string easier, not to act as a buffer. Use a `StreamWriter` with UTF8 encoding instead that outputs whatever you want to the destination. Use a large buffer if you find that there are too many writes to the disk. The default is 8KB I think – Panagiotis Kanavos Oct 26 '16 at 15:56
  • 1
    I'm curious what it is you are doing? If you are really going from an arbitrary byte array => string then I'm pretty sure you want `Base64String`, not a "UTF8" `string`, as UTF8 cannot represent arbitrary bytes and you will lose information if it's not a byte array that "actually" has ascii characters in it. – Quantic Oct 26 '16 at 16:04
  • @Quantic the OP may just want to write text to a UTF8 file – Panagiotis Kanavos Oct 26 '16 at 16:07
  • @PanagiotisKanavos You are probably right, he already has the `StringBuilder` created so presumably he's just going "text in c# => text in file". He probably meant to say "char array" not "byte array". – Quantic Oct 26 '16 at 16:09
  • Finally I want to store it in bytearray as described. But maybe better to use StreamWriter to a MemoryStream instead? – serializer Oct 26 '16 at 16:10

1 Answers1

2

I'd suggest looking at the documentation for streams. As this might help.

Another way to approach it would be to split it up. As for your last comment stating that you wish to store it as a ByteArray with UTF8 you'd need a char[] as else you'd lose your encoding. I'd reccomend splitting it into many smaller strings (or char[]s) stored in separate objects that can easily be reconstructed. Something like this might suffice, create many StringSlices:

public class StringSlice()
{
     public Str {get;}
     public Index {get;}
     public StringSlice(string str, int index)
         {
              this.Str = str;
              this.Index = index;
         }

     public static List<string> ReconstructString(IEnumerable<StringSlice> parts)
         {
              //Sort input by index return list with new strings in order. Probably have to use a buffer on the disc so as not to breach 2GB obj limit.
         }
}

In essence what you would be doing here is similar to the way internet packets are split and reconstructed. I'm not entirely sure if I've answered your question but hopefully this comes some way to helping.

Graham
  • 7,431
  • 18
  • 59
  • 84
ScottishTapWater
  • 3,656
  • 4
  • 38
  • 81