3

I have a stream which contains several \0 inside it. I have to replace textual parts of this stream, but when I do

StreamReader reader = new StreamReader(stream);
string text = reader.ReadToEnd();

text only contains the beginning of the stream (because of the \0 character). So

text = text.Replace(search, replace);
StreamWriter writer = new StreamWriter(stream);
writer.Write(text);

will not do the expected job since I don't parse the "full" stream. Any idea on how to get access to the full data and replace some textual parts ?

EDIT : An example of what I see on notepad

stream
H‰­—[oã6…ÿÛe)Rêq%ÙrlËñE±“-úàÝE[,’íKÿþŽDjxÉ6ŒÅ"XkÏáGqF   að÷óð!SN>¿¿‰È†/$ËÙpñ<^HVÀHuñ'¹¿à»U?`äŸ?
¾fØø(Ç,ükøéàâ+ùõ7øø2ÜTJ«¶Ïäd×SÿgªŸF_ß8ÜU@<Q¨|œp6åâ-ªÕ]³®7Ûn¹ÚÝ|‰,¨¹^ãI©…Ë<UIÐI‡Û©* Ǽ,,ý¬5O->qä›Ü
endstream 
endobj
8 0 obj
<<
/Type /FontDescriptor
/FontName /Verdana
/Ascent 765
/Descent -207
/CapHeight 1489
/Flags 32
/ItalicAngle 0
/StemV 86
/StemH 0
/FontBBox [ -560 -303 1523 1051 ]
/FontFile2 31 0 R
>>
endobj
9 0 obj

And I want to replace /FontName /Verdana by /FontName /Arial on the fly, for example.

Nicolas Voron
  • 2,916
  • 1
  • 21
  • 35
  • 1
    No, `ReadToEnd` doesn't use `\0` as an "end of stream" character. Your diagnosis may be messed up by it though. Try printing the length. Where does this data come from, and should it *really* have these characters in? Is it possible you're just using the wrong encoding? – Jon Skeet Aug 06 '13 at 15:01
  • I admit that i don't really know if these characters are present. I just suspect it. The fact is, that `ReadToEnd()` doesn't gives me the full file text. This file a pdf which contains `stream ... endstream` parts – Nicolas Voron Aug 06 '13 at 15:05
  • Ah, right. Wish you'd said so to start with. See my answer. – Jon Skeet Aug 06 '13 at 15:09
  • @NicolasVoron: What do you plan to do with the PDF? Show it to the user? Extract Text? – Brian Aug 07 '13 at 13:56
  • @Brian The pdf I try to read is a sort template. I want to modify some tags in it. No display or text extraction, just replace some known tags values which are visible on a notepad (see my edit). – Nicolas Voron Aug 07 '13 at 14:48

2 Answers2

2

I can't duplicate your results. The code below creates a string with a \0 in it, writes to file, and then reads it back. The resulting string has the \0 in it:

        string s = "hello\x0world";
        File.WriteAllText("foo.txt", s);
        string t;
        using (var f = new StreamReader("foo.txt"))
        {
            t = f.ReadToEnd();
        }
        Console.WriteLine(t == s);  // prints "True"

I get the same results if I do var t = File.ReadAllText("foo.txt");

Jim Mischel
  • 131,090
  • 20
  • 188
  • 351
2

Ah, now we're getting to it...

This file a pdf

Then it's not a text file. That's a binary file, and should be treated as a binary file. Using StreamReader on it will lose data. You'll need to use a different API to access the data in it - one which understands the PDF format. Have a look at iTextSharp or PDFTron.

Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • It sounds like iTextSharp is not designed for use with Metro style apps and may rely on parts of the .Net framework that aren't available to Metro style apps. In fact this file is a combination of text, and binaries. Is there no way to modify it an other way ? – Nicolas Voron Aug 06 '13 at 15:13
  • 2
    @NicolasVoron: If it's a PDF, then yes, it's a mixture of text content and binaries - but you need to understand the file format in order to work with it. If iTextSharp doesn't work for you, have a look for libraries which do - but abandon any thoughts of just using `StreamReader`, which is designed for *just* plain-text. – Jon Skeet Aug 06 '13 at 15:14
  • @NicolasVoron: Have just added a link to PDFTron, which may be more appropriate. – Jon Skeet Aug 06 '13 at 15:15
  • Thank you Jon. PDFTron isn't free, but i'll search for another one. – Nicolas Voron Aug 06 '13 at 15:18
  • @NicolasVoron: You never said it had to be free :) It sounds like there's a lot of constraints which weren't explicitly mentioned to start with. (It's easy to miss tags, for example. If you're aware of an unusual requirement, it's best to state that in the question.) I wouldn't be surprised to find that there are no good PDF libraries available for WinRT at the moment. – Jon Skeet Aug 06 '13 at 15:21
  • You're right (you can remove the "good" for your sentence, it's the same). In fact due to this lack of pdf library, I was trying to make a template and edit parts myself (there is so little things to change)... It seems to be a bad idea, in the end ! – Nicolas Voron Aug 06 '13 at 15:23
  • And please forgive me my badly asked question, I will try to make the next one clearer and more extensive ;) – Nicolas Voron Aug 06 '13 at 15:25