StreamReader.CurrentEncoding different when run through chocolatey or octopus

Question

I've got an azure VM with a number of files on it. Some of these files are pretty messed up, for example, containing a UTF8 BOM and non-UTF8 characters, in particular, smart quotes like so:

<option ef="“Late”" />

In order to fix this, I have a small C# utility that opens a StreamReader on each:

StreamReader sr = new StreamReader(filename, Encoding.ASCII, true);

calls .ReadToEnd(), and then checks CurrentEncoding. If I run this process in a powershell window, it returns System.Text.ASCIIEncoding as expected because of the smart quotes and because that's what it does when run anywhere else. If I run it inside a chocolatey package or through octopus deploy, CurrentEncoding equals System.Text.UTF8Encoding.

I'm calling .ReadToEnd() because MSDN says that performing a read will set the encoding correctly. What is different about chocolatey and octopus that is making StreamReader guess the wrong encoding?

what do yo mean by saying "accurate" ? Encoding is variable by its means. — Tigran, May 03 '17 at 18:06
What makes you think that's not accurate? If chocolatey changes the encoding to UTF8 then your process will be launch as UTF8. — Gusman, May 03 '17 at 18:07
@Tigran I know these files are ASCII because everywhere this is run except through my new chocolatey package, it detects them as ASCII and converts them to UTF8. — sirdank, May 03 '17 at 18:13
@Gusman Please see my comment above and also my post update. — sirdank, May 03 '17 at 18:16
As long as a file has no BOM there is no implicit encoding, the raw data can be interpreted differently. — , May 03 '17 at 18:20
Just a note, it says "The value can be different after the first call to any Read method", that doesn't means it will always set the correct encoding, if the file has a BOM or any other tag to identify the encoding, then the stream will change it. — Gusman, May 03 '17 at 18:21
@LotPings It still seems like the same code, run against the same file, should produce the same result even if it could plausibly be interpreted differently under different circumstances. — sirdank, May 03 '17 at 18:23
@Gusman You are correct but it would appear, since I'm specifying ASCII at the time of instantiation, that running under chocolatey is actually producing an incorrect change to UTF8. — sirdank, May 03 '17 at 18:25
Maybe if you post all the info at once instead of small pieces on edits we could see the full picture... — Gusman, May 03 '17 at 18:26
@sirdank We are going to need a lot more context here - plus this looks like a bug - best to log it at https://github.com/chocolatey/choco/issues so that we can triage it better and prioritize it. HTH — ferventcoder, May 05 '17 at 03:50
@ferventcoder QA said it happens through chocolatey as well as octopus deploy so I'm no longer sure it's related to chocolatey. I can still submit this as a bug if you think I ought to. — sirdank, May 05 '17 at 18:45

StreamReader.CurrentEncoding different when run through chocolatey or octopus

0 Answers0