2

I recently needed to do a isnull in SQL on a varbinary image.
So far so (ab)normal. I very quickly wrote a C# program to read in the file no_image.png from my desktop, and output the bytes as hex string.

That program started like this:

byte[] ba = System.IO.File.ReadAllBytes(@"‪D:\UserName\Desktop\no_image.png");
Console.WriteLine(ba.Length);
// From here, change ba to hex string

And as I had used readallbytes countless times before, I figured no big deal.
To my surprise, I got a "NotSupported" exception on ReadAllBytes.

I found that the problem was that when I right click on the file, go to tab "Security", and copy-paste the object-name (start marking at the right and move inaccurately to the left), this happens.

And it happens only on Windows 8.1 (and perhaps 8), but not on Windows 7.

202A

When I output the string in question:

public static string ToHexString(string input)
{
    string strRetVal = null;
    System.Text.StringBuilder sb = new System.Text.StringBuilder();

    foreach (char c in input)
    {
        sb.Append(((int)c).ToString("X2"));
    }

    strRetVal = sb.ToString();
    sb.Length = 0;
    sb = null;

    return strRetVal;
} // End Function ToHexString

string str = ToHexString(@"‪D:\UserName\Desktop\cookie.png");
string strRight = " (" + ToHexString(@"D:\UserName\Desktop\cookie.png") + ")"; // Correct value, for comparison

string msg = str + Environment.NewLine + "  " + strRight;
Console.WriteLine(msg);

I get this:

202A443A5C557365724E616D655C4465736B746F705C636F6F6B69652E706E67
   (443A5C557365724E616D655C4465736B746F705C636F6F6B69652E706E67)

First thing, when I lookup 20 2A in ascii, it's [space] + *

Since I don't see neither a space nor a star, when I google 20 2A, the first thing I get is paragraph 202a of the german penal code http://dejure.org/gesetze/StGB/202a.html

But I suppose that is rather an unfortunate coincidence and it is actually the unicode control character 'LEFT-TO-RIGHT EMBEDDING' (U+202A) http://www.fileformat.info/info/unicode/char/202a/index.htm

Is that a bug, or is that a feature ?
My guess is, it's a buggy feature.

Stefan Steiger
  • 78,642
  • 66
  • 377
  • 442

3 Answers3

3

Filenames that contain RLO/LRO overrides are commonly created by malware. Eg. “exe” read backwards spells “malware”. You probably have an infected host, or the origin of the .png is infected.

Remus Rusanu
  • 288,378
  • 40
  • 442
  • 569
  • Well, since the "filename" in question is the windows drive-letter and not the actual filename, you're suggesting windows is malware ? Then all files on my pc are infected *aaaah* The funny thing is, this probably isn't that far from the truth ;) I can also get that RLO/LRO if i just make a screenshot, in fact any file on drive c & d, which are the only writable drives i have. Besides, this hardly is an infected host, this is a brand-new machine (1 week), and i don't have any administration rights on it. But interesting that exe just happens to be a palyndrome ;) – Stefan Steiger Jun 04 '14 at 13:43
  • Speculating here, but perhaps Win8 has some [builtin defenses](http://an7isec.blogspot.com/2014/03/winrar-file-extension-spoofing-0day.html) against such malware and you're seeing an expression of the defense at work... – Remus Rusanu Jun 04 '14 at 13:55
  • Just wanted to say, the name of the Malware is Windows 8 with arabic/hebrew (or generally right to left languages) i18n. – Stefan Steiger Sep 29 '16 at 08:46
3

The issue is that the string does not begin with a letter D at all - it just looks like it does.

It appears that the string is hard-coded in your source file.

If that's the case, then you have pasted the string from the security dialog. Unbeknownst to you, the string you pasted begins with the LRO character. This is an invisible character which tales no space, but tells the renderer to render characters from left-to-right, ignoring the usual rendering.

You just need to delete the character.

To do this, position the cursor AFTER the D in the string. Use the Backspace or Delete to Left key <x] to delete the D. Use the key again to delete the invisible LRO character. One more time to delete the ". Now retype the " and the D.

A similar problem could occur wherever the string came from - e.g. from user input, command line, script file etc.

Note: The security dialog shows the filename beginning with the LRO character to ensure that characters are displayed in the left-to-right order, which is necessary to ensure that the hierarchy is correctly understood when using RTL characters. e.g. a filename c:\folder\path\to\file in Arabic might be c:\folder\مسار/إلى/ملف. The "gotcha" is the Arabic parts read in the other direction so the word "path" according to google translate is مسار, and that is the rightmost word, making it appear is if it was the last element of the path, when in fact it is the element immediately after "c:\folder\".

Because security object paths have an hierarchy which is in conflict with the RTL text layout rules, the security dialog always displays RTL text in LTR mode. That means that the Arabic words will be mangled (letters in wrong order) on the security tab. (Imagine it as if it said "elif ot htap"). So the meaning is just about discernable, but from the point of view of security, the security semantics are preserved.

Ben
  • 34,935
  • 6
  • 74
  • 113
  • 1
    LoL, right-to-left languages reverse the hierarchy - that's a very interesting point. Mixing names is even more interesting. The part before the note is nothing new, but ok for others that have the same problem. – Stefan Steiger Sep 29 '16 at 08:55
0

This question bothered me a lot, how would it be possible that a deterministic function would give 2 different results for identical input? After some testing, it turns out that the answer is simple.

If you look through it in your debugger, you will see that the 'D' char in your @"‪D:\UserName\Desktop\cookie.png" (first use of Hex function) is NOT the same char as in @"D:\UserName\Desktop\cookie.png" (second use).

You must have used some other 'D'-like character, probably by unwanted keyboard shortcut or by messing with your Visual Studio character encoding.

It looks exactly the same, but in reality it's not event a single char 9try to watch the c variable in your toHex function.

if you change to the normal 'D' in your first example, it will work fine.

Kamil T
  • 2,232
  • 1
  • 19
  • 26
  • I know that. But if you look closely and compare str with strRight, you see that the D is just the same D, the problem is there is a text-flow control character in front that you don't see, but that you can only remove by deleting the d. It seems to me this is a buggy windows hack which should make the file-name display from left-to-right on right-to-left systems - a hack that has some nasty unintended negative side-effects. – Stefan Steiger Jun 04 '14 at 13:30
  • Try to copy the first and the second 'D' into the Watch in debugger. Try `"D" == "‪D"` (first 'D' and the second one). The output will be `false`. The text-flow control character you're talking about isn't BEFORE the first 'D' - it is IN that char ;) – Kamil T Jun 04 '14 at 13:33
  • @Quandary Or, the other way - just type whole path, instead of copying it and pasting from Properties window. You'll see that it will work fine. – Kamil T Jun 04 '14 at 13:33
  • Yes, in the char, that's exactly the point. No, I'm too lazy to type the path, but it works if you mark the path from left to right, which is a really annoying requirement ;) This are 'dead' characters, like the graf accent on the a: â , or the spanish tilde on n: ñ – Stefan Steiger Jun 04 '14 at 13:47