2

I am trying to output the ASCII character 131 (ƒ - Latin small letter f with hook) to a message box but for some strange reason, it appears as an empty string. I have the following VB.NET code:

Dim str As String = Convert.ToChar(131)
MessageBox.Show(str, "test", MessageBoxButtons.OK, MessageBoxIcon.Information)
Debug.Print(str)

In the above, the message box doesn't show anything but the debug.print statement shows the character properly in the "Immediate Window". I have about 70 other ascii characters that all work fine with this method but only a select few show up as blank (131 and the EN dash 150).

For example, the following works:

str = Convert.ToChar(164)
MessageBox.Show(str, "test", MessageBoxButtons.OK, MessageBoxIcon.Information)
Debug.Print(str)

I also tried converting to UTF8 but I get the same behavior as in the first code snippet:

Dim utf8Encoding As New System.Text.UTF8Encoding(True)
Dim encodedString() As Byte
str = Convert.ToChar(131)
encodedString = utf8Encoding.GetBytes(str)
Dim str2 As String = utf8Encoding.GetString(encodedString)
MessageBox.Show(str2, "test", MessageBoxButtons.OK, MessageBoxIcon.Information)
Debug.Print(str2)

Is this an encoding problem? Thank you for any insight.

EDIT: Just to clarify, I'm not actually trying to output the character to a message box. That code was just a test. I'm trying to pass the character as a string to a function that uses it in a 3rd party xml editor control, but it shows up as blank. Even while debugging in Visual Studio, you can see its value being equal to "".

EDIT 2: Thanks to some investigations from the accepted answer below, I discovered that I was using the wrong unicode character. For this f character, the code to use was ToChar(402). This worked perfectly. Thank you all.

http203
  • 851
  • 1
  • 11
  • 27
  • 2
    The character “ƒ” is not an Ascii character. This may or may not be relevant. This character has different code numbers in different character codes. – Jukka K. Korpela Mar 05 '13 at 17:13
  • Looking through the default font in vb.net (MS Sans Serif) I can't see that symbol, pasting into notepad and changing to Sans Serif the symbol shows as a strange char, could it be a font issue? – bendataclear Mar 05 '13 at 17:22
  • The MessageBox uses the default system font. If that character is not present in the default system font, then you will not see the character. You could roll your own MessageBox, or perhaps look at the [Extended Message Box Library](http://www.news2news.com/vfp/?solution=5) which looks like it allows you to change the font (as well as other things) on the standard MessageBox. – codechurn Mar 05 '13 at 17:36
  • [**Use `Option Strict On`**!](http://stackoverflow.com/a/14840761/1968) – Then this code doesn’t even compile. – Konrad Rudolph Mar 05 '13 at 18:45
  • @konrad-rudolph I have it on and it works for me... where are you getting an error? – http203 Mar 05 '13 at 18:47
  • @http203 Hmm. I lack a VB compiler at the moment but the first line should not compile: you are treating a char as a string. – Konrad Rudolph Mar 06 '13 at 09:56

1 Answers1

4

As others have noted, the “ƒ” character is not an ASCII character. ASCII is strictly a 7-byte format and the "Extended ASCII" characters are completely different depending on the encoding you are referencing. For example, Windows CodePage 1250 has a blank for character 131(0x83) but CodePage 1252 has the “ƒ” character in that slot.

I use 1252 in the example below, but if you are converting a larger body of encoded ASCII text you should be sure to properly identify the encoding in use and use the correct codepage to convert.

The best way to handle this, I think, is just to convert everything to Unicode and stay away from extended ASCII except where it is absolutely necessary for legacy reasons. To get the “ƒ” character, however, you can do, for example :

Imports System.Text

and then :

Dim enc1252 As Encoding = Encoding.GetEncoding(1252)
Dim bArr(0) As Byte
bArr(0) = CByte(131)

Dim str2 As String = Encoding.Unicode.GetString( _
                     Encoding.Convert(enc1252, Encoding.Unicode, bArr))

MessageBox.Show(str2, " test", MessageBoxButtons.OK, _
                MessageBoxIcon.Information)

VisualStudio uses Unicode natively, however, so if you just need to show the "ƒ" character, and don't need to actually convert any legacy text, you can always just do :

MessageBox.Show("ƒ", " test", MessageBoxButtons.OK, _
                MessageBoxIcon.Information)
J...
  • 30,968
  • 6
  • 66
  • 143
  • Thank you for the informative solution! (using the 1252 encoding worked within my context, simply passing the hardcoded character to the function did _not_ work) – http203 Mar 05 '13 at 18:24
  • @http203 Which version of VisualStudio are you using? – J... Mar 05 '13 at 18:39
  • @J 2008 at the moment. Btw, I discovered that ToChar(402) outputs the character correctly... what codepage is visual studio using and what is the official documentation should I be referencing? – http203 Mar 05 '13 at 18:46
  • @http203 This is because 402(0x192) is the Unicode (UTF-8) byte value of the ƒ character. VisualStudio, as far as I am aware, uses Unicode exclusively (except for C++, I think, where you can specify in the project settings to use ANSI)... I'm a bit puzzled as to why the hardcoded character doesn't work for you. If you don't need to convert a body of text (ie: you just want the ƒ) then yes, you can just use character 402 directly and skip the route through the conversion. Still, you should be able to just use unicode directly in the IDE... – J... Mar 05 '13 at 18:47
  • @J Ok, I investigated to answer your question. When I paste the f character you have here, it works hardcoded. When I was copying it from the Immediate Window from the Debug.Print statement (result from ToChar(131) as a string), it didn't work because it was actually a different character. I copied both into notepad, yours shows up and the other doesn't. – http203 Mar 05 '13 at 18:54
  • Your conversion code is way too complicated (it does redundant work). You can also directly initialise the char array, no need for a separate assignment. I’m also of the firm conviction that your array declaration syntax should be regarded as outdated, and not be used. Use the .NET syntax instead. This leaves us with two lines instead of five: https://gist.github.com/klmr/5098252 – Konrad Rudolph Mar 06 '13 at 10:05