1

How do I convert (as an example):

Señor Coconut Y Su Conjunto - Introducciõn

to:

Señor Coconut Y Su Conjunto - Introducciõn

I've got an app that creates m3u playlists, but when the track filename, artist or title contains non ASCII characters it doesn't get read properly by the music player so the track doesn't get played.

I've discovered that if I write the track out as:

#EXTINFUTF8:76,Señor Coconut Y Su Conjunto - Introducciõn
#EXTINF:76,Señor Coconut Y Su Conjunto - Introducciõn
#UTF8:01-Introducciõn.mp3
01-Introducciõn.mp3

Then the music player will read it correctly and play the track.

My problem is that I can't find the information I need to be able to do the conversion properly.

I've tried the following:

    byte[] byteArray = Encoding.UTF8.GetBytes(output);
    foreach (Byte b in byteArray)
    {
        playList.Write(b);
    }

where playList = new StreamWriter(filename, false); but I just get a series of numbers output:

#EXTINFUTF8:76,83101195177111114326711199111110117116328932831173267111110106117110116111 - 731101161141111001179999105195181110

which I guess are the numerical values of the characters rather than the characters themselves.

It's been a while since I've done this low level character manipulation and I'm a little rusty.

UPDATE

I've now got:

    byte[] byteArray = Encoding.UTF8.GetBytes(output);
    foreach (Byte b in byteArray)
    {
        playList.Write(Convert.ToChar(b));
    }

to do the output and at first glance it appeared to be working. The file as seen in Notepad++ is showing the correct information. However, the first track still isn't being played.

ChrisF
  • 134,786
  • 31
  • 255
  • 325
  • playList.Write("{0}", b); this converts byte to number. – Andrey Apr 13 '10 at 21:20
  • @Andrey - I realise that ;) I don't know what to replace it with. – ChrisF Apr 13 '10 at 21:21
  • see my comments for my answer – Andrey Apr 13 '10 at 21:23
  • Off-topic comment: The word in the filename ought to be "introducción", not "introducciõn". :-) – asveikau Apr 13 '10 at 21:24
  • @asveikau - off topic answer: you may well be right. On topic answer: I've looked, but have clearly used the wrong search terms as I've not found what I need. Hence my question here. – ChrisF Apr 13 '10 at 21:29
  • How are you opening playList? If you open it as a `TextWriter` with the UTF-8 encoding, you should just be able to write your strings and have them converted for you. Otherwise, what's happening is you're writing `Write(b)` which is the same as `Write((int)b)` which is the integer values. – lavinio Apr 13 '10 at 21:43
  • @lavinio - I'm opening playList like this: `var playList = new StreamWriter(filename, false);` – ChrisF Apr 13 '10 at 21:49
  • 1
    Have you tried casting the byte to a char? playlist.Write((char)b); (or look for a WriteByte method). I'm not sure if you can get a new char(b) if the cast doesn't work. If Write doesn't take a char you may have to form a one-char string: new string(c, 1); (or maybe works without ,1 as well). – Rob Parker Apr 13 '10 at 21:59
  • @Rob - I was just about to post an update. I've just tried `playList.Write(Convert.ToChar(b));` and it worked (well almost - the first track isn't getting played) – ChrisF Apr 13 '10 at 22:01
  • what exactly is 'playlist'? If it's a Stream it ought to work. If it's a TextWriter then it won't. You have to solve writing the #EXTINF leader and the Byte array to the same destination. – H H Apr 13 '10 at 22:10
  • @Henk - `playList` is a `StreamWriter` - I should have mentioned that in the question – ChrisF Apr 13 '10 at 22:17
  • In case you missed this in the comments on Andrey's answer... If it doesn't like extended-ASCII (high-bit), you might try Encoding.UTF7 instead. Presumably that would keep it to 7-bit values with the high-bit clear. – Rob Parker Apr 13 '10 at 22:53

2 Answers2

2

You want the whole stream to be in UTF-8. Try:

StreamWriter playList = new StreamWriter(filename, false, System.Text.Encoding.UTF8);

Now, to write to the stream, just pass your String named output like this:

playList.Write(output);

The stream will now all be in the proper encoding, so you should also just be able to do something like:

playList.WriteLine("#EXTINFUTF8:76,Señor Coconut Y Su Conjunto - Introducciõn");
lavinio
  • 23,931
  • 5
  • 55
  • 71
  • Yes - that did it. I realised this myself last night as I was dropping off to sleep & was just about to try it when I saw your answer. – ChrisF Apr 14 '10 at 10:04
0

well, try to write the encoding player expects. and it is utf8! (i guess)

byte[] bytesToWrite = Encoding.Utf8.GetBytes(yourString);

see that: #UTF8 in your sample?

Andrey
  • 59,039
  • 12
  • 119
  • 163
  • I should have added that I've tried that, but just get a series of numbers. – ChrisF Apr 13 '10 at 21:17
  • 1
    then you output bytes to file incorrectly. i bet (10 bucks :) ) that it IS utf8. just output it correctly – Andrey Apr 13 '10 at 21:19
  • 2
    use StreamWriter and pass Encoding.Utf8 to ctor. it will do the trick – Andrey Apr 13 '10 at 21:19
  • 1
    create StreamWriter with this ctor http://msdn.microsoft.com/library/f5f5x7kt.aspx then call WriteLine(output) or Write(output). in this case you don't need to call GetBytes – Andrey Apr 13 '10 at 21:24
  • That just outputs exactly the same character string "Señor" not "Señor" (for example). – ChrisF Apr 13 '10 at 21:29
  • 1
    how do you check that? using notepad? open file with hex editor or smart notepad++. Señor is same as Señor depending on encoding you open it with. – Andrey Apr 13 '10 at 21:32
  • or when using notepad click Open and pick ASCII there – Andrey Apr 13 '10 at 21:32
  • I've got an m3u file generated by MediaMonkey and I'm checking it in Notepad++. It displays the same in Notepad too. I know "Señor" not "Señor" are the same, but I need to have the file with no high ASCII characters (as we used to call them) for MM to understand it. – ChrisF Apr 13 '10 at 21:35
  • Try Encoding.UTF7. That should keep the high bit out of it. – Rob Parker Apr 13 '10 at 22:02
  • click in notepad++, click Encoding, Character Set, Western European, ISO 8859-1 then copy-paste Señor there. then click Encoding, Encode in UTF-8 and witness how Señor turns into Señor. you need just to output your file in C# in UTF8. I gave you the link to msdn. – Andrey Apr 14 '10 at 10:49