0

I am processing a lot of CSV files that have people data and occasionally names are used non-alpha numeric characters like á and those all become � symbols in the datatable. How do i prevent this problem ? I just wanna leave all the names as they are in the file without making any changes.

Thanks,

L

Laurence
  • 7,633
  • 21
  • 78
  • 129
  • The most common reason for this is that it is actually encoded in ISO-8859-1 and interpreted as UTF-8. For less common reasons, the same principle applies. – Esailija Jul 26 '12 at 12:58
  • Brilliant & Thanks Esailija ... As you said, that was the reason .. Would you want to promote your comment as an answer ? – Laurence Jul 26 '12 at 14:40
  • Only if it was helpful to you and resolved your issue :P – Esailija Jul 26 '12 at 14:41
  • Your comment resolved my issue Esailija .. If you promote your comment as an answer, I will accept it, Thanks. – Laurence Jul 26 '12 at 15:08

3 Answers3

1

The most common reason for this is that it is actually encoded in ISO-8859-1 and interpreted as UTF-8. For less common reasons, the same principle applies, that is, something is in different encoding that it claims to be.

Esailija
  • 138,174
  • 23
  • 272
  • 326
0

Change the character encoding in the database or decode it when you read from the DB.

NoAlias
  • 9,218
  • 2
  • 27
  • 46
0

While processing, you need a Reader or something. I suggest you configure it by using a System.Encoding.UnicodeEncoding or UTF32Encoding.

Minus
  • 729
  • 8
  • 20