Questions tagged [non-unicode]

Unicode is intended to be a universal character set for describing all the characters required for written text incorporating all writing systems, technical symbols and punctuation. But unicode isn't supported on every system, and many other character sets exist.

This tag concerns encoding questions dealing with non-unicode character sets. It can be about conversion from/to unicode, or dealing with special characters on systems not supporting unicode.

Some commons character sets:

  • ASCII. 7-bits. Only non-accented latin characters.
  • ISO-8859-1 (a.k.a. Latin-1). 8-bits. Occidental latin characters.
  • ISO-8859-15 (a.k.a Latin-9). Like ISO-8859-1, but with some additional characters like euro sign.
  • CP-1252. 8-bits. Occidental latin characters used by Windows.
  • CP-850. 8-bits. Occidental latin characters used by DOS.
63 questions
1
vote
1 answer

How to script tables in SSMS that contain non-unicode text

I'm working with some tables in SQL Server that store text using 8-bit characters rather than unicode -- varchar rather than nvarchar. A certain amount of the text contains characters with values outside the ASCII range, for example curly quotes,…
howard39
  • 25
  • 5
1
vote
0 answers

Replacement character (black diamond question mark) after every character in text

I wrote a simple script in colab to pull text files from my drive, put them into a string, run them through a function, and print them out. Some text files are saved as ANSI and the text comes out fine. Some text files were saved as unicode and…
John G.
  • 47
  • 1
  • 8
1
vote
1 answer

Show text in modern greek in Excel

I am working on Microsoft Office 2007 in Italian language and am building a little program for a library management with VBA including buttons, msgboxes, etc. My issue is that I can't write in VBA code in Greek language and I can't view the msgboxes…
lisarko8077
  • 299
  • 5
  • 23
1
vote
3 answers

Delete weird ANSI character and convert accented ones using Python

I've downloaded a bunch of Spanish tweets using the Twitter API, but some of them have strange ANSI characters that I don't want there. I have around 18000 files and I want to remove those characters. I have all my files encoded as UTF-8. For…
Ignacio
  • 386
  • 4
  • 19
1
vote
1 answer

JSON, Unicode: a way to detect that XXXX in \uXXXX does not correspond to a Unicode character?

The JSON specification says that a character may be escaped using this notation: \uXXXX (where XXXX are four hex digits) However, not every set of four hex digits corresponds to a Unicode character. Are there tools that can scan a JSON document…
Roger Costello
  • 3,007
  • 1
  • 22
  • 43
1
vote
1 answer

Non-english letters with Slick Util

I'm using LWJGL with Slick Util in my game. But now I can't make Russian locale because slick.TrueTypeFont doesn't render Cyrillic letters (Яиssiaп Vodка, Сомяаd, Уер). Does someone know how to fix it?
Herzmann
  • 21
  • 4
1
vote
1 answer

Why am I getting "�" characters?

I've written a quick-and-dirty utility to parse a text file, but in some cases it's writing out a "�" character. My utility reads from a .txt file which contains "records" in this format: Biography Title:George F. Kennan: An American Life…
B. Clay Shannon-B. Crow Raven
  • 8,547
  • 144
  • 472
  • 862
1
vote
1 answer

SSIS Export to RDB doesn't work with NonUnicode Page File (8859-9)

I'm trying to export data from a table in MS SQL Server 2008 R2 to a RDB Database. But I'm having problems to export Hebrew strings to RDB because my SQL Server is Unicode and my RDB is Non-unicode. Here are the details: I'm using Oracle RDB Data…
1
vote
2 answers

How to draw char of Wingding.ttf font with Java Graphics.DrawString?

I am trying to draw characters from Wingding.ttf font with Java Graphics.DrawString. But resulting image contains only rectangle instead of char. BufferedImage image = new BufferedImage(100, 100, BufferedImage.TYPE_4BYTE_ABGR); Graphics graphics =…
Viktor
  • 164
  • 1
  • 14
1
vote
2 answers

How can we understand that a div has a unicode character or not with JQUERY

I want to recognize all objects in my web page that contains at least one UNICODE character. By this way, I want to perform a specific CSS class to those elements that has UNICODE characters (maybe they are completely UNICODE or maybe they are…
Mehrdad201
  • 121
  • 9
0
votes
1 answer

Non-unicode filenames on webserver

Recently I've decided to install an intranet web application (Open Atrium) in our company, but almost immideately I found myself in trouble. We have huge directory - subdirectory file system with mixed Greek (Non Unicode) and English filenames. It…
Jim
  • 2,760
  • 8
  • 42
  • 66
0
votes
0 answers

spell chek for non unicode font in MS Word

I am using Shree lipi NXT software to type Gujarati & Hindi, but my problem is MS Office doesn't recognise that non unicode words' spelling are correct or not, in short in my case MS Office spell check is not working. So I want a macro which can…
0
votes
0 answers

SSIS convert non Unicode to Unicode but the text is error

I have a package that copy data from Oracle database to SQL database. I used data conversion to convert DT_STR -> Unicode string, but my data has font error (It is Vietnamese) empnm | ----------------+ BUI TH? TH??NG | CAO V?N TIN …
Le Thi Linh
  • 53
  • 2
  • 7
0
votes
1 answer

How to convert single-byte charset (non-ASCII) ByteArray into Kotlin UTF8 String (How to avoid ��?)

I have API that produces results in specific single-byte charset (WIN 1257) and I am reading this result in Kotlin as: val connection = URL("http://192.168.1.21:92/someAPI").openConnection() as HttpURLConnection var byteArray: ByteArray =…
TomR
  • 2,696
  • 6
  • 34
  • 87
0
votes
1 answer

UnicodeEncodeError while transferring ".eml" file to Google Cloud Platform (gsutil v4.6.1 on Linux)

While transferring file(s) from a Linux system to Google Cloud Platform using the gsutil cp command, it fails at some old ".eml" files when trying to process its content (not just file name!) which contains non-English characters not encoded in…