Questions tagged [utf-16le]

UTF-16LE is the little endian variety of UTF-16 without BOM.

UTF-16LE is the little endian variant of . While text in UTF-16 might be expected to signal endianness by starting with a Byte-Order-Mark, text in UTF-16LE should not. UTF-16LE can encode all code points in two or four bytes, like UTF-16.

UTF-16LE is the encoding used for the API, and many frameworks there. Most stored text on Windows is actually instead . Text in both formats on Windows often start with a BOM, which can confuse software not expecting it.

For any more details, consider instead.

82 questions
1
vote
0 answers

reading a greek character from a file

I am trying to read a line from the file ( in utf-16-le) that has a greek alphabet. here is the code: f = codecs.open("dump.tmp", "r",'utf-16-le') fr = f.readlines() for line in fr: if line.startswith("MY LINE …
user741592
  • 875
  • 3
  • 10
  • 25
1
vote
3 answers

Writing unicode with python - what is wrong with this character

With python 2.7 I am reading as unicode and writing as utf-16-le. Most characters are correctly interpreted. But some are not, for example, u'\u810a', also known as unichr(33034). The following code code does not write correctly: import codecs with…
philshem
  • 24,761
  • 8
  • 61
  • 127
1
vote
1 answer

How to save ASCII as Unicode (UTF-16LE) in C/C++ (cpp)?

As you may remember, windows notepad has encoding ability in "Save As.." function: as ASCII(default), UTF-8, Unicode and Big Endian. I need to make a program, which does smth with text of ASCII .txt file and saves result as Unicode .txt file. As i…
shahan
  • 79
  • 7
0
votes
3 answers

Python network convert byte

I'm trying to implement a CRC verification on a network based protocol. CRC calculation is done via the PyCRC lib. PyCRC will generate a checksum for the given packet and return a result like this: CB3D9FD1 When I try to send it on the wire, somehow…
n00bz0r
  • 87
  • 9
0
votes
0 answers

.Net 6.0 Minimal API with UTF-16LE encoding

I am trying to build a small Web API in .Net (Minimal Web API wit .Net 6.0) but I need the response to be of type application/json and UTF-16LE encoding. Is there a possibility to do so? I build the Api using the minimal template and it works fine…
niceGuy
  • 1
  • 1
0
votes
0 answers

Pandas. What is the correct way to read two files together?

There are two files. https://www.imf.org/-/media/Files/Publications/WEO/WEO-Database/2022/WEOApr2022all.ashx https://www.imf.org/-/media/Files/Publications/WEO/WEO-Database/2022/WEOOct2022all.ashx I need to safely (to have correct data) read them…
0
votes
1 answer

How to read utf-16le file with and test regex matches against it without converting to utf8

I have a 1.5GB text file with the UTF-16LE encoding. I want to read it and test regex matches with the lines of the file. Right now I use the following two crates. encoding_rs = "0.8.31" encoding_rs_io = "0.1.7" The code to read the file looks like…
Fajela Tajkiya
  • 629
  • 2
  • 10
0
votes
0 answers

Issue with splitting text file into smaller files by rows and bytes

I have several "UTF-16-LE with BOM" encoded files that are roughly 10~50MB in size. I'm trying to split these files into smaller files no bigger than 1MB (e.g., "File1.txt" into "File1-part-0.txt", "File1-part-1.txt", and so on). After running my…
Andy Garcia
  • 79
  • 1
  • 10
0
votes
0 answers

Unable to convert UTF-16 encoded .CSV to UTF-8 in Shiny (R)

I have been having trouble converting a UTF-16LE encoded .CSV file into UTF-8. I know that I can manually re-save the file into the desired encoding, but I want this functionality to be built into the Shiny app as my user base is not that tech savvy…
0
votes
0 answers

Convert from UTF16 LE to ANSI in Python

I have several files encoded in UTF-16LE and I want to convert them to ANSI. I found some suggestions on stack overflow (Convert from ANSI to UTF-8) but this doesn't work. That is that I can convert files but there are spaces between the words and…
Alex
  • 11
  • 3
0
votes
0 answers

Change UTF 16 encoding to UTF 8 encoding for files in AWS S3

My main goal is to have AWS Glue move files stored in S3 to a database in RDS. My current issue is that the format in which I get these files has a UTF 16 LE encoding and AWS Glue will only process text files with UTF 8 encoding. See…
iclim
  • 1
  • 3
0
votes
0 answers

PHP iconv spits out gibberish when used on UTF-16LE

Im trying to decode UTF-16LE file to UTF-8 problem is I keep getting back kanji and I don't know what might be the cause. Code in question looks as follows echo("before: ".$line); $line = iconv('UTF-16LE', 'UTF-8', $line); // $line =…
0
votes
0 answers

Can UTF-16LE be converted into a MySQL LOAD DATA INFILE type format without garbling Chinese and other languages? If so how?

I'm working on a big dataset encoded in UTF-16LE that holds 1 Billion records containing text strings in over 50 languages ( not all known to me). I need to get these into our database MySql 5.7 using LOAD DATA INFILE(for import speed) but i just…
0
votes
1 answer

Unable to use encoding UTF-16LE on Android Version 9

I have an application which creates a csv-file. The file is then imported by an excel makro. The makro needs the file to be encoded with UTF-16LE encoding. The problem is, i am not able to use this encoding on some devices. Until now, i used the…
Olli
  • 658
  • 5
  • 26
0
votes
0 answers

css not linking to Html encoding issue

I have a file with HTML with UTF-16 LE encoding, the issue is CSS file is not linking to HTML I copy all content in the HTML file to a new HTML file, it works fine. After several attempts to understand the issue, I came to know the file is encoded…
amir6565
  • 13
  • 4