Questions tagged [encoding]

Encoding is a set of predefined rules to reversibly transform a piece of information in a certain representation into a completely different representation. The other way round is called decoding. This tag is rather generic, but it is mainly used for binary encoding schemes such as base 64 and hexadecimal.

There are a lot of different applications:

  • which is how the computer represents characters like a and , which humans can recognize, into bytes, which computers can recognize.
  • which is used to transform between videos and bytes.
  • which is used to transform between plain text and valid URIs. Also known as .
  • which is used to transform between plain text and valid XML.
  • which is used to compress/decompress bytes.
24174 questions
7
votes
3 answers

Printing utf8 strings in Sublime Text's console with Windows

When running this code with python myscript.py from Windows console cmd.exe (i.e. outside of Sublime Text), it works: # coding: utf8 import json d = json.loads("""{"mykey": {"readme": "Café"}}""") print d['mykey']['readme'] Café When running it…
Basj
  • 41,386
  • 99
  • 383
  • 673
7
votes
2 answers

Reading UTF-8 with BOM in ruby 2.5.0

Is there a way to read files encoded in UTF-8 with BOM (Byte order marks) on Ruby v2.5.0? On Ruby 2.3.1 this used to work: csv = CSV.open(file_path, encoding: 'bom|utf-8') However, on 2.5.0 the following error ocurrs: ArgumentError: unknown…
romeu.hcf
  • 73
  • 1
  • 7
7
votes
2 answers

Wrong encoding of google cloud translate and Java

I'm trying to use Google cloud translate. I think the problem is that Google cloud translate use UTF8 and the jvm use UTF16. So i got some typo in translations. For instance : public static void main(String... args) throws Exception { //…
7
votes
2 answers

How to check character encoding of a file in Linux

I have some text files that're encoded by different character encodings, such as ascii, utf-8, big5, gb2312. Now I want to know their accurate character encodings to view them with an text editor, otherwise, they will present garbled characters. I…
user4785733
7
votes
0 answers

C#, .NET Core, Encoding.UTF8.GetBytes is not consistent

Update. I created a test project on GitHub, where you can see the tests are passing on Appveyor (Windows) and failing on Travis (both Linux and OSX). https://github.com/nopara73/UTF8Problems/ I have the a .NET Core 2 xUnit test project, where I am…
nopara73
  • 502
  • 6
  • 24
7
votes
1 answer

How do I use requests.put() to upload a file using Python?

I am trying to use the requests library in Python to upload a file into Fedora commons repository on localhost. I'm fairly certain my main problem is not understanding open() / read() and what I need to do to send data with an http request. def…
awscott
  • 73
  • 1
  • 1
  • 3
7
votes
2 answers

What is the proper encoding to use with item Reader

I'm using spring batch to read csv files, when I open these files with Notepad++ I see that the used encode is encode in ANSI. Now when reading a line from a file, I notice that all accent character are not shown correctly. For example let's take…
Feres.o
  • 283
  • 1
  • 4
  • 16
7
votes
1 answer

Read binary data with Node.js stream

I'm trying to read a binary file with fs createReadStream. Assumed that we know the "misunderstanding" of binary and latin1 as value for the encoding option, and that by defaults using toString on the data chunk will use utf-8, I have tried to user…
loretoparisi
  • 15,724
  • 11
  • 102
  • 146
7
votes
2 answers

How to open a file with utf-8 non encoded characters?

I want to open a text file (.dat) in python and I get the following error: 'utf-8' codec can't decode byte 0x92 in position 4484: invalid start byte but the file is encoded using utf-8, so maybe there some character that cannot be read. I am…
StudentOIST
  • 189
  • 2
  • 7
  • 21
7
votes
1 answer

R, Rstudio Console Encoding Windows

I there a way to change the console encoding in Rstudio on windows? This is not about reading files or sourcing scripts in a specific encoding but about changing the console encoding (the encoding Sys.getlocale yields). This is usually not a big…
snaut
  • 2,261
  • 18
  • 37
7
votes
2 answers

StringIO generated csv file that includes BOM

I'm trying to generate a CSV that opens correctly in Excel, but using StringIO instead of a file. output = StringIO("\xef\xbb\xbf") # Tried just writing a BOM here, didn't work fieldnames = ['id', 'value'] writer = csv.DictWriter(output,…
user31415629
  • 925
  • 6
  • 25
7
votes
1 answer

Convert file from Little-endian UTF-16 Unicode English text, with CRLF line terminators to Ascii encoding

a big thanks for everyone who helped me in my previous scenarios.I'm sure that somebody would have asked a similar question like before.this is my question. my file belongs to Little-endian UTF-16 Unicode English text, with CRLF line terminators…
mac_online
  • 350
  • 1
  • 5
  • 18
7
votes
2 answers

C# Web Api action method automatically decoding query parameter

I have a C# Web Api end point in a controller that has a parameter. This parameter accepts an encrypted string and this string will contain characters like "/", "&", "+" etc. So whenever I call my Api endpoint from javascript, I encode it using…
nak
  • 846
  • 2
  • 10
  • 26
7
votes
1 answer

Django - pdf response has wrong encoding - xhtml2pdf

I'm working on an invoice PDF generator on my Django website. I use xhtml2pdf. It seems to be working but encodings is not correct. There are wrong signs/characters when I use diacritics. This is a view: def render_to_pdf(template_src,…
Milano
  • 18,048
  • 37
  • 153
  • 353
7
votes
1 answer

How can I encode 0000 to 11110 in 4B/5B encoding scheme

From the 4B/5B encoding scheme dataward 0000 in encoded to 11110 codeword similarly 0001 is encoded to 01001 etc. Here the result of XOR operation between two codewords will be another valid codeword. For example XOR of 11110 and 01001 is another…
S. M. Fahad Ahmad
  • 379
  • 1
  • 2
  • 11
1 2 3
99
100