2

I am spinning up a a Rails UI that talks to a Grape API. This is the second instance of this program. The first instance works well. The second instance's Grape API, however, appears to be corrupting data before sending it over the wire.

I need the image to go from file > json > http > db. Right now I am doing that by sending the file like so: file > string > encode to url-safe base64 > to_json > http > decode > save to sqlite3 db with ActiveRecord. I'm led to believe the image data is corrupted by my converting it to base64 based on the below. However, since the Grape is all JSON, the characters must be encoded before sending (since, at least as far as Ruby's JSON library is concerned, invalid UTF-8 == invalid JSON).

So I either have to know:

  1. How to allow Grape API to send non-JSON (raw file string) or
  2. How to decode the string and avoid the error message

Opening a file and converting its contents to url-safe Base64.

File.open("#{folder}/#{file_name}", "rb:UTF-8") do |image|
  file_as_string = image.read
end
 => "iVBORw0K ... # truncated for length

Things go weird right away. IRB does the expected - encodes as UTF-8.

file_as_string.encoding.name
 => "UTF-8"

BUT. The server logs ASCII-8BIT. I cannot explain this. Every file is topped with Ruby's magic UTF-8 comment. Linux $LANG is set to en_US.UTF-8.

OK, but when Base64 converts I lose the plot anyway. Even in IRB, starting with UTF-8, it down coverts. Why US-ASCII? Regardless, why is compatibility is lost?.

Base64.urlsafe_encode64(file_as_string).encoding.name
 => "US-ASCII"
Base64.urlsafe_decode64(Base64.urlsafe_encode64(file_as_string)).encoding.name
 => "ASCII-8BIT"
Base64.urlsafe_decode64(Base64.urlsafe_encode64(file_as_string)).encode("UTF-8")
Encoding::UndefinedConversionError: "\x89" from ASCII-8BIT to UTF-8
    from (irb):27:in `encode'
    from (irb):27
    from /home/me/.rvm/rubies/ruby-2.2.1/bin/irb:11:in `<main>'

Note that the error here in IRB is the same as if I a) don't base64 encode the string before Grape tries to_json and b) when I try to decode and call .save the string to a model attribute on the Rails side.

The file itself is binary (if that matters?)

$ file -bi /path/to/file.png
image/png; charset=binary

Solutions I've tried, or am unwilling to try:

Sending over the raw image.read

This is a JSON API, so Grape converts to JSON before sending the data over the wire -- meaning any response must be valid JSON, as far as I understand it. If I try to send the raw string over, the automatically-called .to_json throws the same error.

Force-encoding the results

The output is not a readable png.

Downgrading

The original instance is Ruby 1.9.2 and CentOS 6.3. The new instance is Ruby 2.2.1 and CentOS 7. I'm generally committed to moving forward, so I'd rather develop some solution, even if not backward compatible, then rollback Ruby and my OS.

Not using UTF-8

Rails's config/application.rb has the line config.encoding = "utf-8" and config/environment.rb has the lines Encoding.default_external = Encoding::UTF_8; Encoding.default_internal = Encoding::UTF_8 I hope not to have to give up UTF-8 compatibility just for this one issue.


So is there a way to serve a file directly in Grape, bypassing the to_json call? Or is there a different encoding safe for JSON-ing and sending over http?

Sam
  • 1,205
  • 1
  • 21
  • 39

1 Answers1

2

PNG files do not have character encoding. You should open the file without declaring the character encoding. You do not need to concern yourself with character sets even after base64 encoding.

Once the file is base64 encoded, the result is 7bit ASCII string, hence encoding.name reports "US-ASCII". This is the string you should pass to your framework,

Do not call .encode() on the string before base64 encoding - this will surely corrupt the string.

To clarify:

  1. file_as_string is neither UTF-8, nor ASCII. It has no character encoding as it's binary file. file_as_string.encoding.name is irrelevant to you.
  2. Base64.urlsafe_encode64(file_as_string).encoding.name = "US-ASCII" is correct as you've effectively made a binary file into a text/character string by encoding it to base64. This does have character encoding - 7bit ASCII. This is what you should be passing to Grape to put on the wire.
  3. Base64.urlsafe_decode64(Base64.urlsafe_encode64(file_as_string)).encoding.name is irrelevant as the result is a binary string again. It has no character encoding. Trying to .encode() this will corrupt the data.
  4. Your IRB fails because you're asking Ruby to covert a binary string to UTF-8 text encoding. That's like taking a picture and asking to convert it to French.
Alastair McCormack
  • 26,573
  • 8
  • 77
  • 100
  • You're right, but unfortunately, the end result I need is saving the string to SQLite 3 database column. (I've tried to clarify this.) I'm calling `.encode()` in my IRB reproduction because Rails is calling `.encode()` before it saves the string to database. IRB is just demonstrating the string cannot be encoded to UTF-8 and the resulting error. – Sam Apr 10 '15 at 07:19
  • @Sam let's be careful not to change the scope of the question to include Rails. Get the Grape part working before turning your attention to Rails. Your IRB does not represent what's going on behind the scenes as Grape should not be calling `.encode()` on the decoded string. Don't alter the .png on disk and don't set the encoding. Just pass the base64 string to Grape. Confirm that this is working with `curl` and manually base64 decode the result back to an image. Only then should you start playing with Rails. Once you have the base64 string in Rails, you should be able to save to the db – Alastair McCormack Apr 10 '15 at 07:49
  • Sorry -- not trying to change the scope. The IRB was just attempted diagnosis of the issue. My question was and still is about the encodings -- which I do not understand how a UTF-8 string can have invalid UTF-8 characters by simply being run through `.encode64` then through `.decode64`. I would like that answered. However, I would mainly like it answered in service of getting the API and Rails app to play nicely. – Sam Apr 10 '15 at 07:56
  • Hi @Sam, I've just updated my answer to clarify a few things. Perhaps you can paste the actual error you're getting from Grape, which might be more indicative of the actual error :) – Alastair McCormack Apr 10 '15 at 08:49
  • When Grape `.call`s my method and the string is NOT base64 encoded -> `Encoding::UndefinedConversionError: "\x89" from ASCII-8BIT to UTF-8`. When Rails `.save`s the value to model instance attribute -> `Encoding::UndefinedConversionError: "\x89" from ASCII-8BIT to UTF-8`. However, I am going to accept your answer because it got me to focus on the correct part. Ignoring the Grape side and `force_encoding("UTF-8")` the base64 AND the decoded string works. I still do not understand why ... Where does \x89 come from, and why doesn't killing it with fire ruin the image? Oh well – Sam Apr 10 '15 at 09:25