1

I am using AWS lambda function to receive a multipart request with attachments and upload them to S3.

But Lambda function replaces few characters with the replacement character and so the attachment getting corrupt.

I checked on a PNG file. sample content: \x89PNG\r\n\u001A\n\u0000\u0000\u0000

All the characters are received as they are but \x89 or in general \x** getting replaced by the replacement character(U+FFFD).

I am extracting the attachment file_str as a string and writing into a file and then uploading it to s3.

File.open(file_path, 'w') do |f|
  f << file_str
end

Thanks in advance.

Marcelo Fonseca
  • 1,705
  • 2
  • 19
  • 38
Avinash142857
  • 129
  • 2
  • 13
  • My guess is the problem is not in writing the file but in the encoding of `file_str`. Whatever is doing the string encoding is forcing it to be valid UTF-8. You need to look earlier in the process – Max Feb 27 '19 at 20:10
  • 1
    You are correct. I tried the same with rails and I got the error even before reaching the action in the controller. Error: "Encoding::UndefinedConversionError: "\x89" from ASCII-8BIT to UTF-8" – Avinash142857 Feb 28 '19 at 03:06

1 Answers1

0

You need the binary mode on to work with (to write) binary files.

#                      ⇓ THIS
File.open(file_path, 'wb') do |f|
  f << file_str
end

You try to store the content in UTF-8 and \x89 is not valid UTF-8.

Aleksei Matiushkin
  • 119,336
  • 10
  • 100
  • 160