7

I have a function in my code that takes a string representing the url of an image and creates a File object from that string, to be attached to a Tweet. This seems to work about 90% of the time, but occasionally fails.

require 'open-uri'
attachment_url = "https://s3.amazonaws.com/FirmPlay/photos/images/000/002/443/medium/applying_too_many_jobs_-_daniel.jpg?1448392757"
image = File.new(open(attachment_url))

If I run the above code it returns TypeError: no implicit conversion of StringIO into String. If I change open(attachment_url) to open(attachment_url).read I get ArgumentError: string contains null byte. I also tried stripping out the null bytes from the file like so, but that also made no difference.

image = File.new(open(attachment_url).read.gsub("\u0000", ''))

Now if I try the original code with a different image, such as the one below, it works fine. It returns a File object as expected:

attachment_url = "https://s3.amazonaws.com/FirmPlay/photos/images/000/002/157/medium/mike_4.jpg"

I thought maybe it had something to do with the params in the original url, so I stripped those out, but it made no difference. If I open the images in Chrome they appear to be fine.

I'm not sure what I'm missing here. How can I resolve this issue?

Thanks!

Update

Here is the working code I have in my app:

filename = self.attachment_url.split(/[\/]/)[-1].split('?')[0]
stream = open(self.attachment_url)
image = File.open(filename, 'w+b') do |file|
    stream.respond_to?(:read) ? IO.copy_stream(stream, file) : file.write(stream)
    open(file)
end

Jordan's answer works except that calling File.new returns an empty File object, whereas File.open returns a File object containing the image data from stream.

Daniel Bonnell
  • 4,817
  • 9
  • 48
  • 88
  • Not clear from your description: Is it the case, that it *always* fails for certain jpg files and *always* succeeds for certain other files, or is it that for a given URI, it sometimes fails and sometimes succeeds? – user1934428 Dec 08 '15 at 17:02
  • 1
    It always fails for certain files but never for others. For example, the first file link will always fail, while the second will always succeed. I suspect there may be a problem with the file itself, so I'm wondering how to detect and circumvent that. – Daniel Bonnell Dec 08 '15 at 17:27
  • 1
    If it's a binary file (like a JPEG) you don't want to strip null bytes. Those are part of the image data. – Jordan Running Dec 08 '15 at 17:42

2 Answers2

9

The reason you're getting TypeError: no implicit conversion of StringIO into String is that open sometimes returns a String object and sometimes returns a StringIO object, which is unfortunate and confusing. Which it does depends on the size of the file. See this answer for more information: open-uri returning ASCII-8BIT from webpage encoded in iso-8859 (Although I don't recommend using the ensure-encoding gem mentioned therein, since it hasn't been updated since 2010 and Ruby has had significant encoding-related changes since then.)

The reason you're getting ArgumentError: string contains null byte is that you're trying to pass the image data as the first argument to File.new:

image = File.new(open(attachment_url))

The first argument of File.new should be a filename, and null bytes aren't allowed in filenames on most systems. Try this instead:

image_data = open(attachment_url)

filename = 'some-filename.jpg'

File.new(filename, 'wb') do |file|
  if image_data.respond_to?(:read)
    IO.copy_stream(image_data, file)
  else
    file.write(image_data)
  end
end

The above opens the file (creating it if it doesn't exist; the b in 'wb' tells Ruby that you're going to write binary data), then writes the data from image_data to it using IO.copy_stream if it's a StreamIO object or File#write otherwise, then closes the file again.

Community
  • 1
  • 1
Jordan Running
  • 102,619
  • 17
  • 182
  • 182
  • Thanks for the solution and also for clarifying the difference between the two methods of writing to a file. I feel like I'm getting a better understanding of File/IO in Ruby now. – Daniel Bonnell Dec 08 '15 at 20:34
  • Well I thought it was working, but then once I plugged it into my code it broke again. Now I'm getting `IOError: not opened for reading` consistently with every image. – Daniel Bonnell Dec 08 '15 at 21:03
  • Also when I call `image.size` is returns `0`. – Daniel Bonnell Dec 08 '15 at 21:08
  • It's hard to know where the error is coming from without the complete error message and line number. As for `image.size` returning `0` I have no idea, since I did not use the variable name `image` in my code. – Jordan Running Dec 08 '15 at 21:11
  • I just added my code to my post. As you can see, I'm saving the `File` object to `image`. When I call `tweet = company.twitter_client.update_with_media(JSON.parse(self.posts)['twitter'], image)` I get the error `IOError: not opened for reading` from `/Users/ACIDSTEALTH/.gem/ruby/2.0.0/gems/multipart-post-2.0.0/lib/composite_io.rb:102:in `read'`. – Daniel Bonnell Dec 08 '15 at 21:19
  • Nvm. I figured it out. Needed to use `File.open`, not `File.new`. I updated my post with the working code. – Daniel Bonnell Dec 08 '15 at 22:16
  • What's the purpose of `open(file)` at the end of your `File.open` block? – Jordan Running Dec 08 '15 at 22:40
  • If I don't include that, the value of `image` is a `Fixnum` of the file size. If I include `open(file)`, the value of `image` is a valid `File` object. Then if I pass `image` to my post tweet method, it works. – Daniel Bonnell Dec 08 '15 at 23:23
0

If you use Paperclip, they have a method to copy to disk.

def raw_image_data
  attachment.copy_to_local_file.read
end

change attachment to what ever variable you used of course.

Eddie
  • 1,428
  • 14
  • 24