49

I am using the Tmail library, and for each attachment in an email, when I do attachment.content_type, sometimes I get not just the content type but also the name. Examples:

image/jpeg; name=example3.jpg

image/jpeg; name=example.jpg

image/jpeg; name=photo.JPG

image/png

I have an array of valid content types like this:

VALID_CONTENT_TYPES = ['image/jpeg']

I would like to be able to check if the content type is included in any of the valid content types array elements.

What would be the best way of doing so in Ruby?

Hommer Smith
  • 26,772
  • 56
  • 167
  • 296

5 Answers5

123

There are multiple ways to accomplish that. You could check each string until a match is found using Enumerable#any?:

str = "alo eh tu"
['alo','hola','test'].any? { |word| str.include?(word) }

Though it might be faster to convert the array of strings into a Regexp:

words = ['alo','hola','test']
r = /#{words.join("|")}/ # assuming there are no special chars
r === "alo eh tu"
Phrogz
  • 296,393
  • 112
  • 651
  • 745
cydparser
  • 2,057
  • 1
  • 16
  • 12
  • 1
    To be safe, you should escape the words in the regex (in case there are any regex special characters present): `r = /#{words.map{|w|Regexp.escape(w)}.join('|')}/` – Phrogz Apr 18 '12 at 18:47
  • @steenslag Thanks! I had never seen that method (present since at least 1.8.6!). – Phrogz Apr 18 '12 at 18:50
  • @steenslag So its not necessary to do the join? I can just do union and it does the escaping? Awesome... – Hommer Smith Apr 18 '12 at 18:54
  • 19
    I tried both and tried benchmarking it 1_000_000x: `.any? # => ( 0.877526)` `r = Regexp.union(*words); r === string # => ( 17.374344)` Just for reference. – index Apr 25 '14 at 04:50
  • Your Regexp was exactly what I needed. Thank you! – Buildzzz Nov 15 '17 at 00:09
  • 4
    few years late, but @index 's benchmark still works and is still true. Only that machines process it faster now, `.any? # => ( 0.160000 ); union => ( 6.410000 )` – Shrinath Oct 05 '18 at 03:04
3

If image/jpeg; name=example3.jpg is a String:

("image/jpeg; name=example3.jpg".split("; ") & VALID_CONTENT_TYPES).length > 0

i.e. intersection (elements common to the two arrays) of VALID_CONTENT_TYPES array and attachment.content_type array (including type) should be greater than 0.

That's at least one of many ways.

Simon Bagreev
  • 2,879
  • 1
  • 23
  • 24
3

So if we just want existence of a match:

VALID_CONTENT_TYPES.inject(false) do |sofar, type| 
    sofar or attachment.content_type.start_with? type
end

If we want the matches this will give the list of matching strings in the array:

VALID_CONTENT_TYPES.select { |type| attachment.content_type.start_with? type }
angusiguess
  • 639
  • 5
  • 11
2
# will be true if the content type is included    
VALID_CONTENT_TYPES.include? attachment.content_type.gsub!(/^(image\/[a-z]+).+$/, "\1") 
noob
  • 8,982
  • 4
  • 37
  • 65
0

I think we can divide this question in two:

  1. How to clean undesired data
  2. How to check if cleaned data is valid

The first is well answered above. For the second, I would do the following:

(cleaned_content_types - VALID_CONTENT_TYPES) == 0

The nice thing about this solution is that you can easily create a variable to store the undesired types to list them later like this example:

VALID_CONTENT_TYPES = ['image/jpeg']
cleaned_content_types = ['image/png', 'image/jpeg', 'image/gif', 'image/jpeg']

undesired_types = cleaned_content_types - VALID_CONTENT_TYPES
if undesired_types.size > 0
  error_message = "The types #{undesired_types.join(', ')} are not allowed"
else
  # The happy path here
end
bonafernando
  • 1,048
  • 12
  • 14