1

I'm using the ApiAuth gem to authenticate API requests. Now I need to write a shell script that uses cURL to send test requests. So I need to generate an MD5 of the POST body and base64 encode it so that it matches what ApiAuth does on the server:

My shell script:

query="{\"document\":{\"recipient_id\":\"$ACCESS_ID\",\"data\":{\"id\":\"$ACCESS_ID\"}},\"vendor_string\":\"test\",\"patient\":{\"document\":{\"recipient_id\":\"$ACCESS_ID\",\"data\":{\"id\":\"$ACCESS_ID\"}}}}"

# need to figure how to get a base64 encoded md5 the same way Ruby does
content_md5=$(echo -n "$query" | openssl md5 -binary | base64)
content_type='application/json'
request_uri="$API_BASE/test"
httpdate=$(date -u +"%a, %_d %b %Y %H:%M:%S GMT")
accept_header='application/vnd.test+json; version=1'

canonical_string="$content_type,$content_md5,$request_uri,$httpdate"
signature=$(echo -n "$canonical_string" | openssl dgst -sha1 -hmac "$SECRET_KEY" -binary | base64)

curl -H "Authorization: APIAuth $ACCESS_ID:$signature"\
     -H "Content-MD5: $content_md5" \
     -H "Date: $httpdate" \
     -H "Accept: $accept_header" \
     -H "Content-type: $content_type" \
     -d $query \
     -v \
     $request_uri

The first thing that fails is comparing the Content-MD5 that I send with the content MD5 that ApiAuth calculates:

https://github.com/mgomes/api_auth/blob/master/lib/api_auth/base.rb#L37

def authentic?(request, secret_key)
  return false if secret_key.nil?

  return !md5_mismatch?(request) && signatures_match?(request, secret_key) && !request_too_old?(request)
end

Here the md5_mismatch?(request) method returns false. it uses these methods to calculate the MD5:

https://github.com/mgomes/api_auth/blob/master/lib/api_auth/request_drivers/action_controller.rb

def calculated_md5
  if @request.env.has_key?('RAW_POST_DATA')
    body = @request.raw_post
  else
    body = ''
  end
  md5_base64digest(body)
end

https://github.com/mgomes/api_auth/blob/master/lib/api_auth/helpers.rb

def b64_encode(string)
  if Base64.respond_to?(:strict_encode64)
    Base64.strict_encode64(string)
  else
    # Fall back to stripping out newlines on Ruby 1.8.
    Base64.encode64(string).gsub(/\n/, '')
  end
end

def md5_base64digest(string)
  if Digest::MD5.respond_to?(:base64digest)
    Digest::MD5.base64digest(string)
  else
    b64_encode(Digest::MD5.digest(string))
  end
end

So I'm thinking it boils down to matching exactly what's going on with:

Digest::MD5.base64digest

My attempt was:

content_md5=$(echo -n "$query" | openssl md5 -binary | base64)

How can I make the bash script equivalent to the ruby method?

I've tried with and without the -binary flag.

I've checked that the $query in bash is the exact same as @request.raw_post in Ruby and there's no trailing newlines since I'm using echo -n.

Update:

Output from bash:

echo $query
{"document":{"recipient_id":"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o","data":{"id":"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o"}},"vendor_string":"kipusystems","patient":{"document":{"recipient_id":"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o","data":{"id":"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o"}}}}

echo $content_md5
Lsb/vxJKHUxyRAqMhOMeOw==

Output from ruby:

puts body
{"document":{"recipient_id":"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o","data":{"id":"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o"}},"vendor_string":"kipusystems","patient":{"document":{"recipient_id":"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o","data":{"id":"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o"}}}}

puts md5_base64digest(body)
/DdffT+N+sZZjaTC5TJNcg==

I selected and copied the $query and body strings out of the terminals that ran the bash script and rails server respectively. They're both exactly the same in that sense, how can I further narrow down this problem?

Update 2: Maybe some character encoding issue?

I pasted this literal text into the (mac bash) shell prompt:

echo -n "{\"document\":{\"recipient_id\":\"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o\",\"data\":{\"id\":\"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o\"}},\"vendor_string\":\"kipusystems\",\"patient\":{\"document\":{\"recipient_id\":\"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o\",\"data\":{\"id\":\"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o\"}}}}" | openssl dgst -md5 -binary | base64

And that outputs: /DdffT+N+sZZjaTC5TJNcg== which is good! That's what the Ruby side outputs. Ok cool.

But when I run my shell script with that exact literal command I just pasted above, it outputs: Lsb/vxJKHUxyRAqMhOMeOw== which is the same as the content-md5 I originally started with (script originally posted).

When I run echo $LANG I get en_US.UTF-8.

Update 3:

I run the shell script with:

sh script.sh

And that outputs Lsb/vxJKHUxyRAqMhOMeOw== when I echo out this command:

echo -n "{\"document\":{\"recipient_id\":\"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o\",\"data\":{\"id\":\"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o\"}},\"vendor_string\":\"kipusystems\",\"patient\":{\"document\":{\"recipient_id\":\"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o\",\"data\":{\"id\":\"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o\"}}}}" | openssl dgst -md5 -binary | base64

Update 4:

Weird! So, I've been running (the above posted shell script) using sh script.sh and that has shown me an md5 result that was different from what I was seeing in Ruby. Now, I chmod +x'ed the script and ran it directly: script.sh and now I get the correct md5!!

But, now the signatures_match? method in ApiAuth returns false still :'(

DiegoSalazar
  • 13,361
  • 2
  • 38
  • 55
  • Just curious. Have you tested to make sure that the string you're encoding is 100% the same on both side of your testing? I'd test the Ruby Digest::MD5.base64digest of a string against the same string being encoded using openssl/base64 and verify that you can get the same result that way first to remove any possible disparity of tests where the inputs were potentially different. – Jim Sep 25 '14 at 01:55
  • btw, out of curiosity, and for my edification, is there a reason why you are explicitly using openssl to create an md5 hash of your string vs using md5sum? – Jim Sep 25 '14 at 02:08
  • Is there a difference? I don't have md5sum installed. – DiegoSalazar Sep 25 '14 at 02:08
  • And yes, as explained in the last line of the question, I've done a simple copy paste comparison of the string on both sides. I believe they're both UTF-8 but haven't checked the bash side. Thanks for taking interest, I'm stumped! – DiegoSalazar Sep 25 '14 at 02:10
  • md5sum is a pretty standard tool and is installed by default in many Linux distributions. Let me add something to this post to show what I mean. – Jim Sep 25 '14 at 02:16
  • I understand but, wouldn't the output from two different md5 algorithm implementations be the same given the same input? – DiegoSalazar Sep 25 '14 at 02:21
  • Well, that's the thing. Your inputs are not the same because of how you're deriving your md5 hash using openssl...I believe. – Jim Sep 25 '14 at 02:23
  • Cannot duplicate. `$ foo='{"document":{"recipient_id":"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o","data":{"id":"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o"}},"vendor_string":"kipusystems","patient":{"document":{"recipient_id":"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o","data":{"id":"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o"}}}}'` `$ echo -n $foo | openssl md5 -binary | base64` `/DdffT+N+sZZjaTC5TJNcg==` – Ignacio Vazquez-Abrams Sep 25 '14 at 03:27
  • If you don't quote your variables, Bash will perform whitespace splitting and wildcard substitution on them. `echo $query` is potentially very different from `echo `$query"`! – tripleee Sep 25 '14 at 03:36
  • Thanks @tripleee but I tried both quoting and not quoting and both produce the same output :/ – DiegoSalazar Sep 25 '14 at 03:37
  • 1
    Do you get the same `echo` inside the script? (If you use `#!/bin/sh` chances are you're not.) Try `printf '%s' "$query"` instead; it's more portable in any event. – tripleee Sep 25 '14 at 03:41
  • Interesting. I wonder what's going on there. I solved the md5 problem by chmod'ing the script and running it with out `sh`. – DiegoSalazar Sep 25 '14 at 03:56

2 Answers2

0

So, I'd probably attempt to attack this in a different way. I surmise that the reason why you're getting different results is because your method for getting the outputs differs because your inputs are different. The one thing that pops out at me is that you're creating an md5 string and outputting it in a binary string from openssl and then base64 encoding that. I think it would be cleaner to use md5sum to get the hash of what you want to compare against:

$ echo -n 'this is a test' | md5sum -- | cut -d ' ' -f 1
54b0c58c7ce9f2a8b551351102ee0938

irb(main):028:0* Digest::MD5.hexdigest('this is a test')
=> "54b0c58c7ce9f2a8b551351102ee0938"

So, for your case, I'd change:

content_md5=$(echo -n "$query" | openssl md5 -binary | base64)

to

content_md5=$(echo -n "$query" | md5sum -- | cut -d ' ' -f 1)

...and see if that works better for you. It's an initial stab in the dark.

So with the base64 encoding:

$ echo -n 'this is a test' | md5sum -- | cut -d ' ' -f 1 | base64
NTRiMGM1OGM3Y2U5ZjJhOGI1NTEzNTExMDJlZTA5MzgK

irb(main):037:0> require 'base64'
=> true
irb(main):038:0> require 'digest'
=> true
irb(main):039:0> Base64.strict_encode64(Digest::MD5.hexdigest('this is a test'))
=> "NTRiMGM1OGM3Y2U5ZjJhOGI1NTEzNTExMDJlZTA5Mzg="

That's closer. The last byte differs and I'm not 100% sure why. I'm looking into that. Just wanted to get this in here because it may just be that by removing the last byte on both outputs you will have what you need.

update -- my results now matches OP results update #2

irb(main):050:0* Digest::MD5.base64digest('{"document":{"recipient_id":"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o","data":{"id":"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o"}},"vendor_string":"kipusystems","patient":{"document":{"recipient_id":"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o","data":{"id":"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o"}}}}')
=> "/DdffT+N+sZZjaTC5TJNcg=="
irb(main):051:0> 
[1]+  Stopped                 irb
$ echo -n '{"document":{"recipient_id":"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o","data":{"id":"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o"}},"vendor_string":"kipusystems","patient":{"document":{"recipient_id":"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o","data":{"id":"lwzvZixLvVLL50qasfoO2YvMz9UzNlxg8HBOEj8NV_o"}}}}' | openssl dgst -md5 -binary | base64
/DdffT+N+sZZjaTC5TJNcg==
Jim
  • 1,499
  • 1
  • 24
  • 43
  • In my case, ApiAuth isn't using `Digest::MD5.hexdigest`, it is using `Digest::MD5.digest`, which I see outputs bytes which is why I used the `-binary` switch on the bash side. – DiegoSalazar Sep 25 '14 at 02:24
  • Right. Before I go into your last comment, I need to call out that my suggestion is flawed in that I'm base64'ing the content_md5 string, which doesn't match up with Ruby's Digest::MD5.base64digest 1:1 so I'd have to look into that and fix it. – Jim Sep 25 '14 at 02:27
  • with respect to your comment, I am not 100% clear if the output from MD5.digest's output of the bytes is in actual byte form. When I checked it, it was not the true byte representation, but I'm not sure if Ruby is printing the friendly form of the string for readability or not. I've only been coding in Ruby for a couple of years. – Jim Sep 25 '14 at 02:29
  • On mac, I'm using `md5` which should be equivalent to linux's `md5sum`. Sadly, I was using that initially until i switch to openssl's implementation of md5. You helped think of a few things. Thanks for taking interest so far! :) – DiegoSalazar Sep 25 '14 at 02:32
  • Ah! Mac! I missed that tidbit of goodness. Let me see if I can muster some more data for ya. Does md5 not return the correct md5sum? – Jim Sep 25 '14 at 02:36
  • Perhaps http://stackoverflow.com/questions/8996820/how-to-create-md5-hash-in-bash-in-mac-os-x helps? I'm not sure how you were invoking md5, but this seems to be roughly the same usage as md5sum. – Jim Sep 25 '14 at 02:40
  • I posted an update to the question with some inspiration from your suggestions, thanks! – DiegoSalazar Sep 25 '14 at 02:52
  • Thanks for posting that link, I did have a look there prior to posting. – DiegoSalazar Sep 25 '14 at 02:52
  • Just keep in mind that ApiAuth uses: `Digest::MD5.base64digest(string)` or `Base64.strict_encode64(Digest::MD5.digest(string))`, they don't use `hexdigest` which outputs only hex chars where as the former outputs all ascii. – DiegoSalazar Sep 25 '14 at 02:59
  • Hrm. So gotta find out why the base64digest method is returning a 24 byte string vs Linux's base64 command which returns a 45 byte string. – Jim Sep 25 '14 at 03:10
  • Just updated mine to show that my results are the same as yours. So I surmise your web charset encoding is not the same as your shell env (UTF-8). – Jim Sep 25 '14 at 03:22
  • Interesting. So maybe I'm writing my shell script incorrectly? – DiegoSalazar Sep 25 '14 at 03:23
  • I'm not sure but I'd first check to see what the encoding is set for on your site. You said that your $LANG is currently UTF-8. Is that what what you're getting when running the script from CLI? What does the site set the charset at? – Jim Sep 25 '14 at 03:27
  • Yes my shell's LANG is UTF-8. The site is still running locally from my Rails project running Ruby 2.1.1 i believe is UTF-8. I posted another detail about how I run the shell script. – DiegoSalazar Sep 25 '14 at 03:30
  • Posted another update! One step closer! Thanks so far! – DiegoSalazar Sep 25 '14 at 03:40
0

Turns out I just needed to run my test script directly and not with the sh command.

Not sure if this has anything to do with running a script with sh and the top of the script declares #!/bin/bash as @tripleee alluded to.

Note: On Mac using iTerm 2 with bash.

DiegoSalazar
  • 13,361
  • 2
  • 38
  • 55
  • 2
    `sh` runs Bash in compatibility mode, or possibly a completely different program (on many Linuxes, `/bin/sh` is [Dash](http://linux.die.net/man/1/dash)). The Bash manual page has a long but probably not exhaustive [list of things which work differently](http://www.gnu.org/s/bash/manual/html_node/Bash-POSIX-Mode.html) with `sh`. – tripleee Sep 25 '14 at 12:53