0

When trying to decode a message from Gmail using official GoogleAPIs, I am running into unexpected issues when trying to decode the body data from the full message get request.

The code used to retrieve full:

// Fetch the email data
const gmailMessageData = await gmailService.users.messages.get({
    userId: 'me', id: message.id, format: 'full'
});

// The emails being received don't contain any parts, so I don't have any additional logic that combines parts together.
// Parse the response
const bodyData = gmailMessageData.data.payload.body.data
    .replace(/-/g, '+')
    .replace(/_/g, '/');

const decodedData = Buffer.from(bodyData, 'base64').toString('utf-8');

And the value of decoded data is:

<br />mail is sent as a courtesy from [redacted]. <br /> reply to: [redacted]. <br />sent on behalf of [redacted] <br /> <br />ation: [redacted] <br /> <br />ration Type: [redacted] <br />val Action: [redacted] <br /> <br /> <br />ID: [redacted] <br />

Although when retrieving the email by using the raw request using the code below:

const gmailMessageData = await gmailService.users.messages.get({
    userId: 'me', id: message.id, format: 'raw'
});

const rawData = gmailMessageData.data.raw;

const decodedData = Buffer.from(rawData, 'base64url').toString('utf-8');

With the value of the decoded data being (specifically focusing on the body, and disregarding some formatting/replacing that needs to take place):

This email is sent as a courtesy from [redacted].=0D<br />=0APlease rep= ly to: [redacted].=0D<br />=0AEmail sent on behalf = of [redacted]=0D<br />=0A=0D<br />=0AApplication: [redacted]= =0D<br />=0A=0D<br />=0ARegistration Type: [redacted] =0D<br />=0A= Retrieval Action: [redacted] =0D<br />=0A=0D<b= r />=0A=0D<br />=0AGuest ID:[redacted]=0D<br />=0A=0D<br />

The raw data body when decoded lines up with the actual data that is expected, while the full data seems to be missing characters after decoding.

I have tried several libraries as listed below, but still have been unable to resolve the issue of missing characters:

  • base64url
  • js-base64
  • urlsafe-base64

I have also tried decoding directly from base64url in the Buffer.from method for the full snippet of code to forego the replacing, still to no avail.

I have tested this code on the following platforms, and all produce the same result, so I don't believe it is system related:

  • Windows 10 20H2 - Node 16
  • MacOS 12.0.1 - Node 17
  • CentOS 8 - Node 17
  • Ubuntu 20.04 - Node 17
  • Node-17-alpine Docker container

This also doesn't seem to be related to outputting the variable either as I have outputted to a web browser using express, outputted directly to console, and even outputted to file; all three produce the same output.

I have no clue what else to try at this point.

Edit Followup: After removing the replace()s from the full snippet as suggested, going straight into base64, and even base64url decoding, I still experience the same issue; even throughout all of the systems described above.

1 Answers1

0

Your replace()s are causing the problem. You only need to base64 decode.

kevintechie
  • 1,441
  • 1
  • 13
  • 15
  • After removing the replaces, and trying both base64 and base64url decoding; running it through the barrage of testing above; still didn't resolve the problem :( – MischiefCoding Jan 04 '22 at 20:37
  • I haven't been able to repro what you're seeing. I don't have the original message to test with, so that could be the problem. Have you tried just sending a plain text message (without the HTML)? The only other difference I can tell is that I'm using node 14. – kevintechie Jan 05 '22 at 09:02
  • It's sadly something out of my control (the email sending); they only send the message in HTML format. Although after further testing yesterday, I was able to successfully decode it and get the expected output using an online converter, so I am wondering if it does have something to do with the node version. – MischiefCoding Jan 05 '22 at 13:04
  • Using an online converter, I did try to create a sample output from the application (changing sensitive data) but it seems to have decoded the output just fine in Node. It seems like there might be an issue with the output from gmail. – MischiefCoding Jan 05 '22 at 13:13