11

I'm building a web app that uses EvaporateJS to upload large files to Amazon S3 using Multipart Uploads. I noticed an issue where every time a new chunk was started the browser would freeze for ~2 seconds. I want the user to be able to continue to use my app while the upload is in progress, and this freezing makes that a bad experience.

I used Chrome's Timeline to look into what was causing this and found that it was SparkMD5's hashing. So I've moved the entire upload process into a Worker, which I thought would fix the issue.

Well the issue is now fixed in Edge and Firefox, but Chrome still has the exact same problem.

Here's a screenshot of my Timeline: Timeline

As you can see, during the freezes my main thread is doing basically nothing, with <8ms of JavaScript running during that time. All the work is occurring in my Worker thread, and even that is only running for ~600ms or so, not the 1386ms that my frame takes.

I'm really not sure what's causing the issue, are there any gotchas with Workers that I should be aware of?

Here's the code for my Worker:

var window = self; // For Worker-unaware scripts

// Shim to make Evaporate work in a Worker
var document = {
    createElement: function() {
        var href = undefined;

        var elm = {
            set href(url) {
                var obj = new URL(url);
                elm.protocol = obj.protocol;
                elm.hostname = obj.hostname;
                elm.pathname = obj.pathname;
                elm.port = obj.port;
                elm.search = obj.search;
                elm.hash = obj.hash;
                elm.host = obj.host;
                href = url;
            },
            get href() {
                return href;
            },
            protocol: undefined,
            hostname: undefined,
            pathname: undefined,
            port: undefined,
            search: undefined,
            hash: undefined,
            host: undefined
        };

        return elm;
    }
};

importScripts("/lib/sha256/sha256.min.js");
importScripts("/lib/spark-md5/spark-md5.min.js");
importScripts("/lib/url-parse/url-parse.js");
importScripts("/lib/xmldom/xmldom.js");
importScripts("/lib/evaporate/evaporate.js");

DOMParser = self.xmldom.DOMParser;

var defaultConfig = {
    computeContentMd5: true,
    cryptoMd5Method: function (data) { return btoa(SparkMD5.ArrayBuffer.hash(data, true)); },
    cryptoHexEncodedHash256: sha256,
    awsSignatureVersion: "4",
    awsRegion: undefined,
    aws_url: "https://s3-ap-southeast-2.amazonaws.com",
    aws_key: undefined,
    customAuthMethod: function(signParams, signHeaders, stringToSign, timestamp, awsRequest) {
        return new Promise(function(resolve, reject) {
            var signingRequestId = currentSigningRequestId++;

            postMessage(["signingRequest", signingRequestId, signParams.videoId, timestamp, awsRequest.signer.canonicalRequest()]);
            queuedSigningRequests[signingRequestId] = function(signature) {
                queuedSigningRequests[signingRequestId] = undefined;
                if(signature) {
                    resolve(signature);
                } else {
                    reject();
                }
            }
        });
    },
    //logging: false,
    bucket: undefined,
    allowS3ExistenceOptimization: false,
    maxConcurrentParts: 5
}

var currentSigningRequestId = 0;
var queuedSigningRequests = [];

var e = undefined;
var filekey = undefined;
onmessage = function(e) {
    var messageType = e.data[0];
    switch(messageType) {
        case "init":
            var globalConfig = {};
            for(var k in defaultConfig) {
                globalConfig[k] = defaultConfig[k];
            }
            for(var k in e.data[1]) {
                globalConfig[k] = e.data[1][k];
            }

            var uploadConfig = e.data[2];

            Evaporate.create(globalConfig).then(function(evaporate) {
                var e = evaporate;

                filekey = globalConfig.bucket + "/" + uploadConfig.name;

                uploadConfig.progress = function(p, stats) {
                    postMessage(["progress", p, stats]);
                };

                uploadConfig.complete = function(xhr, awsObjectKey, stats) {
                    postMessage(["complete", xhr, awsObjectKey, stats]);
                }

                uploadConfig.info = function(msg) {
                    postMessage(["info", msg]);
                }

                uploadConfig.warn = function(msg) {
                    postMessage(["warn", msg]);
                }

                uploadConfig.error = function(msg) {
                    postMessage(["error", msg]);
                }

                e.add(uploadConfig);
            });
            break;

        case "pause":
            e.pause(filekey);
            break;

        case "resume":
            e.resume(filekey);
            break;

        case "cancel":
            e.cancel(filekey);
            break;

        case "signature":
            var signingRequestId = e.data[1];
            var signature = e.data[2];
            queuedSigningRequests[signingRequestId](signature);
            break;
    }
}

Note that it relies on the calling thread to provide it with the AWS Public Key, AWS Bucket Name and AWS Region, AWS Object Key and the input File object, which are all provided in the 'init' message. When it needs something signed, it sends a 'signingRequest' message to the parent thread, which is expected to provided the signature in a 'signature' message once it's been fetched from my API's signing endpoint.

Joshua Walsh
  • 1,915
  • 5
  • 25
  • 50
  • Does this help at all? https://github.com/TTLabs/EvaporateJS/issues/257 – user650881 Jan 18 '17 at 06:23
  • Not really. I'm aware of EvaporateJS's overhead and I was experiencing performance issues using it, which is the reason I started using Worker threads. My question is why the UI thread is still freezing, even when all the work is happening in a Worker. – Joshua Walsh Jan 19 '17 at 06:20
  • Out of curiosity, did you ever resolve this? – tony19 Jan 27 '17 at 02:08
  • Nope, still not resolved. – Joshua Walsh Jan 28 '17 at 05:05
  • @JoshuaWalsh Did you end up resolving this? I'm having a similar issue - haven't tried the web workers yet, as my research indicates that that does not seem to solve this. I wonder if you did manage to fix the UI issues one way or another? – Hemal May 02 '18 at 01:09
  • @Hemal I'm not experiencing the issue any more, but I haven't changed anything. I believe that it was a Chrome bug that's now fixed. If your use case is similar to mine, I believe moving to Web Workers is likely to resolve it. There are some extra details about my specific issue in this GitHub issue: https://github.com/TTLabs/EvaporateJS/issues/308 – Joshua Walsh May 02 '18 at 04:56
  • Cheers @JoshuaWalsh ! – Hemal Jun 06 '18 at 05:11

1 Answers1

4

I can't give a very good example or analyze what you are doing with only the Worker code, but I strongly suspect that the issue either has to do with either the reading of the chunk on the main thread or some unexpected processing that you are doing on the chunk on the main thread. Maybe post the main thread code that calls postMessage to the Worker?

If I were debugging it right now, I'd try moving your FileReader operations into the Worker. If you don't mind the Worker blocking while it loads a chunk, you could also use FileReaderSync.

Post-comments update

Does generating the presigned URL require hashing the file content + metadata + a key? Hashing file content is going to take O(n) in the size of the chunk and it's possible, if the hash is the first operation that reads from the Blob, that the loading of the file content could be deferred until the hashing starts. Unless you are compelled to keep the signing in the main thread (you don't trust the worker with key material?) that would be another good thing to bring into the worker.

If moving the signing into the Worker is too much, you could have the worker do something to force the Blob to be read and/or pass the ArrayBuffer(or Uint8Array or what have you) of file content back to the main thread for signing; this would ensure that reading the chunk does not occur on the main thread.

ellisbben
  • 6,352
  • 26
  • 43
  • I don't have access to the code right now (and won't have access to it again until after my bounty is over) so sadly I can't post the main thread code. I'll try my best to provide an overview of how it works though. First I want to clarify that my FileReader operations are actually happening within the Worker. You can't see the calls in the code I posted because EvaporateJS handles the file reading, but on the main thread I don't use any FileReader functions or even have EvaporateJS loaded. – Joshua Walsh Jan 21 '17 at 07:13
  • Here are the main thread operations: (1) The user selects their file using an input[type=file] element. The main thread grabs the file name and makes a request to my API to find out where to upload the file. The main thread then uses postMessage to send the 'init' message to the Worker. (2) The Worker begins the upload process. Whenever it needs an AWS presigned URL, it sends a postMessage called 'signingRequest' to the main thread. (3) The main thread asynchronously calls my API, with a signing request. Once it has the signature, it sends a postMessage called 'signature' to the Worker. – Joshua Walsh Jan 21 '17 at 07:17
  • (4) The worker sends a 'progress' message to the main thread when the progress is updated. (5) The main thread then calls an Angular digest to update my UI with the new progress. (6) The Worker sends a 'complete' message when the upload is complete. – Joshua Walsh Jan 21 '17 at 07:18
  • I've tried disabling the digest mentioned in step 5, as I know at times Angular can be a little bit bloated, but this didn't improve anything. My only suspicion now is that somehow the main thread is blocking between (2) and (3), but I'm not sure how this would be the case as my API call is asynchronous and the worker doesn't even block when waiting for the reply. Also the timing of the freezes doesn't seem to correspond with the timing of those API requests, it corresponds instead with the timing of the MD5 hashing. I might try a different MD5 library, but I don't have high hopes for that. – Joshua Walsh Jan 21 '17 at 07:20
  • 1
    Sorry, I didn't realise you'd updated this because edits don't cause notifications. The presigned URL requires a hash of that chunk of the file, yes, so I'm not able to defer the loading of the file content. Additionally, the signing has to be done serverside, and in my main thread I have helper methods for communicating with my server, including ones for handling authentication. I'd rather not require the Worker to also include this code. Are you suggesting that the data is being lazily-read? I'll try forcing it, but the hashing already occurs in the Worker so I think it must be already read. – Joshua Walsh Jan 24 '17 at 09:08
  • 1
    Oh, if the hashing already happens in the worker then I don't really see how the content could get lazy-read in the main thread. – ellisbben Jan 24 '17 at 20:57