23

We have a web app (built using AngularJS) that we're gradually adding PWA 'features' too (service worker, launchable, notifications, etc). One of the features our web app has is the ability to complete a web form while offline. At the moment, we store the data in IndexedDB when offline, and simply encourage the user to push that data to the server once they're online ("This form is saved to your device. Now you're back online, you should save it to the cloud..."). We will do this automatically at some point, but that's not necessary at the moment.

We are adding a feature to these web forms, whereby the user will be able to attach files (images, documents) to the form, perhaps at several points throughout the form.

My question is this - is there a way for service worker to handle file uploads? To somehow - perhaps - store the path to the file to be uploaded, when offline, and push that file up once the connection has been restored? Would this work on mobile devices, as do we have access to that 'path' on those devices? Any help, advice or references would be much appreciated.

Braiam
  • 1
  • 11
  • 47
  • 78
user7043436
  • 286
  • 1
  • 2
  • 11
  • 1
    You may want to check this [documentation](https://developers.google.com/web/ilt/pwa/caching-files-with-service-worker#cache_falling_back_to_the_network) to know how you'll handle the majority of requests if you're making your app offline-first. Other patterns will be exceptions based on the incoming request. Also, this [documentation](https://developers.google.com/web/fundamentals/instant-and-offline/web-storage/offline-for-pwa) if offline storage interests you. – abielita Aug 17 '17 at 17:23

4 Answers4

14

One way to handle file uploads/deletes and almost everything, is by keeping track of all the changes made during the offline requests. We can create a sync object with two arrays inside, one for pending files that will need to be uploaded and one for deleted files that will need to be deleted when we'll get back online.

tl;dr

Key phases


  1. Service Worker Installation


    • Along with static data, we make sure to fetch dynamic data as the main listing of our uploaded files (in the example case /uploads GET returns JSON data with the files).

      Service Worker Install

  2. Service Worker Fetch


    • Handling the service worker fetch event, if the fetch fails, then we have to handle the requests for the files listing, the requests that upload a file to the server and the request that deletes a file from the server. If we don't have any of these requests, then we return a match from the default cache.

      • Listing GET
        We get the cached object of the listing (in our case /uploads) and the sync object. We concat the default listing files with the pending files and we remove the deleted files and we return new response object with a JSON result as the server would have returned it.
      • Uloading PUT
        We get the cached listing files and the sync pending files from the cache. If the file isn't present, then we create a new cache entry for that file and we use the mime type and the blob from the request to create a new Response object that it will be saved to the default cache.
      • Deleting DELETE
        We check in the cached uploads and if the file is present we delete the entry from both the listing array and the cached file. If the file is pending we just delete the entry from the pending array, else if it's not already in the deleted array, then we add it. We update listing, files and sync object cache at the end.

      Service Worker Fetch

  3. Syncing


    • When the online event gets triggered, we try to synchronize with the server. We read the sync cache.

      • If there are pending files, then we get each file Response object from cache and we send a PUT fetch request back to the server.
      • If there are deleted files, then we send a DELETE fetch request for each file to the server.
      • Finally, we reset the sync cache object.

      Synching to server

Code implementation


(Please read the inline comments)

Service Worker Install

const cacheName = 'pwasndbx';
const syncCacheName = 'pwasndbx-sync';
const pendingName = '__pending';
const syncName = '__sync';

const filesToCache = [
  '/',
  '/uploads',
  '/styles.css',
  '/main.js',
  '/utils.js',
  '/favicon.ico',
  '/manifest.json',
];

/* Start the service worker and cache all of the app's content */
self.addEventListener('install', function(e) {
  console.log('SW:install');

  e.waitUntil(Promise.all([
    caches.open(cacheName).then(async function(cache) {
      let cacheAdds = [];

      try {
        // Get all the files from the uploads listing
        const res = await fetch('/uploads');
        const { data = [] } = await res.json();
        const files = data.map(f => `/uploads/${f}`);

        // Cache all uploads files urls
        cacheAdds.push(cache.addAll(files));
      } catch(err) {
        console.warn('PWA:install:fetch(uploads):err', err);
      }

      // Also add our static files to the cache
      cacheAdds.push(cache.addAll(filesToCache));
      return Promise.all(cacheAdds);
    }),
    // Create the sync cache object
    caches.open(syncCacheName).then(cache => cache.put(syncName, jsonResponse({
      pending: [], // For storing the penging files that later will be synced
      deleted: []  // For storing the files that later will be deleted on sync
    }))),
  ])
  );
});

Service Worker Fetch

self.addEventListener('fetch', function(event) {
  // Clone request so we can consume data later
  const request = event.request.clone();
  const { method, url, headers } = event.request;

  event.respondWith(
    fetch(event.request).catch(async function(err) {
      const { headers, method, url } = event.request;

      // A custom header that we set to indicate the requests come from our syncing method
      // so we won't try to fetch anything from cache, we need syncing to be done on the server
      const xSyncing = headers.get('X-Syncing');

      if(xSyncing && xSyncing.length) {
        return caches.match(event.request);
      }

      switch(method) {
        case 'GET':
          // Handle listing data for /uploads and return JSON response
          break;
        case 'PUT':
          // Handle upload to cache and return success response
          break;
        case 'DELETE':
          // Handle delete from cache and return success response
          break;
      }

      // If we meet no specific criteria, then lookup to the cache
      return caches.match(event.request);
    })
  );
});

function jsonResponse(data, status = 200) {
  return new Response(data && JSON.stringify(data), {
    status,
    headers: {'Content-Type': 'application/json'}
  });
}

Service Worker Fetch Listing GET

if(url.match(/\/uploads\/?$/)) { // Failed to get the uploads listing
  // Get the uploads data from cache
  const uploadsRes = await caches.match(event.request);
  let { data: files = [] } = await uploadsRes.json();

  // Get the sync data from cache
  const syncRes = await caches.match(new Request(syncName), { cacheName: syncCacheName });
  const sync = await syncRes.json();

  // Return the files from uploads + pending files from sync - deleted files from sync
  const data = files.concat(sync.pending).filter(f => sync.deleted.indexOf(f) < 0);

  // Return a JSON response with the updated data
  return jsonResponse({
    success: true,
    data
  });
}

Service Worker Fetch Uloading PUT

// Get our custom headers
const filename = headers.get('X-Filename');
const mimetype = headers.get('X-Mimetype');

if(filename && mimetype) {
  // Get the uploads data from cache
  const uploadsRes = await caches.match('/uploads', { cacheName });
  let { data: files = [] } = await uploadsRes.json();

  // Get the sync data from cache
  const syncRes = await caches.match(new Request(syncName), { cacheName: syncCacheName });
  const sync = await syncRes.json();

  // If the file exists in the uploads or in the pendings, then return a 409 Conflict response
  if(files.indexOf(filename) >= 0 || sync.pending.indexOf(filename) >= 0) {
    return jsonResponse({ success: false }, 409);
  }

  caches.open(cacheName).then(async (cache) => {
    // Write the file to the cache using the response we cloned at the beggining
    const data = await request.blob();
    cache.put(`/uploads/${filename}`, new Response(data, {
      headers: { 'Content-Type': mimetype }
    }));

    // Write the updated files data to the uploads cache
    cache.put('/uploads', jsonResponse({ success: true, data: files }));
  });

  // Add the file to the sync pending data and update the sync cache object
  sync.pending.push(filename);
  caches.open(syncCacheName).then(cache => cache.put(new Request(syncName), jsonResponse(sync)));

  // Return a success response with fromSw set to tru so we know this response came from service worker
  return jsonResponse({ success: true, fromSw: true });
}

Service Worker Fetch Deleting DELETE

// Get our custom headers
const filename = headers.get('X-Filename');

if(filename) {
  // Get the uploads data from cache
  const uploadsRes = await caches.match('/uploads', { cacheName });
  let { data: files = [] } = await uploadsRes.json();

  // Get the sync data from cache
  const syncRes = await caches.match(new Request(syncName), { cacheName: syncCacheName });
  const sync = await syncRes.json();

  // Check if the file is already pending or deleted
  const pendingIndex = sync.pending.indexOf(filename);
  const uploadsIndex = files.indexOf(filename);

  if(pendingIndex >= 0) {
    // If it's pending, then remove it from pending sync data
    sync.pending.splice(pendingIndex, 1);
  } else if(sync.deleted.indexOf(filename) < 0) {
    // If it's not in pending and not already in sync for deleting,
    // then add it for delete when we'll sync with the server
    sync.deleted.push(filename);
  }

  // Update the sync cache
  caches.open(syncCacheName).then(cache => cache.put(new Request(syncName), jsonResponse(sync)));

  // If the file is in the uplods data
  if(uploadsIndex >= 0) {
    // Updates the uploads data
    files.splice(uploadsIndex, 1);
    caches.open(cacheName).then(async (cache) => {
      // Remove the file from the cache
      cache.delete(`/uploads/${filename}`);
      // Update the uploads data cache
      cache.put('/uploads', jsonResponse({ success: true, data: files }));
    });
  }

  // Return a JSON success response
  return jsonResponse({ success: true });
}

Synching

// Get the sync data from cache
const syncRes = await caches.match(new Request(syncName), { cacheName: syncCacheName });
const sync = await syncRes.json();

// If the are pending files send them to the server
if(sync.pending && sync.pending.length) {
  sync.pending.forEach(async (file) => {
    const url = `/uploads/${file}`;
    const fileRes = await caches.match(url);
    const data = await fileRes.blob();

    fetch(url, {
      method: 'PUT',
      headers: {
        'X-Filename': file,
        'X-Syncing': 'syncing' // Tell SW fetch that we are synching so to ignore this fetch
      },
      body: data
    }).catch(err => console.log('sync:pending:PUT:err', file, err));
  });
}

// If the are deleted files send delete request to the server
if(sync.deleted && sync.deleted.length) {
  sync.deleted.forEach(async (file) => {
    const url = `/uploads/${file}`;

    fetch(url, {
      method: 'DELETE',
      headers: {
        'X-Filename': file,
        'X-Syncing': 'syncing' // Tell SW fetch that we are synching so to ignore this fetch
      }
    }).catch(err => console.log('sync:deleted:DELETE:err', file, err));
  });
}

// Update and reset the sync cache object
caches.open(syncCacheName).then(cache => cache.put(syncName, jsonResponse({
  pending: [],
  deleted: []
})));

Example PWA


I have created a PWA example that implements all these, which you can find and test here. I have tested it using Chrome and Firefox and using Firefox Android on a mobile device.

You can find the full source code of the application (including an express server) in this Github repository: https://github.com/clytras/pwa-sandbox.

Christos Lytras
  • 36,310
  • 4
  • 80
  • 113
  • Thank you for your solution! Your efforts are greatly appreciated. I'm going to actually award the bounty to both your answer and I will start another for the other answer, as I think they both provide some good starting points. One question for you... have you used the Background Sync API? It would be easy to add to your example, but I'm curious of its limitations. In my case, I'll be uploading reasonably large files (~20MB to ~100MB), and I'm wondering about time limits and such. – Brad Mar 12 '20 at 19:33
  • @Brad I've checked `sync` API and it doesn't have any kind of support for passing custom data with sync registrations. There is [`fetch-sync`](https://github.com/sdgluck/fetch-sync) and you can also check this Q/A [Pass custom data to service worker sync?](https://stackoverflow.com/questions/41798009/pass-custom-data-to-service-worker-sync/). The same logic can be used but with a different approach perhaps directly put files to cache and then register a sync event for each request. – Christos Lytras Mar 13 '20 at 00:56
  • For the big file sizes, there is no guarantee that the files will get uploaded, thus the logic at the `doSync` can change and only remove a file from the `pending` array if the `fetch` promise resolves and not if it's rejected; same for `deleted` array. Of course, `doSync` will have to get called at some initial point to check if there are pending files from some failed sync requests which can be done even with a `sync` registration at the beginning. – Christos Lytras Mar 13 '20 at 00:56
13

When the user selects a file via an <input type="file"> element, we are able to get the selected file(s) via fileInput.files. This gives us a FileList object, each item in it being a File object representing the selected file(s). FileList and File are supported by HTML5's Structured Clone Algorithm.

When adding items to an IndexedDB store, it creates a structured clone of the value being stored. Since FileList and File objects are supported by the structured clone algorithm, this means that we can store these objects in IndexedDB directly.

To perform those file uploads once the user goes online again, you can use the Background Sync feature of service workers. Here's an introductory article on how to do that. There are a lot of other resources for that as well.

In order to be able to include file attachments in your request once your background sync code runs, you can use FormData. FormDatas allow adding File objects into the request that will be sent to your backend, and it is available from within the service worker context.

Arnelle Balane
  • 5,437
  • 1
  • 26
  • 32
  • 1
    How many photos (or max size) is it possible to upload and store offline with it ? – John Jun 07 '18 at 16:06
  • 1
    But the behavior I desire is that while the file is waiting to be uploaded, a regular fetch to the URL would get the file from the service worker cache, so that my client code using the maybe-uploaded-maybe-not-uploaded file is indifferent to whether it actually has been uploaded or not. –  Aug 04 '18 at 17:10
5

The Cache API is designed to store a request (as the key) and a response (as the value) in order to cache a content from the server, for the web page. Here, we're talking about caching user input for future dispatch to the server. In other terms, we're not trying to implement a cache, but a message broker, and that's not currently something handled by the Service Worker spec (Source).

You can figure it out by trying this code:

HTML:

<button id="get">GET</button>
<button id="post">POST</button>
<button id="put">PUT</button>
<button id="patch">PATCH</button>

JavaScript:

if ('serviceWorker' in navigator) {
  navigator.serviceWorker.register('/service-worker.js', { scope: '/' }).then(function (reg) {
    console.log('Registration succeeded. Scope is ' + reg.scope);
  }).catch(function (error) {
    console.log('Registration failed with ' + error);
  });
};

document.getElementById('get').addEventListener('click', async function () {
  console.log('Response: ', await fetch('50x.html'));
});

document.getElementById('post').addEventListener('click', async function () {
  console.log('Response: ', await fetch('50x.html', { method: 'POST' }));
});

document.getElementById('put').addEventListener('click', async function () {
  console.log('Response: ', await fetch('50x.html', { method: 'PUT' }));
});

document.getElementById('patch').addEventListener('click', async function () {
  console.log('Response: ', await fetch('50x.html', { method: 'PATCH' }));
});

Service Worker:

self.addEventListener('fetch', function (event) {
    var response;
    event.respondWith(fetch(event.request).then(function (r) {
        response = r;
        caches.open('v1').then(function (cache) {
            cache.put(event.request, response);
        }).catch(e => console.error(e));
        return response.clone();
    }));
});

Which throws:

TypeError: Request method 'POST' is unsupported

TypeError: Request method 'PUT' is unsupported

TypeError: Request method 'PATCH' is unsupported

Since, the Cache API can't be used, and following the Google guidelines, IndexedDB is the best solution as a data store for ongoing requests. Then, the implementation of a message broker is the responsibility of the developer, and there is no unique generic implementation that will cover all of the use cases. There are many parameters that will determine the solution:

  • Which criteria will trigger the use of the message broker instead of the network? window.navigator.onLine? A certain timeout? Other?
  • Which criteria should be used to start trying to forward ongoing requests on the network? self.addEventListener('online', ...)? navigator.connection?
  • Should requests respect the order or should they be forwarded in parallel? In other terms, should they be considered as dependent on each other, or not?
  • If run in parallel, should they be batched to prevent a bottleneck on the network?
  • In case the network is considered available, but the requests still fail for some reason, which retry logic to implement? Exponential backoff? Other?
  • How to notify the user that their actions are in a pending state while they are?
  • ...

This is really very broad for a single StackOverflow answer.

That being said, here is a minimal working solution:

HTML:

<input id="file" type="file">
<button id="sync">SYNC</button>
<button id="get">GET</button>

JavaScript:

if ('serviceWorker' in navigator) {
  navigator.serviceWorker.register('/service-worker.js', { scope: '/' }).then(function (reg) {
    console.log('Registration succeeded. Scope is ' + reg.scope);
  }).catch(function (error) {
    console.log('Registration failed with ' + error);
  });
};

document.getElementById('get').addEventListener('click', function () {
  fetch('api');
});

document.getElementById('file').addEventListener('change', function () {
  fetch('api', { method: 'PUT', body: document.getElementById('file').files[0] });
});

document.getElementById('sync').addEventListener('click', function () {
  navigator.serviceWorker.controller.postMessage('sync');
});

Service Worker:

self.importScripts('https://unpkg.com/idb@5.0.1/build/iife/index-min.js');

const { openDB, deleteDB, wrap, unwrap } = idb;

const dbPromise = openDB('put-store', 1, {
    upgrade(db) {
        db.createObjectStore('put');
    },
});

const idbKeyval = {
    async get(key) {
        return (await dbPromise).get('put', key);
    },
    async set(key, val) {
        return (await dbPromise).put('put', val, key);
    },
    async delete(key) {
        return (await dbPromise).delete('put', key);
    },
    async clear() {
        return (await dbPromise).clear('put');
    },
    async keys() {
        return (await dbPromise).getAllKeys('put');
    },
};

self.addEventListener('fetch', function (event) {
    if (event.request.method === 'PUT') {
        let body;
        event.respondWith(event.request.blob().then(file => {
            // Retrieve the body then clone the request, to avoid "body already used" errors
            body = file;
            return fetch(new Request(event.request.url, { method: event.request.method, body }));
        }).then(response => handleResult(response, event, body)).catch(() => handleResult(null, event, body)));

    } else if (event.request.method === 'GET') {
        event.respondWith(fetch(event.request).then(response => {
            return response.ok ? response : caches.match(event.request);
        }).catch(() => caches.match(event.request)));
    }
});

async function handleResult(response, event, body) {
    const getRequest = new Request(event.request.url, { method: 'GET' });
    const cache = await caches.open('v1');
    await idbKeyval.set(event.request.method + '.' + event.request.url, { url: event.request.url, method: event.request.method, body });
    const returnResponse = response && response.ok ? response : new Response(body);
    cache.put(getRequest, returnResponse.clone());
    return returnResponse;
}

// Function to call when the network is supposed to be available

async function sync() {
    const keys = await idbKeyval.keys();
    for (const key of keys) {
        try {
            const { url, method, body } = await idbKeyval.get(key);
            const response = await fetch(url, { method, body });
            if (response && response.ok)
                await idbKeyval.delete(key);
        }
        catch (e) {
            console.warn(`An error occurred while trying to sync the request: ${key}`, e);
        }
    }
}

self.addEventListener('message', sync);

Some words about the solution: it allows to cache the PUT request for future GET requests, and it also stores the PUT request into an IndexedDB database for future sync. About the key, I was inspired by Angular's TransferHttpCacheInterceptor which allows to serialize backend requests on the server-side rendered page for use by the browser-rendered page. It uses <verb>.<url> as the key. That supposes a request will override another request with the same verb and URL.

This solution also supposes that the backend does not return 204 No content as a response of a PUT request, but 200 with the entity in the body.

Guerric P
  • 30,447
  • 6
  • 48
  • 86
  • 1
    Thank you, very much, for your experimentation and explanation. I still wonder about the Cache API though. Can't we do something like `cache.put(event.request.url, new Response(event.request.body))`? In other words, take the request body (the file), and cache it for future GET requests by making a new response like we'd like to see for future GET requests. As for your other points, you're right in that they are important, but I don't think they're necessary to address for the purpose of this question. Example code can be left as simple as possible to work. – Brad Mar 08 '20 at 15:46
  • Thanks for your feedback, I will provide a minimal solution soon – Guerric P Mar 08 '20 at 15:53
  • @GuerricP `Cache` API can be used with the `fetch` and `PUT`, `DELETE` and `PATCH` requests. You're making assumptions based on errors that come from server responses. You're trying to request a single page (`50x.html`) which accepts only `GET` and `HEAD` requests. If the server handles `PUT`, `PATCH` and `DELETE` and it has `CORS`, then `fetch` can handle these requests and `Cache` API can be used to `Cache` any request, even custom ones. Check this [repo](https://github.com/clytras/pwa-sandbox) that I include in my answer to see an `expressjs` server which can handle all these requests. – Christos Lytras Mar 11 '20 at 18:23
  • @ChristosLytras actually, these errors are not related to my backend, but to the cache implementation: https://chromium.googlesource.com/chromium/src/+/781c15c373baee26d5447ff9157370411f61b18f/third_party/blink/renderer/modules/cache_storage/cache.cc#712 in your code you are not caching `PUT` and `DELETE` requests, you are caching `GET` requests. – Guerric P Mar 11 '20 at 19:04
  • @GuerricP Thanks again for your answer. I'm awarding your answer a bounty as well as the other answer, but Stack Overflow says I have to wait a few days, so I'll come back to it. :-) I think your answer provides a good starting point, and also points out a lot of the other considerations that are important. – Brad Mar 12 '20 at 19:35
  • @Brad that's very nice of you. I'm glad it helped it was a very interesting subject! – Guerric P Mar 12 '20 at 19:58
  • @GuerricP I know exactly what I am doing in my code, I wrote every line of it. Fetch API totally supports all the requests. With `Cache` API of course, you want to emulate the `fetch` request using a [`Response`](https://developer.mozilla.org/en-US/docs/Web/API/Response) object and save these requests. In my example I'm implementing all the logic using just `Cache` API. – Christos Lytras Mar 12 '20 at 20:12
  • @christoslytras i can't be more factual. Your opinion is your own – Guerric P Mar 12 '20 at 20:44
  • @GuerricP we're disqusing here. The fact that `fetch` API supports all methods is not my opinion, it's how `fetch` works. My point is that you don't need `IDBDatabase` at all to handle `GET`, `DELETE` and `PATCH`, it can be done by just using `Cache` API. – Christos Lytras Mar 12 '20 at 22:40
  • Let's discuss but please verify your statements before posting. When passing requests as first parameter of `cache.put` they are set as `GET` requests because the purpose is to use `caches.match` after that, and serve the corresponding response. So yes your code works because you create one cache per HTTP method, and you're interpreting what's in the caches with your own code, but that's using the cache API in a way it's not intended to be used – Guerric P Mar 13 '20 at 13:07
0

I was also stumbling upon it lately. Here is what I am doing to store in index db and return response when offline.

const storeFileAndReturnResponse = async function (request, urlSearchParams) {
  let requestClone = request.clone();

  let formData = await requestClone.formData();

  let tableStore = "fileUploads";

  let fileList = [];
  let formDataToStore = [];
  //Use formData.entries to iterate collection - this assumes you used input type= file
  for (const pair of formData.entries()) {
    let fileObjectUploaded = pair[1];
    //content holds the arrayBuffer (blob) of the uploaded file
    formDataToStore.push({
      key: pair[0],
      value: fileObjectUploaded,
      content: await fileObjectUploaded.arrayBuffer(),
    });

    let fileName = fileObjectUploaded.name;
    fileList.push({
      fileName: fileName,
    });
  }

  let payloadToStore = {
    parentId: parentId,
    fileList: fileList,
    formDataKeyValue: formDataToStore,
  };
  (await idbContext).put(tableStore, payloadToStore);

  return {
    UploadedFileList: fileList,
  };
};
lpizzinidev
  • 12,741
  • 2
  • 10
  • 29
Arulvel
  • 69
  • 3