0

I have a JSONEncoder encoding a 20mb file, which takes ages to process. If the data it's processing changes, I'd like to cancel the encoding, and restart the encoding process but I can't think of a way to do this. Any ideas? I could call JSONEncoder.encode again, but now I would have two 30 second processes running, and double the amount of memory and processor overhead. It would be lovely to be able cancel the previous one.

EDIT: Some of you requested to see my encoder. Here's the one which I'd say causes the biggest bottleneck...

func encode(to encoder: Encoder) throws {
        try autoreleasepool {
            var container = encoder.container(keyedBy: CodingKeys.self)
            try container.encode(brush, forKey: .brush)

            if encoder.coderType == CoderType.export {
                let bezierPath = try NSKeyedUnarchiver.unarchivedObject(ofClass: UIBezierPath.self, from: beziersData)
                let jsonData = try UIBezierPathSerialization.data(with: bezierPath, options: UIBezierPathWritingOptions.ignoreDrawingProperties)
                let bezier = try? JSONDecoder().decode(DBBezier.self, from: jsonData)
                try container.encodeIfPresent(bezier, forKey: .beziersData)
            } else {
                try container.encodeIfPresent(beziersData, forKey: .beziersData)
            }
        }
    }
user139816
  • 214
  • 2
  • 14
  • Encapsulate your encoder in a cancellable object, such as NSOperation or with Combine maybe? – Eric Aya Jun 23 '21 at 10:35
  • Which version of Swift are you using? – Salman Khakwani Jun 23 '21 at 10:37
  • 2
    In case your model data allows you to, you should consider doing this in chunks/subtasks, say 20MB is divided into 100 chunks, you are doing this process in a for loop one by one, before starting each subtask you can check whether you need to proceed or not (is this encoding process cancelled?). If it was cancelled, you can return from that point without encoding all the data and cleaning up your in progress files etc. As long as you are doing this in one shot, it can't be cancelled once it has been initiated. – Tarun Tyagi Jun 23 '21 at 11:15
  • @SalmanKhakwani Thanks for your answer. I'm using Swift 5 – user139816 Jun 23 '21 at 11:38
  • I am also wondering, why 20mb file causes performance issues. JSONEncoder isn't the fasted you can get, but still not that slow that 20mb should be an issue. You might consider a faster alternative, that creates a custom representation from 20mb JSON in 1/10th of the time - or even faster. – CouchDeveloper Jun 24 '21 at 11:30
  • JSONEncoder is not cancellable - no matter what solutions will be suggested, when using it, it will always cost the same resources to create the JSON from the representation till it succeeds or fails, unless you kill the thread where it will be executed. – CouchDeveloper Jun 24 '21 at 13:14
  • Given todays gigahertz processors 20MByte should be encoded in a fraction of a second. Are you sure there is nothing else going wrong? Could you post a mock of the data you are trying to encode such that we can judge if there might be other improvements? Probably your Codable would be interesting too. – Patru Jun 24 '21 at 13:56
  • @Patru the bulk of the data is bezier paths such as... {"y":93,"x1":1038.5,"type":"QuadraticCurveTo","x":1039.25,"y1":91},{"type":"MoveTo","x":1039.25,"y":93}, – user139816 Jun 29 '21 at 03:15
  • @Patru I've added the encoder in the main question. – user139816 Jun 29 '21 at 03:33

1 Answers1

1

You can use OperationQueue and add your long running task into that operation queue.

var queue: OperationQueue?
//Initialisation
if queue == nil {
    queue = OperationQueue()
    queue?.maxConcurrentOperationCount = 1
}
queue?.addOperation {
    //Need to check the isCanceled property of the operation for stopping the ongoing execution in any case.
    self.encodeHugeJSON()
}

You can also cancel the task whenever you want using the following code:

//Whenever you want to cancel the task, you can do it like this
queue?.cancelAllOperations()
queue = nil

What is an Operation Queue:

An operation queue invokes its queued Operation objects based on their priority and readiness. After you add an operation to a queue, it remains in the queue until the operation finishes its task. You can’t directly remove an operation from a queue after you add it.

Reference links:

Salman Khakwani
  • 6,684
  • 7
  • 33
  • 58
  • 3
    [OperationQueue.cancelAllOperations](https://developer.apple.com/documentation/foundation/nsoperationqueue/1417849-cancelalloperations) docs say `Canceling the operations does not automatically remove them from the queue or stop those that are currently executing.` So it cancels all **pending** tasks in the queue, the **in-flight** tasks can't be cancelled. – Tarun Tyagi Jun 23 '21 at 11:18
  • 1
    @TarunTyagi +1 Thanks for this information. I think that the OP can use isCanceled flag in the operation queue and abandon their ongoing task when they find that the flag has been turned on. – Salman Khakwani Jun 23 '21 at 11:35
  • 1
    I agree with @TarunTyagi and if memory is an issue it would actually be worse to start a new operation once the old one is cancelled instead of waiting for the current encoding to finish. – Joakim Danielson Jun 23 '21 at 11:37
  • @TarunTyagi This was how I thought an Operation Queue behaved. So it seems there's no way to cancel it once it's 'in flight'. Breaking it into smaller chunks seems like the only option currently, which I was hoping to avoid. – user139816 Jun 23 '21 at 11:43
  • 2
    @user139816 `Breaking it into smaller chunks seems like the only option` is the most sensible way to go about this. This will allow you mid-flight cancellation of encoding process (process started (steps = 100), it is at step 15, user cancelled, you mark that this process has been cancelled, it will still complete step 15, before starting step 16, it will check the cancellation and exit from there). – Tarun Tyagi Jun 23 '21 at 11:47
  • @TarunTyagi `let user = realm.objects(DBUser.self).filter("id = '\(guid)'").first let jsonData = try! encoder.encode(user)` This will run through all properties of the DBUser object, and all child objects until it's finished. So to break this into chunks, I would somehow encode one level deep, ignoring child objects, then process child objects separately as a 'chunk', continuing this process throughout the entire object tree. Glueing the chunks together along the way? – user139816 Jun 23 '21 at 12:00
  • @user139816 Sounds right, this can't be done from the top user level, you have to go one level deep and from there encode all properties one by one. – Tarun Tyagi Jun 23 '21 at 12:02
  • @TarunTyagi Seems messy! But I think it's the only and best option. I appreciate your help – user139816 Jun 23 '21 at 12:04
  • Does it have to be one file, otherwise there might be other options? – Joakim Danielson Jun 23 '21 at 13:20
  • @SalmanKhakwani was right. I missed his comment in regards to isCancelled. You can simply test for this Bool in a lengthy loop then it will break out when the operation is cancelled. Am marking this as the correct answer. – user139816 Jul 13 '21 at 12:04
  • The top rated comment here by @TarunTyagi is incorrect, as I mentioned above, it is possible to cancel an in flight operation by using isCancelled. – user139816 Jul 13 '21 at 12:08
  • @user139816 The context was - `the **in-flight** operations are not automatically removed from queue AND system also doesn't stop executing them` which is correct. If your setup has the flexibility to check for `isCancelled` before executing next step in a long running operation, only then you can avail going down this path. As discussed earlier via dividing into short subtasks/chunks of work - so you can check at each step before proceeding further to cancel early. In this case, each for loop iteration acts as one step/chunk of work. – Tarun Tyagi Jul 13 '21 at 14:19
  • @user139816 To add more clarity to this, you can not cancel an in-flight `JSONDecoder().decode` call. There is no way to do this. However if you are decoding 100s of different things - one by one - then of course you can check at each step and that allows for the checking `isCancelled` mid-flight (say after completing 15 json decode calls, before going to 16th json decode call out of total expected 100 json decode calls to complete an operation). In this case each individual json decode call is a subtask/chunk of work mentioned in the earlier comments. – Tarun Tyagi Jul 13 '21 at 14:25