0

I am interested in developing a library that would sync a core data model across devices via the Parse mobile backend. I want to mirror the functionality that iCloud core data sync attempts to provide.

Why not use iCloud or Ensembles? I am currently using iCloud core data sync in a production app and it is not working well for me. I also want to provide authentication independent of the Apple ID which is another reason I want to get away from iCloud. As far as Ensembles is concerned, I am not sure if this will still work with Dropbox due the deprecation of the dropbox sync API.  

I haven’t begun to develop the library. I am looking for feedback on my plan which is outlined below. This design is based off of this SO answer

General design of the library:

  1. The library would provide a standard core data stack that would set up the persistent store coordinator and managed object context. All of the standard core data CRUD operations would proceed through an interface provided by the library.

  2. Each time CUD operation takes place, a sync operation object would be saved to Parse in the background that includes all of the information needed to reproduce the operation. This includes: the type of operation that took place, a unique identifier for the object that was operated on, and in the case of a create operation, the parent object and relationship would be provided.

  3. Each operation would have a change_id number associated with it. Every time the device downloads and executed an operation, it would store the latest change_id associated with that operation.
  4. Prior to uploading each sync operation, the device would send a request to the server to ensure that the change_id number stored matches the one stored locally. If  the change_id on the server is higher, it would first download all of the sync operations and execute them then upload its own sync operations.
  5. Conflicts (two devices editing the same value while offline) would be resolved by determining which device changed the value last. 

Am I missing anything here? What are some potential pitfalls with this approach? I hear that sync is hard, should this type of undertaking be left to the most experienced developers?

Community
  • 1
  • 1
ChemDev
  • 827
  • 1
  • 8
  • 23

1 Answers1

3

I'm not the least biased responder, because I am the developer of the Ensembles framework, but let me pitch in some thoughts.

In regards to Ensembles itself, it is a backend-agnostic framework. Yes, it does work with iCloud and Dropbox Sync API, but also with CloudKit, Dropbox Core API (which is not deprecated), and WebDAV. There is also a custom Node.js server available with one package which allows you to host the data yourself using Heroku and S3.

So even if you don't want to stick with Apple, there are other options. But even more than that, you can write your own backend adaptor class. Most are around 500 lines of code, and you can base it off one of the existing classes. This would allow you to make a backend that stores data and authenticates with Parse, and leave the merging of data to Ensembles. Another advantage of this is that you can easily move to other backends in future, or offer them as options. (CloudKit is definitely worth a look.)

But let's assume you are determined not to use someone else's framework, then yes, your approach sounds globally right.

Rather than making CRUD operations go through an interface, you can just observe NSManagedObjectContextDidSaveNotification and extract the changes from the userInfo dictionary.

I'm sure you will find lots of little things you didn't think about, and it's these details that tend to make sync hard. One such example is that you need to build something robust enough to handle failures such as the Parse operations not completing before the app quits. You probably need to have a change tag on every object, so you can retrieve the ones that changed since the last sync.

If your app have a small amount of data, it building this system is not terribly difficult, but as your data starts to get bigger, you need to start using things like batching to keep in-memory data low on iOS. This sort of thing can take a lot of time. For example, Ensembles 2 has pretty much an identical API to Ensembles 1, but I spent about 4 months just rewriting things like batching to be memory efficient.

I built a prototype app using the approach you describe (app was social, not syncing, hence no Ensembles). I used CloudKit, which is very similar to Parse. It was about 1000 lines of Swift code to get the whole data upload/download working OK, with a local Core Data cache. It's certainly do-able, especially if you know Core Data well already. Otherwise there might be a learning curve.

My advocacy of a framework like Ensembles is simply that it has already solved many of the small details you will need to solve, and it will not lock you into a particular backend. If Parse decided to raise their fees, you would be free to move elsewhere.

Drew McCormack
  • 3,490
  • 1
  • 19
  • 23
  • Drew, thanks so much for taking the time to provide a detailed and thoughtful answer to my question! Great to hear from someone who has actually successfully developed a sync library. Ensembles seems like an interesting option, however, I am leaning towards developing my own library because it will be easier to taylor it to my own needs and requirements. With that said, if you had a prebuilt solution which used Parse as a backend that didn’t require much custom code, I would definitely purchase it at your current price point. If I feel that way, there must be others that feel that way too :) – ChemDev Jul 20 '15 at 00:40
  • Probably too late for you now, but I will look at adding a parse.com backend. – Drew McCormack Jul 21 '15 at 07:01
  • @ChemDev I speak as someone who is very well versed on Core Data, and who has also developed their own syncing solution... and then realized that I didn't make it as generic as I wanted when I had a project that needed to use S3 as the backing store. I was so impressed with Ensembles that I paid for version 2, and it was the best money I've spent in years. You have access to the source code, so you can change and adapt as you see fit - but you probably only need to write backend plugins - which is what I did - and now my OS X app is natively syncing directly to AWS S3 (with multiple users). – Jody Hagins Jul 22 '15 at 04:08
  • Jody thanks for the comment, I'm going to give ensembles and will report back. – ChemDev Jul 22 '15 at 12:13
  • Drew, the multi user sync is intriguing. How is this handled with cloudkit in terms authentication of users sharing an ensemble? Is this a feature that is available out of the box or is custom code required? – ChemDev Jul 23 '15 at 00:27
  • It is not handled by Ensembles itself. There are so many ways to do multi-user, that it would be a problem. So that is left to you. For example, you could create a group of users that share a particular Ensemble directory in the cloud. User management is left as an exercise for the developer. – Drew McCormack Jul 23 '15 at 07:16