Mistitled

Rich Siegel's Tumblr

The Gathering Storm: Our Travails with iCloud Sync

Preface

Recently, and with increasing frequency, developers have been speaking out about the difficulties they’ve had trying to implement support for iCloud in their products, and journalists have been reporting the news. I told our story to Ars Technica, and they’ve made their own report. In this post, I provide the background information behind our contributions to the Ars article.

Introduction

iCloud, the service provided by Apple, Inc. for sharing data among multiple Macs and iOS devices, brings with it the promise of power and convenience. However, many developers are having problems trying to fulfill the iCloud promise for their customers:

And we at Bare Bones Software have had difficulties with our digital junk drawer product, Yojimbo, as well.

iCloud Syncing 101

iCloud is the Apple-sanctioned sync service; support for it is built in to current operating systems, and it is available to all Mac and iOS users. As such, using iCloud is a natural and rational decision for a developer to make when deploying Mac and/or iOS applications that need to share data across multiple Apple devices for a single user.

In concept, the service is pretty simple. A central iCloud server holds the truth: the canonical version of the user’s data for an app. As the user manipulates an app’s data, iCloud tracks and reconciles the changes into the central truth, and makes sure that all copies of the data, on each computer, are brought up to date. In order for this to work, though, a lot has to happen behind the scenes. What we casually refer to as iCloud is many parts, each with a role to play.

  • The Ubiquity system runs locally on your Mac, manages data in your home/Library/Mobile Documents/ folder, identifies changed files, and submits them to the iCloud service. In reverse, the Ubiquity system receives data from the iCloud service and stores it locally.

  • The iCloud service, which runs on Apple’s servers and receives data from each Mac logged in to your iCloud account; and supplies data to each Mac in response to requests made by the Ubiquity system.

  • Code in the OS and application frameworks which communicate with the Ubiquity system to move data back and forth.

Applications that wish to support iCloud may do so in one of three basic ways:

  • Key-value storage (KVS): This is the simplest form of support, in which keys (item names, loosely speaking) are paired with values (item content). KVS is very limited; Apple has established both quantity and size limits on KVS usage in iCloud clients. So while it is an excellent solution for preferences and simple configuration data, it is not appropriate for document storage, and is unsuited for applications (such as Yojimbo) which must store multiple types of indexed items of various sizes, ranging from a few words of text to dozens of megabytes of PDF data.

  • Documents in the Cloud: This is another simple form of iCloud support, in which applications using the system’s standard document model may store discrete documents in such a way that the iCloud system synchronizes them between multiple devices (including iOS devices). Documents in the Cloud are supported fairly widely by applications on the Mac, and this method works well for document-oriented tools such as TextEdit or the iWork products. However, because the synchronization is done only at the level of a single document, the Documents in the Cloud model lacks the ability to discretely sync changes from multiple clients, or take advantage of other database functions.

  • Core Data syncing: This is where the rubber meets the road for database-backed applications. Core Data is the application-level database framework supplied by OS X and iOS that provides the means for applications to store items, and data about those items, in a single database. Yojimbo, our product, was one of the very first to ship using Core Data storage — we’ve used it since 2006, and it works great for storing data locally in the way that the product needs it to. Syncing database changes with iCloud, however, is a very complicated and difficult job for Core Data.

So, what’s so hard?

"Don’t you just turn on iCloud syncing, and it all Just Works?”

"Just works" might be the customer experience; but developers must do more to support iCloud (or any new OS technology) than simply flipping a switch. Things can go wrong, and they often do.

Opaque failure modes

In general, when iCloud data doesn’t synchronize correctly (and this happens, in practice, often), neither the developer nor the user has any idea why.

Sometimes, iCloud simply fails to move data from one computer to another. If you know the secret incantation to maximize the amount of logging done by the Ubiquity daemon, there are helpful folks at Apple who may be able to tell you what the entrails mean after the fact. Sometimes you’ll get errors such as, “The file upload timed out.” Apart from not being semantically relevant (Yojimbo doesn’t request any file uploads, because syncing is done at the Core Data object level), the message is also unhelpful because it doesn’t provide any information that either the user or developer can apply to diagnose and resolve the problem.

Corrupted baselines are another common obstacle. While attempting to deploy iCloud sync on Mac OS X 10.7, we ran into a situation in which the baseline (a reference copy of the synchronization data) would become corrupted in various easily encountered situations. There is no recovery from a corrupted baseline, short of digging in to the innards of your local iCloud storage and scraping everything out, and there is no visible indication that corruption has occurred — syncing simply stops.

Finally, one of the most maddening issues: the iCloud-just-says-no problem. Sometimes, when initializing the iCloud application subsystem, it will simply return an opaque internal error. When it fails, there’s no option to recover — all you can do is try again (and again…) until it finally works. And when it does initialize successfully, it can take an extremely long time. In some cases, we’ve seen delays of up to 25 minutes while waiting for the iCloud stack to initialize. There’s no discernable consistency or rationale for when it says no and when it finally says yes…you can just keep trying, and eventually it might just work, or not.

User Experience Shortfalls

Did you know that when you turn off the “Documents & Data” syncing option in the iCloud system preferences, the iCloud system deletes all of your locally stored iCloud data? Or when you sign out of iCloud, the system moves your iCloud data to a location outside of your application’s sandbox container, and the application can no longer use it?

This is part of iCloud’s central design philosophy: if you are using iCloud for data storage, the assumption appears to be that you are using iCloud exclusively for data storage. If you stop using iCloud on a given device, the system assumes that you are no longer interested in your stored data, and it gets removed from your device. Or if you sign out of iCloud (whether or not you intend to log in again), the system stashes your iCloud data in an undisclosed location. Applications which are sandboxed (as iCloud clients must be) are not able to see into this location, so when you sign out of iCloud, your iCloud data becomes inaccessible.

Consider the implications for a Core Data shoebox application such as Yojimbo: You dutifully make your Core Data storage ubiquitous, and then you sign out of iCloud. Poof, there goes your Yojimbo database.

There are ways of coping with this, at least in theory. The recovery from iCloud signout involves taking the opportunity to migrate all of your Core Data storage from Mobile Documents to the private sandbox container on your Mac. We found, to our dismay, that the practical reality didn’t hold up to theory — part of the problem is that you don’t get notified until after the data has been made inaccessible, and once in that state, there’s no choice but to use Core Data to make a copy of the data that’s just been sequestered. And of course, given a database of sufficient size, the process of using Core Data to relocate the database ties up the application in an unresponsive state, without visible progress, for as long as it takes. (And woe betide you if something goes wrong in the middle of it.)

Bugs

Of course, virtually every piece of software has bugs. The Core Data/iCloud sync interface is not immune, and unfortunately the bugs there are the ones that have caused us the most difficulty. There are bugs in Ubiquity on 10.7.x, which is why we had to (reluctantly) make the decision to require 10.8.2 as the minimum OS for supporting iCloud. Even on the latest 10.8 OS update there are bugs. Yes, we’ve reported them — in fact, we’ve gotten hands-on, in-person assistance from the Core Data and Ubiquity engineers at Apple. And even so, some bugs persist or are intractable. Here’s a sampling of what we’ve run into:

Object class confusion

In some circumstances (and we haven’t been able to figure out which, yet), iCloud actually changes the object class of an item when synchronizing it. Loosely described, the object class determines the type of the object in the database — in Yojimbo’s case, this would be a Note, or a Password, or a Web Archive, and so forth. When this bug occurs, iCloud synchronization sometimes causes PDFs to be changed to Web Archives; and sometimes causes Web Archives to be changed to PDFs. Of course, the object’s data or properties aren’t changed; only its object class; and when this happens, things fall apart when trying to view an object of the wrong class.

Broken relationships, lost containers

In many Core Data clients, objects which appear as single items in the UI actually have multiple parts that are held together by relationships. For example, a Web Archive item consists of a WebArchiveBlob item which contains the actual data, and a WebArchive item which maintains other relevant information about the item. (Splitting items up in this way makes it possible to change the properties of an item and not require the entire item to be resynchronized.) The WebArchive item contains relationships that identify its associated WebArchiveBlob; and the WebArchiveBlob contains relationships that identify the particular WebArchive as its owner.

This is all fine, except when iCloud fumbles. In some cases (again, not all the time), iCloud may do one of the following:

  1. Owner relationships in an item’s data will point to the wrong owner;

  2. Owner items get lost in synchronization and never appear on computers other than the one on which they were created. (This leads to the item never appearing in the UI on any other machine.) When this happens, bogus relationships get created between blob items and an arbitrary unrelated owner.

In some cases, we can repair bad relationships; it’s a process that requires significant amounts of time and memory, but can be done. Far more serious, though, is the case in which the owner item doesn’t get synced. It’s simply lost, and will never show up on a synced computer. There’s no way for us to know what it was, or reconstruct it; and no way to fix the incorrect relationships that got created for the blobs.

Lost blobs

Sometimes (without any apparent consistency or repeatability), the associated data for an object (for example, the PDF data for a PDF item, or the web archive data for a Web Archive item) would simply fail to show up on the destination machine. Sometimes it would arrive later (much later — minutes or hours). When the blob data doesn’t show up, we have no choice but to wait — the application can’t display what isn’t there, and there are no mechanics by which Core Data can make a request for immediate delivery of the data from iCloud.

Moreover, there’s no way to detect when the data finally arrives; all the application can do is repeat the request to Core Data ad infinitum until the data is available. (The folks at Apple did provide a helpful workaround in which we can peek in an undocumented location to discover when the data is finally available, though.)

It’s not just us

As mentioned earlier, quite a few Mac and iOS developers have been discovering, the hard way, what we’ve also found. It’s been more than a year since we undertook the job of converting Yojimbo to use iCloud for synchronization, and despite our best efforts, we have been unable to get it working reliably to the point that we can ship a new version to customers. And we find it noteworthy that, to the best of our knowledge, none of Apple’s currently shipping products on Mac OS X use iCloud to sync Core Data. (One might think that Address Book or Calendars would, but in fact they use CardDAV and CalDAV, respectively. These are both HTTP-based protocols for communicating directly with a server, and do not rely on Core Data or the Ubiquity system.)

Alternatives

We, and other developers similarly affected by these challenges, have considered using other services for syncing. Another possibility would be for developers to implement their own sync services.

Syncing is an enormously complicated problem to solve in a reliable way. It’s not an impossible problem to solve, but we’re a small, independent software shop, and we have to be judicious about how we spend our time and resources. In addition, providing a custom sync service also requires a significant capital expense: the master copy of the data has to live somewhere, and that in turn requires investment in data-center-grade server hardware, software, applications, staff, and infrastructure services to keep it all running.

Dropbox is an excellent service for sharing files between multiple computers. However, Dropbox is not suitable for synchronizing between Core Data applications; you can’t just dump your database in a Dropbox and have it work as reliably as we’d like. (We do provide a technical note which describes the process for anyone willing to try to sync Yojimbo with Dropbox; the caveats and the steps required to make it work between desktop installations should shed some light on the issues involved.)

As for other alternatives, a couple of motivated developers have undertaken to engineer their own sync services for use by folks who are writing Core Data applications that require synchronization. Unfortunately, none of the alternatives we have investigated so far have been suitable “out of the box”, but there are some good candidates. We’re continuing along this line of inquiry, because a robust sync service that supports our needs is something that our software was conceived to use.

It is very important to note, however, that using a third-party sync service has implications beyond the simple one of whether syncing works or not. It is vitally important that whatever service we choose is robust and reliable on the back end (where sync state is maintained and the data is stored); it won’t do anyone any good if the sync service doesn’t work reliably.

In addition, third-party services are not free. Data synchronization requires storage and bandwidth; those come at an ongoing cost, and the provider must either pass the cost along to the customer, or go out of business. So, developers who use third-party sync services have to make some difficult decisions about how their software is priced and sold, if they wish to have a product with any longevity.

"What’s next?"

Yojimbo was designed with synchronization as a core feature; so as far as we’re concerned, sync needs to work, and uphold our standards for performance and reliability. We and other affected developers are continuing to iterate with Apple regarding the technical problems we’ve run into. However, if iCloud sync can’t be made to work, perhaps another service will do the job.

We don’t yet know how circumstances will unfold, nor can we tell how long it’ll take to overcome the challenges. But we will continue to work on providing synchronization in Yojimbo, in whatever form it may take.

fin

  1. michaelthegeek reblogged this from rms2
  2. bluebluu reblogged this from rms2
  3. fusionlab reblogged this from rms2
  4. gottino reblogged this from rms2 and added:
    The iCloud Sync Oddyssey
  5. sergiomallardo reblogged this from rms2