Sparkle 2.0: Incremental Updates

With so many people using Sparkle, there are seriously huge bandwidth issues involved in every update. Resources could be saved with incremental updates.

xdelta3 seems to have good efficiency, but it only makes diffs for a single binary file; it doesn't know anything about added or removed files. All the packages—zsync, bsdiff, diffball—seem to work this way. We'd want to package a whole bunch of binary diffs together along with adds and removes... that'd be a hassle. Any ideas? Update: Mojopatch looks like an excellent candidate.

We could perhaps support "Upgrade-In-Place" for incremental updates. Rather than using binary diffs, the app bundle (or whatever other bundle) for upgrading would include only the changed files; it would be extracted into the existing bundle, replacing as it went. A small utility could come with Sparkle for developers to produce such a bundle, given the old version and the new version. An attempt by the user to upgrade from any older version than "the old version" would trigger a full download rather than the partial one. -evands

Ryan over at icculus.org has Mojopatch that builds diffs for packages. If memory serves those patch files are self-contained and executable to update their intended applications. Maybe we can contribute to the Mojopatch project and fill in some of the missing pieces it needs (like a UI for building the patches which is command-line only at the moment and incorporating xdelta3 as it currently uses xdelta 1.13) to give us a great tool to do incremental update patches. -HowardGMac

Mojopatch looks really, really promising. I tested it tonight, and it produced extremely small patches. It seems to be able to handle adding and deleting of files. This is excellent. It doesn't really have a programmatic interface, but I'll look into adapting their command-line tool into something Sparkle can call internally. Thanks for the recommendation, Howard. —Andy Matuschak

Patching isn't necessarily the ideal solution to updating large programs. One of the problems with patching is if you have some large binary blob (like, say, the application itself, or a video) that gets updated repeatedly, and someone needs to upgrade across 10 versions that all modify that file, all 10 patches together may be bigger than just downloading the whole thing. Also, there are some administrative headaches, as you need to make sure that every version is identified with a unique version number and you never accidentally release two versions with the same version number, and you have to deal with separate patch streams for beta users vs. stable users, etc. One approach that can solve many of these problems is treating it somewhat like rsync. Instead of maintaining a set of patches that will get you to where you want to go, you just store the latest version on the server. To determine which files are out of date, you generate an MD5 or SHA-1 of each file, and store them in a manifest file. Then, you can simply compare the manifest file of the installed program to the manifest of the version on the server, and download each of the files that doesn't match. To guarantee atomicity, and to reduce the download time in cases where you have many copies of the same file, you should be able to access each file on the server by it's hash; you just request the hash of each file that you need to update, and once they're all downloaded you can copy them to where they need to be on disk. I have used this technique for an automatic updater for a program that, including many video files, was about 2 gigs, and it was quite simple to implement, and pretty robust. —Brian Campbell

Yeah, that'd certainly be a simpler method. My plan was to do sort of a hybrid: a patch would upgrade from the last version to the current. If the user has something older or something self-modifying, it would just download the full latest version from the file. I guess that's not on a per-file basis, but it could happen. —Andy Matuschak

The downside I see for the checksum-comparison approach is that it'll make some big, complicated appcasts. If it were done, I'd seriously recommend building a utility for developers that'll build that portion of the appcast out of something - likely their just-compiled .app package. Couldn't something like all 3 options be included (with a dev utility)? Updates that use patches where available (include a "patch-from" attribute, so you can have multiple patches available), then individual files, and then full re-downloads as a fallback (if others are not specified) or with a "required" style flag. One other complication I see is if people modify their app, especially things like app icons, toolbar icons, etc. What should be done then? Wipe out all their work, selectively preserve it, or...? -Groxx