Sparkle 2.0: Statistical Information

I've decided to get with the times. Sparkle+-style statistical information is okay in my book so long as we get the user's permission first. The question is: where do we ask for it?

I don't want to ask when we're installing Sparkle. We're already asking enough of the user—to install some unknown program with an unfamiliar name and icon. It would make it look even sketchier and less trustworthy if we also had a check box for statistical information. Sparkle will not be that annoying uncle who first asks if he can come over and then asks if he can have some bacon.

The best solution I can think of is to ask the next time Sparkle presents interface: the first time an update is detected. I'm thinking that we can ask either in a check box in the update alert or in a dialog when the user chooses to install the updates. The former is less obtrusive, but we've already got one check box in there, and this kind of question feels more dialog-y to me.

Do we offer this collection per application or is this going to be all or none ? I for one think per application is not out of line, but I may be in the minority.- HowardGMac

Nah, that's more complexity than I want. It's not a huge deal if some users who are paranoid of some apps but not of others for some reason don't end up sending in their info, I think. —Andy Matuschak

Presumably the statistical information will be sent to the application's author; there are definitely some authors I trust more than others, so a per-application setting here would be much more sensible.

My take on the matter: If you have to ask, you probably shouldn't be collecting this information to begin with. If this data is truly innocuous, however, then the solution is clear. Ultimately instead of worrying about when to ask the user, I believe the focus should be on satisfactorily anonymizing the statistical data submitted. E.g., consider establishing some kind of central Sparkle Mac stats database that developers could query at their leisure for app-specific breakdowns using various one-way hashing functions and the like. Once collected, the stats would be fungible and completely dissociated from users' identities. Stats are sent when the user updates Sparkle, so there is no dependency on an additional server. After all, how often will hardware and OS specs--for each user-- really change? The issue of per-application trust disappears, as do even the most security-conscious users' objections--why ask users at all now? Or you could ask yourself whether every independent Mac developer really, truly needs his/her own Omni Update Stats page. -Scrod

Speaking for Adium here, I can say that one of the useful pieces of info for us is what % of our userbase is on a given version of Adium. Other handy ones are system language, IM protocols used, and OS version. The rest are pretty much just fluff, although I could see situations where they might be useful. You can see what we collect here: http://sparkle.adiumx.com (any odd spikes are due to bugs in the reporting mechanism, which are hopefully worked out by now). I do like the idea of an aggregated stats db with developer views of it for apps, but that sounds like it could be more trouble than it's worth. Also note that for privacy and performance reasons we avoid tracking correlating data between stats, which I think is a good policy. -DavidSmith?

I think statistical data collection should be included in Sparkle by default, but I also think that the user's permission should be obtained first. I don't know how you intend on distributing the framework if it isn't to be included on a per application basis but if you are using an Installer.app package then including an additional stage in the installation to ask permission to send stats isn't that heinous. If the demographics are to be of any use to the developers using Sparkle they would need to be retrievable on a per app basis - they could also reveal some pretty interesting patterns! -- KeithDuncan?

This is a delicate juristic issue. Handling of personal data (and the definition of 'personal data') differs between countries. There is probably no way that pleases everyone. The only safe route is, to not collect any data at all. So I suggest you at least provide some means for a specific application to disable the retrieval of 'statistical' information. (In case you want to dive into details, you may start here: http://de.wikipedia.org/wiki/Bundesdatenschutzgesetz (German) - I recommend asking your lawyer to interpret it, though. ;-)) --AndreasM