January 7, 2012

The future of the App Store is source-only distribution

Following up on the ClearSky idea, let's look at what the applications that run on your and your friends' computers will be like.

Obviously, the Apple app store model of cryptographic signing is useless as a measure of trust of what an application does in a decentralized scenario (note that this is different than using signing to establish authenticity during application distribution). The Apple app store model of cryptographic signing is actually useless for Apple app store apps as well - I know of at least two apps that have made it into the Apple app store that will keep an open, non-password protected telnet port on your iPhone. So centralized quality control does not work.

Will virtualization be able to solve the safety problem? What virtualization is really about is names and meta-machines. In the case of running a VMware virtual machine on your real PC, your real PC is the meta-machine from the point of view of the VMware one, and memory addresses are the names. As long as the software in the VMware box has no access to the memory addresses belonging to your real PC, it cannot escape to do things that the VMware virtual machine cannot already do (you can think of virtualization like being the plot of The Matrix).

Even if the virtualization is unbreakable and the virtual machine does everything you need without posing a danger to your data (and how would it do that if it needs to modify your data to be useful?), you cannot tell whether the app that the virtual machine will run doesn't have secret back-doors or information leaking channels (both can be accomplished using steganography to hide things in the data the app consumes and produces as part of its regular operation). Checking compiled binaries for these is simply not feasible.

The only alternative that is left is the one advocated by Richard Stallman - you can't trust a program unless you can read its source code. While it is still possible to hide back-doors in source code, it requires a very large and extremely messy code base to hide them effectively. And very large and extremely messy code bases tend to lead to shit applications that no one really wants to use (unless corporate IT makes you). Applications with clean code bases that are easy to audit are nice; use them.

Note that this model of software distribution (source-only) is not new - in fact, it has been the most popular method of software distribution for the last 10 years. This is how JavaScript works. And JavaScript has also shown that language-level virtualization (ie "sandboxing") is extremely effective as a security mechanism - there have been very few exploits where an attacker was able to escalate privileges outside of the JavaScript virtual machine.

There are two ways to get around the "easily auditable source" ideal with JavaScript. The most popular way is to use code obfuscation tools. The other way (which in the past year has gained more and more recognition) is to use JavaScript itself to implement another virtual machine (kind of like Russian nesting dolls).

While there is no way to prevent these two techniques (and in fact it is undesirable to do so; but it doesn't stop Apple from trying), you certainly should be able to have the freedom to use applications written and audited by people you trust. The centralized, signed app store approach used by Apple destroys this continuum of trust by putting all applications on an equal level - "approved by Apple" doesn't mean much when the approval process is secretive, arbitrary, and does not guarantee quality or security.

Are there any other benefits to source-only distribution besides security? Plenty: complete portability, high performance (compile to native code), tiny download sizes, easy dependency management, etc.

One way to encourage the ideal of "easily auditable source" is via licensing. This is where the innovation of Henry Poole comes in handy - the Affero GPL is the most business-friendly of all Free Software licenses (more on that in a later post) and when used effectively will enable disruptive new business models. The "platform" part of the hypothetical ClearSky virtual machine will benefit tremendously from being licensed under the AGPL.

No comments: