In this post, I would like to highlight a few insights from a paper, which I wrote together with a colleague of mine, Stefan Kolb. This paper is titled Application Migration Effort in the Cloud – The Case of Cloud Platforms and has been accepted at the 8th IEEE International Conference on Cloud Computing (CLOUD). I’m on quite short notice, since the conference starts this week in New York City. As first author, Stefan will be there to present the paper. I will be there as well, since I was so lucky to earn a IEEE Student Travel Award for the conference.
But on to the paper! First of all, you can find a preprint version here. In a nutshell, the paper describes a case study of porting a real-world application among different PaaS vendors. This is mainly Stefan’s work and my contribution stems from the application of my measurement framework for installability to measure application deployability in the PaaS setting. Application portability and migration in cloud environments is shifting more and more into the focus of research in recent years. By now, we have many many vendors that try to position themselves in the platform market and we have many many studies that confirm performance and scalability benefits if you develop your application on a cloud platform. The problem is that most platforms are mutually incompatible, so it is really a great opportunity for vendors to lock you into their environment forever and ever. No sane user is happy about a lock-in effect and that is why research on application portability in cloud environments is important.
The Application, PaaS Clouds and Yard
So what we did in the paper was to take the Blinkist web application by Blinks Labs GmbH and migrate it to seven PaaS vendors, being IBM Bluemix, cloudControl, AWS Elastic Beanstalk, EngineYard, Heroku, OpenShift, and Pivotal Web Services. We selected the vendors based on the application’s requirements (Ruby 2.0.0 and Rails 4) using PaaSify.it, which is the most comprehensive and up-to-date database of PaaS cloud vendors and their capabilities. It has also been created and is actively being maintained by Stefan. As next step, Stefan migrated the application and automated its deployment to every of the above mentioned platform vendors. Now, it wouldn’t be his work if it didn’t turn into a useful tool as a by-product: Lo and behold, Yard a docker-based mini deployment system let’s you streamline the deployment of your application to any of the mentioned cloud platforms.
Based on this setting, we repeated the deployment process a hundred times and calculated different deployability metrics. These metrics are all based on source code or its execution behavior at run-time. I won’t describe them in detail here, but you can find their definition in the paper. I’ll also skip presenting raw data, which is contained in the paper as well. Instead, I’ll highlight two, in my opinion, interesting findings.
Container-Based PaaS Systems Deploy Much Faster Than VM-Based Systems
All container-based PaaS platforms in the case study did deploy faster than the virtual machine-based ones. “Faster” means that the duration from provisioning the platform to having a running application available was much shorter for container-based solutions. The performance differences are quite high. On average deployment for container-based systems was three times as fast as for VM-based systems. The time scale we are talking about is minutes, from around 6 minutes on average for the fastest container-based PaaS to around 30 minutes on average for the slowest VM-based PaaS. These performance differences do not originate from VM startup as opposed to container startup, but really from application deployment itself. Even if we substract startup times from the comparison, the difference between container-based and VM-based systems stays the same.
“The performance of the initial setup does not matter, because you only do it once”, I hear you say? We also did the same measurements for redeployment, that is, without the initial provisioning phase. On average, all container-based PaaS systems are still faster than all VM-based systems, although at a much lower factor. Nevertheless, setup performance is a strong argument in favor of container-based PaaS systems.
Application Deployment on Container-Based PaaS Systems Requires Less Effort
The difference between container-based systems and VM-based systems can not only be seen in performance, but also in the effort required to achieve deployment. We measured effort in terms of the artifacts needed to achieve deployment. We count the size of the deployment scripts you have write, the number of parameters you need to configure these scripts, the number of source code lines you have to replace for an environment, and the complexity of application packages you need to build for deployment. Everything taken together, again, all container-based PaaS systems require smaller scripts, fewer input parameters, fewer source code changes, and less complex archives than VM-based systems.
There is a big BUT in this observation. In a VM-based system, you have more control over your environment. This means that you can adjust it more easily to your needs, but you also have to do more of the initial adjustment yourself, as seen above. This can be both, an advantage and a disadvantage, and really depends on your actual use case and whether you need a fine grained control over your environment or not.
There are more insights in the paper that I did not discuss here, such as deployment reliability or pricing. I hope I got you interested and encourage you to look at the paper itself.
Finally, it is clear that we won’t stop at this point and there are many more things to do when it comes application portability in cloud environments. In the paper at hand, we looked at a RoR application, but are the results really generalizable to different ecosystems? We just considered the portability of the application itself, but what about the management APIs of the different systems? Can these be unified? And what about the actual performance of the application during operation, and not just deployment, on the different platforms?
Keep up for more to come!