ZeroBanana: Software for Humans

Latest news

OpenStack Orchestration Juno Update

As the Juno (2014.2) development cycle ramps up, now is a good time to review the changes we saw in Heat during the preceding Icehouse (2014.1) cycle and have a look at what is coming up next in the pipeline. This update is also available as a webinar that I recorded for the OpenStack Foundation, as are the other PTL updates. The RDO project is collecting a list of written updates like this one.

While absolute statistics are not always particularly relevant, a comparison between the Havana and Icehouse release cycles shows that the Heat project continues to grow rapidly. In fact, Heat was second only to Nova in numbers of commits for the Icehouse release. As well as building contributor depth we are also rotating the PTL position to build leadership depth, so the project is in very healthy shape.

Changes in Icehouse

The biggest change in Icehouse is the addition of software configuration and deployment resource types. These enable template authors to define software configurations separately from the servers on which they are to be deployed. This makes, amongst other things, for much easier re-usability of artifacts. Software deployments can integrate with your existing configuration management tools - in some cases the shims to do so are already available, and we expect to add more during the Juno cycle.

The Heat Orchestration Template format (Hot) is now frozen at version 2013-05-12. Any breaking changes we make to it in future will be accompanied by a bump in the version number, so you can start using the Hot format with confidence that templates should continue to work in the future.

In order to enable that, template formats and the intrinsic functions that they provide are now pluggable. In Icehouse this is effectively limited to different versions of the existing template types, but in future operators will be able to easily deploy arbitrary template format plugins.

Heat now offers custom parameter constraints - for example, you can specify that a parameter must name a valid Glance image - that provide earlier and better error messages to template users. These are also pluggable, so operators can deploy their own, and more will be added in the future.

There are now OpenStack-native resource types for autoscaling, meaning that you can now scale resource types other than AWS::EC2::Instance. In fact, you can scale not just OS::Nova::Server resources, but any type of resource (including provider resources). Eventually there will be a separate API for scaling groups along the lines of these new resource types.

The heat-engine process is now horizontally scalable (though not yet stateless). Each stack is processed by a single engine at a time, but incoming requests can be spread across multiple engines. (The heat-api processes, of course, are stateless and have always been horizontally scalable.)

The API is growing additions to help operators manage a Heat deployment - for example to allow a cloud administrator to get a list of all stacks created by all users in Heat. These improvements will continue into Juno, and will eventually result in a v2 API to tidy up some legacy cruft.

Finally, Heat no longer requires a user to be an administrator in order to create some types of resources. Previously resources like wait conditions required the admin role, because they involved creation of a user with limited access that could authenticate to post data back to Heat. Creating a user requires admin rights, but in Icehouse Heat creates the user itself in a separate domain to avoid this problem.

Juno Roadmap

Software configurations made their debut in Icehouse, and will get more powerful still in Juno. Template authors will be able to specify scripts to handle all of the stages of an application’s life-cycle, including delete, suspend/resume, and update.

Up until now if the creation of a stack or the rollback of an update failed, or if an update failed with rollback disabled, there was nothing further you could do with the stack apart from delete it. In Juno this will finally change - you will be able to recover from a failure by doing another stack update.

There also needs to be a way to cancel a stack update that is still in progress, and we plan to introduce a new API for that.

We are working toward making autoscaling more robust for applications that are not quite stateless (examples include TripleO and Platforms as a Service like OpenShift). The plan is to allow notifications prior to modifying resources to give the application the chance to quiesce the server (this will probably be extended to all resources managed by Heat), and also to allow the application to have a say in which nodes get removed on scaling down.

At the moment, Heat relies very heavily on polling to detect changes in the state of resources (for example, while a Nova server is being built). In Juno, Heat will start listening for notifications to reduce the overhead involved in polling. (Polling is unlikely to go away altogether, but it can be reduced markedly.) In the long term, beyond the Juno horizon, this is leading to continuous monitoring of a stack’s status, but for now we are laying down the foundations.

There will also be other performance improvements, particularly with respect to database access. TripleO relies on Heat and has some audacious goals for deployment sizes, so that is driving performance improvements for all users. We can now profile Heat using the Rally project, so that should help us to identify more bottlenecks.

In Juno, Heat will gain an OpenStack-native Heat stack resource type, and it will be capable of deploying nested stacks in remote regions. That will allow users to deploy multi-region applications using a single tree of nested stacks.

Adopting and abandoning stack resources makes it possible to transition existing applications to and from Heat’s control. These features are actually available already in Icehouse, but they are still fairly rough around the edges; we hope they will be cleaned up for Juno. This is always going to be a fairly risky operation to perform manually, but it provides a viable option for automatic migrations (Trove is one potential user).

Operations Considerations

There are a few changes in the pipeline that OpenStack operators should take note of when planning their future upgrades.

Perhaps the most pressing is version 3 of the Keystone API. Heat increasingly relies on features available only in the v3 API. While there is a v2 shim to allow basic functionality to work without it for now, operators should look to start testing and deploying the v3 API alongside v2 as soon as possible.

Heat has now adopted the released Oslo messaging library for RPC messages (previously it used the Oslo incubator code). This may require some configuration changes, so operators should be aware of it when upgrading to Juno.

Finally, we expect the Heat engine to begin splitting into multiple servers. The first one is likely to be an “observer” process tasked with listening for notifications, but expect more to follow as we distribute the workload more evenly across systems. We expect everything split out from the Heat engine to be horizontally scalable from the beginning.


OpenStack Orchestration and Configuration Management

At the last OpenStack Summit in Hong Kong, I had a chance meeting in the hallway with a prominent Open Source developer, who mentioned that he would only be interested in Heat once it could replace Puppet. I was slightly shocked by that, because it is the stated goal of the Heat team not to compete with configuration management tools—on the principle that a good cloud platform will not dictate which configuration management tool you use, and nor will a good configuration management tool dictate which cloud platform you use. Clearly some better communication of our aims is required.

There is actually one sense in which Heat could be seen to replace configuration management: the case where the configuration on a (virtual) machine never changes, and therefore requires no management. In an ideal world, cloud applications are horizontally scalable and completely stateless so, rather than painstakingly updating the configuration of a particular machine, you simply kill it and replace it with a new one that has the configuration you want. Preferably not in that order. However, I do not see this as a core part of the value that orchestration provides, although orchestration can certainly make the process easier. What enables this approach is the architecture of the application combined with the self-service, on-demand nature of an IaaS cloud.

Take a look at the example templates provided by the Heat project and you will find a lot of ways to spin up WordPress. WordPress makes for a great demo, because you can see the result of the process in a very tangible way. The downside is that it may be misleading people about what Heat is and how it adds value.

It would be easy to imagine that Heat is simply a service for provisioning servers and configuring the software on them, but that is actually the least-interesting part to me. There are many tools that will do that (Puppet, Juju, &c.); what they cannot do is to orchestrate the interactions among all of the OpenStack infrastructure in an application. That part is unique to Heat, and it is what allows you to treat your infrastructure configuration as code in the same way that configuration management allows you to treat your software configuration as code.

Diagram of the solution spaces covered by orchestration and configuration management tools.

I am sometimes asked “Why should I use Heat instead of Puppet?” If you are asking that question then my answer is that you should probably use both. (In fact, Heat is actually a great way to deploy both the Puppet master and any servers under its control.) Heat allows you to manage the configuration of your virtual infrastructure over time, but you still need a strategy for managing the software configuration of your servers over time. It might be that you pre-build golden images and just discard a server when you want to update it, but equally you might want to use a traditional configuration management tool.

With the addition of the Software Deployments feature in the recent Icehouse (2014.1) release, Heat has moved into the software orchestration space. This makes it easier to define and combine software components in a modular way. It also creates a cleaner interface at which to inject parameters obtained from infrastructure components (e.g. the IP address of the database server you need to talk to). That notwithstanding, Heat remains agnostic about where that data goes, with a goal of supporting any configuration management system, including those that have yet to be invented and those that you rolled yourself.

If you would like to hear more about this with an antipodean accent, I will be speaking about it at the OpenStack Summit in Atlanta on Monday, in a talk with Steve Hardy entitled ‘Introduction to OpenStack Orchestration’. I plan to talk about why you should consider using Heat to deploy your applications, and Steve will show you how to get started.

Our colleague Steve Baker will be speaking (also with an antipodean accent) about ‘Application Software Configuration Using Heat’ on Tuesday.


OpenStack and Platforms as a Service

The subject of Platforms as a Service and their long-term relationship with OpenStack has been the subject of much hand-wringing—most of it in the media—over the past month or so. The ongoing expansion of the project has many folks wondering where exactly the dividing line between OpenStack and its surrounding ecosystem will be drawn, and the announcement of the Solum related project has fuelled speculation that the scope will grow to encompass PaaS.

One particular clarification is urgently needed: Solum is not endorsed in any way by the OpenStack project. The process for that to happen is well-defined and requires, amongst other criteria, that the implementation is mature. Solum as announced comprised exactly zero lines of code, since the backers wisely elected to develop in the open from the beginning.

More subtly, my impression (after attending the Solum session at the OpenStack Summit two weeks ago and speaking to many of the folks involved in starting the project) is that Solum is not intended to be a PaaS as such. I have long been on record as saying that a PaaS is one of the few cloud-related technologies that do not belong in OpenStack. My reason is simple: OpenStack should not annoint one platform or class of platforms when there are so many possible platforms. Today’s PaaS systems offer many web application platforms as a service—you can get Ruby web application platforms and Java web application platforms and Python web application platforms… just about any kind of platform you like, so long as it’s a web application platform. That was the obvious first choice for PaaS offerings to target, but there are plenty of niches that could also use their own platforms. For example, our friends (and early adopters of Heat) at XLcloud are building an open source PaaS for high-performance computing applications.

Though Solum is still in the design phase, I expect it to be much less opinionated than a PaaS. Solum, in essence, is the ‘as-a-Service’ part of Platform as a Service. In other words, it aims to provide the building blocks to deliver any platform as a service on top of OpenStack with a consistent API (no doubt based on the Oasis Camp standard). It seems clear to me that, by commoditising the building blocks for a PaaS, this is likely to be a catalyst for many more platforms to be built on OpenStack. I do not think it will damage the ecosystem at all, and clearly neither do a lot of PaaS vendors who are involved with Solum, such as ActiveState (who are prominent contributors to and users of Cloud Foundry) and Red Hat’s OpenShift team.

Assuming that it develops along these lines, if OpenStack were to eventually reject Solum from incubation solely for reasons of scope it would call into question the relevance of OpenStack more than it would the relevance of Solum. Solum’s trajectory toward success or failure will be determined by the strength of its community well in advance of it being in a position to apply for incubation.

Finally, I would like to clarify the relationship between Heat and PaaS. The Heat team have long stated that one of our goals is to provide the best infrastructure orchestration with which to deploy a PaaS. We have no desire for Heat to include PaaS functionality, and we rejected a suggestion to implement Camp in Heat when it was floated at the Havana Design Summit.

One of the development priorities for the Icehouse cycle, the Software Configuration Provider blueprint is actually aimed at feature-parity with a different Oasis standard, Tosca. We are working on it simply because the Heat team went to the Havana Design Summit in Portland and every user we spoke to there asked us to. The proposed features promise to make Heat more useful for deploying enterprise applications, platforms as a service, Hadoop and other complex workloads.


An Introduction to Heat in Frankfurt

It was my privilege to attend the inaugural Frankfurt OpenStack Meetup last night in… well, Frankfurt (am Main, not the other one). It was great to meet a such a diverse set of OpenStack users, from major companies to students and everywhere in between.

I gave a talk entitled ‘OpenStack Orchestration with Heat’, and for those who missed it that link will take you to a handout which covers all of the material:

An introduction to the OpenStack Orchestration project, Heat, and an explanation of how orchestration can help simplify the deployment and management of your cloud application by allowing you to represent infrastructure as code. Some of the major new features in Havana are also covered, along with a preview of development plans for Icehouse.

Thanks are due to the organisers (principally, Frederik Bijlsma), my fellow presenter Rhys Oxenham, and especially everyone who attended and brought such excellent questions. I am confident that this was the first of many productive meetings for this group.


Hadoop on OpenStack

The latest project to be incubated in OpenStack for the Icehouse (2014.1) release cycle is Savanna, which provides MapReduce as a service using Apache Hadoop. Savanna was approved for incubation by the OpenStack Technical Committee in a vote last night.

In what is becoming a recurring theme, much of the discussion centred around potential overlap with other programs—specifically Heat (orchestration) and Trove (database provisioning). The main goal of Savanna should be to provide a MapReduce API, but in order to do so it has to implement a cluster provisioning service as well.

The Savanna team have done a fair amount of work to determine that Hadoop is too complex a workload for Heat to handle at present, but they have not approached the Heat team about closing the gap. That is unfortunate, because we are currently engaged in an effort to extend Heat to more complex workloads, and Hadoop is a canonical example of the kind of thing we would like to support. (It is doubly unfortunate, given that the obstacles cited appear comparatively minor.) This will have to change, because there was universal agreement that Savanna should move to integrating with Heat rather than roll-your-own orchestration.

The final form of any integration with Trove, however, remains unclear. The Savanna team maintain that there is no overlap because Trove provides a database as a service and Hadoop is not a database, but this is too glib for my liking. Trove is essentially a pretty generic provisioning service, and while its user-facing function is to provision databases, that would be a poor excuse for maintaining multiple provisioning implementations in OpenStack. And, while it would be wrong to describe Hadoop as a database per se, it would be fair to say that Hadoop has a database. Trove is already planning a clustering API. In my opinion, the two teams will need to work together to come up with a common implmentation, whether in the form of a common library, a common service or a direct dependency on Trove.

The idea of allowing Savanna to remain part of the wider OpenStack ecosystem without officially adopting it was, of course, considered. Hadoop can be considered part of the Platform rather than the Infrastructure layer, so naturally there was inquiry into whether it makes sense for OpenStack to annoint that particular platform rather than implement a more generic service (though it is by no means clear that the latter is feasible). Leaving aside that Amazon already implments Hadoop in the form of its Elastic MapReduce service, the Hadoop ecosystem is so big and diverse that worrying about locking users in to it is a bit like worrying about OpenStack locking users in to Linux. It does, of course, but there is still a world of choice there.

The final source of differing opinions simply related to timing. Some folks on the committee felt that an integration plan for Heat and/or Trove should be developed prior to accepting Savanna into incubation. Incubation confers integration with the OpenStack infrastructure (instead of StackForge) and Design Summit session slots, both of which would be highly desirable. The Technical Committee’s responsibility is to bless a team and a rough scope, so the issue was whether the latter is sufficiently clear to proceed.

This objection was overcome, as the committee voted to accept Savanna for incubation, albeit by a smaller margin than some previous votes. The team now has their work cut out to integrate with the other OpenStack projects, and nobody should be surprised if Savanna ends up remaining in incubation through the J cycle. Nonetheless, we welcome them to the OpenStack family and look forward to working with them before and during the upcoming Summit to develop the roadmap for integration.