Microservices

Terraform provider for Cobbler

At Container Solutions we constantly push the boundaries of the tools we work with. While exploring programmable infrastructure we combine the available tools in new ways. Sometimes it works and sometimes it doesn't, but nevertheless we like to contribute things we learned back to the community.

One example of our efforts concerns Terraform. We have done extensive research into Terraform and we have been using it with different platforms, some more exotic than others. When we were applying Terraform to provision bare-metal servers, we found that we were lacking a few networking services in the environment where we were provisioning these servers. When a machine starts up in an unconfigured state, it will try to boot using BOOTP. So you need services like DHCP, a PXE server and eventually a DNS server. Cobbler is a system that bundles these services, so we decided to use that. We quickly found that to handle systems in Cobbler through Terraform, the proper way for Terraform to interface with Cobbler would be through a Provider. No such provider existed yet so we decided to write our own.
Terraform provider for Cobbler diagram

Structure of a Provider

My mate Carlos already started out on a series detailing how to write a provider so I won't repeat that stuff here. Instead I'll explain a bit about the specifics of dealing with Cobbler. From Carlos' post it's clear that we need to define a Provider with three items: a Schema which is a collection of parameters for configuring the Provider, a ResourcesMap that lists the resources that will be configurable using this Provider and a ConfigureFunc that details how to set up the connection. For each of the resources we wil need a Schema as well, plus methods describing how to Create , Read , Update and Delete these resources. I could paste some snippets of the code here but you might as well head over to Github to check out the full source code.

Provider

The provider is fairly straightforward, it has a Schema that contains three strings: a url, a username and a password. For now we have three resources defined in the ResourcesMap : a system, a kickstart file and a snippet.
The ConfigureFunc for the provider returns an HTTP client that talks to the Cobbler server. We abstracted the code that actually interfaces with Cobbler into a separate client library, to make the Provider code cleaner. We just pass the values for the url, username and password that we get from the .tf configuration file provided by the user, as arguments to a new Client object.

System Resource

Next we define the System Resource. This has a slightly more elaborate Schema which contains a map of network interfaces. However, most fields in the Schema are just strings so it's not really special either. The interesting stuff here is in the Create method. Cobbler expects that you perform a series of requests to create a resource (a System in this case) and modify it's properties, and then after you're done you send a sync request which commits your changes. This involves updating the configuration files and restarting the various services under Cobblers control, e.g. the DHCP server. The good thing is that this encourages you to set up a bunch of changes and commit them all at once. In fact, if you call the sync after each change, Cobbler will try to restart the services repeatedly and it breaks very quickly.
The unfortunate thing is that Terraform doesn't accomodate this behaviour, i.e. there is no way to run a post-execution hook or something. So we decided to call the sync in a goroutine. The first Create thread will call the goroutine and wait for a signal from the goroutine over a channel. We set a timeout when creating the resource, and perform the sync only after that timeout has passed. Each call to the Create method will reset the timeout to one second in the future, and so the timeout will pass one second after the last Create has been executed. The channel is used to make sure the calling thread will wait on the goroutine, else the Terraform run would end before the sync actually had been called. The timeout of one second is a bit arbitrary, but it seems to do the job.

To do

We haven't implemented the Read and Update methods yet, they just return nil for now because we thought they were not important for our immediate use case. While diving further into Terraform provider internals we came to understand the need for a Read method, which is to sync the current state of the infrastructure (Cobbler) with Terraforms internal state as reflected in the state file. So this is definitely on our list of things to implement.

Another thing we need to change is the way we handle logging in to Cobbler. We currently login at the beginning of the Create and Delete methods, which causes new logins for every resource created or deleted. We should move the login to the ProviderFunc , so the login happens only once and the token that is returned will be reused throughout the Terraform run.

Also we currently don't implement the sync method after deleting resources, which we really should because the DHCP server also needs to be notified of resources that have been removed.

Other Resources

The code for the kickstart and snippet resources is quite simple really. They just take a path to a textfile on the local system and post that to Cobbler using the specified name.

Wrapping up

As you can see there's still a lot missing but basically it works. We can create a system in Cobbler using Terraform and have our bare-metal server boot over the network having it's boot image served by Cobbler. The exciting thing is that our efforts did not go unnoticed by the Hashicorp folks, who kindly proposed to merge the code into the Terraform tree. We're currently in the process of getting our PR approved. As you can probably guess we're very pleased to get a chance to give something back to the community, which after all is one of Arnold Schwarzenegger's 6 rules of success.

Comments
Leave your Comment