As we covered in our recent blog post on Garden's Pulumi plugin, Infrastructure as Code (IaC) tools like Terraform and Pulumi provide a declarative way to manage cloud resources.
In this blog post, we'll do a more direct head-to-head comparison, and examine the relative pros and cons of each tool.
I'll note at the outset Terraform and Pulumi are definitely not the only great IaC tools out there! Bringing them into the comparison here would be out of scope for this blog post and require a lengthier discussion.
Primer: Why do I need an Infrastructure as Code tool?
The underlying APIs for creating, updating and deleting cloud resources like S3 buckets, hosted databases or network configuration are generally imperative: To inspect or change the state of the system using those APIs, you make fine-grained REST-style requests to view, list, create, update or delete resources.
While these imperative APIs do provide you with full control of the underlying systems, you'd quickly run into some problems if you used them directly to manage your infrastructure:
- How do I get a complete picture of all the components of my infrastructure?
- If someone tweaks a field or changes state somewhere, how can I know how my production system is actually configured? Not knowing that can easily cause problems in prod when updating infrastructure.
- If I change configuration values that are used in several places in my infra, how can I be sure what changes would be applied if I made the update?
- How can I make sure that several updates to my infra aren't performed at the same time?
Even moderately complex cloud systems are composed of a myriad of components managed by a bunch of different APIs: Things like databases, network configuration, Kubernetes clusters and RBAC rules.
As your team grows and more people get involved with setting up infrastructure, it gets harder and harder to keep track of all the moving parts. And adding new components or make significant changes becomes downright scary!
How both Terraform and Pulumi solve these problems
Terraform and Pulumi solve these problems (and more) by giving you a uniform way to declare the desired state of your infrastructure.
Instead of directly calling imperative APIs to create/update/delete infrastructure resources when planning or executing updates to your infra, you use your IaC tool to compare the declared state with what's currently live, and figure out the minimal set of operations needed to update the live system to correspond to your declaration.
You can then review the set of operations that would be performed, and be secure in the knowledge that when you apply those operations, exactly those operations are what gets executed.
The description of your infrastructure you've written down for your IaC tool (as HCL configuration files when using Terraform, or as programs when using Pulumi) now also serve as a complete specification of your infrastructure.
This also makes it a lot easier to keep track of the various components in your system, which is a nice bonus!
At a high level, Terraform an Pulumi have a lot in common. Both of them
- ... operate by declaring resources, diffing the declared state with the live state and atomically applying a set of changes.
- ... support previewing changes before applying them.
- ... support applying a pre-generated plan—although Pulumi only added this functionality in February 2022, and still labels it as experimental, whereas it's been part of Terraform for a long time.
- ... come with state-management features to prevent concurrent updates from clashing with each other and putting infrastructure into an inconsistent state.
- ... come with an in-depth library of providers for working with a wide variety of cloud APIs.
- ... can replace <span class="p-color-bg">kubectl</span> and <span class="p-color-bg">helm</span> as your deployment tools for Kubernetes resources.
All that said, Terraform and Pulumi do differ in important ways.
The essential difference: Configuration vs. programming
This is the most important difference, and the source of the primary pros and cons of Terraform and Pulumi, respectively.
Terraform stacks are described with HCL, a configuration language developed by Hashicorp (the folks behind Terraform). Here's an example (taken from here):
Pulumi, in contrast, uses full-powered programming languages like TypeScript, Go or Python—it does support YAML, but you wouldn't be using Pulumi in the first place if that was your language of choice.
Here's the same S3 bucket declaration done in TypeScript with Pulumi (taken from here).
Note: If you want more side-by-side examples of how Terraform and Pulumi use the various Cloud APIs, I recommend doing some more side-by-side comparisons of the code examples from Terraform and Pulumi's registries (the code examples above were pulled from there).
While HCL is a lot more expressive than YAML, supports local variable declarations (among other things) and comes with lots of useful helper functions, it's not a full-powered programming language.
Most importantly, there's no way to define your own functions. As we'll see below, there are both significant benefits and drawbacks to this constraint.
The argument for config languages: Low power = low risk
At this point you might be asking yourself: Why would people ever choose a less powerful language when a more powerful language is available for the task?
The answer boils down to: Enforcing simplicity at the language level eliminates many classes of bugs.
No user-defined functions means that any errors in your Terraform configuration tend to have a local cause—you don't need to write unit tests for Terraform's helper functions, and you can generally assume that they're correct. Terraform is a mature piece of software, and these helper functions have been extensively field-tested.
This lack of abstraction also forces you to express your intent more directly, even if that means repeating yourself more than you'd like: Terraform config looks very similar to the resource definitions that are generated by evaluating the configuration files.
The more logic you write, the more testing you need, and the more ways there are for things to go wrong: The more complex the logic, the less obvious it is how the concrete resources generated by the program will look.
And in some situations—e.g. for sensitive infrastructure and cloud resources—keeping your deployment logic relatively "dumb", repetitive and explicit is a solid strategy for preventing unpleasant surprises.
The argument for programming languages: Expressiveness and abstraction
Pulumi, on the other hand, embraces full-powered programming languages: TypeScript, Python, Go and more.
The benefits here are obvious:
- You have access to the same type-checking and debugging tools you use for day-to-day programming.
- You can define your own helper functions and classes to extract common patterns, reduce repetition and improve testability.
- Less repetition means fewer code locations to check for errors—of course, the flip side is that an error in a helper function means a potential error at every usage site of the helper!
- You can unit-test your logic with your testing framework of choice—whereas Terraform stacks can generally only be integration-tested.
- You can use any third-party library in your deployment logic—not only the helpers and utilities provided by Terraform.
Which one is right for your team?
Terraform is probably the right choice for you if:
- You (and the engineers that you think will end up maintaining your infra/deployment logic) are more comfortable writing and maintaining configuration files than programming.
- Your infra/deployment logic is relatively straightfoward, and doesn't require a high degree of dynamism, and you're reasonably certain that this will still be the case 1-2 years from now.
- Your team doesn't have engineering resources to devote toward unit-testing deployment logic.
- You tend toward more conservative, battle-tested technology choices.
On the other hand, Pulumi is probably the right choice for you if:
- You (and the engineers that you think will end up maintaining your infra/deployment logic) prefer programming to writing and maintaining configuration files.
- Your infra/deployment logic is relatively complex, and is likely to benefit from a full-powered programming language, or you think it's heading toward significant complexity/dynamism in the next 6-12 months.
- Your team is willing to devote engineering resources to unit-testing your deployment logic, just like any other part of your application.
Terraform and Pulumi are both excellent, visionary pieces of technology. I had a lot of fun doing the background reading for this blog post, and it's very cool to see all the thoughtful engineering and design that's being put into developer tooling these days.
Our systems are becoming more and more complex, and it's a good thing that we're dedicating some serious brainpower to creating the automation we need to handle that complexity!
(If you were wondering, Garden works with both Terraform and Pulumi.)