This blog post was co-authored by Garden CEO Jon Edvald and Rookout head of product Oded Keret and is a recap of their March 10th webinar. It also appeared on the Rookout blog.
You can find a full recording of the webinar here.
Key takeaways:
- Debugging in the world of microservices and Kubernetes can be painful, especially when it comes to reproducing an issue.
- Why? Because there are so many moving parts: differences between environments, configurations, and network behavior; environment-specific datasets; ever-shifting code versions; and more.
- Garden and Rookout offer a powerful combination that allows a developer to write code and test against their own production-like environment, quickly find the root cause of a problem when something breaks, redeploy and re-test a fix, then validate the fix across all environments.
Troubleshooting customer issues in production is a difficult job. These are the issues that impact the business the most, so consequently, stress levels are almost always at a high. And it’s never fun to be measured against an SLA, which feels like you’re stuck in a losing battle.
And it’s especially hard in the world of microservices and Kubernetes, because it’s so difficult to recreate a reliable replica of production in your local development environment.
Indeed, when we ask users about the toughest challenge they face when actually solving production problems, the answer is almost always reproducing the issue.
Alas, “cannot reproduce” is one of the classic problems in software engineering, and sometimes we ask ourselves, how is this still a problem in today’s advanced, cloud native ecosystem? Well, because there are so many moving parts in modern applications, such as:
- There are differences between the customer environment vs. testing environment vs. your local environment
- The configuration and network are never the same
- A customer’s production dataset might cause the bug, and you don’t have access to it
- Code versions keep shifting, and every microservice has its own development cycle with teams pushing separately
- The Kubernetes deployment itself varies and is hard to reproduce outside of production
So, how do you troubleshoot when the Kubernetes command line is limited in terms of what it can offer, you can’t deploy a production-like environment locally, and ssh-ing into a remote environment is out of the question? Most of our users turn to a couple of these tried-and-true tactics.
There’s logging and tracing, which requires spending a lot of time writing code that prints logs that you’ll need if anything goes wrong. This means that when a new problem comes along, you’ll need to add a logline, push it, wait for it to be pushed, then wait for the issue to be reproduced again before you can actually get the log of what happened. You’re already dreaming of the next coffee you’re going to drink in that long wait time, right?
And, if all else fails, there’s pushing to CI and praying that the thing you fixed happens to be the thing that was causing the problem. But we know that’s almost never the case.
Neither option is particularly helpful, and we believe there’s a better approach. That’s where Rookout and Garden come in.
What is Rookout?
Rookout is an application that makes debugging easy and accessible in any environment by allowing software engineers to handle the complexity of modern applications by seeing into their code in real-time, as it’s running. Developers can debug a local environment running on their laptops or a very complex microservices environment running in the cloud or a customer’s environment.
Instrumenting Rookout in your application is a matter of adding the Rookout token as an environment variable or as a parameter in the code itself, and it’s a one-time change. And once that’s done, you have access to every application that’s been instrumented with Rookout. If you had the same application running in multiple environments (for example, in a test environment and in a customer environment), you’d also be able to filter and pick a specific environment.
Rookout uses a technology called Non-Breaking Breakpoints (you may have heard of similar technologies called logpoints or tracepoints). Basically, these let you set a breakpoint at a line of code and get data without stopping your application, which is critical—in a live, dynamic, cloud environment, you can’t just stop the code.
Rookout looks and feels like an IDE—you can see the source code for all of the different services in your application alongside data that helps you to debug. You have the benefit of seeing data at the code level without having to stop your code from running—which is something that you have to be able to do when debugging a live, microservices application.
What is Garden?
Garden is part Kubernetes development tool, part automation engine that builds, tests, and deploys your application. It allows you to fully define the relationships between every part of a system, including how each component is built and tested.
The aim is that for every developer or CI pipeline, you can just run a single command such as <span class="p-color-bg">garden deploy</span> or <span class="p-color-bg">garden test</span> (because tests are a native element in Garden) to spin up a production-like environment and run your full suite of tests, including integration tests.
One of the things that’s most important about Garden is something we call the Stack Graph. The Stack Graph visualizes all the different components in your system and all the steps involved in going from a bunch of source code, through build, through deployments, plus any add’l tasks that need to happen like seeding a database, all the way down to tests—which could be unit tests that have no runtime dependencies but also integration and end-to-end tests that actually need running instances of your stack.
With Garden, you have a framework and toolkit to reason about your whole system, deploy it all in a consistent manner, and deploy a full environment where you can run tests while you code. These environments are as production-like as you can get, way more so than running docker-compose or using homegrown set of bash scripts
Another key aspect of Garden, especially with Kubernetes, you can run <span class="p-color-bg">garden deploy</span> and point it at a remote Kubernetes cluster, but it’ll feel like you’re working in a local environment.
Garden and Rookout: A Better Dev and Debugging Workflow
Editor’s note: in this blog post, we’re going to describe at a high level how Garden and Rookout complement each other during the development process.
If you’d like a much more detailed overview, including a demo that shows the two products working side-by-side, please take a look at the webinar recording.
Given what we know about Garden and Rookout, here’s what an end-to-end development and debugging process might look like with the two products.
- A developer uses Garden to spin up a production-like environment for coding and running tests. It only takes one command to spin up the environment and ensure that the most up-to-date version of every service has been built and tested, and the Stack Graph provides a visual representation of the entire stack. It’s easy to point this environment at a remote Kubernetes cluster, so the developer doesn’t actually have to have Docker and Kubernetes running on their laptop and can still use the IDE and other tools of their choice.
- The developer runs the full suite of integration tests while coding, and one of the tests fails. It’s easy to pinpoint which test failed in Garden (and the Stack Graph visualizes it), but we don’t get a lot of insight about what went wrong. So what are our options to figure out what’s causing the test to fail? We could go put in a bunch of console logs. Or we could look at the test code. Or bend over backwards to try and somehow attach a debugger to a process that’s running in a remote Kubernetes cluster, but we all know that’s far from a delightful experience.
- Luckily, Rookout is already instrumented in our application, so we have a much better option for debugging. We can get to the root cause without shutting down our environment and (gasp!) without adding loglines and redeploying. Within a few minutes, we’ve been able to identify our issue and have all the context we need to fix it (or assign it to the responsible developer).
- Once the issue is fixed, we can redeploy and re-test our app with Garden. Again, we can run the full suite of tests directly from our development environment and get fairly fast feedback—we don’t have to push to CI and wait just to be able to run integration tests. And this time around, all of our tests pass. It’s looking promising!
- Rookout lets us validate the change across all environments to be sure the bug was actually fixed. Rookout is especially well-suited for validating fixes in a complex Kubernetes environment, including in production.
Because all the deploys and builds in Garden happen within a Kubernetes cluster, it’s something that can easily be shared across developers who are working on a problem together. Same goes for Rookout—each dev can have their own instance running. This collaborative aspect is pretty impressive.
Wrapping up and next steps
If you’d like to have a deeper look on what we covered in this post, we recommend you head on over to the webinar recording.
To get started with Garden, you can check out the docs for the open source Garden Core. If you’d prefer a high-level overview of our product, we’ve got you covered, too.
And if you’re looking to create a faster and more efficient Kubernetes development process, we’re happy to set up time to talk and see how we can help. Feel free to get in touch with us to schedule a call.
If you want to see how Rookout can help speed up your developers’ Kubernetes debugging processes and ultimately help you solve customer issues 5x faster (imagine how much more time for coffee you’d have), head over to their website or get in touch to see how they can make that magic happen for you. ;)