Rosemary Wang is a developer advocate with HashiCorp. She's also the author of Essential Infrastructure as Code: Patterns and Practices published by Manning. I have her on the show today to talk about her book and infrastructure as code in general. We talk a little bit about developer advocacy. It really was a great conversation. I just enjoyed talking to Rosemary so much. So check it out.
Before we get there, as always, a reminder that Streaming Audio is brought to you by Confluent Developer. That's developer.confluent.io. If you want to get started learning Kafka, ksqlDB, Confluent Cloud, there are video courses, there's a library of event-driven architecture patterns. There are executable tutorials, long-form explainer pieces, all kinds of educational content, everything you need to get started. So check it out.
When you sign up for Confluent Cloud for the first time to do exercises there, use the code PODCAST100, you'll get an extra hundred dollars of free usage, definitely worth it. developer.confluent.io. And now, let's talk to Rosemary.
Hello and welcome to another episode of Streaming Audio. I am your host, Tim Berglund, and I'm joined on the podcast today by HashiCorp developer advocate and author of Essential Infrastructure as Code, Rosemary Wang, Rosemary, welcome to the show.
Thank you for having me, Tim.
I got the title right, didn't I?
Yes, you did.
Okay. I was saying it and I was like, "Wait, am I saying the right title?"
I think it's usually, folks get a little bit, they reverse it, something blank of blank and I don't mind either way, but I thought it made sense.
Good. Good. Okay. Yeah. So you have a book in early access by Manning and I'd like to talk about the book today and I'd like to talk about what you do at HashiCorp also if we could dig into that if we have time.
And really just this idea of infrastructure as code. This is nominally a Kafka podcast. Kafka is, I Googled this before we started recording. Kafka is infrastructure.
I a hundred percent checked and it turns out that it is. So I don't know, this seems relevant. So yeah. Talk to us. When you say infrastructure as code, what do you mean? And let's just go from there.
Of course. So it's been interesting to hear about the evolution of infrastructure as code. Some people consider infrastructure as code, not purely infrastructure as code. You'll also hear infrastructure as software, infrastructure as configuration, but for me, I group all of the ideas behind infrastructure as code as a set of practices that align with software development practices, but you apply them to infrastructure, or to specifically infrastructure, which broad definition for a lot of things, as you pointed out. Kafka is infrastructure. And I think what's very interesting about applying infrastructure as code, trying to use software development practices like version control, continuous integration, delivery, deployment even, it's challenging. It's a heuristic. It's not a hundred percent perfect in infrastructure. And so the result is that when we try to make the theoretical application work in infrastructure, we kind of get a little confused. It breaks down a little bit. And so, what I wanted to understand was what are the ways that we can apply these practices and be practical about them? We can't apply everything. So how do they work in infrastructure specifically?
I like it. And I think struggling with the name, people say infrastructure as software, infrastructure as code, I've been trying to champion infrastructure as YAML, and I'm not getting any uptake on that. I don't really understand what the problem is, but...
I laugh a lot because people love to judge the domain-specific language, right? Kubernetes is a big technology that a lot of folks are talking about in the industry.
And there's a lot of YAML there.
Yeah. And there's a lot of YAML. But I came from networking and that had a lot of YAML, too. There was none of... And I laugh a little bit because, at the end of the day, whether you're using a domain-specific language and it ends up being a markup language like YAML, or you're using a programming language, at the end of the day, the practices to collaborate, organize, and make sure you're doing it securely are the same.
Yes, there are some technical differences, but you have to still collaborate and scale in the same way.
And you know, 15 years ago, it was fashionable to hate XML. Now it's fashionable to hate YAML. I'm excited to learn in 15 years what it will be fashionable to hate then and we'll still get good things done. So you said applying software development practices to infrastructure and I have an idea what you might mean by that, but go into detail there. What practices?
Sure. I think the main ones are version control. This one, I think people can take to an extreme in the infrastructure space as GitOps, right? The idea of making your changes through version control. The reason why you would do this is that you're trying to audit and track changes. You're trying to control the kind of collaboration that would make configuration changes in your infrastructure.
Previously whenever I would make changes to any kind of system, I would put in a change ticket and someone would try to figure out if it conflicted with something else, and I would wait and see after I applied it whether or not it worked. And sometimes it went well. At worst, I didn't know if it broke something until three weeks later. And so the idea with applying some of these software development practices is saying, "Okay, let's add testing to this workflow. Let's try to build a way that we organize the infrastructure changes and understand what's happening before they go to production. Let's build security in, earlier into our systems, rather than waiting until after we've gone to production and our security team says this practice isn't great. And we should probably go back and secure it."
There are a lot of other factors in software development, including continuous integration and continuous delivery, continuous deployment. That's particularly challenging in the infrastructure change realm, right? Do you have a change advisory board if you automatically deploy everything to production with good testing? So there are many, many more. So the book goes through a whole series of these from how do you write it? Applying clean code practices and applying.
Yeah. I had to go back and dust off my software design patterns book for it a bit and I laughed I was like, "I promise it's not as dense." My hope is that it's not as dense as the design patterns book and it's applied and scoped to how you write infrastructure, in particular. So modularizing dependency management, we go through, there's a chapter that's upcoming on security, and there will be a chapter on cost, which is the one I'm most scared to write, so.
Oh, wow. Yeah.
Yeah. We're progressing through all of these different practices in software that we consider but we don't necessarily talk about with our team very early on.
Wow. So this sounds, I mean, I think I'm famously not an Ops guy. And in contemporary terms, not a DevOps person. No one mistakes me for that. That's not the kind of stuff I normally talk about. It's not where I spent my time, and so it would certainly be possible for me not to have my finger on the pulse of things here, but I don't hear other people talking like this. And I guess I'm... the question I'm about to ask sounds like I'm a hundred percent asking you toot your own horn, but go ahead. Do it. Is this a fairly fresh and new approach? Is there a movement that you're participating in? Or are you really kind of saying, "Hey folks, follow me, here are some new ideas?"
There are other resources on infrastructure as code, and some folks have borrowed the language of software development practices and applied it and talked about it in great detail. But what I realized was that we weren't talking about it from a big picture view. A lot of these books, if you were coming from the development space, which is what I am aspiring to, I always laugh, I'm an aspiring software developer because I can never get it right. And no matter how many times I paired with some fantastic senior developers, principal developers, in various languages, it's not an intuition that I immediately developed. And so, what I wanted to try to figure out was if we're talking a lot about Dev and Ops, and here we are trying to apply development practices to operations, we should try to speak about the software development theories, but in an infrastructure specific way, and that gap was really hard for me to fill when I was learning and applying some of the stuff I was learning in the software developments space as someone who is operational.
I wanted to improve my automation. I knew my automation that I was doing like scripts, configuration management, it was brittle. I couldn't change it that easily. It wasn't very modular. I couldn't easily share it with my teammates, and I knew there had to be something better that I should be doing. And so, trying to understand and draw upon the theories and applying them to infrastructure really helped me. So my hope basically was that if I wrote this, it would find a common language for development teams to understand this if they're picking up operations, but also operations teams trying to fit this into a development life cycle.
I like it a lot. It feels like another one of these things and I'm just, I'm blatantly promoting your book here. Everybody, you should probably read Rosemary's book if you operate things at all. But that's my take so far. But it feels like one of these things where we started, I don't know, 10 years ago there was Chef and Puppet going on, and there were files in Git repos and maybe even on GitHub, and that was all a thing. And maybe the more doctrinaire, you mentioned GitOps. The more doctrinaire GitOps kind of thing is, "Well, if the pull request gets merged, then it happens. There's a hook. And do it." And that's cool. There're all kinds of good things about that, but it feels like when DevOps started to become a word and I almost hate to use that word, because now people are just going to fight about what DevOps means.
But just this movement that we're talking about out here, the software development principles that were in use there was source control. And that's what you started by saying. Like, "Well yeah, okay. Now my infrastructure is described in a text file and that's in a Git repo and there's a button I can push that turns that text file into infrastructure." And maybe that button is like a bunch of levers and knobs and a button, maybe it's not really just a button. But that's the principle is code is a text file. I put code at a Git repo, and now I describe my infrastructure with a text file, and I put that infrastructure in a Git repo, and that's very good. I mean, that was a sea change right there, and that's a software development principle applied to infrastructure.
But it kind of sounds to me like there was a whole bunch of promissory note that was going unpaid. And you're trying to say, "Hey, wait a second. You know, all this other stuff. Being a developer is not just checking source code into a Git repo. There are these other things that you do." And you're trying to say those now apply. Now that we've got the tooling, it's ubiquitous. They're moving targets, you've got different tools coming and going and it's YAML or it's Ruby DSL or whatever. But now, let's be serious about the fact that we are writing infrastructure code. We're software developers. It's just that the domain isn't insurance or healthcare or retail or manufacturing, the domain is systems.
Be that all the way. Is that?
Okay. I like that.
That's perfect. I think what really got to me was that a lot of developers wanted to take it, that I met and worked with, they wanted to take advantage of all these managed offerings that were out there. I mean, you could say Kafka as a service is a managed offering and people.
And I'm a huge fan of Confluent Cloud.
I mean, people don't want to take on this necessarily the operational aspects, but they still need to take on configuration. They still need to take on how they connect to it, as well, and having structure around it and working and at least distributing this information across your team, as well as your security team, is very challenging. This is a book that's almost like a crash course of all the development practices that a software developer might have learned while trying to understand microservices, architectures, and continuous deployment, and just crammed into one book applied to infrastructure, which is challenging. But I hope it helps, at least.
Yeah. But no, that sounds like a good way to organize it. Because all those things can be summarized in a few pages, right? You don't have to... They're all book-length treatments available, but the summary and the introduction are just a few pages and then apply to infrastructure. Now, do you have, obviously you have examples, so where do you go in terms of tooling? Where are you right now? What do the examples get written in?
Yeah, that's a fantastic question, and one that I will acknowledge has been fairly contentious.
Yeah. Yeah. I know, right? I can't get it right. So I am, full disclosure as someone who works for HashiCorp, I am very familiar with Terraform. I used a lot in engineering and I've used quite a bit of Python as well and quite a bit of Golang. And so, in this book, there were two options, two or three different options that I could entertain. The first was to figure out how to write my own infrastructure as codes syntax that was neither Terraform nor any other current tool that existed.
That sounds like a great idea.
Yeah. Fantastic idea. And you know, it provides it, but the problem is that you get folks who say, "No, I really want an example. I really want to understand how this actually applies when I run it and try to do this in my real system." And so, in my first iteration, I said, "Okay, I'll just use Terraform because HashiCorp configuration language, has its limitations, but it is intent driven, right? So if you read it, there's intent, you can say, "Oh, this is the resource attribute. This is the resource." You don't need as much knowledge of the direct syntax. The real focus is these are the configurations and these are the practices.
Yeah. I mean, the idea is you're describing a state that you would like to obtain.
In the world.
Exactly. It's declarative and so it was [crosstalk 00:15:30].
Yeah. It is very structured. So I was like, okay. And after the first round, some folks were like, "We don't want it in Terraform. If syntax changes, this book is not really as relevant. The examples aren't as relevant." So it's very complicated. It is Python. Yeah, Python. So it's an imperative rapper that is around declarative Terraform JSON, which will eventually, eventually Terraform will pick it up and use imperative Golang. So it's a declarative sandwich of sorts.
But kind of a toolchain for the book, so you have some generalized way of...
Yeah. But I mean, you got to make that call at some point. I mean the original.
You mentioned design patterns. The original Gang of Four patterns book was C++.
Because well, I mean, that was the language that was available and prudent to use. And you can't just talk about ideas. You have to show code at some point, so...
No, exactly. And I think C++ and, and reading that book, I sort of was like, "I may not know C++ to the deepest extent that was necessary to fully understand it, but I was still able to try to get a sense of the practices and how I would think about applying it to other languages, and that was my hope for the examples in this book, too.
Absolutely. Yeah. No, back, back... Well, I can disclose my age. That's fine. But when that book was new, I was never good at C++. I mean, C++ was a pretty big language. There was a lot to be good at, but you could kind of fight through it and that was easy enough.
You mentioned managed services and the intersection of managed services and infrastructure management or just DevOps. Now, we were talking before we started recording, just sort of joking that, "Oh, well I've used managed services, of course, you don't need... You don't have any operational concerns." Saying that like, of course, no one would really say that, but you do hear people say that. So, talk to us about that tension. Like if I use it, we'll just go with Confluent Cloud, because this is Streaming Audio. There's a lot. I don't have to operate about Confluent. There's a lot I never have to configure, but it's not zero. So how does that all fit together in your world? In your mind?
There're so many managed offerings now, you're not going to just be using one usually. And the problem is if you were using one, it's okay if you weren't necessarily recording that configuration inversion or source control. You could maybe go in and create it and then voila, you can use it, and hope that no one goes in and changes it, right? But the reality is that we are choosing a lot of infrastructure types. We're choosing a managed service for one thing and then maybe we're building something, servers on a cloud provider, while trying to still communicate to an Oracle database in our data center, right?
And as long as we have these more complex topologies, you might need to consider standardizing on an infrastructure's co-practice, no matter if it's managed or data center, or any other kind of resource. Just because, at the end of the day, we are all engineers and we're all looking for the opportunity to learn and grow, and your teams are going to change. Your teams are going to shift, your business domain will shift, and you need a way to collaborate effectively and communicate some of the knowledge that you have about your infrastructure.
It's a little bit like the... I joke, it's like a little bit like a tiny configuration in hiding, right? If someone goes in, they make a break glass change, meaning they will go in and make annual change for the sake of emergency. And on one day, they switch teams. They no longer have access to that system anymore. But they've made that change. They forgot to log it somewhere. They forgot to tell someone. You roll out a change and all of a sudden your system implodes, right?
And it's usually the weakest link in all of these different topologies that you have across multiple infrastructure components. [crosstalk 00:19:53] And you are none the wiser that it exists.
And you then the next day get to write the outage blog post.
Yeah. Exactly. You get to do the root cause analysis and everything else. And I think one of the things about managed services to understand is that you can create it and someone else can operate it, but if you needed to create a duplicate for some reason or another, especially if you're working in the data, with data. If you want to do something more immutable, a strategy that's more immutable for changing your data infrastructure, for example blue-green, something that's lower risk. Because that is important. You want to protect your data. You want to do something a little bit lower risk. You may not want to do something in place. Then infrastructure as code is something that will help you apply it, reproduce it, compose it with new changes, and evolve it over time without necessarily affecting other parts of your system.
And I wanted to ask you about immutability. It strikes me that kind of nature and the purpose of your book doesn't seem like it needs to take a position. And I can't necessarily infer from what you're saying about applying software development practices to infrastructure as code what your position might be. So this is a little bit orthogonal. I'm just kind of curious because I like asking Ops people this question, but what do you think about immutable infrastructure? Is it a good idea? Is it like you should pursue at almost any cost? Where are you on that spectrum?
I'm a "Pursue when you can, but understand you won't do it all the time." Immutable infrastructure is expensive. There are some things that have expensive in terms of cost and time.
So the amount of time.
Like engineering cost. A lot of you.
To do it.
Even financial too. Because immutable infrastructure means you're creating new infrastructure every time, and it doesn't really help you to delete the infrastructure before you recreate it.
You usually run for an intermittent period of time, two environments.
Duplicates of each other. And depending on how big your environment is, if you're doing this, let's say you have a network, right? And that network supports a huge Kafka cluster. If you decided to do a quote-unquote immutable change, immutable approach, to your network, you'd duplicate that network, plus a new Kafka cluster, any higher-level resources, and then you run that environment for a certain period of time. And that's, well, increased financial cost too.
Seems like zero people would actually do that. But yeah. So you're going to have the time that you keep the old version up and the number of times PRs get merged and auto deploys happen.
Because of course, we're cool. And so that's going to give you a percentage. You could easily be using double the amount of cloud resources.
In a day.
On that. Clearly not an Ops person, but that makes sense.
Yeah. And there's an engineering cost to it. When you consider the testing that you need to do, the concerns you might have about the differences between a net new environment versus making a change in an update, just an in-place update. So I don't push for immutable infrastructure for everything. I think it is important to do it as much as you can, but I also recognize it's not completely possible, and it may not be practical. If you're updating a tag on your cloud resource, you might not need to have an immutable copy. You don't need another copy of that server to update a tag. [crosstalk 00:23:35] You could update it. Yeah. [crosstalk 00:23:37] Exactly. You could update it in place in like 30 seconds and that would be fine.
The world actually is mutable.
Yeah, exactly. And that's mutable. So I think it's a matter of balancing the two and I usually do it from a cost, both from engineering effort as well as financial cost, kind of view because.
No, that's huge. And stuff like Terraform is trying to help you put a tooling layer in there that gets you a good part of the way.
Yeah, exactly. A lot of tools will do it. Cloud Formation, for example, any of the other infrastructure as code tooling out there will generally put a layer for you to understand the types of changes that can be done mutably and immutably. Someone else made that decision or did that testing for you. You don't have to go and figure out which one is which anymore.
Right. Hey, tell us about what you do at HashiCorp. You are also a developer advocate, always good to talk to a fellow professional. What do you focus on there?
I focus on Console and Vault. So console is the service mesh open source tool and Vault is the secrets management open source tool.
Awesome. So you spend your time helping people understand them and help persuade people that it's a good idea to use them if it solves a problem they have.
I think it's a little bit funny to me because, at the end of the day, both of them are in some ways a developer story, more than they're an operations story. So, and that's where it's been.
We talk a lot about operations because, in some ways, a lot of the HashiCorp tools focus on infrastructure automation, but Console and Vault, more than others, it's effectively a developer story because service mesh is in some ways is offering abstractions for observability, offering abstractions for traffic management, traffic shaping, like circuit breaking re- try. Vault is a secrets manager who is offering abstractions for secrets injection. And in both cases, it's an abstraction that takes the functionality away from the application code. And in some ways, you can focus the application code on doing the actual application functionality and not necessarily these operational. I mean, I say that I gesture with quotes as I do this, but operational pieces.
They kind of... Yeah. If you're not watching the YouTube version, you should watch the YouTube version, you can see Rosemary's gesture with quotes. But I mean, those are operational concerns. It makes sense to me that a person like you would be the one who would handle those, and that they would be under the HashiCorp umbrella, which as you said, is all heavily Ops and infrastructure focused.
Because I mean, yeah. Service mess...
We all know what you think about it, now.
Little Freudian slip there. Service mesh is solving it. It is infrastructure and it is solving what I consider to be largely operational concerns. I think an argument can be made. There's a continuum there, and hey, there should be, right?
It's good that we see that continuum.
Yeah, exactly. I will say that it's not as, it's not as... Service mesh is hard. Service mesh is non-obvious. You can generally agree on a definition as like an infrastructure layer that abstracts. It abstracts away sort of the networking, whatever networking you want to talk about, from the application side. But is it operational? Is it more development? No one can fully agree.
Right. Right. No, that's a fuzzy boundary and again, it's appropriate to, I think that some of our tooling and some of our attention is consciously in that fuzzy boundary between those two places.
Yeah. I guess we're going to get many comments about DevOps again, but...
Oh good, yeah.
I guess it's...
Everybody, in the comments, tells us what DevOps means. It's fine. We'll be there. Awesome. Well, hey everybody, if you listen past the end, I'll have a discount code for Rosemary's book. It sounds like if this is a thing that you do, sounds like a really good book. I would encourage everybody to read it. And my guest today has been Rosemary Wang. Rosemary, thanks for being a part of Streaming Audio.
Thank you for having me.
And there you have it. If you're still listening, you get a special discount code for sticking with us all the way to the end. You can use the code PODCON19. That's P-O-D-C-O-N-1-9, to get 40% off all Manning publications in all formats. Just enter PODCON19 during checkout on Manning.com and that 40% off is all yours. Enjoy it. And don't forget to check out Confluent Developer, that's developer.confluent.io to get started learning Apache Kafka.
If you take a free video course on Confluent Developer and sign up to do exercises in Confluent Cloud, you can use the code PODCAST100 to get an extra $100 in free usage. So I hope this podcast was useful to you. And if you want to discuss it more or ask a question, you can always reach out to me on Twitter @TLBerglund. That's T-L-B-E-R-G-L-U-N-D. Or you can leave a comment on the YouTube video or reach out to us in community slack or in the community forum. There's a sign up link and a link to the forum in the show notes if you'd like to join. And while you're at it, please subscribe to our YouTube channel and to this podcast, wherever fine podcasts are sold. If you subscribe through Apple podcast, please be sure to leave us a review there. That helps other people discover the podcast, which we think is a good thing. Thanks for your support and we'll see you next time.
Managing infrastructure as code (IaC) instead of using manual processes makes it easy to scale systems and minimize errors. Rosemary Wang (Developer Advocate, HashiCorp, and author of “Essential Infrastructure as Code: Patterns and Practices”) is an infrastructure engineer at heart and an aspiring software developer who is passionate about teaching patterns for infrastructure as code to simplify processes for system admins and software engineers familiar with Python, provisioning tools like Terraform, and cloud service providers.
The definition of infrastructure has expanded to include anything that delivers or deploys applications. Infrastructure as software or infrastructure as configuration, according to Rosemary, are ideas grouped behind infrastructure as code—the process of automating infrastructure changes in a codified manner, which also applies to DevOps practices, including version controls, continuous integration, continuous delivery, and continuous deployment. Whether you’re using a domain-specific language or a programming language, the practices used to collaborate between you, your team, and your organization are the same—create one application and scale systems.
The ultimate result and benefit of infrastructure as code is automation. Many developers take advantage of managed offerings like Confluent Cloud—fully managed Kafka as a service—to remove the operational burden and configuration layer. Still, as long as complex topologies like connecting to another server on a cloud provider to external databases exist, there is great value to standardizing infrastructure practices. Rosemary shares four characteristics that every infrastructure system should have:
In addition, Rosemary and Tim discuss updating infrastructure with blue-green deployment techniques, immutable infrastructure, and developer advocacy.
If there's something you want to know about Apache Kafka, Confluent or event streaming, please send us an email with your question and we'll hope to answer it on the next episode of Ask Confluent.Email Us