Sunday, February 23, 2020

The Slippery Slope of Docker Dev Environments


Recently, I built up a local development environment that uses Docker for some critical integration test paths. As I put the finishing touches on this work, I realized there were some far-reaching implications that I had not taken into consideration before I started down this road, namely:

  • It required the developer to have docker and docker-compose on their local machine
  • There was a considerable amount of configuration required for the enviroment to actually work
  • The shell script I wrote to “alleviate” some of these configuration woes succeeded only in obfuscating how the system actually worked
  • The shell script I wrote also ended up being rather myopic—it works great in certain environments, but you’re on your own if you’re working in, say, Windows
  • I spent the better part of a day banging my head against some simple database connnectivity issues, only to realize my database container wasn’t configured properly

With all the time invested in this, the result did end up benefitting my team and ultimately helped with some challenges we were having on integration tests. But what was more interesting to me were the challenges it created, not to mention the spirited discussion about the work in the pull request I submitted before it was ultimately merged.

What is more, this environment eventually served a single purpose—to provide integration test clarity, rather than a holistic development environment, like I had initially hoped. The net result was that we moved this environment away from a developer’s machine, and ended up deploying it in a containerized form on our cloud provider to create an integration test resource.

Silver linings aside, my efforts were pretty much a failure, especially considering my initial motives.

How could I have gone so astray, with all of my hard work resulting in little more than a fancy test environment? I decided to dig in more to this issue of container-based development environments, and what I’ve learned since has dramatically changed how I’ll approach the problem in the future.

The current state of containers

Plenty of surveys assure us that Docker adoption continues to rise, especially as infrastructure grows and becomes more complex. A June 2018 survey by DataDog states that about 25% of companies deploy some form of infrastructure with Docker. Half of those environments are orchestrated by some means, and the size of deployments has grown by 75% from 2017 to 2018. According to these sources, the Docker “revolution” is in full swing, with no sign of slowing or stopping. (I’m still curious what the 75% majority of companies are using for their deployments, but I digress.)

The same 2018 DataDog survey mentions that the most widely used Docker images are “Nginx, Redis, and Postgres”. This makes sense to me, as running containers for dependencies of your application seems to be the first step in containerization. Docker Compose provides a relatively straightforward tool for multi-container applications; it also seems to be a great tool to allow developers to run specific, lower-level infrastructure for their own environments. To wit, you set up a docker-compose.yml file in your project and you’re ready to go.

Of these 25% of companies running Docker in a production enviroment, just how many are using Docker as a developer tool? The 2019 Stack Overflow Survey reports that 38.4% surveyed use containers for Development work, but about half of respondents are not using any container technology today. I wondered if there was a way to further understand why developers just aren’t using Docker as much as I initially assumed they were. I decided to dig in a little more and do some cursory research on how developers feel about Docker.

Many developers loathe Docker for their own environments, and with good reason—introducing a conatiner seems to slow feedback cycles between a developer and the environment they’re building on. Containerizing development environments also seems to create an unecessary abstraction for the operator that needs to be able to dig right into the code, the runtime environment, and even the lower-level operating system—all while they are building a feature.

Advocates of moving to Docker as a developer tool state the benefits are just as telling. A dev environment that uses containers should also result in parity across the development team. If everyone uses containers for their database, cache, or other miscellaneous infrastructure, then getting set up to write code should just be as simple as docker-compose up and you’ve got a full development environment at your finger tips. Assuming your team is willing to run containers locally, you’ll never create disparity between your development environment and your production environment. No more nasty surprises when running a brew upgrade—the container will always be in lock step with your needs.

Frankly, I empathize with both groups. As a developer with one foot firmly planted in the Operations side of the house, I think the benefits for running containers are massive. I’m not convinced that these benefits cleanly map over to developer workflows, however. I beleve the disparity between experiences with Docker in a development environment exist because Docker isn’t a tool for developers. However, I don’t think this means dev teams shouldn’t consider leveraging some of Docker for their own needs. But I feel approaching Docker as yet another operations tool can help alleviate some of the pain of running Docker in a local development environment.

Is the container a developer-friendly abstraction?

A container, according to Docker is:

a standard unit of software that packages up code and all its dependencies so the application runs quickly and reliably from one computing environment to another

A slightly more perspicacious definition follows:

Containers are an abstraction at the app layer that packages code and dependencies together.

Sounds great, right? Well yes, specifically if you’re trying to deploy code. In other words, if my main concern is getting our code to run just about anywhere with no surprises, the container seems to be the best abstraction yet—I don’t really need to know much about the code that we’re running, rather, I need predictability in how we’re going to run it. As an operator, this is extremely powerful.

As a developer, this abstraction will probably create some headaches though. Containers don’t really concern themselves much with what is running. The operating system is purposefully abstracted away, as are any of the dependencies necessary to run said container. The application layer itself is just as ephemeral as the underlying OS, and accessing these layers of abstraction requires knowledge about the way in which Docker wants to run code.

For all I’ve said about the intended purpose of containers, I think it would be misguided to say that a container is only something for your operations team. I think developers can benefit specifically from containerizing some parts of their environments. The challenge is creating a truly developer-friendly, container-based workflow.

Common pitfalls of a Docker-powered development environment

Ultimately, I choose to use Docker quite a bit for my own development work. It works well for me, especially for the sort of work I do on a day to day basis. With my focus on Dev Ops and SRE here at Test Double, it makes a lot of sense that I’d be working with containers daily.

Naturally, this might not be a good fit for your team. But if you do want to explore using Docker in development environments, I’d call to your attention some common pitfalls I’ve experienced first hand.

Assuming universal benefit of containerization

Probably the biggest pitfall in going down the road of a containerized development enviroment is assuming that the benefits you’re getting in deployments are the same benefits a developer will experience locally.

Similarly, assuming that a team wants to work this closely with Docker is an assumption that may not work well with your team dynamic. If your developers are relatively siloed from operations work, it’s probably not a safe assumption that they’d like to work with containers locally, every day. But if you’re working on a team where developers are doing some operations work, it could benefit your group, assuming you provide other alternatives to just a containerized development environment.

The reality of running containers locally is that they consume a lot of resources. On my machine alone, in a typical work day, Docker consumes about 36GB of storage. I wouldn’t say this is an egregious amount of system usage for my particular make and model of workstation, but I can easily see how this will grow considerably the more I include containers in my workflow. Docker also consistently tops out in my Activity Monitor for CPU, memory, and disk resources.

What is more, while this might not be a huge amount of drag on my machine, it doesn’t mean it squares well with other’s machines, and ultimately the decision to dedicate so many resources to Docker should be up to the developer’s own personal preference. To wit, your laptop ain’t a server, and it probably shouldn’t require container resources that are built to server standards.

But even beyond system resources, the developer environment just doesn’t square well with whatever system is running your code. In the olden days of yore, this disparity was so great that we often had to tunnel in to an environment that was exactly like our production server, and some of us (sadly, myself included) were even forced to make live code changes on these systems if something went horribly wrong!

Developers need to focus on writing maintainable, reliable, and well-tested code. I’d argue that working in a limited environment reinforces better code practices and decisions—you’re forced to lean on writing clean, maintainable, and actionable code, rather than hoping your server is configured in such a way to deal with performance bottlenecks and inefficient implementations.

The challenge with containerizing something in your stack is to reduce the amount of cognitive overhead required to run the container. Developers already have a lot of context they need to build up and maintain while working. If your local Docker environment breaks this context up with say, figuring out if a container is running, you’ll only cause frustration in the long term.

Similarly, requiring developers to jump through containers with docker run commands can add to the cognitive overhead of context switching. Not only is jumping into a container antithetical to the patterns developers have built up when developing in their own local environments, it requires additional mastery of the docker CLI itself, which creates drag on developers achieving their goals. I also can’t help but feel a little strange when I have to jump into a Docker container to debug something. It smacks of a previous sin I’ve mentioned before: tunneling into a production server.

This heuristic follows with other services. If you’re working on a team that is all in on microservices, and other teams need various services to build their own features, you’ll want to proceed carefully with providing these images as containers. In these situations, it actually might be more beneficial for the team to stand up the service themselves, in their own environment, than obfuscating the service with a container.

For these internal services specifically, it may be worth auditing the documentation provided for this service instead of containerizing it. Poorly maintained documentation combined with a container can create enough cognitive dissonance that it will cause enough frustration to abandon the Docker environment. I’d also wager that all of us should probably look more carefully at our documentation, and audit it frequently, rather than building something new.

The survey results I’ve mentioned above called out Nginx, Redis, and Postgres as being very popular among teams embracing containers. It’s pretty obvious why these are so popular. Unless your team is in the business of writing their own RDBMS or web server / load balancer, you’ll benefit greatly by leveraging open source applications like these in your stack.

But until containerization, your operations team might not have gotten the same benefit that your development team did when the decision was made to include these in your stack. By containerizing these dependencies, operations teams can benefit from a similar sort of accelerator that not writing your own RDBMS would give your developers.

Containerizing Postgres provides a considerable amount of options for deploying, monitoring, and scaling this critical dependency. It also reduces some of the overhead for updates, upgrades, and management of this system. This is all great for operations teams, but does it square with developer needs?

In a word, no. Even with the simplicity of running docker-compose up -d to set up an application stack, it doesn’t work well with the mental model that a vast majority of developers have for running their environment locally. The disparity between both workflows arises from a fundamental difference.

Developers want to be able to dig in to the abstractions they need, and requiring a team to use a completely new tool, with a completely different method of running their local environment, is a big ask. Specifically, developers need to be able to run database migrations, jump into a database CLI, and track database logs. The same follows for any critical component of your stack.

That’s not to say that there aren’t developers happily using Docker locally for all sorts of things. But I’d wager the developers using Docker have figured out how to make it work seamlessly in their workflow, whether they’ve just made a habitual change or even if they’ve scripted things to better work with their own vision of their environment.

Novel scripting around container commands

Speaking of scripting, if you’re building a local environment like this, and you’re trying to reduce the overhead on developers having to docker run various things, your initial impulse (like mine) might be to script away many of the rote tasks associated with container-based work.

This predilection isn’t necessarily misguided. After all, we’re taught to script away redundant tasks so we can focus on less repetitive work. But beware of scripting away too much, especially if your team is relatively new to the container space. If this is going to be adopted by many developers on various teams, it’s worth considering how motivated and interested your team is in maintaining a bespoke bash script for running containers in their own local environment.

I mentioned that there is overhead in working with the docker CLI, but with a team of developers motivated to work with containers, it might make more sense to provide them time and training on using Docker’s own tool, rather than potentially obfuscating the inner workings of Docker itself. It is indeed a tool that must be mastered, but by avoiding novel scripting around these commands, you reduce cognitive strain on troubleshooting and wider understanding of the tool itself.

To emphasize my greatest concern on bespoke container wrappers, novel scripting around containers adds more code, and thus more things to maintain. It’s on your team to own this overhead. If the group determines that it is a net benefit for their workflows, I say go forth. The pitfall we want to avoid though is an abandoned, bespoke shell script that only serves as a potential time sink for those trying to work in your environment, especially during a developer onboarding process.

Dockerizing with no alternatives

If you are interested in moving to a containerized development enviroment, I think you should spend a considerable amount of time building easy alternatives for running systems locally. In other words, don’t assume that using Docker in a local environment is an all-or-nothing decision. Document the steps for setting up your application in both a standard local deployment as well as providing an alternative for those interested in running containers. Let your team determine if this workflow fits.

To sum up, going all in on containers without understanding the trade-offs for developer environments will make life difficult for your dev team.

The End

There is no right way to set up a developer environment. While encouraging your team to embrace containerization might provide some added benefit, the ultimate goal is to improve developer workflow. If we sacrifice the ability to write excellent code in favor of a technical abstraction, we’re creating inefficiencies.

As with anything, the ultimate decision should be made as a team choice, and no environment, whether traditional or containerized, should be retained because of personal preference. Work with your team to identify the pain points in developing locally, and you’ll find the right balance.



from Hacker News https://ift.tt/2w8zrm7

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.