Kubernetes Migration – Moving on to Day 0

In this multi-part series, we are taking a deep dive into everything from the technical review and assessment required to make the correct decision around substantial infrastructure migrations, through the practical hands-on work of migrating from legacy systems to Kubernetes and cloud-native environments. Our previous posts focused on the high-level checklist and guide for migrating to Kubernetes, next, we dove into Day -1 considerations, basically focusing on the fundamentals for properly assessing and planning the migration.

This post will essentially be somewhere between Steps 2 and 3 of our step-by-step guide for migrating to Kubernetes (Post #1 in the series). This is going to be the post that enables you to understand a little more deeply everything you need to line up on Day 0,  in order to get your systems and stacks ready to support cloud-native architecture.

Day 0 – How Your Technology Stack Will Impact Success

In our previous post, we discussed what you need to think about before you even decide to migrate to Kubernetes, and what will define your migration strategy’s success.  Once you have decided that the migration is the right decision for your organization––Day 0 involves getting all of your ducks in a row, and making the right technology decisions to support your migration.  In this section, we’ll look at the two critical technical factors you need to consider and evaluate as part of your project roadmap and scope that will impact the success of your migration – general tooling stack and security.

Cloud Native Tooling

As noted in previous posts, it often surprises organizations that their existing tooling is simply not well-equipped to support cloud-native operations. Deciding to migrate to cloud-native infrastructure will directly impact both the existing tools and engineering workflows. This includes aligning with best practices in continuous integration and deployment (CI/CD), as well as integrating the most suited monitoring and observability tools for your systems, and leveling up security to what your cloud-native systems require.

While replacing legacy tooling may not be critical in your first phase of migration if you choose to migrate simple and non-mission critical applications initially, however, once you decide to migrate applications that touch customer data or impact client operations directly you will need to level up your tooling to deliver sufficient coverage. This is true for both operations and security.

In addition, you should keep in mind that Kubernetes is an ecosystem and not just a tool or standalone platform. This means that Kubernetes itself, at its conception, was built to be lean, and much of its complementary functionality and support comes from peripheral tooling built natively to work with this platform, such as Keda, ArgoCD, Flux, just to name a few. Eventually once fully onboarded to Kubernetes, it will be hard to run your systems in the long-term without adopting the native tooling that supports it.

Real-World Cloud Native Application Examples 

Below we’ll provide some real practical examples that will help you understand how and why the tooling really matters, and which decisions you’ll need to make at which crossroads to best manage and maintain your cloud-native operations in the long term.  In our previous posts, we emphasized this, and we will stress it again.  The deployment is eventually the easy part (and honestly, it ain’t that easy), the Day 2 operations of managing your systems in the long term will be the greatest lift in your migration journey, so it’s important not to underestimate the importance of how the best-suited tooling can support your entire journey.

CI/CD

With regular cloud operations, many organizations got by using CI tools as their deployment (CD) tools as well, and this worked just fine for those systems.  However, CI/CD in cloud native operations looks much different than typical cloud ops.

With legacy CI tools the modus operandi is largely running pipelines with build & test suites, and then once successfully passed, these builds are either deployed to production or staging, or promoted to the next release candidate in the “push methodology”.  This works fine in the world of cloud.

This differs from cloud-native operations where the CI and the CD are decoupled for better governance and deployment across many clusters. This is because the push methodology tends to break down in multi-cluster scenarios, and there are tools that have been built like ArgoCD to provide the more relevant “pull methodology” that is better suited for cloud native systems.  

This means, that instead of pushing your changes once approved, a cloud-native CD tool will essentially track your repository as a single source of truth, in the GitOps approach, and ensure that your deployments have not drifted from your repositories’ configurations pre-deployment.  In addition, the deployment is better optimized for multi-cluster deployment, where the changes are rolled out in small batches, gradually by cluster, and you receive feedback to see if it was successful.  Once everything looks good, the rest of the deployment is rolled out to all of your clusters, to ensure the deployments don’t break down at scale.

Legacy tooling simply does not support the cloud native pull methodology well, and this is something to consider with how much you’d like to evolve you systems and workflows, or if you are just looking for a mild upgrade (refactor vs. full re-architecture, will include the tooling to support this).

Monitoring and Observability

One of the areas that requires explicit attention when migrating to Kubernetes is monitoring and observability. This will provide the lifeline and critical visibility into the layers of complexity that are being built through microservices architecture and new paradigms for delivery. While, like CI/CD tooling, you won’t necessarily need to change these all on day one, eventually if you want the level of monitoring this new and more complex system requires, observability is the hub at which these moving parts break down.

Cloud-native observability will be the building blocks upon which your monitoring is built, and legacy systems despite claiming to – don’t readily support what cloud native stacks require. This will ultimately be the window into your availability, and ensuring it is actually native to your stack will be the difference between uptime and costly downtime.  Many tools will claim they are cloud native, but in order to properly support cloud-native observability there are some critical features your observability requires such as the ability to integrate with your newly adopted cloud native tooling and workflows, creating correlations between the multiple layers that essentially comprise your observability like metrics, logs  & traces, while at the same time ensuring easy scalability. Legacy monitoring agents not optimized for cloud-native systems largely run scripts and output data tool collection-based toolings––which will not give you sufficient visibility into your multi-layer microservices systems.

Security

As we’ve learned over the decades since the cloud was introduced, each platform and infrastructure comes with its own highly specific security considerations, and Kubernetes is no different.  The security required for running Linux-based virtual machines is radically different from security considerations when running Kubernetes, despite both being Linux-based.  Both the configuration and infrastructure layers are distributed differently, and if you don’t have applications that are running in containers today, you may not have even encountered the vulnerabilities associated with image streams that are a common source of risk for cloud-native stacks.

Many times, your entire build process will change from how you build your images to the way they are ultimately shipped to production. This too comes with myriad security considerations––not just with the code layer for how you build your applications, but also ultimately the delivery process for how this code ultimately reaches and runs in production.  While incumbent security platforms have all made acquisitions and built added layers of functionality to support the brave new cloud-native world, emerging cloud-native security tooling is likely better suited and will deliver greater and more customized security for your cloud-native environments. 

As noted before, if you are only starting with internal services, that’s one thing, but the moment you intend to start delivering customer-related data and services in your cloud-native systems, you will need to have more robust security to ensure no breaches or data leakage.

Day 1 and Day 2 of Kubernetes Migration – Coming Next

In this post, we dove into the tooling, workflows, and processes that support a successful migration to Kubernetes.  While not everything needs to be done on Day 1, these are good practices for long-term Kubernetes maintenance and management, which will ultimately be the biggest lift for your organization.  

In our next posts, we will provide a definitive list of all of the Kubernetes configurations, setup, common issues, and challenges, to help dissipate some of the FUD associated with migrating to Kubernetes.  We’ll also learn why Kubernetes applications break, and how you’ll need to re-architect legacy applications in order to port them to your newly minted cloud-native architecture.  All of this and more is coming in this ongoing series of posts to help you onboard to Kubernetes with as little pain as possible.