Kubernetes Migration from Day Minus One (-1) to Day Two (2)

Kubernetes is now much past a hyped-up buzzword and has become nearly the de facto platform for microservices, enabling the flexibility and scalability modern engineering organizations require.  It’s no surprise then that many organizations still running on legacy platforms are exploring how to migrate to cloud-native platforms like Kubernetes.

In this series, we will take a look at Kubernetes migration from ideation, to assessment, through actual practical implementation.  We started with our previous post which provided a high-level checklist and guide for migrating to Kubernetes, that we compiled based on supporting numerous journeys with clients to help them achieve their cloud-native migration goals. In this post, we’re going to dig a little deeper and focus on the assessment and planning of the migration.

This post is going to start with Day -1 (yes, minus one), and will focus on everything you need to think about before you even get started and make the choice to migrate to K8s, how to make the right decisions that will impact your short-term goals of migrating to a new platform (Day 0), and that will support the migration in the long-term, the day after (Day 1 + Day 2 operations).  

Day Minus One (-1) – Preparing to Migrate to Kubernetes

Like all strategic business projects, the very first thing we need to think about when choosing to migrate to Kubernetes, are the goals and outcomes for which we are optimizing.  In our previous post, we mentioned a lot of the challenges that become blockers for migration projects to newer platforms and cloud-native technology.  

These are often a direct by-product of not having an adequate definition of what success looks like.  When we define what our outcomes and goals are from the migration project, not only will we be able to scope our project better (which provides us with a better-defined project roadmap and milestones), but will also give us a much more realistic timeline for completion.

Defining Success 

When you make the decision to migrate to Kubernetes, the first question you’ll need to ask yourselves is what is the purpose of this migration? For some it’s cost optimization, for others it’s the modernization of the infrastructure (formerly known by the favorite corporate buzzword “digital transformation”). Each of these outcomes will define a different course of operation, and with it a different scope, project roadmap, and ultimately timeframe for delivery.  

Be very intentional and explicit about whether you are just upgrading existing platforms to a better and more performant 2.0 version, or if you are doing a complete and hard reset on everything you currently have. There is also a middle ground, of starting small and gradually growing as greater expertise is achieved.

Each of these choices has its own considerations in terms of budget, learning curve, adoption for your engineering teams, transitions, and more.  

Pro Tip: From a technical starting point and business perspective, a good practice is to first focus on the services that would provide enough benefit to migrate based on any one of the KPIs defined as success metrics for your end goal including the timeframes, applications to migrate, and budget considerations. Some would prefer to migrate new applications that drive the company innovation first while others would prefer to migrate non-missions critical applications that carry small risks but have proven value to the organization. 

Reviewing the Existing Tech Stack

Understanding your technology starting point will be a critical factor in understanding the scope and complexity of your migration process. If we take an example from a company that we helped on their migration journey, both their monitoring tools and CI/CD tools were outdated and not the optimal fit for a modernized cloud-native stack.  

This means that the decision to migrate to new tooling is a factor in the process, or otherwise migrating completely new or greenfield applications only will be the starting point (which sometimes provides less immediate business value).  We need to understand how much tolerance our organization has for change and, likewise the budget at our disposal to complete the migration successfully.  These will all factor into our technology decisions in our migration process.

This is also the point that you need to assess the level of work you need to invest in order to get to your cloud-native migration end goal. For example, how far from your goal you are today.  Are you running containerized applications or are you running typical apps on virtual machines or servers? Do you have any kind of orchestration in place, or will this be your first foray into this realm?  If we dig a little deeper into the second step of our previous post, this would be the point at which you understand whether your migration will be a lift-and-shift, refactoring, or a full re-architecture.

A word on homegrown tooling, that many organizations have built, and to which teams are accustomed to.  These also add a layer of complexity in the process of migration, which needs to be factored into decision-making, scope and timelines.

All of this together will factor into the scope and timelines of your migration project, and will help you have a more realistic understanding of what you need to invest from an economic and time perspective to achieve the migration goals.

Pro Tip: The hard truth is that cloud native technology requires much more expertise and dedicated tooling than other infrastructure or platforms, and this should be top of mind when making the decision to migrate to cloud native stacks. Plan the migration to new tooling equally as you would plan the migration to your new infrastructure––you can target this around license expirations to maximize cost benefits. The applications you choose to migrate will dictate what tooling is required and should be budgeted for Day 1, likewise what can wait until Day 2 when greater system stability and in-house knowledge and expertise is accrued.

Evaluating the Knowledge Gaps

One factor that is often overlooked or underestimated is the current knowledge of engineers and the knowledge gaps that need to be overcome for a successful migration.  With talented engineers on our teams, we often underestimate the learning curve in a very Dunning-Kruger-esque fashion, and this too can impact our outcomes and timelines.  

Some of the considerations you need to think about when making the decision to adopt cloud native infrastructure is the number of engineers you have on your team today, and what is their appetite for learning a new platform like Kubernetes that has its own challenges for adoption? Another factor that is often overlooked is how much will it cost from a learning and budget perspective to bring in a completely new tool? Many new tools? While learning a new platform? None of these factors should be underestimated when defining your timelines and budgets.

An additional consideration is what your community and ecosystem looks like. Do you have advisors or external technical staff that can support such a migration?  For example, one of our clients (a large enterprise) was wooed by all of the big cloud providers, where the winning cloud provider was the one that committed the resources to helping facilitate the migration to their platform.  This is a great solution to help gain additional resources and expertise in a complex process like an infrastructure migration.

Pro Tip: All new tools and platforms come with some level of complexity and adoption curve.  Factor the learning curve time into your migration process, to ensure that the day after tomorrow your teams will have the skills required to manage your new systems, which will provide greater confidence in the outcomes.  Don’t discount the importance of advisors and allies on the journey, who can help you level up your skill set, but also provide a safety net during the migration.

Next Up – Day 0, Day 1, and Day 2 of Kubernetes Migration

All of this together will factor into your final migration roadmap, what success looks like, which applications make sense to migrate in phase one, the tooling and stacks to support a successful migration, as well as the in-house knowledge you currently have and need to level up to be able to see the project through successfully.  These all matter equally and should not be overlooked in terms of time and resource management when defining realistic deadlines.

In this post, we focused on those critical crossroads of decisions that need to be made prior to even starting your migration, and the practical considerations regarding your in-house knowledge, tooling, systems, and how you define success for your migration project. Many of these will factor into how large or small the scope of your migration is, depending on the fundamental changes you may need to make around tooling, and the learning curve your team may need to overcome before you can adopt new cloud native infrastructure.  

It is recommended to leverage the community and ecosystem, and all of the resources at your disposal towards ensuring the success of the migration.  Surrounding your team with support and advisors will provide a measure of safety and confidence, in high-stress processes like large infrastructure migration projects.

Pro Tip: Good KPIs and benchmarks to identify success include timeline-related benchmarks, such as a target number of applications to migrate by a target date, as well as cost-related benchmarks such as target migration based on existing tooling license expirations (which includes both time and cost factors).  These types of concrete KPIs will provide good practical metrics to help define success of the overall migration project.

In our next post, we’ll dive into Day 0 considerations for migrating to Kubernetes, once you have made the decision to migrate and are ramping up towards practically making the move.  We will review the technology stacks you need to line up ahead of time, and those that can wait until your systems have stabilized.  We will focus on everything from CI/CD to monitoring and observability, and security, alongside real-world examples to make the right technology decisions for your stacks.