Kubernetes may provide an abundance of benefits, but those who are using it may be well aware that it often requires quite a bit (or even a lot!) of effort and skill to run the platform independently. So – rather than having to put up with it on their own, organizations are able to pay for a managed Kubernetes service instead.
This is where Google Kubernetes Engine (GKE), Azure Kubernetes Service (AKS), and Amazon Elastic Kubernetes Service (EKS) come in. GKE, AKS, and EKS are the three leading managed Kubernetes services that enable organizations to outsource their Kubernetes (K8s) needs to a third-party vendor that takes responsibility for setting up, maintaining, and upgrading Kubernetes.
The need for managed Kubernetes
Kubernetes is a popular container orchestration platform that provides a rich set of features, including self-recovery, workload management, batch execution, and progressive application deployment.
The main advantage of Kubernetes is that it enables organizations to automate container orchestration tasks to ensure effectiveness and developer productivity. However, automating these tasks requires a lot of work, and the Kubernetes learning curve is steep. A managed Kubernetes service helps organizations set up and operate their Kubernetes workloads.
Managed Kubernetes benefits
A managed Kubernetes vendor may offer various services, such as hosting infrastructure with pre-configured environments, full Kubernetes hosting and operations, and dedicated support. The vendor does much (or all) of the grunt work, including configurations, and may also guide their customers through the decision-making process.
Once the initial setup is operational, a managed Kubernetes vendor provides tools to automate routine processes, including scaling, updates, monitoring, and load-balancing. Managed Kubernetes vendors that offer a hosting platform typically manage the underlying infrastructure, including configuration and maintenance.
This is part of our series of guides about Kubernetes troubleshooting.
What Is GKE?
GKE is an orchestration and management system for Docker containers and container clusters running on public Google Cloud services. GKE is based on Kubernetes, which was initially developed by Google and later released as an open source project.
GKE employs Kubernetes to manage clusters, ensuring organizations can easily deploy clusters using features like pre-configured workload settings and auto-scaling. GKE does most of the cluster configuration, enabling organizations to use regular Kubernetes commands to deploy and manage applications, set up policies, and monitor workloads.
What Is AKS?
AKS manages hosted Kubernetes environments and provides capabilities that simplify the deployment and management of containerized applications in the Azure cloud.
The AKS environment includes many features, including automated updates, easy scaling, and self-healing. AKS manages the Kubernetes cluster master for free, expecting organizations to manage the agent nodes in their cluster. AKS bills only for the VMs the organization’s nodes run on.
You can create a cluster using the Azure CLI or the Azure portal. Once you create a cluster, you can use Azure Resource Manager templates to automate Kubernetes cluster creation. These templates let you specify various aspects, including networking, monitoring, and Azure Active Directory (AD) integration. AKS uses these specs when automating cluster deployment.
What Is Amazon EKS?
Amazon EKS enables organizations to easily run Kubernetes on-premises and in the AWS cloud. Amazon EKS is a certified Kubernetes-conformant, ensuring that existing applications running on upstream Kubernetes are also compatible with Amazon EKS.
Amazon EKS automatically manages the scalability and availability of any Kubernetes control plane nodes in charge of key tasks like scheduling containers, storing cluster data, and managing application availability.
The Showdown: GKE vs. AKS vs. EKS
Updates and Upgrades
- GKE provides more flexibility, supporting many Kubernetes versions – four minor versions and a total of 12 versions from 1.14 and including 1.17. It provides automated upgrades for nodes and the control plane, detects and fixes unhealthy nodes, and offers release channels that automatically test new versions.
- AKS quickly updates to support the latest Kubernetes versions and also provides support for minor patches. AKS utilizes a structured approach to supported versions, encouraging customers to update old versions so they can take advantage of new Kubernetes features. It also offers automated upgrades for nodes and release channels.
- EKS supports the same amount of minor versions, but only four versions are available. The main advantage of EKS is that it continues supporting version 1.15, the most commonly used Kubernetes version in production.
- Both AKS and EKS require some manual work for upgrades, for example, when upgrading the Kubernetes control plane.
- Google Cloud is available in 25 regions and 77 zones, with plans to expand the network to more regions.
- Microsoft Azure has a network of data centers available in more regions than any other cloud provider (more than 60, including Africa and India). Azure regions are spread across 140 countries.
- AWS is not far behind Azure with 84 Availability Zones (AZ) in 26 regions around the world, with plans to add 24 more AZs and eight more regions in Canada, Australia, Israel, India, New Zealand, Switzerland, United Arab Emirates (UAE), and Spain.
A service level agreement (SLA) is a contract between a vendor and customers that specifies the services provided by the vendor. Cloud providers offer different SLAs that guarantee uptimes based on the vendor’s availability zones and regions.
Uptime SLAs offered by the three providers:
- GKE splits managed Kubernetes clusters to offer 99.95% for regional deployments and 99.5% uptime for Zonal deployments.
- AKS guarantees only 99.95% when availability zones are enabled and 99.9% when AZs are disabled.
- EKS provides 99.95% uptime
All three providers offer a managed version of the Kubernetes control plane, which manages infrastructure and performs essential processes required to run Kubernetes worker nodes. The key difference relates to pricing:
- GKE initially offered the control plane for free, but now charges a fee of $0.10/hour per cluster.
- AKS offers the Kubernetes control plane free.
- EKS, like GKE, charges $0.10/hour per cluster for the control plane.
Except for specific charges for the Kubernetes control plane (see the section above), all three providers do not charge extra for the managed Kubernetes service itself. Instead, users pay for the cloud resources used by their Kubernetes clusters, such as cloud instances / VMs, virtual private clouds (VPCs) and data transfer, according to each cloud provider’s regular pricing.
Kubernetes can seamlessly scale nodes, ensuring the cluster can optimally use resources. This feature helps save time and reduce costs, automatically provisioning the appropriate amount of resources for each workload.
- GKE offers a highly reliable auto scaling solution. It lets users specify the desired VM size and the required number of nodes within a node pool. Google Cloud uses these instructions to automate the process.
- AKS offers auto-scaling based on the Kubernetes Cluster Autoscaler. It identifies pods that could not be scheduled to nodes, and automatically scales the number of nodes to accommodate them. Users can customize cluster scaling settings.
- EKS manages auto scaling via the Kubernetes Cluster Autoscaler and the Karpenter open source autoscaling project.
All three solutions support common operating systems including Windows and Linux. In addition:
- GKE provides its container-optimized operating system (COS), a simplified and hardened Linux version that enables quicker container deployment and scaling.
- EKS provides Bottlerocket, Amazon’s COS that can run containers rather than the standard Docker engine.
Bare Metal Clusters
A bare metal cluster is deployed on a cloud architecture without a virtualization layer (VMs). It helps reduce infrastructure overhead significantly and provides application deployments with access to more storage and computing resources. As a result, it increases the overall computing power, helping reduce downtime and latency for application requests.
Here are how the three providers handle bare metal clusters:
- Google offers Google Anthos, which lets customers run managed Kubernetes clusters on local bare metal infrastructure.
- Azure does not offer a managed bare metal solution.
- Amazon offers EKS Anywhere, which lets customers run a local version of EKS on their own bare metal servers.
Container Image Services
Each cloud vendor offers its own container image service, integrated with its respective managed Kubernetes service:
- Google shifted from a Google Container Registry service into an Artifact Registry product, focusing on supporting more image formats. The Artifact Registry service offers various features, including integrated image scanning and binary authorization to enhance security.
- Azure offers the Azure Container Registry (ACR), a paid registry with features like image scanning, image signing, immutable tags, and a financially backed SLA. Azure bills this service according to daily usage rates based on the amount of required storage but does not charge customers for network bandwidth. Premium plans also offer geo-redundancy.
- Amazon offers Elastic Container Registry (ECR), a paid service that includes similar features to ACR – a financially-backed SLA, image scanning, and immutable image tags. ECR is geo-redundant by default, offering cross-region and cross-account support that eliminates the need to manage redundancy across zones.
- GKE uses Stackdriver to monitor resources in Kubernetes clusters. Stackdriver monitors master and worker nodes and all Kubernetes components across the platform, including logging.
- AKS lets you use Azure Monitor to assess the health of containers and provides Application Insights to help monitor Kubernetes components.
- EKS does not include native resource monitoring, requiring integration with third-party tools like Prometheus.
RBAC and Network Policies
All three providers configure Kubernetes deployments with default role-based access control (RBAC), and allow you to limit network access to the Kubernetes API endpoint of your cluster.
However, RBAC and secure authentication do not protect the API server, exposing it to attacks attempting to compromise the cluster. You must apply a classless inter-domain routing allowlist or give the API an internal, private IP address to protect against compromised cluster credentials.
Beyond this, here are the key differences between the providers:
- Requires enabling network policies at cluster creation time.
- Provides policy management features through Azure Policy.
- Integrates with Azure Active Directory (AD) for user authentication.
- Supports using Kubernetes RBAC with Azure AD user identities.
- Uses RBAC to maintain its core Kubernetes security controls by default in all clusters.
- Provides a Pod Security Policy with a permissive policy by default.
- Requires you to install and manage upgrades for the Calico CNI on your own.
- Lets you manage networking via managed node groups, but this creates a security issue, because it requires all nodes in a managed node group to be able to send traffic out of the virtual private cloud (VPC) and have a public IP address. Placing the nodes on private subnets can help you mitigate this issue.
- Offers network policy with firewall rules at the pod level via the Network Policy API.
- Supports defense-in-depth, protecting applications at several levels, including ingress traffic, east-west traffic, and inter-pod traffic.
- Allows applications to host data from different users in a multi-tenancy model, with network policy rules to prevent pods and services in one namespace from accessing another.
Kubernetes Troubleshooting with Komodor
Regardless of which Kubernetes Managed Service provider an organization uses, the troubleshooting process remains complex and, without the right tools, can be stressful, ineffective, and time-consuming. Some best practices can help minimize the chances of things breaking down, but eventually, something will go wrong – simply because it can.
This is the reason why we created Komodor, a tool that helps dev and ops teams stop wasting their precious time looking for needles in (hay)stacks every time things go wrong.
Acting as a single source of truth (SSOT) for all of your k8s troubleshooting needs, Komodor offers:
- Change intelligence: Every issue is a result of a change. Within seconds we can help you understand exactly who did what and when.
- In-depth visibility: A complete activity timeline, showing all code and config changes, deployments, alerts, code diffs, pod logs and etc. All within one pane of glass with easy drill-down options.
- Insights into service dependencies: An easy way to understand cross-service changes and visualize their ripple effects across your entire system.
- Seamless notifications: Direct integration with your existing communication channels (e.g., Slack) so you’ll have all the information you need, when you need it.
If you are interested in checking out Komodor, use this link to sign up for a Free Trial.