#024 – Kubernetes for Humans Podcast with Gabriele Bartolini [EDB]

Itiel Shwartz: Hello everyone, and welcome to another episode of the Kubernetes for Humans podcast. Today with me on the show is Gabriele. Gabriele, can you introduce yourself?

Gabriele Bartolini: Yeah, hi, thanks for having me. I’m Gabriele Bartolini, VP of Cloud Native at EDB. EDB is one of the major contributors to the open-source PostgreSQL database, and two years ago we open-sourced an operator for PostgreSQL in Kubernetes called CloudNativePG. My mission over the last few years has been to bring PostgreSQL to Kubernetes. I’ve been using PostgreSQL for almost 25 years now and am a contributor to the project, so I’m happy to be here.

Itiel Shwartz: That’s a lot of years with PostgreSQL! I think we talked about it prior to the show—my company is using AWS RDS, and we’re quite happy with it. We’re big fans of PostgreSQL; almost all of our usage is PostgreSQL. Now, when I think about databases on Kubernetes, it always feels a bit suspicious because, at the end of the day, AWS is very stable, but Kubernetes is not necessarily that stable. Kubernetes had the concept of treating containers like cattle, not pets, but databases are usually our pets, right? So, why should someone like me or someone else do it?

Gabriele Bartolini: Yeah, it’s always about the “why.” I’ll tell you why we are on Kubernetes now because I think that’s the reason why, one day, we might not need Kubernetes anymore. I’m a big DevOps person. I was actually the first person to invite Gene Kim to Italy about 10 years ago. Gene Kim is my idol in terms of all the work he’s done with the IT Revolution series of books, all the research into how organizations can become better places for everyone—a safer environment to build innovation. I’ve always been fascinated by what are now called DORA capabilities, like version control systems, trunk-based development, continuous integration, continuous delivery, automated testing, shift-left on security, and so on.

Over the years, we’ve tried to automate everything we do when developing software, and that’s how we came into Kubernetes. At the time, I was very skeptical about running databases on Kubernetes, but when local persistent volumes were introduced, that’s when we started to look at Kubernetes more seriously. In 2019, we began this crazy journey. I remember being in San Diego at KubeCon—that was my first KubeCon—trying to speak about using PostgreSQL on local storage, and everyone was skeptical, saying it was an anti-pattern. But it’s actually a choice you have. Anyway, I’m happy to discuss this more later.

The main reason is what I call the microservice database approach. You can empower developers to own the database and make it part of their development lifecycle, which increases velocity and allows them to deliver multiple times a day. This is possible because of one of PostgreSQL’s foundational capabilities: transactional DDL. You can do change management in a safe way because of rollbacks. For example, if you have a migration with your application for a new version, you can apply all the changes in a single transaction. If one of them fails, you can roll back, and PostgreSQL enables you to do that. That’s why we can do end-to-end testing with Kubernetes.

Itiel Shwartz: Interesting. What kind of usage patterns are suitable for running a database on Kubernetes? Is it suitable for every kind of database workload?

Gabriele Bartolini: I think we can talk about database-as-a-service versus databases in Kubernetes. That’s a key topic. For example, databases in Kubernetes don’t come for free. You need to know both Kubernetes and PostgreSQL. If you don’t have the skills for PostgreSQL and don’t want to worry about database management, that’s where database-as-a-service makes sense. But if you want to own your data and keep it inside Kubernetes to remove any vendor lock-in, that’s something you can do now. 

In terms of usage patterns, what we love about Kubernetes are the scheduling points. With a declarative approach, using affinity and anti-affinity rules, you can decide where your PostgreSQL database runs. This gives you a separation between the logical organization of Kubernetes and the physical infrastructure, making it entirely transparent to users. We started with a fail-fast approach, testing whether we could reach the same level of consistency and performance on bare metal as we did outside of Kubernetes. I wrote a blog article in 2020 explaining how we did our tests, and it turned out that Kubernetes was performing very well. That’s when we decided to go all-in.

Itiel Shwartz: How common is it now? How many companies are running databases on Kubernetes compared to, say, when you were in San Diego? What’s the trend, and who is adopting this approach?

Gabriele Bartolini: We started with the Postgres Operator, and I think the main differentiator between our operator and others is that ours was designed primarily for Kubernetes users. The others were using tools that existed outside of Kubernetes and were adapted to run inside it. We extended the Kubernetes controller and don’t use StatefulSets; we manage PersistentVolumeClaims directly. This approach has been adopted by others, like Strimzi with Kafka.

In terms of adoption, I’m really happy. While I can’t name many companies, I can say that we have two important adopters: IBM Cloud Pak, where all the PostgreSQL runs with our operator, and Google Cloud. EDB also has a database-as-a-service platform called BigAnimal, which runs on our operator. Other database services based on PostgreSQL are using it as well, like Tempo.

The good thing about PostgreSQL is that it’s a very versatile technology. Over the years, it has learned from other technologies, and you can use XML, JSON, PL/pgSQL, and even vector search extensions like pgvector. It’s a platform that performs well across most use cases, even if it’s not the absolute best in every one.

Itiel Shwartz: Do you think in five years most PostgreSQL instances will run on Kubernetes, or will managed services still have their place?

Gabriele Bartolini: That’s hard to say; it depends on the organization. But I think, especially in Europe with the new data act, there’s a possibility that organizations will have to move stuff on-premise again, running their own cloud with Kubernetes, setting up hybrid clouds, or even using multi-cloud. Kubernetes works the same way across these environments, so organizations will have the choice. In my opinion, it’s not only feasible right now, but it might also be the best way to run PostgreSQL everywhere.

Itiel Shwartz: That’s fascinating. I think we’re going to check it out—maybe in a quarter or two—as we’re also a strong PostgreSQL and Kubernetes company. We’re looking to see if we can get it to work as well as or similar to RDS.

Gabriele Bartolini: Great! I think that’s pretty much it. It’s been a pleasure. I love talking about databases on Kubernetes. Thanks a lot.

[Music]

Gabriele Bartolini is a long-time open-source programmer and entrepreneur. Gabriele has a degree in Statistics from the University of Florence. After having consistently contributed to the growth of 2ndQuadrant and its members through nurturing a lean and DevOps culture, he is now leading the Cloud Native initiative at EDB.

Gabriele lives in Prato, a small but vibrant city located in the northern part of Tuscany, Italy – famous for having hosted the first European PostgreSQL conferences. His second home is Melbourne, Australia, where he studied at Monash University and worked in the ICT sector. He loves playing the Blues with his Fender Stratocaster, but his major passions are called Elisabeth and Charlotte!

Itiel Shwartz is CTO and co-founder of Komodor, a company building the next-gen Kubernetes management platform for Engineers.

Worked at eBay, Forter, and Rookout as the first developer.

Backend & Infra developer turned ‘DevOps’, an avid public speaker who loves talking about infrastructure, Kubernetes, Python observability, and the evolution of R&D culture.  He is also the host of the Kubernetes for Humans Podcast. 

Please note: This transcript was generated using automatic transcription software. While we strive for accuracy, there may be slight discrepancies between the text and the audio. For the most precise understanding, we recommend listening to the podcast episode