• Home
  • Komodor Blog
  • Troubleshooting K8s with Marino at KubeCon EU 2024 | Fireside Chat

Troubleshooting K8s with Marino at KubeCon EU 2024 | Fireside Chat

Last KubeCon EU, our friends from Equinix held a Developers Fireside Chat with the diverse conference-goers. Marino Wijay was one of the guests who got to chat with the team and share their perspective on #Kubernetes.

In the video below, Komodor’s own Marino discusses the platform’s focus on providing reliability and proactive solutions for Kubernetes operations. He explains that Komodor helps users assess their Kubernetes operations, ensuring the right security guardrails and distributions are in place. The platform also helps with right-sizing workloads and offers features for troubleshooting and cluster operations. Marino mentions that Komodor leverages AI for log analysis and highlights the importance of security and access controls in the cloud-native ecosystem. He advises new attendees at KubeCon to pace themselves, network, and take breaks to avoid overload.

Key Points Covered:

  • Platform benefits
  • Centralizing Kubernetes troubleshooting
  • Focus on proactive rather than reactive solutions
  • SREs instrumental in developing Komodor
  • Komodor tackles multiple Kubernetes problems

The following is an AI-generated transcript of the conversation:

0:05

we’re back at KubeCon 2024 in

0:08

Paris Cloud native

0:11

con I’m here join I’m Chris PIV here

0:14

with eanx by the way here’s Marino W J

0:17

did I say that right that’s right Marino

0:19

wi you’re with Commodore that’s correct

0:21

I’m with Commodore tell me about

0:23

Commodore give me the 10,000 foot so I I

0:28

uh I’ll I’ll talk a little bit about my

0:29

history and it’ll help you understand

0:31

the the mindset of why Commodore and

0:33

what it’s all about because I don’t like

0:35

to just kind of jump into pitches or

0:37

anything uh back in my ssops days when

0:41

we had an issue we didn’t have one

0:44

centralized platform to be able to see

0:46

where those issues were originating from

0:48

what what was going on how to backtrack

0:51

and try and figure out what caused that

0:52

issue the RCA the root cause analysis

0:55

and you would just have to jump through

0:56

so many different systems but back then

0:58

we were talking about physical physical

1:00

Hardwares with operating systems we’re

1:02

talking about virtualization like

1:04

vsphere hyperv that was the name of the

1:07

game back then but you didn’t have a

1:09

single platform to see at all and

1:11

everything that we did was very

1:14

reactive fast forward to the kubernetes

1:16

conversation kubernetes is complex you

1:19

know there’s a lot going on there’s so

1:21

many moving pieces but it’s an elegant

1:23

system when you when you look at it from

1:25

about 10,000 ft right yeah very elegant

1:28

but when things start to go wrong what

1:30

do you end up doing Cube cuddle logs pod

1:33

name whatever or you’re digging into

1:35

like the name space and figuring out if

1:37

it has the right label or you’re looking

1:39

into so many other details that you’re

1:41

using Cube cuddle as the way to figure

1:44

out what’s wrong now Commodore enters

1:47

the chat because they live on this this

1:51

idea of providing reliability first and

1:55

think about the reliability Story versus

1:57

Disaster Recovery let’s be proactive

1:59

tell you about what could go wrong and

2:01

help you fix it beforehand oh right all

2:04

right yeah and so that’s not it though

2:06

it’s it’s a platform that really helps

2:08

you assess your kubernetes operations to

2:11

help you understand if you’ve got the

2:12

right security guard rails if you’ve got

2:15

the right distributions running if

2:17

you’re out of date or up toate if

2:19

someone messed up with an image tag like

2:22

we’re we’re talking about very specific

2:23

things where kubernetes can go wrong and

2:26

it’s either within our control or

2:27

outside of our control but now we have a

2:29

platform that tells us all of this and

2:31

that is what commodor is okay so some of

2:34

the things you mentioned there I know

2:35

there’s other open- source tools that

2:37

try to tackle pieces of it I don’t are

2:40

you leveraging those and then

2:42

coordinating them all or are you

2:43

tackling all those problems like

2:45

yourself like so allowed images and

2:48

stuff like that you know we’re tackling

2:50

all those problems together as a

2:51

platform as a platform okay the idea is

2:54

you don’t want to have to jump through

2:56

different tools to be able to discern

2:58

what is going on and the data might not

3:00

even correlate to begin with you want

3:03

the data to correlate and you want a

3:04

system to do that for you and there’s no

3:06

like AI magic going on behind the scenes

3:09

this is literally Sr magic like sres

3:13

have put their logic into building

3:15

something like this out in fact when I

3:17

when I joined Commodore about two and a

3:19

halfish weeks ago one of the first

3:21

onboarding plan one of the first things

3:23

I had to do as part of my onboarding

3:24

plan was to meet people one of those

3:27

people was uh near B atar and he

3:30

actually helped develop this platform

3:31

because he is an Sr and he saw all these

3:35

different problems when he was working

3:37

with kubernetes developing it trying to

3:39

scale it and he decided he wanted to

3:42

translate this into a platform which is

3:43

why commodor exists today now I I bring

3:46

this up because when you have someone

3:49

that is faced with the problem and then

3:52

is tasked with innovating a solution to

3:54

that problem that is effectively what we

3:56

have here we have the right people that

3:58

know the problems face very well know

4:01

how to fix kubernetes but understand

4:03

that because it’s a drawing ecosystem

4:04

and things change CRS break all the time

4:07

right well you need a system in place to

4:10

be able to catch that and help sres for

4:12

the tomorrow figure that out so that

4:14

makes me wonder like how do you get the

4:15

new experience the new problems in like

4:19

cuz if he’s now he’s he’s selling the

4:20

solution but now he’s no longer doing

4:23

the the Daya day groundwork so like how

4:25

do you get that new stuff in yeah a lot

4:27

of that is updating we have to con

4:29

update like we can’t keep the platform

4:31

static I mean it’s a SAS based model and

4:35

if your SAS is not up to date folks are

4:37

not going to use it consistently right

4:39

and for you’re a little motivated then

4:41

yeah all right but it’s beautiful

4:43

because it’s not just that it’s the

4:44

security guard reals too it’s the cost

4:46

aspect too I didn’t even talk about cost

4:49

and it’s not like I’m going to click a

4:50

button and boom I just immediately save

4:52

$5,000 or something a month or however

4:54

much you you use what it really is a a

4:58

good analysis of is how to right siize

5:00

your workloads and right sizing is a

5:03

conversation that is often missed yeah

5:05

I’ve been seeing some traction on that

5:07

so you guys offer that as well then

5:09

absolutely okay um all right so now that

5:13

we’ve got the overview like what does it

5:15

look like if you want to start going

5:17

with this like how complicated is it to

5:19

to get commodor to enter your

5:22

chat well as long as you’re cool with

5:24

using a SAS and you got access to the

5:28

internet which I’m sure you

5:30

um go to commod door.com and sign up and

5:33

you can get a free trial pretty quickly

5:36

and you can onboard your first few

5:37

clusters and you get to see what’s going

5:39

wrong or going well okay all right all

5:41

right so um and then the the interface

5:44

is all web based then I okay it’s all

5:46

it’s all web based so there’s nothing

5:48

you need to download what goes into your

5:50

cluster is an agent that runs that

5:52

collects a lot of the Telemetry data a

5:54

lot of the uh data around logs events

5:58

anything that we feel that is not noise

6:00

as well we’ll bring into our platforms

6:02

for deeper analysis okay

6:05

so the big theme at this cucon is AI

6:09

right you said you’re not really doing

6:11

the AI thing you’re doing like the

6:13

actual intelligence I guess that’s also

6:15

AI huh anyway you got like actual

6:18

experience AE maybe uh Sr Sr AE um

6:24

actual Sr a ASR all right we’re getting

6:28

we’re getting somewhere so you’ve got

6:30

that going on where do you see do you

6:33

see like AI having a place maybe with

6:36

Commodore maybe without commodor maybe

6:38

just in the general Cloud native

6:39

ecosystem we do we do we actually do

6:41

have a little area where we’re

6:43

leveraging some Ai and it’s just because

6:45

look when you when you’re thinking about

6:48

how chat GPT has helped us summarize

6:50

like long articles into like five key

6:53

bullet points that is exactly what we

6:55

want to use it for and it’s something

6:58

like I have all of these

7:00

mhm cut out the noise and just tell me

7:02

what’s going wrong oh my gosh that’s it

7:04

that’s really what we’re using it for

7:06

and if somebody who who actually

7:07

maintains a kubernetes controller I am

7:10

guilty of maybe triggering the same air

7:13

Message two to three times each time the

7:16

eror curs yeah so somebody on the other

7:19

end that can maybe can pack that down is

7:22

would be really

7:23

wonderful all right all right all right

7:27

so do you guys have any uh new new

7:29

announcements or or fresh features or

7:31

anything recent that you want to talk

7:33

about yeah we’ve really uh doubled down

7:34

on our reliability features as well as

7:37

um features around cluster operations

7:39

because we’ve really wanted to to

7:41

demonstrate that it’s not just about

7:43

troubleshooting it’s not just about

7:45

reliability there are multiple problems

7:47

and multiple areas of kubernetes that

7:49

you need to look at if you look at it

7:51

with what just one tool you can’t run it

7:53

to the tool sprawl situation so we want

7:55

to solve that and we’ve been strongly

7:57

emphasizing how you have to treat this

7:58

as a reliability platform versus a

8:00

troubleshooting one one of the largest

8:02

things that we really focused on was

8:04

creating multi- tiers of of folks that

8:07

either don’t really understand

8:08

kubernetes or folks that deeply do and

8:11

the reason for this is if you hand off

8:13

the keys to a developer they may not

8:15

always know what kubernetes is or how to

8:17

work with it but if something’s going

8:19

wrong you kind of want to help guide

8:21

them to the right right approach and

8:23

that’s what we do okay as well right do

8:25

you have like tools in there for like

8:27

day two needs like backups mic

8:30

anything like that so these are still uh

8:31

work in progress there’s still a lot of

8:33

additional capabilities that we’re

8:35

working through as well and we’re

8:37

growing the platform yeah that’s great

8:39

though cuz like it it seems like the

8:41

next big area well it’s been for a while

8:43

right but it’s like it’s real easy to

8:46

install kuber well I think it’s easy to

8:48

install kubernetes I don’t know if

8:49

everyone does it’s just maintaining it

8:52

going on all right so you guys are here

8:54

you got a booth what are you Booth j18 I

8:56

think um right and then you you have any

9:00

experiences going on in your booth you

9:01

just talking the product or well we’re

9:03

doing a bunch of demos uh so you’re

9:05

welcome to come by take a look at see

9:06

what’s going on but we also have a

9:08

raffle going on oo um so folks can

9:11

absolutely try for it I think it’s in

9:14

it’s switch and some other cool gear

9:16

like an Apple Watch as well Apple watch

9:19

and stuff yeah and um yeah I mean you

9:21

know one of the cool things is like when

9:23

your when your CTO and CEO are on the

9:25

floor giving demos as well it’s a

9:27

different experience because

9:29

um in all honesty I don’t see that a lot

9:32

right I don’t

9:33

see your U your leadership team getting

9:36

right in there like deeply talking about

9:39

something and then demoing to you or

9:41

demoing to like someone that’s walking

9:43

up to the booth right so that’s

9:45

important too that would be pretty sweet

9:47

yeah I love it uh all right so you’re

9:49

ccon what are you looking forward to

9:51

doing while you’re here what have you

9:53

been enjoying so I I’ve been really

9:56

enjoying seeing a lot of companies

9:59

transition and consolidate into

10:01

platforms right and look there platform

10:04

engineering is a very umbrella term and

10:06

I bring this up because there are

10:07

different areas of platform engineering

10:09

there’s Network platform engineering

10:11

there’s reliabil reliability engineering

10:14

there’s other facets around storage and

10:16

backup and data management and you you

10:19

cannot tie these all together still like

10:21

you can yes you can you can have an or

10:23

overarching um Team or or function that

10:26

kind of oversees a lot of this and can

10:28

provide some stitching but each has

10:31

their own domain and I’m noticing that

10:33

consolidation happening much more

10:35

quickly and why I say this is if you

10:37

look back to what’s been happening over

10:39

the last six seven 8 months there’s a

10:42

lot of consolidation going on the

10:45

biggest consolidation I wanted to call

10:47

out was Cisco’s acquisition of isov

10:50

veent of of what of isov veent of isov

10:53

veent okay and I call he just did

10:55

another acquisition that yeah so that

10:58

that’s important too right but it’s

11:02

important because when you look at a

11:05

platform you expect a lot out of that

11:07

platform but the reality is like it’s

11:10

what I said much earlier on you can’t

11:11

have one platform that does it all so

11:13

you have to break that up into smaller

11:15

pieces and what I think Cisco is trying

11:17

to do is create a a platform of

11:19

platforms and those sub platforms are

11:22

the pieces that Cisco’s been acquiring

11:24

now cium in itself isovalent andium the

11:27

open source technology in itself is a

11:29

platforms as a network platform and

11:31

that’s why I think there’s like a

11:33

strong um I guess movement around

11:36

Network platform engineering alog

11:38

together because you’ve got service mesh

11:39

cni and you’ve got the physical

11:41

networking and the virtual networking

11:42

that wraps around it too which is

11:44

something I think very much ties into

11:46

what equinix metal and equinix is doing

11:48

as well right yeah ex award yeah and we

11:51

we like to be the building ground for

11:52

you to put all your platforms on top of

11:55

exactly you know and a so that’s one of

11:57

the biggest themes uh the the other

11:59

biggest theme I’ve noticed is that yes

12:02

there is AI but I don’t think it’s AI is

12:06

in the name of everything that people

12:08

are doing they actually are implementing

12:09

it somehow right whether it’s a simple

12:12

API call out to open AI so that we can

12:16

do some quick analysis of something or

12:18

get some quick data or something like

12:20

what we’re doing the log analysis right

12:23

but there are other folks out there that

12:25

are actually injecting AI into

12:27

kubernetes directly K’s GPT for example

12:30

I’m sure you might have like had

12:30

conversations about that already right

12:33

um not here but yes in general yeah the

12:38

the other Trend which is you know it’s

12:40

always going to be a trend is is

12:42

security right security is going to be a

12:45

never never non-stop ending journey and

12:50

it’s really about authentication and

12:51

authorization and yesterday when I was

12:53

walking the floor I noticed that uh mark

12:56

from tremol tremolo security was being

12:58

interviewed and he he offers up a

13:00

technology called open Unison as well

13:03

and you know as I start to see what he’s

13:05

doing and some of the other Security

13:07

based companies out there there really

13:11

is a lot of gaps when it comes to

13:14

access a lot of companies haven’t

13:16

implemented access controls like they

13:18

thought they have and so companies like

13:20

what Mark does exists like what even

13:22

what we do at Commodore exist to solve

13:25

that problem as well not just for

13:26

kubernetes but even for other kinds of

13:29

real estate as

13:32

well I love it so for people who might

13:35

be new to the to the show do you have

13:37

any advice for

13:38

them yeah

13:40

so coupon is a big big event it’s a

13:44

multi-day event it’s not just coupon

13:46

it’s it’s rejects then it’s the colos

13:49

and then it’s coupon and by the time you

13:51

get to the first day of coupon if you’ve

13:53

been to reject you’re probably already

13:54

like pretty much drained by you’re done

13:56

yeah so this time around I only I

13:59

arrived for the first day of coupon I

14:00

actually made it for like half day of

14:02

Colo which is great because I got the

14:04

opportunity to um connect with some

14:06

folks I caught a few talks I caught a

14:09

talk on um on in on basically Co

14:12

causality and instrumentation of

14:15

observability and then yesterday I

14:17

caught a a chaos engineering workshop

14:20

and you know it’s really cool to see

14:21

these perspectives because it’s not just

14:23

about kubernetes today it’s about what’s

14:25

beyond so I think for new folks I think

14:29

are sessions out there for beginner

14:30

kubernetes but there’s also advanced

14:33

stuff out there as well if you’re

14:34

looking for that also I think the

14:36

biggest thing you can do here is connect

14:38

and network there’s so many people out

14:40

here that are solving problems they

14:42

might have solved your problem you just

14:44

got to talk to them and connect with

14:45

them and find out what they’re up to but

14:48

I also will say like you got to pace

14:49

yourself wear good shoes drink lots of

14:52

water because you’re my mouth’s already

14:54

getting dry right now like I’m like

14:56

looking for water I’m like only going to

14:58

walk over here hand me some water uh

15:01

drink lots of water and uh you know

15:04

hydrate eat take breaks when you can get

15:07

out in the sun or get out and get some

15:08

fresh air because being in here like it

15:11

it can get intense can be overloading at

15:14

times and you need to kind of give

15:15

yourself a little bit of space and a bit

15:17

of a break so that you can get back into

15:19

it and also one last thing don’t try to

15:21

get to all the sessions get to the

15:23

Keynotes if you can and get to a few

15:25

sessions right the hallway track is the

15:27

best track to learn everything else I’ve

15:29

been hearing a lot of plugs for hallway

15:30

track this year all right well speaking

15:32

of taking breaks I think it’s time for

15:34

us to take a break thank you for Marino

15:36

and we are going to stay tuned for more

15:39

of eonex at cucon Cloud native conon EU

15:44

Paris how many more names do we have

15:47

2024 we’ll see you for more stay tuned

15:50

right here to equinex developers Channel

15:52

on YouTube Take Care everyone