Skip to content

Bradley Andersen

AI Inference Needs a Global Resilience Layer

Background: AI Inference

First, some background. What is AI inference?

When you ask ChatGPT "Can you explain [X] to me?", what happens? A server somewhere converts that question to tokens and passes them through some trained model. The model performs a lot of calculations, generating a response, token by token, until it has a complete answer, which it then somehow delivers to you.

Note: we're not talking about training a model here; rather, about using an already-trained model.

Welcome to the K8GB Blog!

We're excited to launch the official K8GB blog! This new platform will serve as your go-to resource for everything related to K8GB - the cloud native global load balancing solution for Kubernetes.