AI Inference Needs a Global Resilience Layer
Background: AI Inference
First, some background. What is AI inference?
When you ask ChatGPT "Can you explain [X] to me?", what happens? A server somewhere converts that question to tokens and passes them through some trained model. The model performs a lot of calculations, generating a response, token by token, until it has a complete answer, which it then somehow delivers to you.
Note: we're not talking about training a model here; rather, about using an already-trained model.