Pop Goes The Stack Less

Media Summary: Everyone's chasing for flash, but a quiet revolution is happening where the real money is: predictive AI. Uptime used to mean reliability. But in the LLM era, five nines just means your liar is always available. Real reliability now ... GPUs get all the attention, but in inference, the real bottleneck is often memory, specifically the KV cache. In this episode of 's ...

Pop Goes The Stack Less - Detailed Analysis & Overview

Everyone's chasing for flash, but a quiet revolution is happening where the real money is: predictive AI. Uptime used to mean reliability. But in the LLM era, five nines just means your liar is always available. Real reliability now ... GPUs get all the attention, but in inference, the real bottleneck is often memory, specifically the KV cache. In this episode of 's ... Multi-model AI isn't a buzzword anymore, it's how organizations are actually operating. In this episode of 's AI is no longer a lab tool—it's showing up in pipelines, production systems, and the places where “seemed like a good idea” ... Remember when were quiet little endpoints that waited politely for humans to click buttons? Yeah, that's over. Now you've ...

"It's just a chat" is the most dangerous sentence in AI. In this episode of The perimeter isn't where you left it. Agents are on the move, APIs are on fire, and your infrastructure is about as ready for this as a ... The 2025 API Threat Report is out, and shocker—we're still getting wrecked by injection, data leaks, and BOLA. That's Broken ... Prompt injection has been the headline security problem for the last year, but have we been guarding the wrong layer? Agents break the old rules of observability. Latency, throughput, and error rates still matter, but once software starts making ... Why do researchers keep describing large language models like aliens? Because in enterprise environments, they often behave ...

Coming to you from the Hub, 's Joel Moses and guest co-pilot Oscar Spencer cut through the conference ... Prompt injection isn't some new exotic hack. It's what happens when you throw your admin console and your users into the same ... Ops used to be a world of YAML, caffeine, and careful deploy rituals. Now it's probabilistic models, token-based cost surprises, ... Recorded live at in Las Vegas, this episode of Agents are popping up everywhere: tiny bots spinning up for a task, then dying off. They shouldn't carry long-lived credentials any ... OpenClaw is what happens when the industry looks at autonomous agents and decides they should have more autonomy, more ...

Photo Gallery

Pop Goes the Stack | Less small talk, more substance | AI/ML

Pop Goes the Stack: Five nines of wrong - Detecting drift and errors in AI systems | LLM

Pop Goes the Stack | KV cache is the real inference bottleneck (Not GPUs) | Agentic AI

Pop Goes the Stack | Model routing isn’t load balancing (And that’s why you’re not ready) | AI

Pop Goes the Stack | DevOps meets agents: Risk, audit, and the Deming playbook | AI

Pop Goes the Stack | MCP tools and AI risks: The case for slow, secure adoption | AI API

Pop Goes the Stack | Unstructured Integration: The surface area risking AI privacy & compliance | AI

Pop Goes the Stack | The perimeter has shifted | Agentic AI

Pop Goes the Stack | BOLA exploits: The #1 API threat and how to stop it | API Security

Pop Goes the Stack: Why Prompt Filters Fail Against LLM Attacks | GenAI

Pop Goes the Stack | Measuring what matters: Observability for agents | Agentic AI

Pop Goes the Stack | Alien autopsy of LLMs: Constitutions, deception, guardrails | AI

View Detailed Profile

Pop Goes The Stack Less