The pizza problem: what AI doesn’t get about testing FOSSASIA Summit 2026

FOSSASIA Summit 2026 Sunday, 8 March, 2026 9:00 AM (Asia/Bangkok) To Tuesday, 10 March, 2026 7:30 PM (Asia/Bangkok)

Info Featured Schedule Speakers Exhibition

The pizza problem: what AI doesn’t get about testingWhen AI agents first hit the scene, my team and I were all in. Our code generators would fetch jira ticket details through jira MCP and we’d use testing and automation agents to write test cases and then automate them. We felt incredibly productive. We had dozens of tests and test cases in minutes for every change that arrived on the QA environment. But that feeling was an illusion. The cracks started to show quickly. Our test case agent, for all its speed, would write tests that were nonsensical, redundant, or, worse, missed the most critical edge cases that our business logic depended on. We were left with a mountain of brittle, unmaintainable assets and a false sense of security. We realised we weren't a team of expert testers anymore; we were janitors for a machine, cleaning up its mistakes. Our mistake was that we forgot that AI doesn’t explore. It only guesses the next most probable option. To explain, let me use a story from my own work. When I take my team of 15 out for lunch and ask, “Where should we eat?” If I always pick the most common choice, it’s chicken tikka biryani every time. That’s low temperature. If I only take the top 5 answers, I get chicken tikka biryani, pizza, ramen, burgers and pasta. That’s top-k. If I stop once 90% of opinions are covered, maybe it’s pizza plus biryani. That’s top-p. This is how Generative AI decides. It creates variation, but it never notices that the our favourite biryani place has closed or that a new pizza restaurant opened. Humans notice. That is why exploratory testing cannot be replaced by AI. Don't believe me? try asking an LLM to create a new unheard Dad joke :) In this session, I will give audience both a story and a playbook: how to explain AI’s limits when someone tells you to “just use AI for testing,” and how to build a 'Human–AI–Human' workflow that helps embrace AI as a tool for what it is, boilerplate and tedious stuff. Key takeaways of the session : <ul><li>Understand why AI generates variety but not true exploration (temperature, top-k, top-p explained simply with the lunch story).</li><li>Have clear language to push back when managers or teams suggest replacing testers with AI.</li><li>Learn a Human–AI–Human workflow: tester guides input → AI drafts → tester edits and applies judgment.</li></ul>