A technical deep dive into evaluating coding agents on real-world Business Central tasks. Results show up to ~70% resolution rates but highlight gaps in reliability and complex scenarios. Learn how the benchmark is built and what the results mean for AL development in practice.
Speaker(s)
Klaus Marius Hansen is a Principal Software Engineer at Microsoft, based at the Microsoft Development Center Copenhagen in Lyngby, Denmark. Over his ten years at Microsoft, Klaus has worked across the Dynamics and Power platforms and now focuses on AI - specifically offline and online evaluation of AI-powered experiences. He is passionate about building the rigorous evaluation frameworks that ensure AI features deliver real value to SMB customers and partners.