W04.1.2 Exploiting analytical modeling for efficient deployment of emerging AI workloads on diverse accelerator hardware
As AI models continue to evolve, efficiently mapping them onto accelerator hardware becomes increasingly important and increasingly complex. This talk gives an accessible introduction to the fundamentals of AI accelerator mapping and hardware modeling, showing how performance, energy, and memory efficiency depend on the interaction between workload structure, hardware architecture, and execution strategy. We will outline analytical methods that make these trade-offs visible and support systematic design space exploration. We use these foundations to look at two timely directions: emerging sequence workloads such as state space models, and novel accelerator platforms such as AMD's AIE NPU platform. Together, these examples illustrate how mapping and modeling techniques can help bridge established accelerator principles with the requirements of new workloads and new hardware targets, while also connecting naturally to practical deployment flows through an MLIR-based compiler interface.
