W04.3.2 Computing close to memory: a co-design perspective
Next-generation computing architectures will have to confront the demise of scaling laws and the unabated increase in AI workloads. Against this backdrop, Compute Memories (CMs) are especially promising, since they drastically reduce ever-more costly data movements, while offering massive parallelism. Nonetheless, the development of CMs is hampered by the paucity of exploration frameworks for investigating hardware/software co-designed solutions. In this talk, I illustrate two complementary approaches which addresses this challenge, based on open hardware and system simulation frameworks, respectively. The talk also details the architecture of domain-specific CMs for AI using such strategies, each resulting in >100X performance increase compared to traditional processor-centric execution. I will highlight differences in capabilities, target scenarios and implementation philosophies.
