Abstract: We propose a co-design approach for compute-in-memory inference for deep neural networks (DNN). We use multiplication-free function approximators based on l 1 norm along with a co-adapted ...
Abstract: In recent years, the demand for fast and high-capacity memory has surged due to the emergence of generative AI models such as GPT. To address this need, High Bandwidth Memory (HBM) has ...