TY - GEN
T1 - Flexible Instruction Set Architecture for Programmable Look-up Table based Processing-in-Memory
AU - Connolly, Mark
AU - Sutradhar, Purab Ranjan
AU - Indovina, Mark
AU - Ganguly, Amlan
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Processing in Memory (PIM) is a recent novel computing paradigm that is still in its nascent stage of development. Therefore, there has been an observable lack of standardized and modular Instruction Set Architectures (ISA) for the PIM devices. In this work, we present the design of an ISA which is primarily aimed at a recent programmable Look-up Table (LUT) based PIM architecture. Our ISA performs the three major tasks of i) controlling the flow of data between the memory and the PIM units, ii) reprogramming the LUTs to perform various operations required for a particular application, and iii) executing sequential steps of operation within the PIM device. A microcoded architecture of the Controller/Sequencer unit ensures minimum circuit overhead as well as offers programmability to support any custom operation. We provide a case study of CNN inferences, large matrix multiplications, and bitwise computations on the PIM architecture equipped with our ISA and present performance evaluations based on this setup. We also compare the performances with several other PIM architectures.
AB - Processing in Memory (PIM) is a recent novel computing paradigm that is still in its nascent stage of development. Therefore, there has been an observable lack of standardized and modular Instruction Set Architectures (ISA) for the PIM devices. In this work, we present the design of an ISA which is primarily aimed at a recent programmable Look-up Table (LUT) based PIM architecture. Our ISA performs the three major tasks of i) controlling the flow of data between the memory and the PIM units, ii) reprogramming the LUTs to perform various operations required for a particular application, and iii) executing sequential steps of operation within the PIM device. A microcoded architecture of the Controller/Sequencer unit ensures minimum circuit overhead as well as offers programmability to support any custom operation. We provide a case study of CNN inferences, large matrix multiplications, and bitwise computations on the PIM architecture equipped with our ISA and present performance evaluations based on this setup. We also compare the performances with several other PIM architectures.
KW - Convolutional neural network
KW - Deep neural network
KW - DRAM
KW - Instruction set architecture
KW - Look-up table
KW - Microcode
KW - Processing in memory
UR - http://www.scopus.com/inward/record.url?scp=85123913538&partnerID=8YFLogxK
UR - https://doi.org/10.1109/ICCD53106.2021.00022
U2 - 10.1109/ICCD53106.2021.00022
DO - 10.1109/ICCD53106.2021.00022
M3 - Conference contribution
AN - SCOPUS:85123913538
T3 - Proceedings - IEEE International Conference on Computer Design: VLSI in Computers and Processors
SP - 66
EP - 73
BT - Proceedings - 2021 IEEE 39th International Conference on Computer Design, ICCD 2021
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 39th IEEE International Conference on Computer Design, ICCD 2021
Y2 - 24 October 2021 through 27 October 2021
ER -